From cs at zip.com.au Sat Aug 1 05:56:17 2015 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 1 Aug 2015 13:56:17 +1000 Subject: [Python-ideas] A different format for PI? In-Reply-To: References: Message-ID: <20150801035617.GA23723@cskk.homeip.net> On 23Jul2015 13:16, Abe Dillon wrote: >Is there a forum or something similar related to python-ideas? If there >isn't, I think there should be. > >The mailing list format is restrictive. There's no good way to search past >discussions and the digests I get are disorganized and difficult to follow. >I'd like to contribute, but I don't know if my ideas are topics that have >already been discussed in depth or if they're actually new. Others have posted about the many ways to search the archives. I want to add 2 recommendations: First, and incredibly important, if you want to participate in a list, or even just to lurk, do not use digest mode. It is a disaster (regardless of the list). Instead: - switch your subscription to indivdual messages - add a filing rule to your mailer to move all python-list (or whichever list) to its own folder. I put python-list, tutor, python-dev and python-ideas all in a shared "python" folder myself. - put your mail reader in "threads" mode, where all messages in a particular discussion are grouped together This removes all the (misleading) attractions of "digest" mode. Secondly, go to the python-ideas archive here: https://mail.python.org/pipermail/python-ideas/ and download all the past archives (those links labelled "Gzip'd Text ..."). Ungzip and concatenate into a large file. This can now be searched on your local machine. With some effort these archives can be converted into a normal mbox mail file, which can be browsed in threading mode, should you wish to follow a particular discussion from the past. Cheers, Cameron Simpson All mail clients suck. This one just sucks less. - www.mutt.org From gmludo at gmail.com Sat Aug 1 15:48:03 2015 From: gmludo at gmail.com (Ludovic Gasc) Date: Sat, 1 Aug 2015 15:48:03 +0200 Subject: [Python-ideas] Concurrency Modules In-Reply-To: <55B872BB.5080603@mail.de> References: <559EFB73.5050606@mail.de> <9c139305-f583-46c1-b819-6a98dbd04acc@googlegroups.com> <55B2B0FB.1060409@mail.de> <55B3C93D.9090601@mail.de> <55B5508E.1000201@mail.de> <55B872BB.5080603@mail.de> Message-ID: 2015-07-29 8:29 GMT+02:00 Sven R. Kunze : > Thanks Ludovic. > > On 28.07.2015 22:15, Ludovic Gasc wrote: > > Hello, > > This discussion is pretty interesting to try to list when each > architecture is the most efficient, based on the need. > > However, just a small precision: multiprocess/multiworker isn't antinomic > with AsyncIO: You can have an event loop in each process to try to combine > the "best" of two "worlds". > As usual in IT, it isn't a silver bullet that will care the cancer, > however, at least to my understanding, it should be useful for some > business needs like server daemons. > > > I think that should be clear for everybody using any of these modules. But > you are right to point it out explicitly. > Based on my discussions at EuroPython and PyCON-US, it's certainly clear for the middle-class management of Python community, however, not really from the typical Python end-dev: Several persons tried to troll me that multiprocessing is more efficient than AsyncIO. To me, it was a opportunity to transform the negative troll attempt to a positive exchange about efficiency and understand before to troll ;-) More seriously, I've the feeling that it isn't very clear for everybody, especially for the new comers. > It isn't a crazy new idea, this design pattern is implemented since a long > time ago at least in Nginx: http://www.aosabook.org/en/nginx.html > > If you are interested in to use this design pattern to build a HTTP server > only, you can use easily aiohttp.web+gunicorn: > http://aiohttp.readthedocs.org/en/stable/gunicorn.html > If you want to use any AsyncIO server protocol (aiohttp.web, panoramisk, > asyncssh, irc3d), you can use API-Hour: http://www.api-hour.io > > And if you want to implement by yourself this design pattern, be my guest, > if a Python peon like me has implemented API-Hour, everybody on this > mailing-list can do that. > > For communication between workers, I use Redis, however, you have plenty > of solutions to do that. > As usual, before to select a communication mechanism you should benchmark > based on your use cases: some results should surprise you. > > > I hope not to disappoint you. > Don't worry for that, don't hesitate to "hit", I have a very strong shield to avoid disappointments ;-) > I actually strive not to do that manually for each tiny bit of program > You're right, micro-benchmarks isn't a good approach to decide macro architecture of application. > (assuming there are many place in the code base where a project could > benefit from concurrency). > As usual, depends on your architecture/need. If you do a lot of network than CPU usage, the waiting time of network should play for more concurrency. > Personally, I use benchmarks for optimizing problematic code. > > But if Python would be able to do that without choosing the right and > correctly configured approach (to be determined by benchmarks) that would > be awesome. As usual, that needs time to evolve. > It should technically possible, however, I don't believe too much in implicit hidden optimizations to the end-dev: It's very complicated to hide the magic, few people have the skills to implement that, and the day you have an issue, you're almost alone. See PyPy: certainly one day they will provide a good solution for that, however, it isn't trivial to implement, see the time they need. With the time, I believe more and more to educate developers who help them to understand the big picture and use explicitly optimizations: The learning curve is more important, however, at the end, you have more autonomous developers who will resolve more problems and less afraid to break the standard frame to innovate. I don't have scientific proof of that, it's only a feeling. However, again both approaches aren't antinomic: Each time we have an automagic optimization like computed gotos without side effects, I will use that. I found that benchmark resulted improvements do not last forever, > unfortunately, and that most of the time nobody is able to keep track of > everything. So, as soon as something changes, you need to start anew. That > is not acceptable for me. > I'm fully agree with you: Until it works, don't break for the pleasure. Moreover, instead of to trash your full stack for efficiency reasons (For example, drop all your Python code to migrate to Go) where you need to relearn everything, you should maybe first find a solution in your actual stack. At least to me, it was very less complicated to migrate to Python 3, multiworker pattern and AsyncIO than to migrate to Go/NodeJS/Erlang/... Moreover, with a niche language, it's more complicated to find developers and harder to spot impostors: Some people use alternative languages not really used only to try to convince others who are good developers. Another solution is also to add more servers to handle load, but it isn't always the solution with the smallest TCO, don't forget to count sysadmin costs+complexity to debug when you have an issue on your production. > Btw. that is also a reason why a I said recently (another topic on this > list), 'if Python could optimize that without my attention that would be > great'. The simplest solution and therefore the easiest to comprehend for > all team members is the way to go. > Again, I'm strongly agree with you, however, with the age of Python and the big size of performance community we have (PyPy, Numba, Cython, Pyston...) I believe that less and less automagic solutions without side effects will be find. Not impossible, but harder and harder (I secretly hope that somebody will prove me I was wrong ;-) ) Maybe to "steal" some optimizations from others languages ? I don't have the technical level to help for that, I'm more a business logic dev than a low level dev. > If that is not efficient enough that is actually a Python issue. > Readability counts most. And fortunately, most of the cases that attitude > works perfectly with Python. :) > Again and again, I'm agree with you: the combo size of community (big toolbox and a lot of developers) + readability to be newcomer friendly is clearly a big win-win, at least to me. The only issue I had it was efficiency: with the success of our company, we couldn't be stopped by the programming language/framework to build quickly efficient daemons, it's why I've dropped quickly PHP and Ruby in the past. Now, with our new stack, based on the trusted predictions of our fortune-telling telephony service department, we could survive a long time before to replace some Python parts with C or other. Have a nice week-end. > > > Have a nice week. > > PS: Thank you everybody for EuroPython, it was amazing ;-) > > -- > Ludovic Gasc (GMLudo) > http://www.gmludo.eu/ > > 2015-07-26 23:26 GMT+02:00 Sven R. Kunze : > >> Next update: >> >> >> Improving Performance by Running Independent Tasks Concurrently - A Survey >> >> >> | processes | threads | >> coroutines >> >> ---------------+-------------------------+----------------------------+------------------------- >> purpose | cpu-bound tasks | cpu- & i/o-bound tasks | >> i/o-bound tasks >> | | >> | >> managed by | os scheduler | os scheduler + interpreter | customizable >> event loop >> controllable | no | no | >> yes >> | | >> | >> parallelism | yes | depends (cf. GIL) | >> no >> switching | at any time | after any bytecode | >> at user-defined points >> shared state | no | yes | >> yes >> | | >> | >> startup impact | biggest/medium* | medium | >> smallest >> cpu impact** | biggest | medium | >> smallest >> memory impact | biggest | medium | >> smallest >> | | >> | >> pool module | multiprocessing.Pool | multiprocessing.dummy.Pool | >> asyncio.BaseEventLoop >> solo module | multiprocessing.Process | threading.Thread | >> --- >> >> >> * >> biggest - if spawn (fork+exec) and always on Windows >> medium - if fork alone >> >> ** >> due to context switching >> >> >> On 26.07.2015 14:18, Paul Moore wrote: >> >> Just as a note - even given the various provisos and "it's not that >> simple" comments that have been made, I found this table extremely >> useful. Like any such high-level summary, I expect to have to take it >> with a pinch of salt, but I don't see that as an issue - anyone who >> doesn't fully appreciate that there are subtleties, probably wouldn't >> read a longer explanation anyway. >> >> So many thanks for taking the time to put this together (and for >> continuing to improve it). >> >> You are welcome. :) >> >> +1 on something like this ending up in the Python docs somewhere. >> >> Not sure how the process for this is but I think the Python gurus will >> find a way. >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Sat Aug 1 19:29:04 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Sat, 01 Aug 2015 19:29:04 +0200 Subject: [Python-ideas] fork - other approaches Message-ID: <20150801172904.C34AA87403@smtp04.mail.de> Thanks everybody for inspiring me with alternative ways of working with pools. I am very certain that any them will work as intended. However, they do not zero in 100% on my main intentions: 1) easy to understand 2) exchangeable (seq par) A) pmap It origates from map and allows easy exchangeability back and forth sequential and concurrent/parallel execution. However, I have to admit that I have difficulties to change all the 'for loops' to map (mentally as well as for real). The 'for loop' IS the most used loop construct in business applications and I do not see it going away because of something else (such as map). B) with Pool() It removes the need to close and join the pool which removes the visual clutter from the source code. That as such is great. However, exchangeability is clearly not given and the same issue concerning understandability like pmap arises. C) apply Nick's approach of providing a 'call_in_background' solution comes almost close to what would solve the issues at hand. However, it reminds me of apply (deprecated built-in function for calling other functions). So, a better name for it would be 'bg_apply'. All of these approaches basically rip the function call out of the programmer's view. It is no longer function(arg) but apply(function, arg) # or bg_apply(function, arg) # or bg_apply_many(function, args) I don't see this going well in production and in code reviews. So, an expression keyword like 'fork' would still be better at least from my perspective. It would tell me: 'it's not my responsibility anymore; delegate this to someone else and get me a handle of the future result'. Best, Sven ------------------------------------------------------------------------------------------------- FreeMail powered by mail.de - MEHR SICHERHEIT, SERIOSIT?T UND KOMFORT -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Sat Aug 1 19:36:27 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Sat, 01 Aug 2015 19:36:27 +0200 Subject: [Python-ideas] fork Message-ID: <20150801173628.29BAB873FE@smtp04.mail.de> Thanks everybody for the feedback on 'fork'. Let me address the issues and specify it further: 1) Process vs. Thread vs. Coroutine From my understanding, the main fallacy here is that the caller would be able to decide which type of pool is best suited. Take create_thumbnail as an example. You do not know whether this is cpu-bound or io-bound; you can just make a guess or try it out. But who knows then? I would say: the callee. create_thumbnail is cpu-bound when doing the work itself on the machine. create_thumbnail is io-bound when delegating the work to, say, a web service. SAME FUNCTIONALITY, SAME NAME, SAME API, DIFFERENT POOLS REQUIRED. This said, I would propose something like a marking solution: @cpu_bound def create_thumbnail(image): # impl @io_bound def create_thumbnail(image): # impl (coroutines are already marked as such) From this, the Python interpreter should be able to infer which type of pool is appropriate. 2) Pool size Do lists have a fixed length? Do I need to define their lengths right from the start? Do I know them in advance? I think the answers to these questions are obvious. I don't understand why it should be different for the size of the pools. They could grow and shrink depending on the workload and the available resources. 3) Pool Management in General There is a reason why I hesitate to explicitly manage pools. Our code runs on a plethora of platforms ranging from few to many hardware threads. We actually do not want to integrate platform-specific properties right into the source. The point of having parallelism and concurrency is to squeeze out more of the machines and get better response times. Anything else wouldn't be honest in my opinion (besides from researching and experimenting). Thus, a practical solution needs to be simple and universal. Explicitly setting the size of the pool is not universal and definitely not easy. It doesn't need to be perfect. Even if a first draft implementation would simply define pools having exactly 4 processes/threads/coroutines, that would be awesome. Even cutting execution time into half would be an amazing accomplishment. Maybe, even 'fork' is too complicated. It could work without it given the decorators above. But then, we could not decide whether to run things in parallel or sequentially. I think I do not like that. 4) Keyword 'fork' Well, first shot. If you have a better one, I am all in for it (4 letters or shorter only ;) )... Or maybe something like 'par' for parallel or 'con' for concurrent. 5) Awaiting the Completion of Something As Andrew proposed, using the return value should result in blocking. What if there is no result to wait for? That one is harder but I think another keyword like 'wait' or 'await' should work here fine. for image in images: fork create_thumbnail(image) wait print(get_size_of_thumbnail_dir()) 6) Exceptions As close to sequential execution as possible. That is, when some function is forked out and raises an exception, it should behave as if it were a normal function call. for image in images: fork create_thumbnail(image) # I would like to see that in my stacktrace Also true for expressions. '+=' might raise an exception because, say, huge_calculation returns 'None'. Although the actually evaluation of the sum needs to take place only at the print statement, I would like to see the exception raised at the highlighted place: end_result = 0 for items in items_list: end_result += fork huge_calculation(items) # stacktrace for '+=' should be here print(end_result) # not here Best, Sven ------------------------------------------------------------------------------------------------- FreeMail powered by mail.de - MEHR SICHERHEIT, SERIOSIT?T UND KOMFORT -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Sat Aug 1 19:43:49 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 1 Aug 2015 13:43:49 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <55B3E9B2.50709@trueblade.com> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> Message-ID: <55BD0555.6010204@trueblade.com> On 7/25/2015 3:55 PM, Eric V. Smith wrote: > In trying to understand the issues for a PEP, I'm working on a sample > implementation. There, I've just disallowed concatentation entirely. > Compared to all of the other issues, it's really insignificant. I'll put > it back at some point. I'm basically done with my implementation of f-strings. I really can't decide if I want to allow adjacent f-string concatenation or not. I'm leaning towards not. I don't like mixing compile-time concatenation with run-time expression evaluation. But my mind is not made up. One issue that has cropped up: Should we support !s and !r, like str.format does? It's not really needed, since with f-strings you can just call str or repr yourself: >>> f'{"foo":10}' 'foo ' >>> f'{repr("foo"):10}' "'foo' " Do we also need to support: >>> f'{"foo"!r}' "'foo'" With str.format, !s and !r are needed because you can't put the call to repr in str.format's very limited expression syntax. But since f-strings support arbitrary expressions, it's not needed. Still, I'm leaning toward including it for two reasons: it's concise, and there's no reason to be arbitrarily incompatible with str.format. If I include !s and !r, then the only way that str.format differs from f-string expressions is in non-numeric subscripting (unfortunate, but discussed previously and I think required). This ignores the fact that f-string expressions encompass all Python expressions, while str.format is extremely limited. I'll start working on the PEP shortly. Eric. From Steve.Dower at microsoft.com Sat Aug 1 20:07:09 2015 From: Steve.Dower at microsoft.com (Steve Dower) Date: Sat, 1 Aug 2015 18:07:09 +0000 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BD0555.6010204@trueblade.com> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com>,<55BD0555.6010204@trueblade.com> Message-ID: +1 for !r/!s and not being arbitrarily incompatible with existing formatting. (I also really like being able to align string literals using an f-string. That seems to come up all the time in my shorter scripts for headings etc.) Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Eric V. Smith Sent: ?8/?1/?2015 10:44 To: python-ideas at python.org Subject: Re: [Python-ideas] Briefer string format On 7/25/2015 3:55 PM, Eric V. Smith wrote: > In trying to understand the issues for a PEP, I'm working on a sample > implementation. There, I've just disallowed concatentation entirely. > Compared to all of the other issues, it's really insignificant. I'll put > it back at some point. I'm basically done with my implementation of f-strings. I really can't decide if I want to allow adjacent f-string concatenation or not. I'm leaning towards not. I don't like mixing compile-time concatenation with run-time expression evaluation. But my mind is not made up. One issue that has cropped up: Should we support !s and !r, like str.format does? It's not really needed, since with f-strings you can just call str or repr yourself: >>> f'{"foo":10}' 'foo ' >>> f'{repr("foo"):10}' "'foo' " Do we also need to support: >>> f'{"foo"!r}' "'foo'" With str.format, !s and !r are needed because you can't put the call to repr in str.format's very limited expression syntax. But since f-strings support arbitrary expressions, it's not needed. Still, I'm leaning toward including it for two reasons: it's concise, and there's no reason to be arbitrarily incompatible with str.format. If I include !s and !r, then the only way that str.format differs from f-string expressions is in non-numeric subscripting (unfortunate, but discussed previously and I think required). This ignores the fact that f-string expressions encompass all Python expressions, while str.format is extremely limited. I'll start working on the PEP shortly. Eric. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Aug 1 20:25:45 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 2 Aug 2015 04:25:45 +1000 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BD0555.6010204@trueblade.com> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> Message-ID: <20150801182545.GC25179@ando.pearwood.info> On Sat, Aug 01, 2015 at 01:43:49PM -0400, Eric V. Smith wrote: > I really can't decide if I want to allow adjacent f-string concatenation > or not. I'm leaning towards not. I don't like mixing compile-time > concatenation with run-time expression evaluation. But my mind is not > made up. There's no harm in allowing implicit concatenation between f-strings. Possible confusion only creeps in when you allow implicit concatenation between f- and non-f-strings. > One issue that has cropped up: > > Should we support !s and !r, like str.format does? It's not really > needed, since with f-strings you can just call str or repr yourself: [...] > With str.format, !s and !r are needed because you can't put the call to > repr in str.format's very limited expression syntax. But since f-strings > support arbitrary expressions, it's not needed. Wait, did I miss something? Does this mean that f-strings will essentially be syntactic sugar for str(eval(s))? f"[i**2 for i in sequence]" f = lambda s: str(eval(s)) f("[i**2 for i in sequence]") -- Steve From eric at trueblade.com Sat Aug 1 20:25:52 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 01 Aug 2015 14:25:52 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com>, <55BD0555.6010204@trueblade.com> Message-ID: <55BD0F30.70305@trueblade.com> On 08/01/2015 02:07 PM, Steve Dower wrote: > +1 for !r/!s and not being arbitrarily incompatible with existing > formatting. (I also really like being able to align string literals > using an f-string. That seems to come up all the time in my shorter > scripts for headings etc.) I meant to write that last example as: >>> f'{"foo"!r:10}' "'foo' " You can of course mix !r with a string format specifier. Eric. > Cheers, > Steve > > Top-posted from my Windows Phone > ------------------------------------------------------------------------ > From: Eric V. Smith > Sent: ?8/?1/?2015 10:44 > To: python-ideas at python.org > Subject: Re: [Python-ideas] Briefer string format > > On 7/25/2015 3:55 PM, Eric V. Smith wrote: >> In trying to understand the issues for a PEP, I'm working on a sample >> implementation. There, I've just disallowed concatentation entirely. >> Compared to all of the other issues, it's really insignificant. I'll put >> it back at some point. > > I'm basically done with my implementation of f-strings. > > I really can't decide if I want to allow adjacent f-string concatenation > or not. I'm leaning towards not. I don't like mixing compile-time > concatenation with run-time expression evaluation. But my mind is not > made up. > > One issue that has cropped up: > > Should we support !s and !r, like str.format does? It's not really > needed, since with f-strings you can just call str or repr yourself: > >>>> f'{"foo":10}' > 'foo ' >>>> f'{repr("foo"):10}' > "'foo' " > > Do we also need to support: > >>>> f'{"foo"!r}' > "'foo'" > > With str.format, !s and !r are needed because you can't put the call to > repr in str.format's very limited expression syntax. But since f-strings > support arbitrary expressions, it's not needed. Still, I'm leaning > toward including it for two reasons: it's concise, and there's no reason > to be arbitrarily incompatible with str.format. If I include !s and !r, > then the only way that str.format differs from f-string expressions is > in non-numeric subscripting (unfortunate, but discussed previously and I > think required). This ignores the fact that f-string expressions > encompass all Python expressions, while str.format is extremely limited. > > I'll start working on the PEP shortly. > > Eric. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From eric at trueblade.com Sat Aug 1 20:51:09 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 1 Aug 2015 14:51:09 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <20150801182545.GC25179@ando.pearwood.info> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> Message-ID: <55BD151D.6060702@trueblade.com> On 8/1/2015 2:25 PM, Steven D'Aprano wrote: > On Sat, Aug 01, 2015 at 01:43:49PM -0400, Eric V. Smith wrote: >> With str.format, !s and !r are needed because you can't put the call to >> repr in str.format's very limited expression syntax. But since f-strings >> support arbitrary expressions, it's not needed. > > Wait, did I miss something? Does this mean that f-strings will > essentially be syntactic sugar for str(eval(s))? > > f"[i**2 for i in sequence]" > > f = lambda s: str(eval(s)) > f("[i**2 for i in sequence]") Well, it's somewhat more complex. It's true that: >>> sequence=[1,2,3] >>> f"{[i**2 for i in sequence]}" '[1, 4, 9]' But it's more complex when there are format specifiers and literals involved. Basically, the idea is that: f'a{expr1:spec1}b{expr2:spec2}c' is shorthand for: ''.join(['a', expr1.__format__(spec1), 'b', expr2.__format__(spec2), 'c']) The expressions can indeed be arbitrarily complex expressions. Because only string literals are supported, it just the same as if you'd written the expressions not inside of a string (as shown above). It's not like you're eval-ing user supplied strings. Eric. From abarnert at yahoo.com Sun Aug 2 01:30:05 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 1 Aug 2015 16:30:05 -0700 Subject: [Python-ideas] fork - other approaches In-Reply-To: <20150801172904.C34AA87403@smtp04.mail.de> References: <20150801172904.C34AA87403@smtp04.mail.de> Message-ID: <24B23DC0-E484-47EC-89D8-6CA19C4FE864@yahoo.com> On Aug 1, 2015, at 10:29, Sven R. Kunze wrote: > > Thanks everybody for inspiring me with alternative ways of working with pools. > > I am very certain that any them will work as intended. However, they do not zero in 100% on my main intentions: > > 1) easy to understand > 2) exchangeable (seq <-> par) > > > A) pmap > > It origates from map and allows easy exchangeability back and forth sequential and concurrent/parallel execution. > > However, I have to admit that I have difficulties to change all the 'for loops' to map (mentally as well as for real). You probably don't have to--or want to--change all the for loops. It's very rare that you have a huge sequence of separate loops that all contribute equally to performance and are all parallelizable with the same granularity and so on. Usually, there is one loop that you want to parallelize, and that solves the problem for your entire program. > The 'for loop' IS the most used loop construct in business applications and I do not see it going away because of something else (such as map). Of course the for statement isn't going away. Neither are comprehensions. And neither are map and other higher-order functions. They do related but slightly different things, and a language that tried to force them all into the same construct would be an unpleasant language. That's why they've coexisted for decades in Python without any of them going away. But you're the one who's trying to do that. In order to avoid having to learn about any other ways to write flow control, you want to change the language so you can disguise all flow control as the kind you already know how to write. > B) with Pool() > > It removes the need to close and join the pool which removes the visual clutter from the source code. That as such is great. It also means you can't forget to clean up the pool, you can't accidentally try to use the results before they're ready, etc. The with statement is one of the key tools in using Python effectively, and I personally wouldn't trust a developer who didn't understand it to start doing multicore optimizations on my code. Also, if you're learning from the examples at the top of the docs and haven't seen with Pool before, I suspect either you're still using Python 2.x (in which case you need to upgrade to 3.5 before you can start proposing new features for 3.6) or reading the 2.7 docs while using 3.x (in which case, don't do that). > However, exchangeability is clearly not given and the same issue concerning understandability like pmap arises. It's still calling map, so if you don't understand even the basics of higher-order functions, I suppose you still won't understand it. But again, that's a pretty basic and key thing, and I wouldn't go assigning multicore optimization tasks to a developer who couldn't grasp the concept. > C) apply > > Nick's approach of providing a 'call_in_background' solution comes almost close to what would solve the issues at hand. > > However, it reminds me of apply (deprecated built-in function for calling other functions). So, a better name for it would be 'bg_apply'. The problem with apply is that it's almost always completely unnecessary, and can be written as a one-liner when it is; its presence encouraged people from other languages where it _is_ necessary to overuse it in Python. But unfortunately, there is a bit of a mix between functions that "apply" other functions--including Pool.apply_async--and those that "call" other functions and even those they "submit" them. There's really no difference, so it would be nice if Python were consistent in the naming. And, since Pool uses the "apply" terminology, I think you may be right here. I disagree about abbreviating background to "bg", however. You're only going to be writing this a few times in your program, but you'll be reading those few places quite often, and the fact that they're backgrounding code will likely be important to understanding and debugging that code. So I'd stick with the PEP 8 recommendation and spell it out. But of course your mileage may vary. Since this is a function you're writing based on Nick's blog post, you can call it whatever makes sense in your particular app. (And, even if it makes it into the stdlib, there's nothing stopping you from writing "bg_apply = apply_in_background" or "from asyncio import apply_in_background as bg_apply" if you really want to.) > All of these approaches basically rip the function call out of the programmer's view. > > It is no longer > > function(arg) > > but > > apply(function, arg) # or > bg_apply(function, arg) # or > bg_apply_many(function, args) > > > I don't see this going well in production and in code reviews. Using a higher-order function when there's no need for it certainly should be rejected in code review--which is why Python no longer has the "apply" function. But using one when it's appropriate--like calling map when you want to map a function over an iterable and get back and iterable of results--is a different story. If you're afraid of doing that because you're afraid it won't pass code reviews, then either you have insufficient faith in your coworkers, or you need to find a new job. > So, an expression keyword like 'fork' would still be better at least from my perspective. It would tell me: 'it's not my responsibility anymore; delegate this to someone else and get me a handle of the future result'. You still haven't answered any of the issues I or anyone else raised with this: fork strongly implies forking new processes rather than submitting to a pool, there's no obvious or visible way to control what kind of pool you're using it how you're using it, there's nowhere to look up what kind of future-like object you get back or what its API is, it's insufficiently useful as a statement but looks clumsy and unpythonic as an expression, etc. Using Pool.map--or Executor.map, which is what I think you really want here (it provides real composable futures, it lets you switch between threads and processes in one central place, etc., and you appear to have no need for the lower-level features of the pool, like controlling batching)--avoids all of those problems. It's worth noting that there are some languages where a solution like this could be more appropriate. For example, in a pure immutable functional language, you really could just have the user start up tasks and let the implementation decide how to pool things, how to partition them among green threads/OS threads/processes manually, etc. because that would be a transparent optimization. For example, an Erlang implementation could use static analysis or runtime tracing to recognize that some processes communicate more heavily than others and partition them into OS processes in a way that minimizes the cost of that communication, and that would be pretty nifty. But a Python implementation couldn't do that, because any of those tasks might write to a shared variable that another task needs, or try to return some unpicklable object, etc. Of course the hope is that in the long run, something like PyPy's STM will be so universally usable that neither you nor the implementation will ever need to make such decisions. But until then, it has to be you, the user, who makes them. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sun Aug 2 02:02:53 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 1 Aug 2015 17:02:53 -0700 Subject: [Python-ideas] fork In-Reply-To: <20150801173628.29BAB873FE@smtp04.mail.de> References: <20150801173628.29BAB873FE@smtp04.mail.de> Message-ID: On Aug 1, 2015, at 10:36, Sven R. Kunze wrote: > > Thanks everybody for the feedback on 'fork'. > > Let me address the issues and specify it further: > > > 1) Process vs. Thread vs. Coroutine > > From my understanding, the main fallacy here is that the caller would be able to decide which type of pool is best suited. > > Take create_thumbnail as an example. You do not know whether this is cpu-bound or io-bound; you can just make a guess or try it out. > > But who knows then? I would say: the callee. > > create_thumbnail is cpu-bound when doing the work itself on the machine. > create_thumbnail is io-bound when delegating the work to, say, a web service. There's a whole separate thread going on about making it easier to understand the distinctions between coroutine/thread/process, separate tasks/pools/executors, etc. There's really no way to take that away from the programmer, but Python (and, more importantly, the Python docs) could do a lot to make that easier. Your idea of having a single global "pool manager" object, where you could submit tasks and, depending on how they're marked, they get handled differently might have merit. But that's something you could build pretty easily on top of concurrent.futures (at least for threads vs. processes; you can add in coroutines later, because they're not quite as easy to integrate), upload to PyPI, and start getting experience with before trying to push it into the stdlib, much less the core language. (Notice that Greg Ewing had a proposal a few years ago that was very similar to the recent async/await change, but he couldn't sell anyone on it. But then, after extensive experience with the asyncio module, first as tulip on PyPI and then added to the stdlib, the need for the new syntax became more obvious to everyone, and people--including Guido--who had rejected Greg's proposal out of hand enthusiastically supported the new proposal.) > Same functionality, same name, same API, different pools required. > > > This said, I would propose something like a marking solution: > > @cpu_bound > def create_thumbnail(image): > # impl > > @io_bound > def create_thumbnail(image): > # impl > > (coroutines are already marked as such) > > From this, the Python interpreter should be able to infer which type of pool is appropriate. > > > 2) Pool size > > Do lists have a fixed length? Do I need to define their lengths right from the start? Do I know them in advance? > > I think the answers to these questions are obvious. I don't understand why it should be different for the size of the pools. They could grow and shrink depending on the workload and the available resources. The available resources rarely change at runtime. If you're doing CPU-bound work, the number of cores is unlikely to change during a run. (In rare cases, you might want to sometimes count hyperthreads as separate cores and sometimes not, but that would depend on intimate knowledge of the execution characteristics of the tasks you're submitting in two different places.) Similarly, if you're doing threads, the ideal pool size usually depends more on what you're waiting for than on what you're doing--12 threads may be great for submitting URLs to arbitrary servers on the internet, 4 threads may be better for submitting to a specific web service that you've configured to match, 16 threads may be better for a simulation with 2^n bodies, etc Sometimes these really do need to grow and shrink configurably--not during a run, but during a deployment. In that case, you should store them in a config file rather than hard coding them. Then your sysadmin/deploy manager/whatever can learn how to test and configure them. For a real-life example (although not in Python), I know Level3 configured their video servers to use 4 processes of 4 threads per machine, while Akamai used 1 process of 16 threads (actually 2, but the second only for failover, not used live). Why? I have no idea, but presumably they tested the software with their machines and their networks and came to different results, and it's a good thing their software allowed them to configure it so they could each save that 1.3% heat or whatever it was they were trying to optimize. > 3) Pool Management in General > > There is a reason why I hesitate to explicitly manage pools. Our code runs on a plethora of platforms ranging from few to many hardware threads. We actually do not want to integrate platform-specific properties right into the source. The point of having parallelism and concurrency is to squeeze out more of the machines and get better response times. Anything else wouldn't be honest in my opinion (besides from researching and experimenting). Which is exactly why some apps should expose these details to the sysadmin as configuration variables. Hiding the details inside the interpreter would make that harder, not easier. > Thus, a practical solution needs to be simple and universal. Explicitly setting the size of the pool is not universal and definitely not easy. If you want universal and easy, the default value is the number of CPUs, which is often the best value to use. When you don't need to manually configure things to squeeze out the last few %, just rely on the defaults. When you do need to, it should be as easy as possible. And that's the way things currently are. > It doesn't need to be perfect. Even if a first draft implementation would simply define pools having exactly 4 processes/threads/coroutines, that would be awesome. Even cutting execution time into half would be an amazing accomplishment. > > Maybe, even 'fork' is too complicated. It could work without it given the decorators above. But then, we could not decide whether to run things in parallel or sequentially. I think I do not like that. > > > 4) Keyword 'fork' > > Well, first shot. If you have a better one, I am all in for it (4 letters or shorter only ;) )... Or maybe something like 'par' for parallel or 'con' for concurrent. > > > 5) Awaiting the Completion of Something > > As Andrew proposed, using the return value should result in blocking. > > What if there is no result to wait for? > That one is harder but I think another keyword like 'wait' or 'await' should work here fine. > > for image in images: > fork create_thumbnail(image) > wait > print(get_size_of_thumbnail_dir()) This only allows you to wait on everything to finish, or nothing at all. Very often, you want to wait on things in whatever order they come in. Or wait until the first task has finished. Or wait on them in the order they were submitted (which still allows you to get some pipelining over waiting on all). This is a well-known problem, and the standard solution across many languages is futures. The concurrent.futures module and the asyncio module are both designed around futures. You can explicitly wait on a future, or chain further operations onto a future--and, more importantly, you can compose futures into various kinds of group-waiting objects (wait for all, wait for any, wait for all or until first error, wait in any order, wait in specified order) that are themselves futures. If you want to try to collapse futures into syntax, you need something that still retains all of the power of futures. A single keyword isn't going to do that. Also, note that await is already a keyword in Python; it's used to explicitly block until another coroutine is ready. In other words, it's a syntactic form of the very simplest way to use futures (and note that, because futures are composable, anything can ultimately be reduced to "block until this one future is ready"). The reason the thread/process futures don't have such a keyword is that they don't need one; just calling a function blocks on it, and, because threads and processes are preemptive rather than cooperative, that works without blocking any other tasks. So, instead of writing "await futures.wait(iterable_of_futures, where=FIRST_EXCEPTION)" you just write the same thing without "await" and it already does what you want. > 6) Exceptions > > As close to sequential execution as possible. > > That is, when some function is forked out and raises an exception, it should behave as if it were a normal function call. > > for image in images: > fork create_thumbnail(image) # I would like to see that in my stacktrace Futures already take care of this. They automatically transport exceptions (with stack traces) across the boundary to reraise where they're waited for. > Also true for expressions. '+=' might raise an exception because, say, huge_calculation returns 'None'. Although the actually evaluation of the sum needs to take place only at the print statement, I would like to see the exception raised at the highlighted place: > > end_result = 0 > for items in items_list: > end_result += fork huge_calculation(items) # stacktrace for '+=' should be here > print(end_result) # not here In this code, your += isn't inside a "fork", so there's no way the implementation could know that you want it delayed. What you're asking for here is either implicit lazy evaluation, contagious futures, or dataflow variables, all of which are much more radical changes to the language than just adding syntactic sugar for explicit futures. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Sun Aug 2 02:23:22 2015 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 02 Aug 2015 09:23:22 +0900 Subject: [Python-ideas] fork - other approaches In-Reply-To: <20150801172904.C34AA87403@smtp04.mail.de> References: <20150801172904.C34AA87403@smtp04.mail.de> Message-ID: <87bneqczd1.fsf@uwakimon.sk.tsukuba.ac.jp> Sven R. Kunze writes: > I am very certain that any them will work as intended. However, > they do not zero in 100% on my main intentions: > > 1) easy to understand > 2) exchangeable (seq par) Exchangeability is a property of the computational structure, not of the language syntax. In particular, in languages that *have* a for loop, you're also going to have side effects, and exchangeability will fail if those side effects interact between problem components. Therefore you need at least two syntaxes: one to express sequential iteration, and one to express parallizable computations. Since the "for" syntax in Python has always meant sequential iteration, and the computations allowed in the suite are unrestricted, you'd just be asking for trouble. > So, an expression keyword like 'fork' would still be better at > least from my perspective. It would tell me: 'it's not my > responsibility anymore; delegate this to someone else and get me a > handle of the future result'. But now you run into the problem that "for" is not an expression in Python (and surely never will be). You need something that (1) takes a "set-ish"[1] of "problems" and a function to map over them, or (2) a set-ish of problem-function pairs applying the functions to the problems, and then (3) *returns* a "set-ish" of results. (That's just a somewhat more computational expression of your words that I quote.) In other words, "fork" can't be a statement, it has to be an expression, and in Python that expression is the function "map". What am I missing? Footnotes: [1] Possibly an iterable. From eric at trueblade.com Sun Aug 2 17:37:43 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 2 Aug 2015 11:37:43 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BD0555.6010204@trueblade.com> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> Message-ID: <55BE3947.9020000@trueblade.com> On 8/1/2015 1:43 PM, Eric V. Smith wrote: > On 7/25/2015 3:55 PM, Eric V. Smith wrote: >> In trying to understand the issues for a PEP, I'm working on a sample >> implementation. There, I've just disallowed concatentation entirely. >> Compared to all of the other issues, it's really insignificant. I'll put >> it back at some point. > > I'm basically done with my implementation of f-strings. Here's another issue. I can't imagine this will happen often, but it should be addressed. It has to do with literal expressions that begin with a left brace. For example, this expression: >>> {x: y for x, y in [(1, 2), (3, 4)]} {1: 2, 3: 4} If you want to put it in an f-string, you'd naively write: >>> f'expr={{x: y for x, y in [(1, 2), (3, 4)]}}' 'expr={x: y for x, y in [(1, 2), (3, 4)]}' But as you see, this won't work because the doubled '{' and '}' chars are just interpreted as escaped braces, and the result is an uninterpreted string literal, with the doubled braces replaced by undoubled ones. There's currently no way around this. You could try putting a space between the left braces, but that fails with IndentationError: >>> f'expr={ {x: y for x, y in [(1, 2), (3, 4)]}}' File "", line 1 {x: y for x, y in [(1, 2), (3, 4)]} ^ IndentationError: unexpected indent In the PEP I'm going to specify that leading spaces are skipped in an expression. So that last example will now work: >>> f'expr={ {x: y for x, y in [(1, 2), (3, 4)]}}' 'expr={1: 2, 3: 4}' Note that the right braces in that last example aren't interpreted as a doubled '}'. That's because the first one is part of the expression, and the second one ends the expression. The only time doubling braces matters is inside the string literal portion of an f-string. I'll reflect this "skip leading white space" decision in the PEP. Eric. From python at mrabarnett.plus.com Sun Aug 2 17:57:27 2015 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 2 Aug 2015 16:57:27 +0100 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BE3947.9020000@trueblade.com> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <55BE3947.9020000@trueblade.com> Message-ID: <55BE3DE7.5080203@mrabarnett.plus.com> On 2015-08-02 16:37, Eric V. Smith wrote: > On 8/1/2015 1:43 PM, Eric V. Smith wrote: >> On 7/25/2015 3:55 PM, Eric V. Smith wrote: >>> In trying to understand the issues for a PEP, I'm working on a sample >>> implementation. There, I've just disallowed concatentation entirely. >>> Compared to all of the other issues, it's really insignificant. I'll put >>> it back at some point. >> >> I'm basically done with my implementation of f-strings. > > Here's another issue. I can't imagine this will happen often, but it > should be addressed. It has to do with literal expressions that begin > with a left brace. > > For example, this expression: >>>> {x: y for x, y in [(1, 2), (3, 4)]} > {1: 2, 3: 4} > > If you want to put it in an f-string, you'd naively write: > >>>> f'expr={{x: y for x, y in [(1, 2), (3, 4)]}}' > 'expr={x: y for x, y in [(1, 2), (3, 4)]}' > > But as you see, this won't work because the doubled '{' and '}' chars > are just interpreted as escaped braces, and the result is an > uninterpreted string literal, with the doubled braces replaced by > undoubled ones. > > There's currently no way around this. You could try putting a space > between the left braces, but that fails with IndentationError: > >>>> f'expr={ {x: y for x, y in [(1, 2), (3, 4)]}}' > File "", line 1 > {x: y for x, y in [(1, 2), (3, 4)]} > ^ > IndentationError: unexpected indent > Why is there an IndentationError? It's an expression, not a statement, so leading spaces should be ignored. They're not at the Python prompt, but, then, that accepts both statements and expressions. > In the PEP I'm going to specify that leading spaces are skipped in an > expression. So that last example will now work: > >>>> f'expr={ {x: y for x, y in [(1, 2), (3, 4)]}}' > 'expr={1: 2, 3: 4}' > > Note that the right braces in that last example aren't interpreted as a > doubled '}'. That's because the first one is part of the expression, and > the second one ends the expression. The only time doubling braces > matters is inside the string literal portion of an f-string. > > I'll reflect this "skip leading white space" decision in the PEP. > From eric at trueblade.com Sun Aug 2 18:00:59 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 2 Aug 2015 12:00:59 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BE3DE7.5080203@mrabarnett.plus.com> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <55BE3947.9020000@trueblade.com> <55BE3DE7.5080203@mrabarnett.plus.com> Message-ID: <55BE3EBB.1020102@trueblade.com> Eric. On 8/2/2015 11:57 AM, MRAB wrote: > On 2015-08-02 16:37, Eric V. Smith wrote: >> On 8/1/2015 1:43 PM, Eric V. Smith wrote: >>> On 7/25/2015 3:55 PM, Eric V. Smith wrote: >>>> In trying to understand the issues for a PEP, I'm working on a sample >>>> implementation. There, I've just disallowed concatentation entirely. >>>> Compared to all of the other issues, it's really insignificant. I'll >>>> put >>>> it back at some point. >>> >>> I'm basically done with my implementation of f-strings. >> >> Here's another issue. I can't imagine this will happen often, but it >> should be addressed. It has to do with literal expressions that begin >> with a left brace. >> >> For example, this expression: >>>>> {x: y for x, y in [(1, 2), (3, 4)]} >> {1: 2, 3: 4} >> >> If you want to put it in an f-string, you'd naively write: >> >>>>> f'expr={{x: y for x, y in [(1, 2), (3, 4)]}}' >> 'expr={x: y for x, y in [(1, 2), (3, 4)]}' >> >> But as you see, this won't work because the doubled '{' and '}' chars >> are just interpreted as escaped braces, and the result is an >> uninterpreted string literal, with the doubled braces replaced by >> undoubled ones. >> >> There's currently no way around this. You could try putting a space >> between the left braces, but that fails with IndentationError: >> >>>>> f'expr={ {x: y for x, y in [(1, 2), (3, 4)]}}' >> File "", line 1 >> {x: y for x, y in [(1, 2), (3, 4)]} >> ^ >> IndentationError: unexpected indent >> > Why is there an IndentationError? It's an expression, not a statement, > so leading spaces should be ignored. Good question. I'm parsing it with PyParser_ASTFromString. Maybe I'm missing a compiler flag there which will ignore leading spaces. But in any event, the result is the same: You'll need to add a space here in order to disambiguate it from doubled braces. That's really the crux of the issue. Eric. From 4kir4.1i at gmail.com Sun Aug 2 19:09:12 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Sun, 02 Aug 2015 20:09:12 +0300 Subject: [Python-ideas] Concurrency Modules References: <559EFB73.5050606@mail.de> <9c139305-f583-46c1-b819-6a98dbd04acc@googlegroups.com> <55B2B0FB.1060409@mail.de> <55B3C93D.9090601@mail.de> <55B5508E.1000201@mail.de> <55B872BB.5080603@mail.de> Message-ID: <87egjlr51j.fsf@gmail.com> Ludovic Gasc writes: > 2015-07-29 8:29 GMT+02:00 Sven R. Kunze : > >> Thanks Ludovic. >> >> On 28.07.2015 22:15, Ludovic Gasc wrote: >> >> Hello, >> >> This discussion is pretty interesting to try to list when each >> architecture is the most efficient, based on the need. >> >> However, just a small precision: multiprocess/multiworker isn't antinomic >> with AsyncIO: You can have an event loop in each process to try to combine >> the "best" of two "worlds". >> As usual in IT, it isn't a silver bullet that will care the cancer, >> however, at least to my understanding, it should be useful for some >> business needs like server daemons. >> >> >> I think that should be clear for everybody using any of these modules. But >> you are right to point it out explicitly. >> > > Based on my discussions at EuroPython and PyCON-US, it's certainly clear > for the middle-class management of Python community, however, not really > from the typical Python end-dev: Several persons tried to troll me that > multiprocessing is more efficient than AsyncIO. > > To me, it was a opportunity to transform the negative troll attempt to a > positive exchange about efficiency and understand before to troll ;-) > More seriously, I've the feeling that it isn't very clear for everybody, > especially for the new comers. > Do you mean those trolls that measure first then make conclusions ;) Could you provide an evidence-based description of the issue such as http://www.mailinator.com/tymaPaulMultithreaded.pdf but for Python? From xavier.combelle at gmail.com Sun Aug 2 21:12:21 2015 From: xavier.combelle at gmail.com (Xavier Combelle) Date: Sun, 2 Aug 2015 21:12:21 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BE3EBB.1020102@trueblade.com> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <55BE3947.9020000@trueblade.com> <55BE3DE7.5080203@mrabarnett.plus.com> <55BE3EBB.1020102@trueblade.com> Message-ID: You could disambiguate with parenthesis like this f'expr={({x: y for x, y in [(1, 2), (3, 4)]})}' 2015-08-02 18:00 GMT+02:00 Eric V. Smith : > > > Eric. > > On 8/2/2015 11:57 AM, MRAB wrote: > > On 2015-08-02 16:37, Eric V. Smith wrote: > >> On 8/1/2015 1:43 PM, Eric V. Smith wrote: > >>> On 7/25/2015 3:55 PM, Eric V. Smith wrote: > >>>> In trying to understand the issues for a PEP, I'm working on a sample > >>>> implementation. There, I've just disallowed concatentation entirely. > >>>> Compared to all of the other issues, it's really insignificant. I'll > >>>> put > >>>> it back at some point. > >>> > >>> I'm basically done with my implementation of f-strings. > >> > >> Here's another issue. I can't imagine this will happen often, but it > >> should be addressed. It has to do with literal expressions that begin > >> with a left brace. > >> > >> For example, this expression: > >>>>> {x: y for x, y in [(1, 2), (3, 4)]} > >> {1: 2, 3: 4} > >> > >> If you want to put it in an f-string, you'd naively write: > >> > >>>>> f'expr={{x: y for x, y in [(1, 2), (3, 4)]}}' > >> 'expr={x: y for x, y in [(1, 2), (3, 4)]}' > >> > >> But as you see, this won't work because the doubled '{' and '}' chars > >> are just interpreted as escaped braces, and the result is an > >> uninterpreted string literal, with the doubled braces replaced by > >> undoubled ones. > >> > >> There's currently no way around this. You could try putting a space > >> between the left braces, but that fails with IndentationError: > >> > >>>>> f'expr={ {x: y for x, y in [(1, 2), (3, 4)]}}' > >> File "", line 1 > >> {x: y for x, y in [(1, 2), (3, 4)]} > >> ^ > >> IndentationError: unexpected indent > >> > > Why is there an IndentationError? It's an expression, not a statement, > > so leading spaces should be ignored. > > Good question. I'm parsing it with PyParser_ASTFromString. Maybe I'm > missing a compiler flag there which will ignore leading spaces. > > But in any event, the result is the same: You'll need to add a space > here in order to disambiguate it from doubled braces. That's really the > crux of the issue. > > Eric. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Sun Aug 2 21:29:39 2015 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 02 Aug 2015 15:29:39 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BE3947.9020000@trueblade.com> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <55BE3947.9020000@trueblade.com> Message-ID: On 08/02/2015 11:37 AM, Eric V. Smith wrote: > On 8/1/2015 1:43 PM, Eric V. Smith wrote: >> On 7/25/2015 3:55 PM, Eric V. Smith wrote: > If you want to put it in an f-string, you'd naively write: > >>>> f'expr={{x: y for x, y in [(1, 2), (3, 4)]}}' > 'expr={x: y for x, y in [(1, 2), (3, 4)]}' This probably doesn't work either... f'expr={{{x: y for x, y in [(1, 2), (3, 4)]}}}' Escaping "{{{" needs to resolve for left to right to work. Which is weird. > But as you see, this won't work because the doubled '{' and '}' chars > are just interpreted as escaped braces, and the result is an > uninterpreted string literal, with the doubled braces replaced by > undoubled ones. > > There's currently no way around this. You could try putting a space > between the left braces, but that fails with IndentationError: Could two new escape characters be added to python strings? "\{" and "\}" f'expr={\{x: y for x, y in [(1, 2), (3, 4)]\}}' Ron From srkunze at mail.de Sun Aug 2 22:56:52 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Sun, 02 Aug 2015 22:56:52 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BD0555.6010204@trueblade.com> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> Message-ID: <55BE8414.9010805@mail.de> On 01.08.2015 19:43, Eric V. Smith wrote: > Should we support !s and !r, like str.format does? It's not really > needed, since with f-strings you can just call str or repr yourself: Just to get a proper understanding here: What is the recommended usage? repr and str or !r and !s? From rosuav at gmail.com Mon Aug 3 01:30:12 2015 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 3 Aug 2015 09:30:12 +1000 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BE3947.9020000@trueblade.com> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <55BE3947.9020000@trueblade.com> Message-ID: On Mon, Aug 3, 2015 at 1:37 AM, Eric V. Smith wrote: > There's currently no way around this. You could try putting a space > between the left braces, but that fails with IndentationError: > >>>> f'expr={ {x: y for x, y in [(1, 2), (3, 4)]}}' > File "", line 1 > {x: y for x, y in [(1, 2), (3, 4)]} > ^ > IndentationError: unexpected indent > > In the PEP I'm going to specify that leading spaces are skipped in an > expression. So that last example will now work: > >>>> f'expr={ {x: y for x, y in [(1, 2), (3, 4)]}}' > 'expr={1: 2, 3: 4}' > > Note that the right braces in that last example aren't interpreted as a > doubled '}'. That's because the first one is part of the expression, and > the second one ends the expression. The only time doubling braces > matters is inside the string literal portion of an f-string. Sounds good. And even though your }} is perfectly valid, I'd recommend people use spaces at both ends: >>> f'expr={ {x: y for x, y in [(1, 2), (3, 4)]} }' 'expr={1: 2, 3: 4}' which presumably would be valid too. It's a narrow enough case (expressions beginning or ending with a brace) that the extra spaces won't be a big deal IMO. ChrisA From python-ideas at mgmiller.net Mon Aug 3 01:46:24 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Sun, 02 Aug 2015 16:46:24 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BD151D.6060702@trueblade.com> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> Message-ID: <55BEABD0.7000604@mgmiller.net> Hi, I don't understand how we got to arbitrary expressions. There was probably an edge case or two, but I wasn't expecting str(eval(s)) to be the answer, and one I'm not sure I'd want. -Mike > On 8/1/2015 2:25 PM, Steven D'Aprano wrote: >> Wait, did I miss something? Does this mean that f-strings will >> essentially be syntactic sugar for str(eval(s))? >> From abarnert at yahoo.com Mon Aug 3 03:08:32 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 2 Aug 2015 18:08:32 -0700 Subject: [Python-ideas] Concurrency Modules In-Reply-To: <87egjlr51j.fsf@gmail.com> References: <559EFB73.5050606@mail.de> <9c139305-f583-46c1-b819-6a98dbd04acc@googlegroups.com> <55B2B0FB.1060409@mail.de> <55B3C93D.9090601@mail.de> <55B5508E.1000201@mail.de> <55B872BB.5080603@mail.de> <87egjlr51j.fsf@gmail.com> Message-ID: On Aug 2, 2015, at 10:09, Akira Li <4kir4.1i at gmail.com> wrote: > > Ludovic Gasc writes: > >> 2015-07-29 8:29 GMT+02:00 Sven R. Kunze : >> >>> Thanks Ludovic. >>> >>> On 28.07.2015 22:15, Ludovic Gasc wrote: >>> >>> Hello, >>> >>> This discussion is pretty interesting to try to list when each >>> architecture is the most efficient, based on the need. >>> >>> However, just a small precision: multiprocess/multiworker isn't antinomic >>> with AsyncIO: You can have an event loop in each process to try to combine >>> the "best" of two "worlds". >>> As usual in IT, it isn't a silver bullet that will care the cancer, >>> however, at least to my understanding, it should be useful for some >>> business needs like server daemons. >>> >>> >>> I think that should be clear for everybody using any of these modules. But >>> you are right to point it out explicitly. >> >> Based on my discussions at EuroPython and PyCON-US, it's certainly clear >> for the middle-class management of Python community, however, not really >> from the typical Python end-dev: Several persons tried to troll me that >> multiprocessing is more efficient than AsyncIO. >> >> To me, it was a opportunity to transform the negative troll attempt to a >> positive exchange about efficiency and understand before to troll ;-) >> More seriously, I've the feeling that it isn't very clear for everybody, >> especially for the new comers. > > Do you mean those trolls that measure first then make > conclusions ;) > > Could you provide an evidence-based description of the issue such as > http://www.mailinator.com/tymaPaulMultithreaded.pdf > but for Python? The whole point of that post, and of the older von Behrens paper is references, is that a threading-like API can be built that uses explicit cooperative threading and dynamic stacks, and that avoids all of the problems with threads while retaining almost all of the advantages. That sounds great. Which is probably why it's exactly what Python asyncio does. Just like von Behrens's thread package, it uses an event loop around poll (or something better) to drive a scheduler for coroutines. The only difference is that Python has coroutines natively, unlike Java or C, and with a nice API, so there's no reason not to hide that API. (But if you really want to, you can just use gevent without its monkeypatching library, and then you've got an almost exact equivalent.) In other words, in the terms used by mailinator, asyncio is exactly the thread package they suggest using instead of an event package. Their evidence that something like asyncio can be built for Java, and we don't need evidence that something like asyncio could be built for Python because Guido already built it. You could compare asyncio with the coroutine API to asyncio with the lower-level callback API (or Twisted with inline callbacks to Twisted with coroutines, etc.), but what would be the point? Of course multiprocessing vs. asyncio is a completely different question. Now that we have reasonably similar, well-polished APIs for both, people can start running comparisons. But it's pretty easy to predict what they'll find: for some applications, multiprocessing is better; for others, asyncio is better; for others, a simple combination of the two easily beats either alone; for others, it really doesn't make much difference because concurrency isn't even remotely the key issue. The only thing that really matters to anyone is which is better for _their_ application, and that's something you can't extrapolate from a completely different test any better than you can guess it. From eric at trueblade.com Mon Aug 3 04:43:03 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 2 Aug 2015 22:43:03 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BEABD0.7000604@mgmiller.net> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> Message-ID: <55BED537.8020000@trueblade.com> On 8/2/2015 7:46 PM, Mike Miller wrote: > Hi, > > I don't understand how we got to arbitrary expressions. I think here: https://mail.python.org/pipermail/python-ideas/2015-July/034701.html > There was probably an edge case or two, but I wasn't expecting > str(eval(s)) to be the answer, and one I'm not sure I'd want. As I pointed out earlier, it's not exactly str(eval(s)). Also, what's your concern with the suggested approach? There are no security concerns as there would be with eval-ing arbitrary strings. Eric. From eric at trueblade.com Mon Aug 3 04:43:50 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 2 Aug 2015 22:43:50 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BE8414.9010805@mail.de> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <55BE8414.9010805@mail.de> Message-ID: <55BED566.5000109@trueblade.com> On 8/2/2015 4:56 PM, Sven R. Kunze wrote: > On 01.08.2015 19:43, Eric V. Smith wrote: >> Should we support !s and !r, like str.format does? It's not really >> needed, since with f-strings you can just call str or repr yourself: > > Just to get a proper understanding here: > > What is the recommended usage? repr and str or !r and !s? They're equivalent, so it wouldn't matter. I'd probably use !r and !s, myself. Eric. From breamoreboy at yahoo.co.uk Mon Aug 3 05:14:55 2015 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Mon, 3 Aug 2015 04:14:55 +0100 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <55BE3947.9020000@trueblade.com> <55BE3DE7.5080203@mrabarnett.plus.com> <55BE3EBB.1020102@trueblade.com> Message-ID: On 02/08/2015 20:12, Xavier Combelle wrote: > You could disambiguate with parenthesis like this f'expr={({x: y for x, > y in [(1, 2), (3, 4)]})}' > What on earth happened to "Readability counts"? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From python-ideas at mgmiller.net Mon Aug 3 09:05:09 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 03 Aug 2015 00:05:09 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BED537.8020000@trueblade.com> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> Message-ID: <55BF12A5.4050300@mgmiller.net> On 08/02/2015 07:43 PM, Eric V. Smith wrote: > On 8/2/2015 7:46 PM, Mike Miller wrote: >> I don't understand how we got to arbitrary expressions. > I think here: > https://mail.python.org/pipermail/python-ideas/2015-July/034701.html In that message, GvR seems to be exploring the options. I could be wrong, but from reading again, he appears to favor keeping it to .format() syntax? >> There was probably an edge case or two, but I wasn't expecting Did anyone discover the strategy below wasn't possible (moving the identifiers)? f'{x}{y}' --> '{}{}'.format(x, y) f'{x:10}{y[name]}' --> '{:10}{[name]}'.format(x, y) > Also, what'syour concern with the suggested approach? There are no security concerns > as there would be with eval-ing arbitrary strings. That's true I suppose, and perhaps I'm being irrational but it feels like a complex solution, a pandora's box if you will. To put it another way, it's way more power than I was expecting. It's rare that I use even the advanced features of .format() as it is. I'm guessing that despite the magic happening behind the scenes, people will still think of the format string as an (interpolated) string, like the shell. If they want to write arbitrary expressions they can already do that in python and then format a string with the answers. This will be another way to write code, that's (as far as I know) not strictly necessary. Also I thought, that the simpler the concept, the greater likelihood of PEP acceptance. Binding the format string to .format() does that, in the mind at least, if not the implementation. Still, if this is what most people want, I'll keep quiet from now on. ;) (Thanks for taking this on, btw.) -Mike From steve at pearwood.info Mon Aug 3 16:08:52 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 4 Aug 2015 00:08:52 +1000 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BED537.8020000@trueblade.com> References: <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> Message-ID: <20150803140851.GD3737@ando.pearwood.info> On Sun, Aug 02, 2015 at 10:43:03PM -0400, Eric V. Smith wrote: > On 8/2/2015 7:46 PM, Mike Miller wrote: > > Hi, > > > > I don't understand how we got to arbitrary expressions. > > I think here: > https://mail.python.org/pipermail/python-ideas/2015-July/034701.html > > > There was probably an edge case or two, but I wasn't expecting > > str(eval(s)) to be the answer, and one I'm not sure I'd want. > > As I pointed out earlier, it's not exactly str(eval(s)). Also, what's > your concern with the suggested approach? There are no security concerns > as there would be with eval-ing arbitrary strings. Language features should be no more powerful than they need to be. It isn't just *security* that we should be concerned about, its also about readability, learnability, the likelihood of abuse by writing unmaintainable Perlish one-liners, and the general increase in complexity. Or to put it another way... YAGNI. We started of with a fairly simple and straightforward feature request: to make it easy to substitute named variables in format strings. We ought to be somewhat cautious about accepting even that limited version. After all, hundreds of languages don't have such a feature, and Python worked perfectly well without it for over 20 years. This doesn't add anything to the language that cannot already be done with % and str.format(). But suddenly we've gone from a feature request that has been routinely denied many times in the past (having variables be automatically substituted into strings), to full-blown evaluation of arbitrarily complex expressions being discussed as if it were a done-deal. I've heard of the trick of asking for a pony if you actually want a puppy, but this is the first time I've seen somebody ask for a puppy and be given a thoroughbred. Anyway, there's no harm done, since this is going through the PEP process. It just strikes me as so unlike the usual conservatism, particularly when it comes to syntax changes, that it surprised me. Perhaps somebody slipped something in the water? :-) -- Steve From rymg19 at gmail.com Mon Aug 3 16:22:24 2015 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Mon, 03 Aug 2015 09:22:24 -0500 Subject: [Python-ideas] Briefer string format In-Reply-To: <20150803140851.GD3737@ando.pearwood.info> References: <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <20150803140851.GD3737@ando.pearwood.info> Message-ID: <5C07D0CA-DCD2-4692-9D01-AA2AF6F61D0E@gmail.com> On August 3, 2015 9:08:52 AM CDT, Steven D'Aprano wrote: >On Sun, Aug 02, 2015 at 10:43:03PM -0400, Eric V. Smith wrote: >> On 8/2/2015 7:46 PM, Mike Miller wrote: >> > Hi, >> > >> > I don't understand how we got to arbitrary expressions. >> >> I think here: >> https://mail.python.org/pipermail/python-ideas/2015-July/034701.html >> >> > There was probably an edge case or two, but I wasn't expecting >> > str(eval(s)) to be the answer, and one I'm not sure I'd want. >> >> As I pointed out earlier, it's not exactly str(eval(s)). Also, what's >> your concern with the suggested approach? There are no security >concerns >> as there would be with eval-ing arbitrary strings. > >Language features should be no more powerful than they need to be. It >isn't just *security* that we should be concerned about, its also about > >readability, learnability, the likelihood of abuse by writing >unmaintainable Perlish one-liners, and the general increase in >complexity. > >Or to put it another way... YAGNI. > >We started of with a fairly simple and straightforward feature request: > >to make it easy to substitute named variables in format strings. We >ought to be somewhat cautious about accepting even that limited >version. >After all, hundreds of languages don't have such a feature, and Python >worked perfectly well without it for over 20 years. This doesn't add >anything to the language that cannot already be done with % and >str.format(). > >But suddenly we've gone from a feature request that has been routinely >denied many times in the past (having variables be automatically >substituted into strings), to full-blown evaluation of arbitrarily >complex expressions being discussed as if it were a done-deal. > >I've heard of the trick of asking for a pony if you actually want a >puppy, but this is the first time I've seen somebody ask for a puppy >and >be given a thoroughbred. > >Anyway, there's no harm done, since this is going through the PEP >process. It just strikes me as so unlike the usual conservatism, >particularly when it comes to syntax changes, that it surprised me. >Perhaps somebody slipped something in the water? :-) Nah, we've just reached the maximum number of we can use % formatting before slowly drifting away into madness. -- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. From guido at python.org Mon Aug 3 16:44:42 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Aug 2015 16:44:42 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: <20150803140851.GD3737@ando.pearwood.info> References: <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <20150803140851.GD3737@ando.pearwood.info> Message-ID: Steven, can you back down on the rhetoric? I don't think it's called for here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Mon Aug 3 17:14:56 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Mon, 03 Aug 2015 17:14:56 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BED566.5000109@trueblade.com> References: <76E93441-D589-4E1C-9A03-8448F0EF0B73@gmail.com> <55AD0376.7020000@trueblade.com> <55AD2B15.1080909@trueblade.com> <55AD368D.7020108@trueblade.com> <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <55BE8414.9010805@mail.de> <55BED566.5000109@trueblade.com> Message-ID: <55BF8570.9040409@mail.de> Maybe, it's just me but looking at the Zen of Python: "There should be one-- and preferably only one --obvious way to do it." it doesn't seem to be right to have both of them. On 03.08.2015 04:43, Eric V. Smith wrote: > On 8/2/2015 4:56 PM, Sven R. Kunze wrote: >> On 01.08.2015 19:43, Eric V. Smith wrote: >>> Should we support !s and !r, like str.format does? It's not really >>> needed, since with f-strings you can just call str or repr yourself: >> Just to get a proper understanding here: >> >> What is the recommended usage? repr and str or !r and !s? > They're equivalent, so it wouldn't matter. I'd probably use !r and !s, > myself. > > Eric. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From srkunze at mail.de Mon Aug 3 19:11:15 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Mon, 03 Aug 2015 19:11:15 +0200 Subject: [Python-ideas] fork In-Reply-To: References: <20150801173628.29BAB873FE@smtp04.mail.de> Message-ID: <55BFA0B3.1010702@mail.de> On 02.08.2015 02:02, Andrew Barnert wrote: > Your idea of having a single global "pool manager" object, where you > could submit tasks and, depending on how they're marked, they get > handled differently might have merit. But that's something you could > build pretty easily on top of concurrent.futures (at least for threads > vs. processes; you can add in coroutines later, because they're not > quite as easy to integrate), upload to PyPI, You mean something like this? https://pypi.python.org/pypi/xfork -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Mon Aug 3 19:36:36 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Mon, 03 Aug 2015 19:36:36 +0200 Subject: [Python-ideas] fork - other approaches In-Reply-To: <87bneqczd1.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20150801172904.C34AA87403@smtp04.mail.de> <87bneqczd1.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <55BFA6A4.2090806@mail.de> On 02.08.2015 02:23, Stephen J. Turnbull wrote: > Sven R. Kunze writes: > > > I am very certain that any them will work as intended. However, > > they do not zero in 100% on my main intentions: > > > > 1) easy to understand > > 2) exchangeable (seq par) > > Exchangeability is a property of the computational structure, not of > the language syntax. It is a property of both and this thread is about the latter. I am glad ThreadPool and ProcessPool have the same API. That is very helpful. > In particular, in languages that *have* a for > loop, you're also going to have side effects, and exchangeability will > fail if those side effects interact between problem components. > > Therefore you need at least two syntaxes: one to express sequential > iteration, and one to express parallizable computations. Since the > "for" syntax in Python has always meant sequential iteration, and the > computations allowed in the suite are unrestricted, you'd just be > asking for trouble. Sorry? From liik.joonas at gmail.com Mon Aug 3 21:29:04 2015 From: liik.joonas at gmail.com (Joonas Liik) Date: Mon, 3 Aug 2015 22:29:04 +0300 Subject: [Python-ideas] fork - other approaches In-Reply-To: <55BFA6A4.2090806@mail.de> References: <20150801172904.C34AA87403@smtp04.mail.de> <87bneqczd1.fsf@uwakimon.sk.tsukuba.ac.jp> <55BFA6A4.2090806@mail.de> Message-ID: perhaps something could be done with the "with" statement? with someParallelExecutor() as ex: # do something # the context manager might impose some restrictions on what can be done # .. but the context manager needs to get at the code in order to execute it in parallel somehow.. current = ex.get_next_item_or_something() do_something_in_parrallel_maybe() doSomethingElseWithNastyCapsCuzThisBitWasOriginallyFromJava() #:P ex.poke_at_the_cm() ex.current_iteration_variables.x = "i got no real cause for this line but it seems like doing it this way might possibly not be completely useless" now_notice_how_i_have_described_my_computation_as_a_list_of_steps_and_if_this_variable_name_was_shorter_itd_even_prolly_be_pythonic = True using the "for" keyword does fele nice and fuzzy but "with" is much closer to what we actually want. the problem really is that the with doesnt have enough power to change the execution of the inner block at the moment. From stephen at xemacs.org Tue Aug 4 04:46:24 2015 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 04 Aug 2015 11:46:24 +0900 Subject: [Python-ideas] fork - other approaches In-Reply-To: <55BFA6A4.2090806@mail.de> References: <20150801172904.C34AA87403@smtp04.mail.de> <87bneqczd1.fsf@uwakimon.sk.tsukuba.ac.jp> <55BFA6A4.2090806@mail.de> Message-ID: <8737zzdb3z.fsf@uwakimon.sk.tsukuba.ac.jp> Sven R. Kunze writes: > > Exchangeability is a property of the computational structure, not of > > the language syntax. > It is a property of both and this thread is about the latter. I am glad > ThreadPool and ProcessPool have the same API. That is very helpful. That's because they *can* have the same API, because the computational structure is mostly the same, and where it isn't, little to no confusion is possible. For example, the fact that a process-oriented task doesn't lock variables when reading or writing them is unlikely to matter because that task can't access global objects of the parent program anyway. In order to take advantage of that aspect of threads, you need to rewrite the task. Perhaps a better way to express what I meant is "Syntax can express exchangeability already present in the computational structure. It cannot impose exchangeability not present in the computational structure." > > In particular, in languages that *have* a for > > loop, you're also going to have side effects, and exchangeability will > > fail if those side effects interact between problem components. > > > > Therefore you need at least two syntaxes: one to express sequential > > iteration, and one to express parallizable computations. Since the > > "for" syntax in Python has always meant sequential iteration, and the > > computations allowed in the suite are unrestricted, you'd just be > > asking for trouble. > Sorry? Exactly what I said: you're trying to change a statement that has always meant sequential iteration of statements containing side effects like assignments, and have it also mean parallel execution where side effects need to be carefully controlled. That will cause trouble for people reading the code (eg, they now have to understand any function calls recursively to understand whether there might be any ambiguities), even if it doesn't necessarily cause trouble for you writing it. From abarnert at yahoo.com Tue Aug 4 05:21:45 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 3 Aug 2015 20:21:45 -0700 Subject: [Python-ideas] fork In-Reply-To: <55BFA0B3.1010702@mail.de> References: <20150801173628.29BAB873FE@smtp04.mail.de> <55BFA0B3.1010702@mail.de> Message-ID: <6A8EA952-ED98-4C26-9A40-54BE54367849@yahoo.com> On Aug 3, 2015, at 10:11, Sven R. Kunze wrote: > >> On 02.08.2015 02:02, Andrew Barnert wrote: >> Your idea of having a single global "pool manager" object, where you could submit tasks and, depending on how they're marked, they get handled differently might have merit. But that's something you could build pretty easily on top of concurrent.futures (at least for threads vs. processes; you can add in coroutines later, because they're not quite as easy to integrate), upload to PyPI, > > You mean something like this? > > https://pypi.python.org/pypi/xfork Did you just write this today? Then yes, that proves my point about how easy it is to write it. Now you just have to get people using it, get some experience with it, etc. and you can come back with a proposal to put something like this in the stdlib, add syntactic support, etc. that it will be hard for anyone to disagree with. (Or to discover that it has flaws that need to be fixed, or fundamental flaws that can't be fixed, before making the proposal.) One quick comment: from my experience (mostly with other languages that are very different from Python, so I can't promise how well it applies here...), implicit futures without implicit laziness or even an explicit delay mechanism are not as useful as they look at first glance. Code that forks off 8 Fibonacci calls, but waits for each one's result before forking off the next one, might as well have just stayed sequential. And if you're going to use the result by forking off another job, then it's actually more convenient to use explicit futures like the ones in the stdlib. One slightly bigger idea: If you really want to pursue your implicit-as-possible design further, you might want to consider making the decorators replace the function with an object whose __call__ method just implicitly submits it to the pool. Then you can use normal function-calling syntax and pretend everything is magic. You can even add operator dunder methods to your future class that do the same thing (so "result * 2" just builds a new future out of "self.get() * 2", either submitted to the pool, probably better, tacked on as an add_done_callback). I think there's a limit to how far you can push this without some mechanism to mark when you need to actual value (in ML-derived languages and C++, static types make this easier: a cast, implicit or explicit, forces a wait; in Python, that doesn't work), but it might be worth exploring that limit. Or it might be better to just stop at the magic function calls and leave the futures alone. -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Tue Aug 4 20:09:33 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 04 Aug 2015 20:09:33 +0200 Subject: [Python-ideas] fork In-Reply-To: <6A8EA952-ED98-4C26-9A40-54BE54367849@yahoo.com> References: <20150801173628.29BAB873FE@smtp04.mail.de> <55BFA0B3.1010702@mail.de> <6A8EA952-ED98-4C26-9A40-54BE54367849@yahoo.com> Message-ID: <55C0FFDD.5020002@mail.de> On 04.08.2015 05:21, Andrew Barnert wrote: > On Aug 3, 2015, at 10:11, Sven R. Kunze > wrote: >> >> On 02.08.2015 02:02, Andrew Barnert wrote: >>> Your idea of having a single global "pool manager" object, where you >>> could submit tasks and, depending on how they're marked, they get >>> handled differently might have merit. But that's something you could >>> build pretty easily on top of concurrent.futures (at least for >>> threads vs. processes; you can add in coroutines later, because >>> they're not quite as easy to integrate), upload to PyPI, >> >> You mean something like this? >> >> https://pypi.python.org/pypi/xfork > > Did you just write this today? Then yes, that proves my point about > how easy it is to write it. Now you just have to get people using it, > get some experience with it, etc. and you can come back with a > proposal to put something like this in the stdlib, add syntactic > support, etc. that it will be hard for anyone to disagree with. (Or to > discover that it has flaws that need to be fixed, or fundamental flaws > that can't be fixed, before making the proposal.) I presented it today. The team members already showed interest. They also noted they like its simplicity. The missing syntax support seemed like minor issue compared to what complexity is hidden. Others admitted they knew about the existence of concurrent.futures and such but never used it due to - its complexity - AND *drum roll* the '.result()' of the future objects As it seems, it doesn't feel natural. > One quick comment: from my experience (mostly with other languages > that are very different from Python, so I can't promise how well it > applies here...), implicit futures without implicit laziness or even > an explicit delay mechanism are not as useful as they look at first > glance. Code that forks off 8 Fibonacci calls, but waits for each > one's result before forking off the next one, might as well have just > stayed sequential. And if you're going to use the result by forking > off another job, then it's actually more convenient to use explicit > futures like the ones in the stdlib. > > One slightly bigger idea: If you really want to pursue your > implicit-as-possible design further, you might want to consider making > the decorators replace the function with an object whose __call__ > method just implicitly submits it to the pool. I added two new decorators for this. But they don't work with the @ syntax. It seems like a well-known issue of Python: _pickle.PicklingError: Can't pickle : it's not the same object as __main__.fib_fork Would be great if somebody could fix that. > Then you can use normal function-calling syntax and pretend everything > is magic. You can even add operator dunder methods to your future > class that do the same thing (so "result * 2" just builds a new future > out of "self.get() * 2", either submitted to the pool, probably > better, tacked on as an add_done_callback). I think there's a limit to > how far you can push this without some mechanism to mark when you need > to actual value (in ML-derived languages and C++, static types make > this easier: a cast, implicit or explicit, forces a wait; in Python, > that doesn't work), but it might be worth exploring that limit. Or it > might be better to just stop at the magic function calls and leave the > futures alone. I actually like the idea of contagious futures and I might outline why this is not an issue with the current Python language. Have a look at the following small interactive Python session: >>> 3+4 7 >>> _ 7 >>> a=9 >>> _ 7 >>> a+=10 >>> _ 7 >>> a 19 >>> _ 19 >>> Question: When has the add operation being executed? Answer: Unknown from the programmer's perspective. Only requirement: Exceptions are raised exactly where the operation is supposed to take place in the source code (even if the operation that raises the exception is performed later). Best, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Tue Aug 4 20:32:38 2015 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 04 Aug 2015 14:32:38 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BED537.8020000@trueblade.com> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> Message-ID: <55C10546.8000304@trueblade.com> On 08/02/2015 10:43 PM, Eric V. Smith wrote: > On 8/2/2015 7:46 PM, Mike Miller wrote: >> Hi, >> >> I don't understand how we got to arbitrary expressions. > > I think here: > https://mail.python.org/pipermail/python-ideas/2015-July/034701.html Actually, a better link is: https://mail.python.org/pipermail/python-ideas/2015-July/034729.html where I discuss the pros and cons of str.format-like expressions, versus full expressions. Plus, Guido's response. I hope to have the first draft of a PEP ready in the next few days. I'll also look at putting my implementation online somewhere. Eric. From srkunze at mail.de Tue Aug 4 20:34:41 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 04 Aug 2015 20:34:41 +0200 Subject: [Python-ideas] fork - other approaches In-Reply-To: <8737zzdb3z.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20150801172904.C34AA87403@smtp04.mail.de> <87bneqczd1.fsf@uwakimon.sk.tsukuba.ac.jp> <55BFA6A4.2090806@mail.de> <8737zzdb3z.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <55C105C1.7010206@mail.de> On 04.08.2015 04:46, Stephen J. Turnbull wrote: > Perhaps a better way to express what I meant is "Syntax can express > exchangeability already present in the computational structure. It > cannot impose exchangeability not present in the computational > structure." I completely agree with this. So, we still need a syntax. ;) As the table of the thread 'Concurreny Modules' suggest, coroutines aren't that different if they can fit into that matrix alongside with processes and threads. Just internal technical differences and therefore different properties (let me stress that: that is overly desirable) but a common usage still leaves much to be desired. > Exactly what I said: you're trying to change a statement that has > always meant sequential iteration of statements containing side > effects like assignments, and have it also mean parallel execution > where side effects need to be carefully controlled. That will cause > trouble for people reading the code (eg, they now have to understand > any function calls recursively to understand whether there might be > any ambiguities), even if it doesn't necessarily cause trouble for you > writing it. I never said I wanted to change the 'for' loop. Your logic ('you need this, so you need that and thus you need these') came to that conclusion but it wasn't definitely not me. And I am not sure I agree with that conclusion. From abarnert at yahoo.com Tue Aug 4 21:38:57 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 4 Aug 2015 12:38:57 -0700 Subject: [Python-ideas] fork In-Reply-To: <55C0FFDD.5020002@mail.de> References: <20150801173628.29BAB873FE@smtp04.mail.de> <55BFA0B3.1010702@mail.de> <6A8EA952-ED98-4C26-9A40-54BE54367849@yahoo.com> <55C0FFDD.5020002@mail.de> Message-ID: On Aug 4, 2015, at 11:09, Sven R. Kunze wrote: > >> On 04.08.2015 05:21, Andrew Barnert wrote: >>> On Aug 3, 2015, at 10:11, Sven R. Kunze wrote: >>> >>>> On 02.08.2015 02:02, Andrew Barnert wrote: >>>> Your idea of having a single global "pool manager" object, where you could submit tasks and, depending on how they're marked, they get handled differently might have merit. But that's something you could build pretty easily on top of concurrent.futures (at least for threads vs. processes; you can add in coroutines later, because they're not quite as easy to integrate), upload to PyPI, >>> >>> You mean something like this? >>> >>> https://pypi.python.org/pypi/xfork >> >> Did you just write this today? Then yes, that proves my point about how easy it is to write it. Now you just have to get people using it, get some experience with it, etc. and you can come back with a proposal to put something like this in the stdlib, add syntactic support, etc. that it will be hard for anyone to disagree with. (Or to discover that it has flaws that need to be fixed, or fundamental flaws that can't be fixed, before making the proposal.) > > I presented it today. The team members already showed interest. They also noted they like its simplicity. The missing syntax support seemed like minor issue compared to what complexity is hidden. > > Others admitted they knew about the existence of concurrent.futures and such but never used it due to > - its complexity > - AND *drum roll* the '.result()' of the future objects > As it seems, it doesn't feel natural. I don't know how to put this nicely, but I think anyone who finds the complexity of concurrent.futures too daunting to even attempt to learn it should not be working on any code that uses less explicit concurrency. I have taught concurrent.futures to rank novices in a brief personal session or a single StackOverflow answer and they responded, "Wow, I didn't realize it could be this simple". Someone who can't grasp it is almost certain to be someone who introduces races all over your code and can't even understand the problem, much less debug it. >> One quick comment: from my experience (mostly with other languages that are very different from Python, so I can't promise how well it applies here...), implicit futures without implicit laziness or even an explicit delay mechanism are not as useful as they look at first glance. Code that forks off 8 Fibonacci calls, but waits for each one's result before forking off the next one, might as well have just stayed sequential. And if you're going to use the result by forking off another job, then it's actually more convenient to use explicit futures like the ones in the stdlib. >> >> One slightly bigger idea: If you really want to pursue your implicit-as-possible design further, you might want to consider making the decorators replace the function with an object whose __call__ method just implicitly submits it to the pool. > > I added two new decorators for this. But they don't work with the @ syntax. It seems like a well-known issue of Python: > > _pickle.PicklingError: Can't pickle : it's not the same object as __main__.fib_fork > > Would be great if somebody could fix that. > >> Then you can use normal function-calling syntax and pretend everything is magic. You can even add operator dunder methods to your future class that do the same thing (so "result * 2" just builds a new future out of "self.get() * 2", either submitted to the pool, probably better, tacked on as an add_done_callback). I think there's a limit to how far you can push this without some mechanism to mark when you need to actual value (in ML-derived languages and C++, static types make this easier: a cast, implicit or explicit, forces a wait; in Python, that doesn't work), but it might be worth exploring that limit. Or it might be better to just stop at the magic function calls and leave the futures alone. > > I actually like the idea of contagious futures and I might outline why this is not an issue with the current Python language. > > Have a look at the following small interactive Python session: > > >>> 3+4 > 7 > >>> _ > 7 > >>> a=9 > >>> _ > 7 > >>> a+=10 > >>> _ > 7 > >>> a > 19 > >>> _ > 19 > >>> > > > Question: > When has the add operation being executed? > > Answer: > Unknown from the programmer's perspective. Not true. The language clearly defines when each step happens. The a.__add__ method is called, then the result is assigned to a, then the statement finishes. (Then, in the next statement, nothing happens--except, because this is happening in the interactive interpreter, and it's an expression statement, after the statement finishes doing nothing, the value of the expression is assigned to _ and its repr is printed out.) This ordering relationship may be very important if the variable a is shared by multiple threads, especially if more than one thread may modify it, especially if you're using non-atomic operations like += (where another thread can read, use, and assign the variable between the __add__ call and the assignment). If a references a mutable object with an __iadd__ method, the variable doesn't even need to be shared, only the value, for this to matter. The only way to safely ignore these problems is to never share any variables or any mutable values between threads. (This is why concurrency features are easier to design in pure functional languages.) Hiding this fact when you or the people you're hiding it from don't even understand the issue is exactly how you create races. > Only requirement: > Exceptions are raised exactly where the operation is supposed to take place in the source code (even if the operation that raises the exception is performed later). -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Tue Aug 4 22:05:34 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 04 Aug 2015 13:05:34 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: <55C10546.8000304@trueblade.com> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <55C10546.8000304@trueblade.com> Message-ID: <55C11B0E.9010608@mgmiller.net> Hi, In that message there was a logical step that I don't follow: > For example: > '{a.foo}'.format(a=b[c]) > > If we limit f-strings to just what str.format() string expressions can > represent, it would be impossible to represent this with an f-string, > without an intermediate assignment. > For example: > f'{a[2:3]:20d}' > > We need to extract the expression "a[2:3]" and the format spec "20d". I > can't just scan for a colon any more, I've got to actually parse the > expression until I find a "}", ":", or "!" that's not part of the > expression so that I know where it ends. There was a solution to this that came up early in the discussion, moving the identifier only: f'{x}{y}' --> '{}{}'.format(x, y) f'{x:10}{y[name]}' --> '{:10}{[name]}'.format(x, y) I missed the part where this was rejected. As far as I can tell from your message, it is because it would be hard to parse? But, it seems no harder than other solutions. I've whipped up a simple implementation below. Also, Guido sounds supportive of your general process, but to my knowledge has not explicitly called for arbitrary expressions to be included. Perhaps he could do that, or encourage us to find a more conservative solution? Sorry to be a pain, but I think this part is important to get right. -Mike Simple script to illustrate (just ascii, only one format op supported). TL;DR: the idea is to grab the identifier portion by examining the class of each character, then move it over to a .format function call argument. import string idchars = string.ascii_letters + string.digits + '_' # + unicode letters capture = None isid = None fstring = '{a[2:3]:20d}' #~ fstring = '{a.foo}' identifier = [] fmt_spec = [] for char in fstring: print(char + ', ', end='') if char == '{': print('start_capture ', end='') capture = True isid = True elif char == '}': print('end_capture') capture = False break else: if capture: if (char in idchars) and isid: identifier.append(char) else: isid = False fmt_spec.append(char) identifier = ''.join(identifier) fmt_spec = ''.join(fmt_spec) print() print('identifier:', repr(identifier)) print('fmt_spec: ', repr(fmt_spec)) print('result: ', "'{%s}'.format(%s)" % (fmt_spec, identifier)) And the results: ?python3 fstr.py {, start_capture a, [, 2, :, 3, ], :, 2, 0, d, }, end_capture identifier: 'a' fmt_spec: '[2:3]:20d' result: '{[2:3]:20d}'.format(a) On 08/04/2015 11:32 AM, Eric V. Smith wrote: > Actually, a better link is: > https://mail.python.org/pipermail/python-ideas/2015-July/034729.html > where I discuss the pros and cons of str.format-like expressions, versus > full expressions. Plus, Guido's response. From eric at trueblade.com Tue Aug 4 22:20:10 2015 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 4 Aug 2015 16:20:10 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <55C11B0E.9010608@mgmiller.net> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <55C10546.8000304@trueblade.com> <55C11B0E.9010608@mgmiller.net> Message-ID: <55C11E7A.9030606@trueblade.com> On 8/4/2015 4:05 PM, Mike Miller wrote: > Hi, > > In that message there was a logical step that I don't follow: > >> For example: >> '{a.foo}'.format(a=b[c]) >> >> If we limit f-strings to just what str.format() string expressions can >> represent, it would be impossible to represent this with an f-string, >> without an intermediate assignment. > >> For example: >> f'{a[2:3]:20d}' >> >> We need to extract the expression "a[2:3]" and the format spec "20d". I >> can't just scan for a colon any more, I've got to actually parse the >> expression until I find a "}", ":", or "!" that's not part of the >> expression so that I know where it ends. > > There was a solution to this that came up early in the discussion, > moving the identifier only: > > f'{x}{y}' --> '{}{}'.format(x, y) > f'{x:10}{y[name]}' --> '{:10}{[name]}'.format(x, y) > > I missed the part where this was rejected. As far as I can tell from > your message, it is because it would be hard to parse? But, it seems no > harder than other solutions. I've whipped up a simple implementation > below. > It's rejected because .format treats: '{:10}{[name]}'.format(x, y) --> format(x, '10') + format(y['name']) and we (for some definition of "we") would like: f'{x:10}{y[name]}' --> format(x, '10') + format(y[name]) It's the change from y[name] to y['name'] that Guido rejected for f-strings. And I agree: it's unfortunate that str.format works this way. It would have been better just to say that the subscripted value must be a literal number for str.format, but it's too late for that. It's not hard to parse either way. All of the machinery exists to use either the str.format approach, or the full expression approach. > Also, Guido sounds supportive of your general process, but to my > knowledge has not explicitly called for arbitrary expressions to be > included. Perhaps he could do that, or encourage us to find a more > conservative solution? True, he hasn't definitively stated his approval for arbitrary expressions. I think it logically follows from our discussions. But if he'd like to rule on it one way or the other before I'm done with the PEP draft, that's fine with me. Or, we can just wait for the PEP. Personally, now that I have a working implementation that I've been using, I have to say that full expressions are pretty handy. And while I agree you don't want to be putting hyper-complicated dict comprehensions with lots of function calls into an f-string, the same can be said of many places we allow expressions. > Sorry to be a pain, but I think this part is important to get right. No problem. It's all part of the discussion. Eric. From guido at python.org Tue Aug 4 22:52:49 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Aug 2015 22:52:49 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: <55C11E7A.9030606@trueblade.com> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <55C10546.8000304@trueblade.com> <55C11B0E.9010608@mgmiller.net> <55C11E7A.9030606@trueblade.com> Message-ID: OK, fine, I'll say right now that I agree with Eric's arguments for full expressions. (Though honestly the whole look of f-strings hasn't quite grown on me. I wish I could turn back the clock and make expression substitution a feature of all string literals, perhaps using \{...}, which IIRC I've seen in some other language.) On Tue, Aug 4, 2015 at 10:20 PM, Eric V. Smith wrote: > On 8/4/2015 4:05 PM, Mike Miller wrote: > > Hi, > > > > In that message there was a logical step that I don't follow: > > > >> For example: > >> '{a.foo}'.format(a=b[c]) > >> > >> If we limit f-strings to just what str.format() string expressions can > >> represent, it would be impossible to represent this with an f-string, > >> without an intermediate assignment. > > > >> For example: > >> f'{a[2:3]:20d}' > >> > >> We need to extract the expression "a[2:3]" and the format spec "20d". I > >> can't just scan for a colon any more, I've got to actually parse the > >> expression until I find a "}", ":", or "!" that's not part of the > >> expression so that I know where it ends. > > > > There was a solution to this that came up early in the discussion, > > moving the identifier only: > > > > f'{x}{y}' --> '{}{}'.format(x, y) > > f'{x:10}{y[name]}' --> '{:10}{[name]}'.format(x, y) > > > > I missed the part where this was rejected. As far as I can tell from > > your message, it is because it would be hard to parse? But, it seems no > > harder than other solutions. I've whipped up a simple implementation > > below. > > > > It's rejected because .format treats: > '{:10}{[name]}'.format(x, y) --> format(x, '10') + format(y['name']) > > and we (for some definition of "we") would like: > f'{x:10}{y[name]}' --> format(x, '10') + format(y[name]) > > It's the change from y[name] to y['name'] that Guido rejected for > f-strings. And I agree: it's unfortunate that str.format works this way. > It would have been better just to say that the subscripted value must be > a literal number for str.format, but it's too late for that. > > It's not hard to parse either way. All of the machinery exists to use > either the str.format approach, or the full expression approach. > > > Also, Guido sounds supportive of your general process, but to my > > knowledge has not explicitly called for arbitrary expressions to be > > included. Perhaps he could do that, or encourage us to find a more > > conservative solution? > > True, he hasn't definitively stated his approval for arbitrary > expressions. I think it logically follows from our discussions. But if > he'd like to rule on it one way or the other before I'm done with the > PEP draft, that's fine with me. Or, we can just wait for the PEP. > > Personally, now that I have a working implementation that I've been > using, I have to say that full expressions are pretty handy. And while I > agree you don't want to be putting hyper-complicated dict comprehensions > with lots of function calls into an f-string, the same can be said of > many places we allow expressions. > > > Sorry to be a pain, but I think this part is important to get right. > > No problem. It's all part of the discussion. > > Eric. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Tue Aug 4 23:03:27 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 04 Aug 2015 23:03:27 +0200 Subject: [Python-ideas] fork In-Reply-To: References: <20150801173628.29BAB873FE@smtp04.mail.de> <55BFA0B3.1010702@mail.de> <6A8EA952-ED98-4C26-9A40-54BE54367849@yahoo.com> <55C0FFDD.5020002@mail.de> Message-ID: <55C1289F.10109@mail.de> On 04.08.2015 21:38, Andrew Barnert wrote: > I think anyone who finds the complexity of concurrent.futures too > daunting to even attempt to learn it should not be working on any code > that uses less explicit concurrency. I am sorry because I disagree here with you. > I have taught concurrent.futures to rank novices in a brief personal > session or a single StackOverflow answer and they responded, "Wow, I > didn't realize it could be this simple". Nobody says that concurrent.futures is not an vast improvement over previous approaches. But it is still not the end of the line of simplifications. > Someone who can't grasp it is almost certain to be someone who > introduces races all over your code and can't even understand the > problem, much less debug it. Nobody wants races, yet everybody still talks about them. Don't allow races in the first place and be done with it. > Not true. The language clearly defines when each step happens. The > a.__add__ method is called, then the result is assigned to a, then the > statement finishes. (Then, in the next statement, nothing > happens--except, because this is happening in the interactive > interpreter, and it's an expression statement, after the statement > finishes doing nothing, the value of the expression is assigned to _ > and its repr is printed out.) Where can find this definition in the docs? To me, we are talking about class customization as described on reference/datamodel.html. Seems like an implementation detail, not a language detail. I am not saying, CPython doesn't do it like that, but I saying the Python language could support lazy evaluation and not disagreeing with the docs. > This ordering relationship may be very important if the variable a is > shared by multiple threads, especially if more than one thread may > modify it, especially if you're using non-atomic operations like += > (where another thread can read, use, and assign the variable between > the __add__ call and the assignment). If a references a mutable object > with an __iadd__ method, the variable doesn't even need to be shared, > only the value, for this to matter. The only way to safely ignore > these problems is to never share any variables or any mutable values > between threads. Mutual variables are global variables. And these have gone out of style quite some time ago. Btw. this is races again and I thought we agreed on not having them because nobody really can/wants to debug them. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Tue Aug 4 23:16:58 2015 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 4 Aug 2015 17:16:58 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <55C10546.8000304@trueblade.com> <55C11B0E.9010608@mgmiller.net> <55C11E7A.9030606@trueblade.com> Message-ID: <55C12BCA.5080303@trueblade.com> On 8/4/2015 4:52 PM, Guido van Rossum wrote: > OK, fine, I'll say right now that I agree with Eric's arguments for full > expressions. Thanks. > (Though honestly the whole look of f-strings hasn't quite grown on me. I > wish I could turn back the clock and make expression substitution a > feature of all string literals, perhaps using \{...}, which IIRC I've > seen in some other language.) Well, we could do that with a future statement. It might be tough to ever make it the default, though. But since it would only be literals, it's easy enough to find. Eric. From eric at trueblade.com Tue Aug 4 23:59:03 2015 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 4 Aug 2015 17:59:03 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <55C12BCA.5080303@trueblade.com> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <55C10546.8000304@trueblade.com> <55C11B0E.9010608@mgmiller.net> <55C11E7A.9030606@trueblade.com> <55C12BCA.5080303@trueblade.com> Message-ID: <708534C6-1A53-4036-B3EC-8B8EAB30B9E8@trueblade.com> > On Aug 4, 2015, at 5:16 PM, Eric V. Smith wrote: > >> On 8/4/2015 4:52 PM, Guido van Rossum wrote: >> (Though honestly the whole look of f-strings hasn't quite grown on me. I >> wish I could turn back the clock and make expression substitution a >> feature of all string literals, perhaps using \{...}, which IIRC I've >> seen in some other language.) > > Well, we could do that with a future statement. It might be tough to > ever make it the default, though. By which I meant __future__ import. > But since it would only be literals, it's easy enough to find. By which I meant the affected literals would be easy to find and mechanically convert. Sorry for the inexact wording. Eric. From python-ideas at mgmiller.net Wed Aug 5 01:05:54 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 04 Aug 2015 16:05:54 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: <55C11E7A.9030606@trueblade.com> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <55C10546.8000304@trueblade.com> <55C11B0E.9010608@mgmiller.net> <55C11E7A.9030606@trueblade.com> Message-ID: <55C14552.9090807@mgmiller.net> On 08/04/2015 01:20 PM, Eric V. Smith wrote: > It's rejected because .format treats: > '{:10}{[name]}'.format(x, y) --> format(x, '10') + format(y['name']) Isn't this what already happens? Seems odd to go in a different direction just avoid an implementation that already exists, even though it may not be perfect. Perhaps it's time to deprecate the troublesome syntax? Fortunately there's plenty of time before the next version of python to figure this out. -Mike From python-ideas at mgmiller.net Wed Aug 5 01:29:09 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 04 Aug 2015 16:29:09 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: <55C14552.9090807@mgmiller.net> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <55C10546.8000304@trueblade.com> <55C11B0E.9010608@mgmiller.net> <55C11E7A.9030606@trueblade.com> <55C14552.9090807@mgmiller.net> Message-ID: <55C14AC5.3030001@mgmiller.net> Sorry to reply to myself... I'm hoping we could consider a .format()-only implementation as Plan B, alongside your Plan A with arbitrary expressions. -Mike From sag150430 at utdallas.edu Wed Aug 5 02:22:51 2015 From: sag150430 at utdallas.edu (Grayson, Samuel Andrew) Date: Wed, 5 Aug 2015 00:22:51 +0000 Subject: [Python-ideas] Use the plus operator to concatenate iterators Message-ID: Concatenation is the most fundamental operation that can be done on iterators. In fact, we already do that with lists. [1, 2, 3] + [4, 5, 6] # evaluates to [1, 2, 3, 4, 5, 6] I propose: iter([1, 2, 3]) + iter([4, 5, 6]) # evaluates to something like itertools.chain(iter([1, 2, 3]), iter([4, 5, 6])) # equivalent to iter([1, 2, 3, 4, 5, 6]) There is some python2 code where: a = dict(zip('abcd', range(4))) isinstance(a.values(), list) alphabet = a.keys() + a.values() In python2, this `alphabet` becomes a list of all values and keys In current python3, this raises: TypeError: unsupported operand type(s) for +: 'dict_keys' and 'dict_values' But in my proposal, it works just fine. `alphabet` becomes an iterator over all values and keys (similar to the python2 case). Sincerely, Sam G -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Wed Aug 5 02:32:06 2015 From: ron3200 at gmail.com (Ron Adam) Date: Tue, 04 Aug 2015 20:32:06 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: <55C14552.9090807@mgmiller.net> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <55C10546.8000304@trueblade.com> <55C11B0E.9010608@mgmiller.net> <55C11E7A.9030606@trueblade.com> <55C14552.9090807@mgmiller.net> Message-ID: On 08/04/2015 07:05 PM, Mike Miller wrote: > > On 08/04/2015 01:20 PM, Eric V. Smith wrote: >> It's rejected because .format treats: >> '{:10}{[name]}'.format(x, y) --> format(x, '10') + format(y['name']) > > Isn't this what already happens? Seems odd to go in a different direction > just avoid an implementation that already exists, even though it may not be > perfect. > > Perhaps it's time to deprecate the troublesome syntax? > > Fortunately there's plenty of time before the next version of python to > figure this out. Since "f" strings don't exist yet, they could be handled with a different method. '{x:10}{y[name]}'.__fmt__(x=x, y=y, name=name) The string isn't altered here, which may help with error messages, and all names are supplied as keywords explicitly. But is there a migration path that would work? Cheers, Ron From joejev at gmail.com Wed Aug 5 02:43:12 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Tue, 4 Aug 2015 20:43:12 -0400 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: Iterators all all different types though. iter(list) returns a list_iterator type, iter(dict.keys()) returns a dict_keys_iterator type and so on. Is your suggestion that the standard lib types do this? How do we update all of the existing iterators not in the stdlib that do not do this? Finally, how is this better than itertools.chain? On Tue, Aug 4, 2015 at 8:22 PM, Grayson, Samuel Andrew < sag150430 at utdallas.edu> wrote: > Concatenation is the most fundamental operation that can be done on > iterators. In fact, we already do that with lists. > > [1, 2, 3] + [4, 5, 6] > # evaluates to [1, 2, 3, 4, 5, 6] > > I propose: > > iter([1, 2, 3]) + iter([4, 5, 6]) > # evaluates to something like itertools.chain(iter([1, 2, 3]), > iter([4, 5, 6])) > # equivalent to iter([1, 2, 3, 4, 5, 6]) > > There is some python2 code where: > > a = dict(zip('abcd', range(4))) > isinstance(a.values(), list) > alphabet = a.keys() + a.values() > > In python2, this `alphabet` becomes a list of all values and keys > > In current python3, this raises: > > TypeError: unsupported operand type(s) for +: 'dict_keys' and > 'dict_values' > > But in my proposal, it works just fine. `alphabet` becomes an iterator > over all values and keys (similar to the python2 case). > > Sincerely, > Sam G > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 5 03:01:07 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 4 Aug 2015 21:01:07 -0400 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: On Tue, Aug 4, 2015 at 8:43 PM, Joseph Jevnik wrote: > Iterators all all different types though. iter(list) returns a list_iterator > type, iter(dict.keys()) returns a dict_keys_iterator type and so on. Is your > suggestion that the standard lib types do this? How do we update all of the > existing iterators not in the stdlib that do not do this? In theory, this can be done inside PyNumber_Add(x, y). It already checks for numbers or sequences and failing that can check for the __next__ method on its first operand and return itertools.chain(x, y). > Finally, how is this better than itertools.chain? Shorter. Especially when you chain more than two iterators. Nevertheless, I am -1 on the idea. It is bad enough that Python abuses + as sequences concatenation operator. From rosuav at gmail.com Wed Aug 5 03:01:37 2015 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Aug 2015 11:01:37 +1000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: On Wed, Aug 5, 2015 at 10:22 AM, Grayson, Samuel Andrew wrote: > I propose: > > iter([1, 2, 3]) + iter([4, 5, 6]) > # evaluates to something like itertools.chain(iter([1, 2, 3]), iter([4, > 5, 6])) > # equivalent to iter([1, 2, 3, 4, 5, 6]) Try this: class iter: iter = iter # snapshot the original iter() def __init__(self, iterable): self.it = self.iter(iterable) self.next = [] def __iter__(self): return self def __next__(self): while self.next: try: return next(self.it) except StopIteration: self.it, *self.next = self.next return next(self.it) # Allow StopIteration to bubble when it's the last one def __add__(self, other): result = self.__class__(self.it) result.next = self.next + [self.iter(other)] return result As long as you explicitly call iter() on something, you get the ability to add two iterators together. I haven't checked for odd edge cases, but something like this does work, and should work on all Python versions. ChrisA From joejev at gmail.com Wed Aug 5 03:02:52 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Tue, 4 Aug 2015 21:02:52 -0400 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: That would get _really_ messy with iterators that define an __add__, sequence concat or number add function On Tue, Aug 4, 2015 at 9:01 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Tue, Aug 4, 2015 at 8:43 PM, Joseph Jevnik wrote: > > Iterators all all different types though. iter(list) returns a > list_iterator > > type, iter(dict.keys()) returns a dict_keys_iterator type and so on. Is > your > > suggestion that the standard lib types do this? How do we update all of > the > > existing iterators not in the stdlib that do not do this? > > In theory, this can be done inside PyNumber_Add(x, y). It already > checks for numbers or sequences and failing that can check for the > __next__ method on its first operand and return itertools.chain(x, > y). > > > Finally, how is this better than itertools.chain? > > Shorter. Especially when you chain more than two iterators. > > Nevertheless, I am -1 on the idea. It is bad enough that Python > abuses + as sequences concatenation operator. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Aug 5 03:06:53 2015 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Aug 2015 11:06:53 +1000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: On Wed, Aug 5, 2015 at 11:01 AM, Alexander Belopolsky wrote: > Nevertheless, I am -1 on the idea. It is bad enough that Python > abuses + as sequences concatenation operator. I'm -1 on this idea, but I disagree that concrete sequence concatenation is bad. (I'm not sure it applies to sequence protocol, incidentally; it's specific to lists and tuples.) Being able to add a list and a list is a perfectly reasonable feature. ChrisA From alexander.belopolsky at gmail.com Wed Aug 5 03:17:53 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 4 Aug 2015 21:17:53 -0400 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: On Tue, Aug 4, 2015 at 9:06 PM, Chris Angelico wrote: > (I'm not sure it applies to sequence protocol, > incidentally; it's specific to lists and tuples.) https://docs.python.org/3/c-api/sequence.html#c.PySequence_Concat From alexander.belopolsky at gmail.com Wed Aug 5 03:21:44 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 4 Aug 2015 21:21:44 -0400 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: On Tue, Aug 4, 2015 at 9:06 PM, Chris Angelico wrote: > Being able to add a list and a list is a perfectly reasonable feature. Sure >>> from numpy import array >>> array([1,2,3]) + array([3,2,1]) array([4, 4, 4]) .. and how do I concatenate those things? From sag150430 at utdallas.edu Wed Aug 5 03:21:45 2015 From: sag150430 at utdallas.edu (Grayson, Samuel Andrew) Date: Wed, 5 Aug 2015 01:21:45 +0000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: , Message-ID: In the case that `__add__` is already defined by the iterator, I propose to use that. Otherwise, I propose to concatenate the iterators. This serves two purposes: convenience and backwards-compatibility/consistency. Convenience: Imagine having to do `operator.mul(5, 4)` instead of `5 * 4`, or `list_chain([1, 2, 3], [4, 5, 6])` instead of `[1, 2, 3] + [4, 5, 6]`. Following this pattern for commonly used operators, `range(5) + 'abc'` instead of `itertools.chain(range(5), 'abc'`. (notice also that if you act like a list, but redefine the __add__ method, then the default behavior is overridden. In the same way, I propose that if you redefine the __add__ method, then the proposed default behavior (concatenanation) is overridden. Backwards-compatibility: This helps backwards-compatibility where lists in python2 changed to iterators in python3, everywhere except for concatenation via the plus operator. From liik.joonas at gmail.com Wed Aug 5 03:32:15 2015 From: liik.joonas at gmail.com (Joonas Liik) Date: Wed, 5 Aug 2015 04:32:15 +0300 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: no opinion if its a good idea or not, however, if iterator + iterator... iterator * number From rosuav at gmail.com Wed Aug 5 03:34:56 2015 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Aug 2015 11:34:56 +1000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: On Wed, Aug 5, 2015 at 11:32 AM, Joonas Liik wrote: > no opinion if its a good idea or not, however, if iterator + iterator... > > iterator * number Almost completely useless. Multiplying an *iterable* by an integer will often be useful (eg multiplying list by int), but multiplying an *iterator* (or even just adding one to itself) is going to be useless, because the first time through it will exhaust it, and any well-behaved iterator will remain exhausted once it's ever raised StopIteration. (Note that the 'class iter' that I posted earlier is NOT well-behaved. You can add something onto an exhausted iterator and rejuvenate it. It'd take a couple extra lines of code to fix that.) ChrisA From sag150430 at utdallas.edu Wed Aug 5 03:31:37 2015 From: sag150430 at utdallas.edu (Grayson, Samuel Andrew) Date: Wed, 5 Aug 2015 01:31:37 +0000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: , Message-ID: I propose that if the iterator class overrides the __add__ method, then use that instead of concatenation. Therefore numpy.array([1, 2, 3]) + numpy.array([3, 2, 1]) # evaluates to numpy.array([4, 4, 4]) It is already done with the __mul__ function a = [1, 2, 3] # or a tuple a * 2 # evaluates to [1, 2, 3, 1, 2, 3] b = np.array([1, 2, 3]) b * 2 # evaluates to np.array([2, 4, 6]) Sincerely, Sam G From cs at zip.com.au Wed Aug 5 04:03:33 2015 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 5 Aug 2015 12:03:33 +1000 Subject: [Python-ideas] proposal: "python -m foo" should bind sys.modules['foo'] In-Reply-To: <55C0B32E.7090900@wielicki.name> References: <55C0B32E.7090900@wielicki.name> Message-ID: <20150805020333.GA25179@cskk.homeip.net> Hello all, This is a writeup of a proposal I floated here: https://mail.python.org/pipermail/python-list/2015-August/694905.html last Sunday. If the response is positive I wish to write a PEP. Briefly, it is a natural expectation in users that the command: python -m module_name ... used to invoke modules in "main program" mode on the command line imported the module as "module_name". It does not, it imports it as "__main__". An import within the program of "module_name" makes a new instance of the module, which causes cognitive dissonance and has the side effect that now the program has two instances of the module. What I propose is that the above command line _should_ bind sys.modules['module_name'] as well as binding '__main__' as it does currently. I'm proposing that the python -m option have this effect (python pseudocode): % python -m module.name ... runs: # pseudocode, with values hardwired for clarity import sys M = new_empty_module(name='__main__', qualname='module.name') sys.modules['__main__'] = M sys.modules['module.name'] = M # load the module code from wherever (not necessarily a file - CPython # already must do this phase) M.execfile('/path/to/module/name.py') Specificly, this would have the following two changes to current practice: 1) the module is imported _once_, and bound to both its canonical name and also to __main__. 2) imported modules acquire a new attribute __qualname__ (analogous to the recent __qualname__ on functions). This is always the conanoical name of the module as resolved by the importer. For most modules __name__ will be the same as __qualname__, but for the "main" module __name__ will be '__main__'. This change has the following advantages: The current standard boilerplate: if __name__ == '__main__': ... invoke "main program" here ... continues to work unchanged. Importantly, if the program then issues "import module_name", it is already there and the existing instance is found and used. The thread referenced above outlines my most recent encounter with this and the trouble it caused me. Followup messages include some support for this proposed change, and some criticism. The critiquing article included some workarounds for this multiple module situation, but they were (1) somewhat dependent on modules coming from a file pathname and (2) cumbersome and require every end user to adopt these changes if affected by the situation. I'd like to avoid that. Cheers, Cameron Simpson The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. - George Bernard Shaw From random832 at fastmail.us Wed Aug 5 04:37:00 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Tue, 04 Aug 2015 22:37:00 -0400 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: <1438742220.33029.347997193.05339E19@webmail.messagingengine.com> On Tue, Aug 4, 2015, at 21:34, Chris Angelico wrote: > Almost completely useless. Multiplying an *iterable* by an integer > will often be useful (eg multiplying list by int), but multiplying an > *iterator* (or even just adding one to itself) is going to be useless, > because the first time through it will exhaust it, and any > well-behaved iterator will remain exhausted once it's ever raised > StopIteration. (Note that the 'class iter' that I posted earlier is > NOT well-behaved. You can add something onto an exhausted iterator and > rejuvenate it. It'd take a couple extra lines of code to fix that.) Suppose iterator * number returns a new iterator which will iterate through the original iterator once, caching the results, and then yield the cached results n-1 times. From sag150430 at utdallas.edu Wed Aug 5 04:37:03 2015 From: sag150430 at utdallas.edu (Grayson, Samuel Andrew) Date: Wed, 5 Aug 2015 02:37:03 +0000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: , Message-ID: If we are barking up that tree, import itertools _old_iter = iter class iter (object): def __init__(self, it): self.it = _old_iter(it) def __iter__(self): return self def __next__(self): return next(self.it) def __add__(self, other): return iter(itertools.chain(self.it, other)) def __radd__(self, other): return iter(itertools.chain(other, self.it)) >>> list('wxy' + iter([1, 2]) + range(3, 5) + 'abc') ['w', 'x', 'y', 1, 2, 3, 4, 'a', 'b', 'c'] From rosuav at gmail.com Wed Aug 5 04:44:57 2015 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Aug 2015 12:44:57 +1000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: <1438742220.33029.347997193.05339E19@webmail.messagingengine.com> References: <1438742220.33029.347997193.05339E19@webmail.messagingengine.com> Message-ID: On Wed, Aug 5, 2015 at 12:37 PM, wrote: > On Tue, Aug 4, 2015, at 21:34, Chris Angelico wrote: >> Almost completely useless. Multiplying an *iterable* by an integer >> will often be useful (eg multiplying list by int), but multiplying an >> *iterator* (or even just adding one to itself) is going to be useless, >> because the first time through it will exhaust it, and any >> well-behaved iterator will remain exhausted once it's ever raised >> StopIteration. (Note that the 'class iter' that I posted earlier is >> NOT well-behaved. You can add something onto an exhausted iterator and >> rejuvenate it. It'd take a couple extra lines of code to fix that.) > > Suppose iterator * number returns a new iterator which will iterate > through the original iterator once, caching the results, and then yield > the cached results n-1 times. That's easily spelled "list(iterator) * number", apart from the fact that it makes a concrete result list. I don't think it needs language support. ChrisA From steve at pearwood.info Wed Aug 5 04:15:37 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 5 Aug 2015 12:15:37 +1000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: <20150805021532.GI3737@ando.pearwood.info> On Wed, Aug 05, 2015 at 12:22:51AM +0000, Grayson, Samuel Andrew wrote: > Concatenation is the most fundamental operation that can be done on iterators. Surely "get next value" is the most fundamental operation that can be done on interators. Supporting concatenation is not even part of the definition of iterator. But having said that, concatenation does make sense as an iterator method. Python chooses to make that a function, itertools.chain, rather than a method or operator. See below. > In fact, we already do that with lists. > > [1, 2, 3] + [4, 5, 6] > # evaluates to [1, 2, 3, 4, 5, 6] > > I propose: > > iter([1, 2, 3]) + iter([4, 5, 6]) > # evaluates to something like itertools.chain(iter([1, 2, 3]), iter([4, 5, 6])) > # equivalent to iter([1, 2, 3, 4, 5, 6]) I don't entirely dislike this. I'm not a big fan of Python's choice to use + for concatenation, but the principle of supporting concatenation for iterators makes sense. But, "iterator" isn't a type in Python, it is a protocol, so there are a whole lot of *different* types that count as iterators, and as far as I can see, they don't share any common superclass apart from object itself. I count at least nine in the builtins alone: range_iterator, list_iterator, tuple_iterator, str_iterator, set_iterator, dict_keyiterator, dict_valueiterator, dict_itemiterator, generator (These are distinct from the types range, list, tuple, etc.) and custom-made iterators don't have to inherit from any special class, they just need to obey the protocol. So where would you put the __add__ and __radd__ methods? The usual Pythonic solution to the problem of where to put a method that needs to operate on a lot of disparate types with no shared superclass is to turn it into a function. We already have that: itertools.chain. A bonus with chain is that there is no need to manually convert each argument to an iterator first, it does it for you: chain(this, that, another) versus iter(this) + iter(that) + iter(another) And the bonus with chain() is that you can start using it *right now*, and not wait another two years for Python 3.6. > There is some python2 code where: > > a = dict(zip('abcd', range(4))) > isinstance(a.values(), list) > alphabet = a.keys() + a.values() > > In python2, this `alphabet` becomes a list of all values and keys > > In current python3, this raises: > > TypeError: unsupported operand type(s) for +: 'dict_keys' and 'dict_values' > > But in my proposal, it works just fine. `alphabet` becomes an iterator > over all values and keys (similar to the python2 case). dict_keys and dict_values are not iterators, they are set-like views, and concatenating them does not make sense. The Python 2 equivalent of Python 3's `a.keys() + a.values()` is a.viewkeys() + a.viewvalues() which also raises TypeError, as it should. Or to put it another way, the Python 3 equivalent of the Python 2 code is this: list(a.keys()) + list(a.values()) Either way, since dict keys and values aren't iterators, any change to the iterator protocol or support for iterator concatenation won't change them. -- Steve From steve at pearwood.info Wed Aug 5 05:01:47 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 5 Aug 2015 13:01:47 +1000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: <1438742220.33029.347997193.05339E19@webmail.messagingengine.com> References: <1438742220.33029.347997193.05339E19@webmail.messagingengine.com> Message-ID: <20150805030147.GL3737@ando.pearwood.info> On Tue, Aug 04, 2015 at 10:37:00PM -0400, random832 at fastmail.us wrote: > Suppose iterator * number returns a new iterator which will iterate > through the original iterator once, caching the results, and then yield > the cached results n-1 times. Repetition on an arbitrary iterator is ambiguous. If I say, "repeat the list [1,2,3] twice" there is no ambiguity, I must get [1, 2, 3, 1, 2, 3] or there is some problem. But iterators can have non-deterministic lengths and values: def gen(): while random.random() < 0.9: yield random.random() What does it mean to "repeat gen twice"? It might mean either of: - generate one run of values using gen, then repeat those same values; - generate two runs of values using gen. Both are easy to write, e.g.: list(gen())*2 # explicitly cache the values, then repeat chain(gen(), gen()) # explicitly use two separate runs In the face of ambiguity, resist the temptation to guess. There's no obvious right behaviour here, whichever you bake into iterator * you will make about half the users unhappy because it doesn't support their use-case. Not every simple expression needs to be an operator. -- Steve From rosuav at gmail.com Wed Aug 5 06:29:17 2015 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Aug 2015 14:29:17 +1000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: <20150805030147.GL3737@ando.pearwood.info> References: <1438742220.33029.347997193.05339E19@webmail.messagingengine.com> <20150805030147.GL3737@ando.pearwood.info> Message-ID: On Wed, Aug 5, 2015 at 1:01 PM, Steven D'Aprano wrote: > Repetition on an arbitrary iterator is ambiguous. If I say, "repeat the > list [1,2,3] twice" there is no ambiguity, I must get [1, 2, 3, 1, 2, 3] > or there is some problem. But iterators can have non-deterministic > lengths and values: > > def gen(): > while random.random() < 0.9: > yield random.random() > > What does it mean to "repeat gen twice"? It might mean either of: > > - generate one run of values using gen, then repeat those same values; > > - generate two runs of values using gen. Actually, that's a generator, which is an example of an *iterable*, not an *iterator*. An iterator would be gen(), not gen. There's no logical way to go back to the iterator's source and say "Please sir, I want some more"; compare: x = iter([1,2,3,4,5]) next(x); next(x) print(list(x*2)) What does "doubling" an iterator that's already partly consumed do? Asking to go back to the original list is wrong; asking to duplicate (by caching) the results that we'd already get makes sense. With a generator, it's no different - you could chain two generator objects called from the same function, but the generator object doesn't know what function it was called from (at least, I don't think it does). So there's only one possible meaning for doubling an iterator, and it's list(x). ChrisA From ericsnowcurrently at gmail.com Wed Aug 5 07:01:23 2015 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 4 Aug 2015 23:01:23 -0600 Subject: [Python-ideas] proposal: "python -m foo" should bind sys.modules['foo'] In-Reply-To: <20150805020333.GA25179@cskk.homeip.net> References: <55C0B32E.7090900@wielicki.name> <20150805020333.GA25179@cskk.homeip.net> Message-ID: On Aug 4, 2015 8:30 PM, "Cameron Simpson" wrote: > > Hello all, > > This is a writeup of a proposal I floated here: > https://mail.python.org/pipermail/python-list/2015-August/694905.html > last Sunday. If the response is positive I wish to write a PEP. Be sure to read through PEP 495. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Wed Aug 5 07:46:30 2015 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 5 Aug 2015 15:46:30 +1000 Subject: [Python-ideas] proposal: "python -m foo" should bind sys.modules['foo'] In-Reply-To: References: Message-ID: <20150805054630.GA66989@cskk.homeip.net> On 04Aug2015 23:01, Eric Snow wrote: >On Aug 4, 2015 8:30 PM, "Cameron Simpson" wrote: >> This is a writeup of a proposal I floated here: >> https://mail.python.org/pipermail/python-list/2015-August/694905.html >> last Sunday. If the response is positive I wish to write a PEP. > >Be sure to read through PEP 495. Hmm. Done: http://legacy.python.org/dev/peps/pep-0495/ Was there something specific I should have been looking for in there? Cheers, Cameron Simpson Raw data, like raw sewage, needs some processing before it can be spread around. The opposite is true of theories. - James A. Carr From ericsnowcurrently at gmail.com Wed Aug 5 10:14:55 2015 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 5 Aug 2015 02:14:55 -0600 Subject: [Python-ideas] proposal: "python -m foo" should bind sys.modules['foo'] In-Reply-To: <20150805054630.GA66989@cskk.homeip.net> References: <20150805054630.GA66989@cskk.homeip.net> Message-ID: On Aug 4, 2015 11:46 PM, "Cameron Simpson" wrote: > > On 04Aug2015 23:01, Eric Snow wrote: >> Be sure to read through PEP 495. Sorry, I meant 395. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Wed Aug 5 11:01:54 2015 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 5 Aug 2015 19:01:54 +1000 Subject: [Python-ideas] proposal: "python -m foo" should bind sys.modules['foo'] In-Reply-To: References: Message-ID: <20150805090154.GA81586@cskk.homeip.net> On 05Aug2015 02:14, Eric Snow wrote: >On Aug 4, 2015 11:46 PM, "Cameron Simpson" wrote: >> >> On 04Aug2015 23:01, Eric Snow wrote: >>> Be sure to read through PEP 495. > >Sorry, I meant 395. Ah, ok, many thanks. I've now read this, particularly this section: http://legacy.python.org/dev/peps/pep-0395/#fixing-pickling-without-breaking-introspection I see that Guido has lent Nick the time machine, as that section outlines a scheme almost word for word what I propose. Though not quite. I see that this was withdrawn, and I after reading the whole PEP and the withdrawal statement at the top I think there are two probalems with the PEP. One is that, as stated, several of these issues have since been addressed elsewhere (though not my issue). The other is that it tried to address a whole host of issues which are related more by sharing the import system than necessarily being closely related of themselves, though clearly there are several scenarios that need considering to ensure that one fix doesn't make other things worse. I still wish to put forth my proposal on its own, probably PEPed, for the following reasons: (a) at present the multiple import via __main__/"python -m" is still not fixed (b) that the fix here: http://legacy.python.org/dev/peps/pep-0395/#fixing-dual-imports-of-the-main-module seems more oriented around keeping sys.path sane than directly avoiding a dual import (c) my suggestion both reuses __qualname__ proposal almost as PEP495 suggested (d) can't break anything because modules do not presently have a __qualname__ (e) would automatically remove a very surprising edge case that is very easy to trip over i.e. by doing nothing very weird, just plain old imports. Therefore I'd still like commentry on my quite limited and small proposal, with an eye to PEPing it and actually getting it approved. Cheers, Cameron Simpson Yesterday, I was running a CNC plasma cutter that's controlled by Windows XP. This is a machine that moves around a plasma torch that cuts thick steel plate. ?A "New Java update is available" window popped up while I was working. ?Not good. - John Nagle From guido at python.org Wed Aug 5 11:49:08 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Aug 2015 11:49:08 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: <55C14AC5.3030001@mgmiller.net> References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <55C10546.8000304@trueblade.com> <55C11B0E.9010608@mgmiller.net> <55C11E7A.9030606@trueblade.com> <55C14552.9090807@mgmiller.net> <55C14AC5.3030001@mgmiller.net> Message-ID: I can only promise that will be considered if you write up a full proposal and specification, to compete with Eric's PEP. (I won't go as far as requiring you to provide an implementation, like Eric.) On Wed, Aug 5, 2015 at 1:29 AM, Mike Miller wrote: > Sorry to reply to myself... > > I'm hoping we could consider a .format()-only implementation as Plan B, > alongside your Plan A with arbitrary expressions. > > > -Mike > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Aug 5 12:49:42 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 05 Aug 2015 12:49:42 +0200 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: Message-ID: <55C1EA46.8080907@mail.de> 1) It always sucks when moving from lists to iterators. 2) As you showed, transition to Python 3 is made easier. Thus: +1 for me On 05.08.2015 02:22, Grayson, Samuel Andrew wrote: > Concatenation is the most fundamental operation that can be done on > iterators. In fact, we already do that with lists. > > [1, 2, 3] + [4, 5, 6] > # evaluates to [1, 2, 3, 4, 5, 6] > > I propose: > > iter([1, 2, 3]) + iter([4, 5, 6]) > # evaluates to something like itertools.chain(iter([1, 2, 3]), > iter([4, 5, 6])) > # equivalent to iter([1, 2, 3, 4, 5, 6]) > > There is some python2 code where: > > a = dict(zip('abcd', range(4))) > isinstance(a.values(), list) > alphabet = a.keys() + a.values() > > In python2, this `alphabet` becomes a list of all values and keys > > In current python3, this raises: > > TypeError: unsupported operand type(s) for +: 'dict_keys' and > 'dict_values' > > But in my proposal, it works just fine. `alphabet` becomes an iterator > over all values and keys (similar to the python2 case). > > Sincerely, > Sam G > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 5 13:13:24 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Aug 2015 13:13:24 +0200 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: <55C1EA46.8080907@mail.de> References: <55C1EA46.8080907@mail.de> Message-ID: Honestly, this has been brought up and rejected so many times I don't think it's even worth discussing. Maybe someone should write a PEP with the proposal so that we can reject it and direct future discussion to the PEP. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Aug 5 13:34:17 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 05 Aug 2015 13:34:17 +0200 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: <55C1EA46.8080907@mail.de> Message-ID: <55C1F4B9.6000209@mail.de> Good idea. What are the reasons for the rejections? On 05.08.2015 13:13, Guido van Rossum wrote: > Honestly, this has been brought up and rejected so many times I don't > think it's even worth discussing. Maybe someone should write a PEP > with the proposal so that we can reject it and direct future > discussion to the PEP. > > -- > --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 5 13:47:11 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Aug 2015 13:47:11 +0200 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: <55C1F4B9.6000209@mail.de> References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> Message-ID: Read the thread. On Wed, Aug 5, 2015 at 1:34 PM, Sven R. Kunze wrote: > Good idea. > > What are the reasons for the rejections? > > > On 05.08.2015 13:13, Guido van Rossum wrote: > > Honestly, this has been brought up and rejected so many times I don't > think it's even worth discussing. Maybe someone should write a PEP with the > proposal so that we can reject it and direct future discussion to the PEP. > > -- > --Guido van Rossum (python.org/~guido ) > > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Aug 5 16:00:52 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 05 Aug 2015 16:00:52 +0200 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> Message-ID: <55C21714.5020705@mail.de> 60% of the thread are about the ominous * operator on lists. << Seriously? The thread started with a simple and useful idea; not with some rarely used feature. 35% of the thread are about some internal implementation issues. <<< Maybe not easy but I am certain you will sort that out. 5% about its usefulness and simplicity. <<< I stick to that. On 05.08.2015 13:47, Guido van Rossum wrote: > Read the thread. > > On Wed, Aug 5, 2015 at 1:34 PM, Sven R. Kunze > wrote: > > Good idea. > > What are the reasons for the rejections? > > > On 05.08.2015 13:13, Guido van Rossum wrote: >> Honestly, this has been brought up and rejected so many times I >> don't think it's even worth discussing. Maybe someone should >> write a PEP with the proposal so that we can reject it and direct >> future discussion to the PEP. >> >> -- >> --Guido van Rossum (python.org/~guido ) > > > > > -- > --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From edk141 at gmail.com Wed Aug 5 16:15:40 2015 From: edk141 at gmail.com (Ed Kellett) Date: Wed, 05 Aug 2015 14:15:40 +0000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: <55C21714.5020705@mail.de> References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> <55C21714.5020705@mail.de> Message-ID: On Wed, 5 Aug 2015 at 15:01 Sven R. Kunze wrote: > 60% of the thread are about the ominous * operator on lists. << Seriously? > The thread started with a simple and useful idea; not with some rarely used > feature. > 35% of the thread are about some internal implementation issues. <<< Maybe > not easy but I am certain you will sort that out. > 5% about its usefulness and simplicity. <<< I stick to that. > I don't think the type issue is an "internal implementation issue". I don't think it should get in the way of doing something useful, but it is a problem that'd need to be solved. That said, there are solutions, and as far as I can tell it's the only problem that's been raised in the thread. My own view is that this would be a good thing, but only because of the deeper problem that it's inconvenient to call functions in Python (they quickly make for rparen soup) and impossible to define custom operators. `+` shouldn't really mean concatenation, but it *does* mean concatenation in Python, and I think concatenation is a pretty reasonable thing to want to do. The clumsiness of itertools.chain on iterators compared with + on lists feels like discouragement from using iterators, even where they're clearly the better solution. edk -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Aug 5 16:16:28 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 5 Aug 2015 15:16:28 +0100 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: <55C21714.5020705@mail.de> References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> <55C21714.5020705@mail.de> Message-ID: On 5 August 2015 at 15:00, Sven R. Kunze wrote: > 60% of the thread are about the ominous * operator on lists. << Seriously? > The thread started with a simple and useful idea; not with some rarely used > feature. > 35% of the thread are about some internal implementation issues. <<< Maybe > not easy but I am certain you will sort that out. > 5% about its usefulness and simplicity. <<< I stick to that. There is no single iterator type that can have a "+" operator defined on it. If you pick one or more such types, other (possibly user defined) types will not work with "+" and this will confuse users. itertools.chain is available to do this in an unambiguous and straightforward manner, its only downside (for some people) is that it's slightly more verbose. Paul From p.f.moore at gmail.com Wed Aug 5 16:26:21 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 5 Aug 2015 15:26:21 +0100 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> <55C21714.5020705@mail.de> Message-ID: On 5 August 2015 at 15:15, Ed Kellett wrote: > That said, there are solutions, and as far as I can tell it's the only > problem that's been raised in the thread. Are there solutions? No-one has come up with one, to my knowledge. You need to cover 1. All of the many internal iterator types in Python: >>> type(iter([])) >>> type({}.keys()) >>> type({}.values()) >>> type({}.items()) >>> type(iter((1,2))) Please don't dismiss this as "just a matter of coding". It's also necessary to remember to do this for every new custom iterator type. 2. Generator functions and expressions: >>> def f(): ... yield 1 ... yield 2 ... >>> type(f()) >>> type((i for i in [1,2,3])) "just another internal type", yeah I know... 3. User defined iterators: >>> class I: ... def __next__(self): ... return 1 ... >>> type(I()) It is *not* acceptable to ask all users to retro-fit an "__add__ method to all their custom iterator types. Even if it were, this would fail for types which are their own iterator and have a custom __add__ of their own. 4. Iterators defined in 3rd party modules, which is similar to user-defined iterators, but with a much wider user base, and much more stringent compatibility issues to consider. Hand-waving away the real issues here is not an option. Paul From abarnert at yahoo.com Wed Aug 5 16:30:24 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 5 Aug 2015 07:30:24 -0700 Subject: [Python-ideas] fork In-Reply-To: <55C1289F.10109@mail.de> References: <20150801173628.29BAB873FE@smtp04.mail.de> <55BFA0B3.1010702@mail.de> <6A8EA952-ED98-4C26-9A40-54BE54367849@yahoo.com> <55C0FFDD.5020002@mail.de> <55C1289F.10109@mail.de> Message-ID: <991E71A1-FDB3-4BE5-B607-B74850903DCD@yahoo.com> On Aug 4, 2015, at 14:03, Sven R. Kunze wrote: > >> On 04.08.2015 21:38, Andrew Barnert wrote: >> I think anyone who finds the complexity of concurrent.futures too daunting to even attempt to learn it should not be working on any code that uses less explicit concurrency. > > I am sorry because I disagree here with you. > >> I have taught concurrent.futures to rank novices in a brief personal session or a single StackOverflow answer and they responded, "Wow, I didn't realize it could be this simple". > > Nobody says that concurrent.futures is not an vast improvement over previous approaches. But it is still not the end of the line of simplifications. > >> Someone who can't grasp it is almost certain to be someone who introduces races all over your code and can't even understand the problem, much less debug it. > > Nobody wants races, yet everybody still talks about them. Don't allow races in the first place and be done with it. What does that even mean? How would you not allow races? If you let people throw arbitrary tasks at a thread pool, with no restriction on mutable shared state, you've allowed races. >> Not true. The language clearly defines when each step happens. The a.__add__ method is called, then the result is assigned to a, then the statement finishes. (Then, in the next statement, nothing happens--except, because this is happening in the interactive interpreter, and it's an expression statement, after the statement finishes doing nothing, the value of the expression is assigned to _ and its repr is printed out.) > > Where can find this definition in the docs? > > To me, we are talking about class customization as described on reference/datamodel.html. Seems like an implementation detail, not a language detail. No, the data model is a feature of the language, not one specific implementation. The fact that you can define classes that work the same way as builtin types like int is a fundamental feature. It's something Guido and others worked very hard on making true back in Python 2.2-2.3. It's one of the things that makes Python or C++ more pleasant to use than Tcl or Java. Any implementation that didn't do the same would not be Python, and would not run a good deal of Python code. > I am not saying, CPython doesn't do it like that, but I saying the Python language could support lazy evaluation and not disagreeing with the docs. > >> This ordering relationship may be very important if the variable a is shared by multiple threads, especially if more than one thread may modify it, especially if you're using non-atomic operations like += (where another thread can read, use, and assign the variable between the __add__ call and the assignment). If a references a mutable object with an __iadd__ method, the variable doesn't even need to be shared, only the value, for this to matter. The only way to safely ignore these problems is to never share any variables or any mutable values between threads. > > Mutual variables are global variables. And these have gone out of style quite some time ago. No. Shared values include global variables, nonlocal variables used by two closures from the same scope, attributes of objects passed to both functions, members of collections passed to both functions, etc. The existence of all of these other things is why global variables are not necessary. They have many advantage over globals, allowing you to better control how state is shared, to share it reentrantly, to make it more explicit in the code, etc. But because they have all the same benefits, they also have the exact same race problem when used to share state between threads. > Btw. this is races again and I thought we agreed on not having them because nobody really can/wants to debug them. And how do you propose "not having them"? It's not impossible to write purely functional code that doesn't use any mutable state, in which case it doesn't matter whether your state is shared. But the fact that your example uses += proves that this isn't your intention. If you take the code from your example and run it in two threads simultaneously, you have a race. The fact that you didn't intend to create a race because you don't understand that doesn't mean the problem isn't there, it just means you have no idea you've just written buggy code and no idea how to test for it or debug it. And that's exactly the problem. What makes concurrent code with shared state hard, more than anything else, is people who don't realize what's hard about it and write code that seems to work but doesn't. Making it easier for such people to write broken code without even realizing they're doing so is not a good thing. From abarnert at yahoo.com Wed Aug 5 16:34:15 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 5 Aug 2015 07:34:15 -0700 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> <55C21714.5020705@mail.de> Message-ID: <6C599652-7A86-401A-B573-0CAFC1B55E91@yahoo.com> On Aug 5, 2015, at 07:26, Paul Moore wrote: > >> On 5 August 2015 at 15:15, Ed Kellett wrote: >> That said, there are solutions, and as far as I can tell it's the only >> problem that's been raised in the thread. > > Are there solutions? No-one has come up with one, to my knowledge. > > You need to cover > > 1. All of the many internal iterator types in Python: > >>>> type(iter([])) > >>>> type({}.keys()) > >>>> type({}.values()) > >>>> type({}.items()) > These last three are not iterators, they're views. The fact that the OP and at least one person explaining the problem both seem to think otherwise implies that the problem is even bigger: we'd need to add the operator to not just all possible iterator types, but all possible iterable types. That's an even more insurmountable task--especially since many iterable types already have a perfectly good meaning for the + operator. From edk141 at gmail.com Wed Aug 5 16:38:49 2015 From: edk141 at gmail.com (Ed Kellett) Date: Wed, 05 Aug 2015 14:38:49 +0000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> <55C21714.5020705@mail.de> Message-ID: On Wed, 5 Aug 2015 at 15:26 Paul Moore wrote: > On 5 August 2015 at 15:15, Ed Kellett wrote: > > That said, there are solutions, and as far as I can tell it's the only > > problem that's been raised in the thread. > > Are there solutions? No-one has come up with one, to my knowledge. > > You need to cover > > 1. All of the many internal iterator types in Python: > > >>> type(iter([])) > > >>> type({}.keys()) > > >>> type({}.values()) > > >>> type({}.items()) > > >>> type(iter((1,2))) > > > Please don't dismiss this as "just a matter of coding". It's also > necessary to remember to do this for every new custom iterator type. > > 2. Generator functions and expressions: > > >>> def f(): > ... yield 1 > ... yield 2 > ... > >>> type(f()) > > >>> type((i for i in [1,2,3])) > > > "just another internal type", yeah I know... > > 3. User defined iterators: > > >>> class I: > ... def __next__(self): > ... return 1 > ... > >>> type(I()) > > > It is *not* acceptable to ask all users to retro-fit an "__add__ > method to all their custom iterator types. Even if it were, this would > fail for types which are their own iterator and have a custom __add__ > of their own. > > 4. Iterators defined in 3rd party modules, which is similar to > user-defined iterators, but with a much wider user base, and much more > stringent compatibility issues to consider. > > Hand-waving away the real issues here is not an option. > Yes, this is why I replied to say they were real issues. Well, one solution would be to have + special-case iterators, and try an iterator-add if there would be a TypeError otherwise. I think this is horrible; I'm just mentioning it for completeness. The other solution would be an Iterator ABC that you'd inherit as a mixin, and recognizes anything as a subclass that implements the current iterator protocol + __add__. Python could do this for 1 and 2. 3 and 4 would be more difficult and ultimately require the author of the class to either inherit Iterator or write an __add__, but this might be eased a bit by having iter() wrap iterators that aren't instances of Iterator. > Even if it were, this would > fail for types which are their own iterator and have a custom __add__ > of their own. Yes. I don't think there's a solution to this part?as I said before, I think the root of the problem is Python's lack of user-defined operators. edk -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Aug 5 16:58:47 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Aug 2015 00:58:47 +1000 Subject: [Python-ideas] proposal: "python -m foo" should bind sys.modules['foo'] In-Reply-To: <20150805090154.GA81586@cskk.homeip.net> References: <20150805090154.GA81586@cskk.homeip.net> Message-ID: On 5 August 2015 at 19:01, Cameron Simpson wrote: > Ah, ok, many thanks. I've now read this, particularly this section: > > http://legacy.python.org/dev/peps/pep-0395/#fixing-pickling-without-breaking-introspection > > I see that Guido has lent Nick the time machine, as that section outlines a > scheme almost word for word what I propose. Though not quite. > > I see that this was withdrawn, and I after reading the whole PEP and the > withdrawal statement at the top I think there are two probalems with the > PEP. One is that, as stated, several of these issues have since been > addressed elsewhere (though not my issue). The other is that it tried to > address a whole host of issues which are related more by sharing the import > system than necessarily being closely related of themselves, though clearly > there are several scenarios that need considering to ensure that one fix > doesn't make other things worse. Right, the withdrawal was of that *specific* PEP, since it hadn't aged well, and covered various things that could be tackled independently. > I still wish to put forth my proposal on its own, probably PEPed, for the > following reasons: > > (a) at present the multiple import via __main__/"python -m" is still not > fixed > > (b) that the fix here: > > http://legacy.python.org/dev/peps/pep-0395/#fixing-dual-imports-of-the-main-module > > seems more oriented around keeping sys.path sane than directly avoiding a > dual import > > (c) my suggestion both reuses __qualname__ proposal almost as PEP495 > suggested > > (d) can't break anything because modules do not presently have a > __qualname__ >From an *implementation* perspective, you'll want to look at Eric's own PEP 451: https://www.python.org/dev/peps/pep-0451/ While I mentioned it in relation to pickle compatibility in the PEP 395 withdrawal notice, it's also relevant to reducing the risk of falling into the double import trap. In particular, __spec__.name already holds the additional state we need to fix this behaviour (i.e. the original module name), I just haven't found the opportunity to go back and update runpy to take advantage of PEP 451 to address this and other limitations. It would definitely be good to have a PEP addressing that. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From breamoreboy at yahoo.co.uk Wed Aug 5 17:04:55 2015 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Wed, 5 Aug 2015 16:04:55 +0100 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> <55C21714.5020705@mail.de> Message-ID: On 05/08/2015 15:16, Paul Moore wrote: > On 5 August 2015 at 15:00, Sven R. Kunze wrote: >> 60% of the thread are about the ominous * operator on lists. << Seriously? >> The thread started with a simple and useful idea; not with some rarely used >> feature. >> 35% of the thread are about some internal implementation issues. <<< Maybe >> not easy but I am certain you will sort that out. >> 5% about its usefulness and simplicity. <<< I stick to that. > > There is no single iterator type that can have a "+" operator defined > on it. If you pick one or more such types, other (possibly user > defined) types will not work with "+" and this will confuse users. > > itertools.chain is available to do this in an unambiguous and > straightforward manner, its only downside (for some people) is that > it's slightly more verbose. > > Paul c = itertools.chain should be short enough for those people. I've discarded ch as it's too often used as a character, and cn as it's likely to get confused with some type of connector. Me, I'll stick with the readable version. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From rosuav at gmail.com Wed Aug 5 17:07:39 2015 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 6 Aug 2015 01:07:39 +1000 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> <55C21714.5020705@mail.de> Message-ID: On Thu, Aug 6, 2015 at 12:38 AM, Ed Kellett wrote: > Python could do this for 1 and 2. 3 and 4 would be more difficult and > ultimately require the author of the class to either inherit Iterator or > write an __add__, but this might be eased a bit by having iter() wrap > iterators that aren't instances of Iterator. > I offered a solution along these lines early in the thread - effectively, you replace the builtin iter() with a class which wraps the iterator in a way that allows addition as chaining. As far as I can see, there *is no solution* that will accept arbitrary iterables and meaningfully add them, so you're going to have to call iter() at some point anyway. Why not use that as the hook that adds your new method? Note that a production-ready version of 'class iter' would need a couple of additional features. One would be a "pass-through" mode - have __new__ recognize that it's being called on an instance of itself, and swiftly return it, thus making iter() idempotent. Another would be supporting the iter(callable, sentinel) form, which shouldn't be hard (just tweak __new__ and __init__ to allow two parameters). There may be other requirements too, but probably nothing insurmountable. ChrisA From pmiscml at gmail.com Wed Aug 5 17:23:20 2015 From: pmiscml at gmail.com (Paul Sokolovsky) Date: Wed, 5 Aug 2015 18:23:20 +0300 Subject: [Python-ideas] proposal: "python -m foo" should bind sys.modules['foo'] In-Reply-To: References: <20150805054630.GA66989@cskk.homeip.net> Message-ID: <20150805182320.052ab9d7@x230> Hello, On Wed, 5 Aug 2015 02:14:55 -0600 Eric Snow wrote: > On Aug 4, 2015 11:46 PM, "Cameron Simpson" wrote: > > > > On 04Aug2015 23:01, Eric Snow wrote: > >> Be sure to read through PEP 495. > > Sorry, I meant 395. I'm sorry for possibly hijacking this thread, but it touches very much issue I had on my mind for a while: being able to run modules inside package as normal scripts. As this thread already has few people knowledgeable of peculiarities of package imports, perhaps they can suggest something. Scenario: There's a directory ("pkg" (representing Python namespace package)), and inside it, there's bar.py of not relevant content and foo.py with "from . import bar". What I'd like to do is (while inside "pkg" directory): python3 foo.py Current behavior: SystemError: Parent module '' not loaded, cannot perform relative import Expected behavior: "from . import bar" in foo.py is successful. > > -eric -- Best regards, Paul mailto:pmiscml at gmail.com From python-ideas at mgmiller.net Wed Aug 5 17:42:52 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 05 Aug 2015 08:42:52 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AFE66E.9070007@trueblade.com> <20150723033112.GJ25179@ando.pearwood.info> <20150723142213.GK25179@ando.pearwood.info> <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <55C10546.8000304@trueblade.com> <55C11B0E.9010608@mgmiller.net> <55C11E7A.9030606@trueblade.com> <55C14552.9090807@mgmiller.net> <55C14AC5.3030001@mgmiller.net> Message-ID: <55C22EFC.5000600@mgmiller.net> Ok, thank you. -Mike On 08/05/2015 02:49 AM, Guido van Rossum wrote: > I can only promise that will be considered if you write up a full proposal and > specification, to compete with Eric's PEP. (I won't go as far as requiring you > to provide an implementation, like Eric.) From p.f.moore at gmail.com Wed Aug 5 18:43:07 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 5 Aug 2015 17:43:07 +0100 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: <6C599652-7A86-401A-B573-0CAFC1B55E91@yahoo.com> References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> <55C21714.5020705@mail.de> <6C599652-7A86-401A-B573-0CAFC1B55E91@yahoo.com> Message-ID: On 5 August 2015 at 15:34, Andrew Barnert wrote: > On Aug 5, 2015, at 07:26, Paul Moore wrote: >> >>> On 5 August 2015 at 15:15, Ed Kellett wrote: >>> That said, there are solutions, and as far as I can tell it's the only >>> problem that's been raised in the thread. >> >> Are there solutions? No-one has come up with one, to my knowledge. >> >> You need to cover >> >> 1. All of the many internal iterator types in Python: >> >>>>> type(iter([])) >> >>>>> type({}.keys()) >> >>>>> type({}.values()) >> >>>>> type({}.items()) >> > > These last three are not iterators, they're views. The fact that the OP and at least one person explaining the problem both seem to think otherwise implies that the problem is even bigger: we'd need to add the operator to not just all possible iterator types, but all possible iterable types. That's an even more insurmountable task--especially since many iterable types already have a perfectly good meaning for the + operator. I understand that - that was really my point, that befor anyone can claim that "there are solutions" they need to be sure they understand what the problem is - and part of that is getting people to be clear on what they mean when they say "iterators" - as you say, iterator vs iterable is a big problem here (and the OP specifically wanted this for views, *not* iterators...) Sorry for not being clear that I was explaining that other people weren't being clear :-) Paul From guido at python.org Wed Aug 5 18:48:11 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Aug 2015 18:48:11 +0200 Subject: [Python-ideas] Use the plus operator to concatenate iterators In-Reply-To: <55C21714.5020705@mail.de> References: <55C1EA46.8080907@mail.de> <55C1F4B9.6000209@mail.de> <55C21714.5020705@mail.de> Message-ID: On Wed, Aug 5, 2015 at 4:00 PM, Sven R. Kunze wrote: > 60% of the thread are about the ominous * operator on lists. << Seriously? > The thread started with a simple and useful idea; not with some rarely used > feature. > 35% of the thread are about some internal implementation issues. <<< Maybe > not easy but I am certain you will sort that out. > What seems an internal implementation issue to you is a philosophical issue to others. You dismiss it at your peril (i.e. a waste of your time). Binary operators in Python are always implemented by letting one or the other argument provide the implementation, never by having the language pick an implementation (not even a default implementation). What you see in the C code that might seem to contradict this is either an optimization or an ancient artifact, not to be copied. 5% about its usefulness and simplicity. <<< I stick to that. > Good luck. I'm checking out of this thread until a PEP is deemed ready for review. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Wed Aug 5 20:56:52 2015 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 05 Aug 2015 14:56:52 -0400 Subject: [Python-ideas] String interpolation for all literal strings Message-ID: <55C25C74.50008@trueblade.com> In the "Briefer string format" thread, Guido suggested [1] in passing that it would have been nice if all literal strings had always supported string interpolation. I've come around to this idea as well, and I'm going to propose it for inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider either modifying it or creating a new (and very similar) PEP. The concept would be that all strings are scanned for \{ and } pairs. If any are found, then they'd be interpreted in the same was as the other discussion on "f-strings". That is, the expression between the \{ and } would be extracted and searched for conversion characters and format specifiers. The expression would be evaluated, converted if needed, have its __format__ method called, and the resulting string inserted back in to the original string. Because strings containing \{ are currently valid, we'd have to introduce this feature with a __future__ import statement. How we transition to having this be the default interpretation of strings is up in the air. Guido privately suggested that it might be nice to also support the 'f' modifier on strings, to give the identical behavior. This way, you could start using the feature without requiring the __future__ import. While I'm not crazy about having two ways to achieve the same thing, I do think it might be nice to support interpolated strings without requiring the __future__ import. Eric. [1] https://mail.python.org/pipermail/python-ideas/2015-August/034928.html From joejev at gmail.com Wed Aug 5 21:18:06 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Wed, 5 Aug 2015 15:18:06 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C25C74.50008@trueblade.com> References: <55C25C74.50008@trueblade.com> Message-ID: raw-strings will not be scanned, correct? On Wed, Aug 5, 2015 at 2:56 PM, Eric V. Smith wrote: > In the "Briefer string format" thread, Guido suggested [1] in passing > that it would have been nice if all literal strings had always supported > string interpolation. > > I've come around to this idea as well, and I'm going to propose it for > inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider > either modifying it or creating a new (and very similar) PEP. > > The concept would be that all strings are scanned for \{ and } pairs. If > any are found, then they'd be interpreted in the same was as the other > discussion on "f-strings". That is, the expression between the \{ and } > would be extracted and searched for conversion characters and format > specifiers. The expression would be evaluated, converted if needed, have > its __format__ method called, and the resulting string inserted back in > to the original string. > > Because strings containing \{ are currently valid, we'd have to > introduce this feature with a __future__ import statement. How we > transition to having this be the default interpretation of strings is up > in the air. > > Guido privately suggested that it might be nice to also support the 'f' > modifier on strings, to give the identical behavior. This way, you could > start using the feature without requiring the __future__ import. While > I'm not crazy about having two ways to achieve the same thing, I do > think it might be nice to support interpolated strings without requiring > the __future__ import. > > Eric. > > > [1] https://mail.python.org/pipermail/python-ideas/2015-August/034928.html > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Wed Aug 5 21:28:40 2015 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 5 Aug 2015 15:28:40 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: <6E80A8E4-B0DF-46DE-8D65-E634B0FFCEB2@trueblade.com> > On Aug 5, 2015, at 3:18 PM, Joseph Jevnik wrote: > > raw-strings will not be scanned, correct? Good question. I would expect them to be scanned. Eric. > >> On Wed, Aug 5, 2015 at 2:56 PM, Eric V. Smith wrote: >> In the "Briefer string format" thread, Guido suggested [1] in passing >> that it would have been nice if all literal strings had always supported >> string interpolation. >> >> I've come around to this idea as well, and I'm going to propose it for >> inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider >> either modifying it or creating a new (and very similar) PEP. >> >> The concept would be that all strings are scanned for \{ and } pairs. If >> any are found, then they'd be interpreted in the same was as the other >> discussion on "f-strings". That is, the expression between the \{ and } >> would be extracted and searched for conversion characters and format >> specifiers. The expression would be evaluated, converted if needed, have >> its __format__ method called, and the resulting string inserted back in >> to the original string. >> >> Because strings containing \{ are currently valid, we'd have to >> introduce this feature with a __future__ import statement. How we >> transition to having this be the default interpretation of strings is up >> in the air. >> >> Guido privately suggested that it might be nice to also support the 'f' >> modifier on strings, to give the identical behavior. This way, you could >> start using the feature without requiring the __future__ import. While >> I'm not crazy about having two ways to achieve the same thing, I do >> think it might be nice to support interpolated strings without requiring >> the __future__ import. >> >> Eric. >> >> >> [1] https://mail.python.org/pipermail/python-ideas/2015-August/034928.html >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joejev at gmail.com Wed Aug 5 21:29:39 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Wed, 5 Aug 2015 15:29:39 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <6E80A8E4-B0DF-46DE-8D65-E634B0FFCEB2@trueblade.com> References: <55C25C74.50008@trueblade.com> <6E80A8E4-B0DF-46DE-8D65-E634B0FFCEB2@trueblade.com> Message-ID: I would very much so expect them to not be scanned. This would make working with thinks that actually have braces really annoying. On Wed, Aug 5, 2015 at 3:28 PM, Eric V. Smith wrote: > > On Aug 5, 2015, at 3:18 PM, Joseph Jevnik wrote: > > raw-strings will not be scanned, correct? > > > Good question. I would expect them to be scanned. > > Eric. > > > On Wed, Aug 5, 2015 at 2:56 PM, Eric V. Smith wrote: > >> In the "Briefer string format" thread, Guido suggested [1] in passing >> that it would have been nice if all literal strings had always supported >> string interpolation. >> >> I've come around to this idea as well, and I'm going to propose it for >> inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider >> either modifying it or creating a new (and very similar) PEP. >> >> The concept would be that all strings are scanned for \{ and } pairs. If >> any are found, then they'd be interpreted in the same was as the other >> discussion on "f-strings". That is, the expression between the \{ and } >> would be extracted and searched for conversion characters and format >> specifiers. The expression would be evaluated, converted if needed, have >> its __format__ method called, and the resulting string inserted back in >> to the original string. >> >> Because strings containing \{ are currently valid, we'd have to >> introduce this feature with a __future__ import statement. How we >> transition to having this be the default interpretation of strings is up >> in the air. >> >> Guido privately suggested that it might be nice to also support the 'f' >> modifier on strings, to give the identical behavior. This way, you could >> start using the feature without requiring the __future__ import. While >> I'm not crazy about having two ways to achieve the same thing, I do >> think it might be nice to support interpolated strings without requiring >> the __future__ import. >> >> Eric. >> >> >> [1] >> https://mail.python.org/pipermail/python-ideas/2015-August/034928.html >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmiscml at gmail.com Wed Aug 5 21:33:05 2015 From: pmiscml at gmail.com (Paul Sokolovsky) Date: Wed, 5 Aug 2015 22:33:05 +0300 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C25C74.50008@trueblade.com> References: <55C25C74.50008@trueblade.com> Message-ID: <20150805223305.381bf23b@x230> Hello, On Wed, 05 Aug 2015 14:56:52 -0400 "Eric V. Smith" wrote: > In the "Briefer string format" thread, Guido suggested [1] in passing > that it would have been nice if all literal strings had always > supported string interpolation. Cute! Wonder, how many more years we have to wait till Guido says that it would have been nice to support stream syntax and braces compound statements right from the beginning. Just imagine full power of lambdas in our hands! Arghhh! With all unbelievable goodness being pushed into the language nowadays, someone should really start pushing into direction of supporting alternative syntaxes. -- Best regards, Paul mailto:pmiscml at gmail.com From yselivanov.ml at gmail.com Wed Aug 5 21:34:54 2015 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 5 Aug 2015 15:34:54 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C25C74.50008@trueblade.com> References: <55C25C74.50008@trueblade.com> Message-ID: <55C2655E.8040907@gmail.com> On 2015-08-05 2:56 PM, Eric V. Smith wrote: > In the "Briefer string format" thread, Guido suggested [1] in passing > that it would have been nice if all literal strings had always supported > string interpolation. > > I've come around to this idea as well, and I'm going to propose it for > inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider > either modifying it or creating a new (and very similar) PEP. > > The concept would be that all strings are scanned for \{ and } pairs. If > any are found, then they'd be interpreted in the same was as the other > discussion on "f-strings". That is, the expression between the \{ and } > would be extracted and searched for conversion characters and format > specifiers. The expression would be evaluated, converted if needed, have > its __format__ method called, and the resulting string inserted back in > to the original string. > > Because strings containing \{ are currently valid, we'd have to > introduce this feature with a __future__ import statement. How we > transition to having this be the default interpretation of strings is up > in the air. Have you considered using '#{..}' syntax (used by Ruby and CoffeeScript)? '\{..}' feels unbalanced and weird. Yury From yselivanov.ml at gmail.com Wed Aug 5 21:36:29 2015 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 5 Aug 2015 15:36:29 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <6E80A8E4-B0DF-46DE-8D65-E634B0FFCEB2@trueblade.com> References: <55C25C74.50008@trueblade.com> <6E80A8E4-B0DF-46DE-8D65-E634B0FFCEB2@trueblade.com> Message-ID: <55C265BD.2060902@gmail.com> On 2015-08-05 3:28 PM, Eric V. Smith wrote: >> >On Aug 5, 2015, at 3:18 PM, Joseph Jevnik wrote: >> > >> >raw-strings will not be scanned, correct? > Good question. I would expect them to be scanned. I think by definition raw strings should stay untouched. Yury From joejev at gmail.com Wed Aug 5 21:36:21 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Wed, 5 Aug 2015 15:36:21 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150805223305.381bf23b@x230> References: <55C25C74.50008@trueblade.com> <20150805223305.381bf23b@x230> Message-ID: Paul: There are projects out there to support alternative syntax. Look at macropy or quasiquotes On Wed, Aug 5, 2015 at 3:33 PM, Paul Sokolovsky wrote: > Hello, > > On Wed, 05 Aug 2015 14:56:52 -0400 > "Eric V. Smith" wrote: > > > In the "Briefer string format" thread, Guido suggested [1] in passing > > that it would have been nice if all literal strings had always > > supported string interpolation. > > Cute! Wonder, how many more years we have to wait till Guido says that > it would have been nice to support stream syntax and braces compound > statements right from the beginning. Just imagine full power of lambdas > in our hands! Arghhh! With all unbelievable goodness being pushed into > the language nowadays, someone should really start pushing into > direction of supporting alternative syntaxes. > > > -- > Best regards, > Paul mailto:pmiscml at gmail.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed Aug 5 21:53:16 2015 From: barry at python.org (Barry Warsaw) Date: Wed, 5 Aug 2015 15:53:16 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: <20150805155316.567e5d16@anarchist.wooz.org> On Aug 05, 2015, at 03:34 PM, Yury Selivanov wrote: >On 2015-08-05 2:56 PM, Eric V. Smith wrote: >> The concept would be that all strings are scanned for \{ and } pairs. I think it's a very interesting idea too, although the devil is in the details. Since this will be operating on string literals, they'd be scanned at compile time right? Agreed that raw strings probably shouldn't be scanned. Since it may happen that some surprising behavior occurs (long after it's past __future__), there should be some way to prevent scanning. To me that either means r'' strings don't get scanned or f'' is required. I'm still unclear on what the difference would be between f'' strings and these currently mythical scanned-strings are, but I'll wait for the PEP. >Have you considered using '#{..}' syntax (used by Ruby and >CoffeeScript)? > >'\{..}' feels unbalanced and weird. As it does for me. Let's see what particular color Eric reaches for. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From robertc at robertcollins.net Wed Aug 5 21:59:40 2015 From: robertc at robertcollins.net (Robert Collins) Date: Thu, 6 Aug 2015 07:59:40 +1200 Subject: [Python-ideas] Reprs of classes and functions In-Reply-To: References: Message-ID: I plan on committing http://bugs.python.org/issue13224 tomorrowish (the patch is a little stale and I want to poke around some to be sure its ok). If anyone objects please follow up in the issue. -Rob On 26 February 2015 at 16:28, Guido van Rossum wrote: > I didn't read all that, but my use case is that for typing.py (PEP 484) I > want "nice" reprs for higher-order classes, and if str(int) returned just > 'int' it would be much simpler to e.g. make the str() or repr() of > Optional[int] return "Optional[int]". I've sort of solved this already in > https://github.com/ambv/typehinting/blob/master/prototyping/typing.py, but > it would still be nice if I didn't have to work so hard at it. > > On Wed, Feb 25, 2015 at 5:41 PM, Andrew Barnert > wrote: >> >> On Feb 25, 2015, at 8:21, Guido van Rossum wrote: >> >> A variant I've often wanted is to make str() return the "clean" type only >> (e.g. 'int') while leaving repr() alone -- this would address most problems >> Steven brought up. >> >> >> On further thought, I'm not sure backward compatibility can be dismissed. >> Surely code shouldn't care about the str of a callable, right? But yeah, >> sometimes it might care, if it's, say, a reflective framework for bridging >> or debugging or similar. >> >> I once wrote a library that exposed Python objects over AppleEvents. I >> needed to know all the bound methods of the object. For pure-Python objects, >> this is easy, but for builtin/extension objects, the bound methods >> constructed from both slot-wrapper and method-descriptor are of type >> method-wrapper. (The bug turned up when an obscure corner of our code that >> tried lxml, cElementTree, then ElementTree was run on a 2.x system without >> lxml installed, something we'd left out of the test matrix for that module.) >> >> When you print out the method, it's obvious what type it is. But how do >> you deal with it programmatically? Despite what the docs say, >> method-descriptor does not return true for either isbuiltin or isroutine. In >> fact, nothing in inspect is much help. And the type isn't in types. Printing >> out the method, it's dead obvious what it is. >> >> As it turned out, callable(x) and (hasattr(x, 'im_self') or hasattr(x, >> '__self__')) was good enough for my particular use case, at least in all >> Python interpreters/versions we supported (it even works in PyPy with CPyExt >> extension types and an early version of NumPyPy). But what would a good >> general solution be? >> >> I think dynamically constructing the type from int().__add__ and then >> verifying that all extension types in all interpreters and versions I care >> about pass isinstance(that) unless they look like pure-Python bound methods >> might be the best way. And even if I did resort to using the string >> representation, I'd probably use the repr rather than the str, and I'd >> probably look at type(x) rather than x itself. But still, I can imagine >> someone deciding "whatever I do is going to be hacky, because there's no >> non-hacky way to get this information, so I might as well go with the >> simplest hack that's worked on every CPython and PyPy version from 2.2 to >> today" and use "... or str(x).startswith('> >> >> On Wed, Feb 25, 2015 at 12:12 AM, Serhiy Storchaka >> wrote: >>> >>> This idea is already casually mentioned, but sank deep into the threads >>> of the discussion. Raise it up. >>> >>> Currently reprs of classes and functions look as: >>> >>> >>> int >>> >>> >>> int.from_bytes >>> >>> >>> open >>> >>> >>> import collections >>> >>> collections.Counter >>> >>> >>> collections.Counter.fromkeys >>> > >>> >>> collections.namedtuple >>> >>> >>> What if change default reprs of classes and functions to just full >>> qualified name __module__ + '.' + __qualname__ (or just __qualname__ if >>> __module__ is builtins)? This will look more neatly. And such reprs are >>> evaluable. >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Robert Collins Distinguished Technologist HP Converged Cloud From guido at python.org Wed Aug 5 22:03:49 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Aug 2015 22:03:49 +0200 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C2655E.8040907@gmail.com> References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: On Wed, Aug 5, 2015 at 9:34 PM, Yury Selivanov wrote: > On 2015-08-05 2:56 PM, Eric V. Smith wrote: > >> In the "Briefer string format" thread, Guido suggested [1] in passing >> that it would have been nice if all literal strings had always supported >> string interpolation. >> >> I've come around to this idea as well, and I'm going to propose it for >> inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider >> either modifying it or creating a new (and very similar) PEP. >> >> The concept would be that all strings are scanned for \{ and } pairs. If >> any are found, then they'd be interpreted in the same was as the other >> discussion on "f-strings". That is, the expression between the \{ and } >> would be extracted and searched for conversion characters and format >> specifiers. The expression would be evaluated, converted if needed, have >> its __format__ method called, and the resulting string inserted back in >> to the original string. >> >> Because strings containing \{ are currently valid, we'd have to >> introduce this feature with a __future__ import statement. How we >> transition to having this be the default interpretation of strings is up >> in the air. >> > > Have you considered using '#{..}' syntax (used by Ruby and > CoffeeScript)? > Well, I feel bound by *some* backward compatibility... Python string literals don't treat anything special except \ followed by certain characters. It feels better to add to the set of "certain characters" (which we've done before) than to add a completely new escape sequence. > '\{..}' feels unbalanced and weird. > Not more or less than '#{..}'. I looked through https://en.wikipedia.org/wiki/String_interpolation for what other languages do, and it reminded me that Swift uses '\(..)' -- that would also be a possibility, but '\{..}' feels closer to the existing PEP 3101 '{..}.format(..) syntax. And I did indeed mean for r-strings not to be interpolated (since they are exempt from \ interpretation). We should look a bit more into how this proposal interacts with regular expressions (where \{ can be used to avoid the special meaning of {..}). I think \(..) would be more cumbersome than \{..}, since () is more common in regular expressions than {}. BTW an idea on the transition: with a __future__ import \{..} is enabled in all non-raw strings, without a __future__ import you can still use \{..} inside f-literals. (Because having to add a __future__ import interrupts one's train of thought.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Wed Aug 5 22:46:05 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 05 Aug 2015 13:46:05 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: <55C2760D.2010507@mgmiller.net> Sounds awesome. Making it default could be a killer feature for Python 4.0 ;) -Mike On 08/05/2015 01:03 PM, Guido van Rossum wrote: > BTW an idea on the transition: with a __future__ import \{..} is enabled in all > non-raw strings, without a __future__ import you can still use \{..} inside > f-literals. (Because having to add a __future__ import interrupts one's train of > thought.) From eric at trueblade.com Wed Aug 5 22:50:35 2015 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 05 Aug 2015 16:50:35 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C265BD.2060902@gmail.com> References: <55C25C74.50008@trueblade.com> <6E80A8E4-B0DF-46DE-8D65-E634B0FFCEB2@trueblade.com> <55C265BD.2060902@gmail.com> Message-ID: <55C2771B.9060302@trueblade.com> On 08/05/2015 03:36 PM, Yury Selivanov wrote: > On 2015-08-05 3:28 PM, Eric V. Smith wrote: >>> >On Aug 5, 2015, at 3:18 PM, Joseph Jevnik wrote: >>> > >>> >raw-strings will not be scanned, correct? >> Good question. I would expect them to be scanned. > > I think by definition raw strings should stay untouched. The sub-thread about regular expressions has me pretty much convinced that I agree. Eric. From oscar.j.benjamin at gmail.com Wed Aug 5 23:24:12 2015 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Wed, 5 Aug 2015 22:24:12 +0100 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C25C74.50008@trueblade.com> References: <55C25C74.50008@trueblade.com> Message-ID: On 5 August 2015 at 19:56, Eric V. Smith wrote: > > In the "Briefer string format" thread, Guido suggested [1] in passing > that it would have been nice if all literal strings had always supported > string interpolation. > > I've come around to this idea as well, and I'm going to propose it for > inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider > either modifying it or creating a new (and very similar) PEP. > > The concept would be that all strings are scanned for \{ and } pairs. If > any are found, then they'd be interpreted in the same was as the other > discussion on "f-strings". That is, the expression between the \{ and } > would be extracted and searched for conversion characters and format > specifiers. The expression would be evaluated, converted if needed, have > its __format__ method called, and the resulting string inserted back in > to the original string. I strongly dislike this idea. One of the things I like about Python is the fact that a string literal is just a string literal. I don't want to have to scan through a large string and try to work out if it really is just a literal or a dynamic context-dependent expression. I would hold this objection if the proposal was a limited form of variable interpolation (akin to .format) but if any string literal can embed arbitrary expressions than I *really* don't like that idea. It would be better if strings that have this magic behaviour are at least explicitly marked. The already proposed f-prefix requires a single character to prefix the string but that single character would communicate quite a lot when looking at unfamiliar code. It's already necessary to check for prefixes at the beginning of a string literal but it's not necessary to read the whole (potentially large) thing in order to understand how it interacts with the surrounding code. I don't want to have to teach my students about this when explaining how strings work in Python. I was already thinking that I would just leave f-strings out of my introductory programming course because they're redundant and so jarring against the way that Python code normally looks (this kind of thing is not helpful to people who are just learning about statements, expressions, scope, execution etc). I also don't want to have to read/debug code that is embedded in string literals: message = '''\ ... x = \{__import__('sys').exit()} ... ''' > Because strings containing \{ are currently valid, we'd have to > introduce this feature with a __future__ import statement. How we > transition to having this be the default interpretation of strings is up > in the air. This is a significant compatibility break. I don't see any benefit to justify it. Why is print('x = \{x}, y = \{y}') better than print(f'x = {x}, y = {y}') and even if you do prefer the former is it really worth breaking existing code? > Guido privately suggested that it might be nice to also support the 'f' > modifier on strings, to give the identical behavior. This way, you could > start using the feature without requiring the __future__ import. While > I'm not crazy about having two ways to achieve the same thing, I do > think it might be nice to support interpolated strings without requiring > the __future__ import. What would be the point? If both are available then I would just always use the f-string since I prefer local explicitness over the global effect of a __future__ import. Or is there a plan to introduce the f-prefix and then deprecate it in the even more distant future when all strings behave that way? -- Oscar From gmludo at gmail.com Thu Aug 6 01:16:03 2015 From: gmludo at gmail.com (Ludovic Gasc) Date: Thu, 6 Aug 2015 01:16:03 +0200 Subject: [Python-ideas] Concurrency Modules In-Reply-To: References: <559EFB73.5050606@mail.de> <9c139305-f583-46c1-b819-6a98dbd04acc@googlegroups.com> <55B2B0FB.1060409@mail.de> <55B3C93D.9090601@mail.de> <55B5508E.1000201@mail.de> <55B872BB.5080603@mail.de> <87egjlr51j.fsf@gmail.com> Message-ID: +14 Thank you Andrew for your answer. @Akira: Measure, profile, and benchmark your projects: learning curve is more complex, however, at the end you'll can filter easier the ideas from the community on your projects. A lot of "good" practices are counter-efficient like micro-services: if you push micro-services pattern to the extreme, you'll add latency because you'll generate more internal traffic for one HTTP request. It doesn't mean that you must have a monolithic daemon, only to slice pragmatically your services. I've a concrete example of an open source product that abuses this pattern and where I've measured concrete efficiency impacts before and after microservices introduction. I can't cite his name because we use that on production, I want to keep a good relationship with them. -- Ludovic Gasc (GMLudo) http://www.gmludo.eu/ 2015-08-03 3:08 GMT+02:00 Andrew Barnert via Python-ideas < python-ideas at python.org>: > On Aug 2, 2015, at 10:09, Akira Li <4kir4.1i at gmail.com> wrote: > > > > Ludovic Gasc writes: > > > >> 2015-07-29 8:29 GMT+02:00 Sven R. Kunze : > >> > >>> Thanks Ludovic. > >>> > >>> On 28.07.2015 22:15, Ludovic Gasc wrote: > >>> > >>> Hello, > >>> > >>> This discussion is pretty interesting to try to list when each > >>> architecture is the most efficient, based on the need. > >>> > >>> However, just a small precision: multiprocess/multiworker isn't > antinomic > >>> with AsyncIO: You can have an event loop in each process to try to > combine > >>> the "best" of two "worlds". > >>> As usual in IT, it isn't a silver bullet that will care the cancer, > >>> however, at least to my understanding, it should be useful for some > >>> business needs like server daemons. > >>> > >>> > >>> I think that should be clear for everybody using any of these modules. > But > >>> you are right to point it out explicitly. > >> > >> Based on my discussions at EuroPython and PyCON-US, it's certainly clear > >> for the middle-class management of Python community, however, not really > >> from the typical Python end-dev: Several persons tried to troll me that > >> multiprocessing is more efficient than AsyncIO. > >> > >> To me, it was a opportunity to transform the negative troll attempt to a > >> positive exchange about efficiency and understand before to troll ;-) > >> More seriously, I've the feeling that it isn't very clear for everybody, > >> especially for the new comers. > > > > Do you mean those trolls that measure first then make > > conclusions ;) > > > > Could you provide an evidence-based description of the issue such as > > http://www.mailinator.com/tymaPaulMultithreaded.pdf > > but for Python? > > The whole point of that post, and of the older von Behrens paper is > references, is that a threading-like API can be built that uses explicit > cooperative threading and dynamic stacks, and that avoids all of the > problems with threads while retaining almost all of the advantages. > > That sounds great. Which is probably why it's exactly what Python asyncio > does. Just like von Behrens's thread package, it uses an event loop around > poll (or something better) to drive a scheduler for coroutines. The only > difference is that Python has coroutines natively, unlike Java or C, and > with a nice API, so there's no reason not to hide that API. (But if you > really want to, you can just use gevent without its monkeypatching library, > and then you've got an almost exact equivalent.) > > In other words, in the terms used by mailinator, asyncio is exactly the > thread package they suggest using instead of an event package. Their > evidence that something like asyncio can be built for Java, and we don't > need evidence that something like asyncio could be built for Python because > Guido already built it. You could compare asyncio with the coroutine API to > asyncio with the lower-level callback API (or Twisted with inline callbacks > to Twisted with coroutines, etc.), but what would be the point? > > Of course multiprocessing vs. asyncio is a completely different question. > Now that we have reasonably similar, well-polished APIs for both, people > can start running comparisons. But it's pretty easy to predict what they'll > find: for some applications, multiprocessing is better; for others, asyncio > is better; for others, a simple combination of the two easily beats either > alone; for others, it really doesn't make much difference because > concurrency isn't even remotely the key issue. The only thing that really > matters to anyone is which is better for _their_ application, and that's > something you can't extrapolate from a completely different test any better > than you can guess it. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmiscml at gmail.com Thu Aug 6 01:32:10 2015 From: pmiscml at gmail.com (Paul Sokolovsky) Date: Thu, 6 Aug 2015 02:32:10 +0300 Subject: [Python-ideas] Running scripts with relative imports directly, was: Re: proposal: "python -m foo" In-Reply-To: <20150805182320.052ab9d7@x230> References: <20150805054630.GA66989@cskk.homeip.net> <20150805182320.052ab9d7@x230> Message-ID: <20150806023210.18b5387d@x230> Hello, On Wed, 5 Aug 2015 18:23:20 +0300 Paul Sokolovsky wrote: > I'm sorry for possibly hijacking this thread, but it touches very much > issue I had on my mind for a while: being able to run modules inside > package as normal scripts. As this thread already has few people > knowledgeable of peculiarities of package imports, perhaps they can > suggest something. > > Scenario: > > There's a directory ("pkg" (representing Python namespace package)), > and inside it, there's bar.py of not relevant content and foo.py with > "from . import bar". What I'd like to do is (while inside "pkg" > directory): > > python3 foo.py > > Current behavior: > > SystemError: Parent module '' not loaded, cannot perform relative > import > > Expected behavior: > > "from . import bar" in foo.py is successful. Perhaps I was asking something dumb, or everyone just takes for granted that nowadays Python code can't be developed comfortably without IDE, or 2+ console windows open, or something. But I find it quite a sign of problem, because if one accepts that one can't just run a script right away, but needs to do extra steps, then well, that enterprisey niche is pretty crowded and there're more choices how to make it more complicated. So, I did my homework (beyond just googling, which unfortunately didn't turn up much), and being able to do it with a simple "loader" and "single command line switch" like: python3 -mruninpkg script.py arg1 arg2 arg3 restored my piece of mind. The script is here: https://github.com/pfalcon/py-runinpkg . Hope someone else will find its existence insightful, or maybe someone will suggest how to make it better (I see the bottleneck in that it's not possible to make mod.__name__ an empty string, and that's what would be needed here to avoid double-import problem). I actually did another googling session, and there's definitely a niche for such solution, like this 10-years old article shows: http://code.activestate.com/recipes/307772-executing-modules-inside-packages-with-m/ If only there were good, widely known inventory of them for different usecases... -- Best regards, Paul mailto:pmiscml at gmail.com From ben+python at benfinney.id.au Thu Aug 6 01:47:05 2015 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 06 Aug 2015 09:47:05 +1000 Subject: [Python-ideas] Running scripts with relative imports directly, was: Re: proposal: "python -m foo" References: <20150805054630.GA66989@cskk.homeip.net> <20150805182320.052ab9d7@x230> <20150806023210.18b5387d@x230> Message-ID: <85twsdjo1y.fsf@benfinney.id.au> Paul Sokolovsky writes: > I find it quite a sign of problem, because if one accepts that one > can't just run a script right away, but needs to do extra steps, then > well, that enterprisey niche is pretty crowded and there're more > choices how to make it more complicated. Python's BDFL has spoken of running modules with relative import as top-level scripts: I'm -1 on this and on any other proposed twiddlings of the __main__ machinery. The only use case seems to be running scripts that happen to be living inside a module's directory, which I've always seen as an antipattern. To make me change my mind you'd have to convince me that it isn't. He doesn't describe (that I can find) what makes him think it's an antipattern, so I'm not clear on how he expects to be convinced it's a valid pattern. Nonetheless, that appears to be the hurdle you'd need to confront. -- \ ?Skepticism is the highest duty and blind faith the one | `\ unpardonable sin.? ?Thomas Henry Huxley, _Essays on | _o__) Controversial Questions_, 1889 | Ben Finney From rosuav at gmail.com Thu Aug 6 01:59:57 2015 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 6 Aug 2015 09:59:57 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On Thu, Aug 6, 2015 at 7:24 AM, Oscar Benjamin wrote: > What would be the point? If both are available then I would just > always use the f-string since I prefer local explicitness over the > global effect of a __future__ import. Or is there a plan to introduce > the f-prefix and then deprecate it in the even more distant future > when all strings behave that way? If it's done that way, the f-prefix will be like the u-prefix in Python 3.3+, where it's permitted for compatibility with older versions, but unnecessary. Future directives are the same - you can legally put "from __future__ import nested_scopes" into Python 3.6 and not get an error, even though it's now pure noise. I don't have a problem with that. Whether or not it's good for string literals to support interpolation, though, I'm not sure about. The idea that stuff should get interpolated into strings fits a shell scripting language perfectly, but I'm not fully convinced it's a good thing for an applications language. How shelly is Python? Or, what other non-shell languages have this kind of feature? PHP does (which is hardly an advertisement!); I can't think of any others off hand, any pointers? Side point: My preferred bike shed color is \{...}, despite its similarity to \N{...}; Unicode entity escapes aren't common, and most of the names have spaces in them anyway, so there's unlikely to be real confusion. (You might have a module constant INFINITY=float("inf"), and then \N{INFINITY} will differ from \{INFINITY}. That's the most likely confusion I can think of.) But that's insignificant. All spellings will come out fairly similar in practice. ChrisA From cs at zip.com.au Thu Aug 6 02:07:53 2015 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 6 Aug 2015 10:07:53 +1000 Subject: [Python-ideas] proposal: "python -m foo" should bind sys.modules['foo'] In-Reply-To: References: Message-ID: <20150806000753.GA32566@cskk.homeip.net> On 06Aug2015 00:58, Nick Coghlan wrote: >On 5 August 2015 at 19:01, Cameron Simpson wrote: >> Ah, ok, many thanks. I've now read this, particularly this section: >> http://legacy.python.org/dev/peps/pep-0395/#fixing-pickling-without-breaking-introspection [...] >From an *implementation* perspective, you'll want to look at Eric's >own PEP 451: https://www.python.org/dev/peps/pep-0451/ Ah. Yes. Thanks. On that basis I withdraw my .__qualname__ suggestion because there exists module.__spec__.name. So it now reduces the proposal to making the -m option do this: % python -m module.name ... runs (framed loosely like https://www.python.org/dev/peps/pep-0451/#how-loading-will-work): # pseudocode, with values hardwired for clarity import sys module = ModuleType('module.name') module.__name__ = '__main__' sys.modules['__main__'] = module sys.modules['module.name'] = module ... load module code ... I suspect "How Reloading Will Work" would need to track both module.__name__ and module.__spec__.name to reattach the module to both entires in sys.modules. >In particular, __spec__.name already holds the additional state we >need to fix this behaviour (i.e. the original module name), I just >haven't found the opportunity to go back and update runpy to take >advantage of PEP 451 to address this and other limitations. >It would definitely be good to have a PEP addressing that. I'd like to have a go at addressing just the change I outline above, in the interests of just getting it done. Is that too narrow a change or PEP topic? Are there specific other things I should be considering/addressing that might be affected by my suggestion? Also, where do I find to source for runpy to preruse? Cheers, Cameron Simpson A program in conformance will not tend to stay in conformance, because even if it doesn't change, the standard will. - Norman Diamond From abarnert at yahoo.com Thu Aug 6 02:13:41 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 5 Aug 2015 17:13:41 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On Aug 5, 2015, at 16:59, Chris Angelico wrote: > > On Thu, Aug 6, 2015 at 7:24 AM, Oscar Benjamin > wrote: >> What would be the point? If both are available then I would just >> always use the f-string since I prefer local explicitness over the >> global effect of a __future__ import. Or is there a plan to introduce >> the f-prefix and then deprecate it in the even more distant future >> when all strings behave that way? > > If it's done that way, the f-prefix will be like the u-prefix in > Python 3.3+, where it's permitted for compatibility with older > versions, but unnecessary. Future directives are the same - you can > legally put "from __future__ import nested_scopes" into Python 3.6 and > not get an error, even though it's now pure noise. I don't have a > problem with that. > > Whether or not it's good for string literals to support interpolation, > though, I'm not sure about. The idea that stuff should get > interpolated into strings fits a shell scripting language perfectly, > but I'm not fully convinced it's a good thing for an applications > language. How shelly is Python? Or, what other non-shell languages > have this kind of feature? PHP does (which is hardly an > advertisement!); I can't think of any others off hand, any pointers? Guido's specific inspiration was Swift, which is about as "applicationy" a language as you can get. From rosuav at gmail.com Thu Aug 6 02:26:48 2015 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 6 Aug 2015 10:26:48 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On Thu, Aug 6, 2015 at 10:13 AM, Andrew Barnert wrote: >> Whether or not it's good for string literals to support interpolation, >> though, I'm not sure about. The idea that stuff should get >> interpolated into strings fits a shell scripting language perfectly, >> but I'm not fully convinced it's a good thing for an applications >> language. How shelly is Python? Or, what other non-shell languages >> have this kind of feature? PHP does (which is hardly an >> advertisement!); I can't think of any others off hand, any pointers? > > Guido's specific inspiration was Swift, which is about as "applicationy" a language as you can get. Thanks. If anyone else wants to read up on that: https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/StringsAndCharacters.html I poked around with a few Swift style guides, and they seem to assume that interpolation is a good and expected thing, which is promising. No proof, of course, but the converse would have been strong evidence. Count me as +0.5 on this. ChrisA From python at mrabarnett.plus.com Thu Aug 6 02:28:16 2015 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 6 Aug 2015 01:28:16 +0100 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: <55C2AA20.1010104@mrabarnett.plus.com> On 2015-08-05 21:03, Guido van Rossum wrote: > On Wed, Aug 5, 2015 at 9:34 PM, Yury Selivanov > wrote: > > On 2015-08-05 2:56 PM, Eric V. Smith wrote: > > In the "Briefer string format" thread, Guido suggested [1] in > passing > that it would have been nice if all literal strings had always > supported > string interpolation. > > I've come around to this idea as well, and I'm going to propose > it for > inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider > either modifying it or creating a new (and very similar) PEP. > > The concept would be that all strings are scanned for \{ and } > pairs. If > any are found, then they'd be interpreted in the same was as the > other > discussion on "f-strings". That is, the expression between the > \{ and } > would be extracted and searched for conversion characters and format > specifiers. The expression would be evaluated, converted if > needed, have > its __format__ method called, and the resulting string inserted > back in > to the original string. > > Because strings containing \{ are currently valid, we'd have to > introduce this feature with a __future__ import statement. How we > transition to having this be the default interpretation of > strings is up > in the air. > > > Have you considered using '#{..}' syntax (used by Ruby and > CoffeeScript)? > > > Well, I feel bound by *some* backward compatibility... Python string > literals don't treat anything special except \ followed by certain > characters. It feels better to add to the set of "certain characters" > (which we've done before) than to add a completely new escape sequence. > > '\{..}' feels unbalanced and weird. > > > Not more or less than '#{..}'. I looked through > https://en.wikipedia.org/wiki/String_interpolation for what other > languages do, and it reminded me that Swift uses '\(..)' -- that would > also be a possibility, but '\{..}' feels closer to the existing PEP 3101 > '{..}.format(..) syntax. > What that page shows me is how common it is to use $ for interpolation; it's even used in Python's own string.Template! > And I did indeed mean for r-strings not to be interpolated (since they > are exempt from \ interpretation). > > We should look a bit more into how this proposal interacts with regular > expressions (where \{ can be used to avoid the special meaning of {..}). > I think \(..) would be more cumbersome than \{..}, since () is more > common in regular expressions than {}. > > BTW an idea on the transition: with a __future__ import \{..} is enabled > in all non-raw strings, without a __future__ import you can still use > \{..} inside f-literals. (Because having to add a __future__ import > interrupts one's train of thought.) > I'd prefer interpolated string literals to be marked, leaving unmarked literals as they are (except for rejecting unknown escapes!). From abarnert at yahoo.com Thu Aug 6 02:45:59 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 5 Aug 2015 17:45:59 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On Aug 5, 2015, at 17:26, Chris Angelico wrote: > > On Thu, Aug 6, 2015 at 10:13 AM, Andrew Barnert wrote: >>> Whether or not it's good for string literals to support interpolation, >>> though, I'm not sure about. The idea that stuff should get >>> interpolated into strings fits a shell scripting language perfectly, >>> but I'm not fully convinced it's a good thing for an applications >>> language. How shelly is Python? Or, what other non-shell languages >>> have this kind of feature? PHP does (which is hardly an >>> advertisement!); I can't think of any others off hand, any pointers? >> >> Guido's specific inspiration was Swift, which is about as "applicationy" a language as you can get. > > Thanks. If anyone else wants to read up on that: > > https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/StringsAndCharacters.html > > I poked around with a few Swift style guides, and they seem to assume > that interpolation is a good and expected thing, which is promising. > No proof, of course, but the converse would have been strong evidence. I personally love the feature in Swift, and I've worked with other people who even considered it one of the main reasons to switch from ObjC, and haven't heard anyone who actually used it complain about it. And there are blog posts by iOS app developers that seem to agree. Of course that's hardly a scientific survey. Especially since ObjC kind of sucks for string formatting (it's basically C90 printf with more verbose syntax). I have seen plenty of people complain about other things about Swift's strings (strings of Unicode graphemes clusters aren't randomly accessible, and the fact that regexes and some other string-related features work in terms of UTF-16 code units makes it even worse), but not about the interpolation. From tjreedy at udel.edu Thu Aug 6 03:38:44 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 5 Aug 2015 21:38:44 -0400 Subject: [Python-ideas] fork In-Reply-To: <55C1289F.10109@mail.de> References: <20150801173628.29BAB873FE@smtp04.mail.de> <55BFA0B3.1010702@mail.de> <6A8EA952-ED98-4C26-9A40-54BE54367849@yahoo.com> <55C0FFDD.5020002@mail.de> <55C1289F.10109@mail.de> Message-ID: On 8/4/2015 5:03 PM, Sven R. Kunze wrote: >> Not true. The language clearly defines when each step happens. The >> a.__add__ method is called, a.__iadd__, if it exists. https://docs.python.org/3/reference/datamodel.html#emulating-numeric-types >> then the result is assigned to a, then the >> statement finishes. (Then, in the next statement, nothing >> happens--except, because this is happening in the interactive >> interpreter, and it's an expression statement, after the statement >> finishes doing nothing, the value of the expression is assigned to _ >> and its repr is printed out.) > > Where can find this definition in the docs? https://docs.python.org/3/reference/simple_stmts.html#augmented-assignment-statements -- Terry Jan Reedy From tjreedy at udel.edu Thu Aug 6 03:58:07 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 5 Aug 2015 21:58:07 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C2655E.8040907@gmail.com> References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: On 8/5/2015 3:34 PM, Yury Selivanov wrote: > '\{..}' feels unbalanced and weird. Escape both. The closing } is also treated specially, and not inserted into the string. The compiler scans linearly from left to right, but human eyes are not so constrained. s = "abc\{kjljid some long expression jk78738}def" versus s = "abc\{kjljid some long expression jk78738\}def" and how about s = "abc\{kjljid some {long} expression jk78738\}def" -- Terry Jan Reedy From dan at tombstonezero.net Thu Aug 6 04:20:29 2015 From: dan at tombstonezero.net (Dan Sommers) Date: Thu, 6 Aug 2015 02:20:29 +0000 (UTC) Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> Message-ID: On Thu, 06 Aug 2015 09:59:57 +1000, Chris Angelico wrote: > Whether or not it's good for string literals to support interpolation, > though, I'm not sure about. The idea that stuff should get > interpolated into strings fits a shell scripting language perfectly, > but I'm not fully convinced it's a good thing for an applications > language ... I had that same reaction: string interpolation is a shell-scripty thing. That said, my shell has printf as a built in function, and my OS comes with /usr/bin/printf whether I want it or not. > ... How shelly is Python? Or, what other non-shell languages have this > kind of feature? PHP does (which is hardly an advertisement!); I can't > think of any others off hand, any pointers? Ruby has this kind of feature. Common Lisp's format string is an entire DSL, but that DSL is like printf in that the string describes the formatting and the remaining arguments to the format function provide the data, rather than the string naming local variables or containing expressions to be evaluated. From rosuav at gmail.com Thu Aug 6 04:32:12 2015 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 6 Aug 2015 12:32:12 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On Thu, Aug 6, 2015 at 12:20 PM, Dan Sommers wrote: > Common Lisp's format string is an entire DSL, but that DSL is like > printf in that the string describes the formatting and the remaining > arguments to the format function provide the data, rather than the > string naming local variables or containing expressions to be evaluated. Lots of languages have some sort of printf-like function (Python has %-formatting and .format() both), where the actual content comes from additional arguments. It's the magic of having the string *itself* stipulate where to grab stuff from that's under discussion here. ChrisA From steve at pearwood.info Thu Aug 6 04:48:10 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 6 Aug 2015 12:48:10 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: <20150806024809.GS3737@ando.pearwood.info> On Wed, Aug 05, 2015 at 05:13:41PM -0700, Andrew Barnert via Python-ideas wrote: > Guido's specific inspiration was Swift, which is about as > "applicationy" a language as you can get. Swift is also barely more than a year old. While it's a very exciting looking language, it's not one which has a proven long-term record. I know that everything coming from Apple is cool, but other languages have had automatic variable interpolation for a long time, e.g. PHP and Ruby, and Python has resisted joining them. While it's good to reconsider design decisions, I wonder, what has changed? -- Steve From dan at tombstonezero.net Thu Aug 6 05:02:04 2015 From: dan at tombstonezero.net (Dan Sommers) Date: Thu, 6 Aug 2015 03:02:04 +0000 (UTC) Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> Message-ID: On Thu, 06 Aug 2015 12:32:12 +1000, Chris Angelico wrote: > On Thu, Aug 6, 2015 at 12:20 PM, Dan Sommers wrote: >> Common Lisp's format string is an entire DSL, but that DSL is like >> printf in that the string describes the formatting and the remaining >> arguments to the format function provide the data, rather than the >> string naming local variables or containing expressions to be >> evaluated. > Lots of languages have some sort of printf-like function (Python has > %-formatting and .format() both), where the actual content comes from > additional arguments. It's the magic of having the string *itself* > stipulate where to grab stuff from that's under discussion here. Yes, I agree. :-) Perhaps I should have said something like, "...that DSL *remains* like printf...." I tried to make the argument that non-shelly languages should stay away from that magic, but it apparently didn't come out the way I wanted it to. From steve at pearwood.info Thu Aug 6 05:18:46 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 6 Aug 2015 13:18:46 +1000 Subject: [Python-ideas] Briefer string format In-Reply-To: <55BED537.8020000@trueblade.com> References: <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> Message-ID: <20150806031843.GT3737@ando.pearwood.info> On Sun, Aug 02, 2015 at 10:43:03PM -0400, Eric V. Smith wrote: > As I pointed out earlier, it's not exactly str(eval(s)). Also, what's > your concern with the suggested approach? There are no security concerns > as there would be with eval-ing arbitrary strings. This comment has been sitting at the back of my mind for days, and I suddenly realised why. That's not correct, there are security concerns. They're not entirely new concerns, but the new syntax makes it easier to fall into the security hole. Here's an example of shell injection in PHP: "); $file=$_GET['filename']; system("rm $file"); ?> https://www.owasp.org/index.php/Command_Injection With the new syntax, Python's example will be: os.system(f"rm {file}") or even os.system("rm \{file}") if Eric's second proposal goes ahead. Similarly for SQL injection and other command injection attacks. It is true that the same issues can occur today, for example: os.system("rm %s" % file) but it's easier to see the possibility of an injection with an explicit interpolation operator than the proposed implicit one. We can teach people to avoid the risk of command injection attacks by avoiding interpolation, but the proposed syntax makes it easier to use interpolation without noticing. Especially with the proposed \{} syntax, any string literal could do runtime interpolation, and the only way to know whether it does or not is to inspect the entire string carefully. Passing a literal is no longer safe, as string literals will no longer just be literals, they will be runtime expressions. Bottom line: the new syntax will make it easier for command injection to remain unnoticed. Convenience cuts both ways. Making the use of string interpolation easier also makes the *misuse* of string interpolation easier. -- Steve From ncoghlan at gmail.com Thu Aug 6 05:26:34 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Aug 2015 13:26:34 +1000 Subject: [Python-ideas] proposal: "python -m foo" should bind sys.modules['foo'] In-Reply-To: <20150806000753.GA32566@cskk.homeip.net> References: <20150806000753.GA32566@cskk.homeip.net> Message-ID: On 6 August 2015 at 10:07, Cameron Simpson wrote: > I suspect "How Reloading Will Work" would need to track both module.__name__ > and module.__spec__.name to reattach the module to both entires in > sys.modules. Conveniently, the fact that reloading rewrites the global namespace of the existing module, rather than creating the new module, means that the dual references won't create any new problems relating to multiple references - we already hit those issues due to the fact that modules refer directly to each from their module namespaces. > I'd like to have a go at addressing just the change I outline above, in the > interests of just getting it done. Is that too narrow a change or PEP topic? PEPs can be used for quite small things if we want to check for edge cases, and the interaction between __main__ and the rest of the import system is a truly fine source of those :) > Are there specific other things I should be considering/addressing that > might be affected by my suggestion? Using __spec__.name for pickling: http://bugs.python.org/issue19702 Proposed runpy refactoring to reduce the special casing for __main__: http://bugs.python.org/issue19982 > Also, where do I find to source for runpy to preruse? It's a standard library module: https://hg.python.org/cpython/file/default/Lib/runpy.py "_run_module_as_main" is the module level function that powers the "-m" switch. Actually *implementing* this change should be as simple as changing the line: main_globals = sys.modules["__main__"].__dict__ to instead be: main_module = sys.modules["__main__"] sys.modules[mod_spec.name] = main_module main_globals = main_module.__dict__ The PEP is mainly useful to more widely *advertise* the semantic change, since having the module start being accessible under both names has the potential to cause problems. In particular, I'll upgrade the pickle issue to something that *needs* to be addressed before this change can be made, as there will be programs that are working today because they'll be dual importing the main module, and then pickling objects from the properly imported one, which then unpickle correctly in other processes (even if __main__ is different). Preventing the dual import without also fixing the pickle compatibility issue when pickling __main__ objects would thus have the potential to break currently working code. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Aug 6 05:57:19 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Aug 2015 13:57:19 +1000 Subject: [Python-ideas] Running scripts with relative imports directly, was: Re: proposal: "python -m foo" In-Reply-To: <85twsdjo1y.fsf@benfinney.id.au> References: <20150805054630.GA66989@cskk.homeip.net> <20150805182320.052ab9d7@x230> <20150806023210.18b5387d@x230> <85twsdjo1y.fsf@benfinney.id.au> Message-ID: On 6 August 2015 at 09:47, Ben Finney wrote: > Paul Sokolovsky > writes: > >> I find it quite a sign of problem, because if one accepts that one >> can't just run a script right away, but needs to do extra steps, then >> well, that enterprisey niche is pretty crowded and there're more >> choices how to make it more complicated. > > Python's BDFL has spoken of running modules with relative import as > top-level scripts: > > I'm -1 on this and on any other proposed twiddlings of the __main__ > machinery. The only use case seems to be running scripts that happen > to be living inside a module's directory, which I've always seen as > an antipattern. To make me change my mind you'd have to convince me > that it isn't. > > > > He doesn't describe (that I can find) what makes him think it's an > antipattern, so I'm not clear on how he expects to be convinced it's a > valid pattern. It's an anti-pattern because doing it fundamentally confuses the import system's internal state: https://www.python.org/dev/peps/pep-0395/#why-are-my-imports-broken Relative imports from the main module just happen to be a situation where the failure is an obvious one rather than subtle state corruption. > Nonetheless, that appears to be the hurdle you'd need to > confront. This came up more recently during the PEP 420 discussions, when the requirement to include __init__.py to explicitly mark package directories was eliminated. This means there's no longer any way for the interpreter to reliably infer from the filesystem layout precisely where in the module hierarchy you intended a module to live. See https://www.python.org/dev/peps/pep-0420/#discussion for references. However, one of the subproposals from PEP 395 still offers a potential fix: https://www.python.org/dev/peps/pep-0395/#id24 That proposes to allow explicit relative imports at the command line, such that Paul's example could be correctly invoked as: python3 -m ..pkg.foo It would also be possible to provide a syntactic shorthand for submodules of top level packages: python3 -m .foo The key here is that the interpreter is being explicitly told that the current directory is inside a package, as well as how far down in the package hierarchy it lives, and can adjust the way it sets sys.path[0] accordingly before proceeding on to import "pkg.foo" as __main__. That should be a relatively uncomplicated addition to runpy._run_module_as_main that could be rolled into Cameron's PEP plans. Steps required: * count leading dots in the supplied mod_name * remove "leading_dots-1" trailing directory names from sys.path[0] * strip the leading dots from mod_name before continuing with the rest of the function * in the special case of only 1 leading dot, remove the final directory segment from sys.path[0] and prepend it to mod_name with a dot separator Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Aug 6 06:18:58 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Aug 2015 14:18:58 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On 6 August 2015 at 07:24, Oscar Benjamin wrote: > On 5 August 2015 at 19:56, Eric V. Smith wrote: >> >> In the "Briefer string format" thread, Guido suggested [1] in passing >> that it would have been nice if all literal strings had always supported >> string interpolation. >> >> I've come around to this idea as well, and I'm going to propose it for >> inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider >> either modifying it or creating a new (and very similar) PEP. >> >> The concept would be that all strings are scanned for \{ and } pairs. If >> any are found, then they'd be interpreted in the same was as the other >> discussion on "f-strings". That is, the expression between the \{ and } >> would be extracted and searched for conversion characters and format >> specifiers. The expression would be evaluated, converted if needed, have >> its __format__ method called, and the resulting string inserted back in >> to the original string. > > I strongly dislike this idea. One of the things I like about Python is > the fact that a string literal is just a string literal. I don't want > to have to scan through a large string and try to work out if it > really is just a literal or a dynamic context-dependent expression. I > would hold this objection if the proposal was a limited form of > variable interpolation (akin to .format) but if any string literal can > embed arbitrary expressions than I *really* don't like that idea. I'm in this camp as well. We already suffer from the problem that, unlike tuples, numbers and strings, lists, dictionary and set "literals" are actually formally displays that provide a shorthand for runtime procedural code, rather than literals that can potentially be fully resolved at compile time. This means there are *fundamentally* different limitations on what we can do with them. In particular, we can take literals, constant fold them, do various other kinds of things with them, because we *know* they're not dependent on runtime state - we know everything we need to know about them at compile time. This is an absolute of Python: string literals are constants, not arbitrary code execution constructs. Our own peephole generator assumes this, AST manipulation code assumes this, people reading code assume this, people teaching Python assume this. I already somewhat dislike the idea of having a "string display" be introduced by something as subtle as a prefix character, but so long as it gets its own AST node independent of the existing "I'm a constant" string node, I can live with it. There's at least a marker right up front to say to readers "unlike other strings, this one may depend on runtime state". If the prefix was an exclamation mark to further distinguish it from the alphabetical prefix characters, I'd be even happier :) Dropping the requirement for the prefix *loses* expressiveness from the language, because runtime dependent strings would no longer be clearly distinguished from the genuine literals. Having at least f"I may be runtime dependent!" as an indicator, and preferably !"I may be runtime dependent!" instead, permits a clean simple syntax for explicit interpolation, and dropping the prefix saves only one character at writing time, while making every single string literal potentially runtime dependent at reading time. Editors and IDEs can also be updated far more easily, since existing strings can be continue to be marked up as is, while prefixed strings can potentially be highlighted differently to indicate that they may contain arbitrary code (and should also be scanned for name references and type compatibility with string interpolation). Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Aug 6 06:35:03 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Aug 2015 14:35:03 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On 6 August 2015 at 14:18, Nick Coghlan wrote: > I'm in this camp as well. We already suffer from the problem that, > unlike tuples, numbers and strings, lists, dictionary and set > "literals" are actually formally displays that provide a shorthand for > runtime procedural code, rather than literals that can potentially be > fully resolved at compile time. Sorry, I had tuples in the wrong category there - they're their own unique snowflake, with a literal for the empty tuple, and an n-ary operator for larger tuples. The types with actual syntactic literals are strings, bytes, integers, floats and complex numbers (with an implied zero real component): https://docs.python.org/3/reference/lexical_analysis.html#literals The types with procedural displays are lists, sets, and dictionaries: https://docs.python.org/3/reference/expressions.html#displays-for-lists-sets-and-dictionaries One of the key things I'll personally be looking for with Eric's PEP are the proposed changes to the lexical analysis and expressions section of the language reference. With either f-strings or bang-strings (my suggested alternate colour for the bikeshed, which is exactly the same as f-strings, but would use "!" as the prefix instead of "f" to more clearly emphasise the distinction from the subtle effects of "u", "b" and "r"), those changes will be relatively straightforward - it will go in as a new kind of expression. If the proposal is to allow arbitrary code execution inside *any* string, then everything except raw strings will need to be moved out of the literals section and into the expressions section. That's a *lot* of churn in the language definition just to save typing one prefix character to explicitly request string interpolation. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Aug 6 07:23:23 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Aug 2015 15:23:23 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: On 6 August 2015 at 06:03, Guido van Rossum wrote: > We should look a bit more into how this proposal interacts with regular > expressions (where \{ can be used to avoid the special meaning of {..}). I > think \(..) would be more cumbersome than \{..}, since () is more common in > regular expressions than {}. Pondering the fact that "\N{GREEK LETTER ALPHA}", "{ref}".format_map(data), f"\{ref}" and string.Template("${ref}") all overload on "{}" as their parenthetical pair gave me an idea. Since we're effectively defining a "string display" (which will hopefully remain clearly independent of normal string literals), what if we were to bake internationalisation and localisation directly into this PEP, such that, by default, these new strings would be flagged for translation, and translations could change the *order* in which subexpressions were displayed, but not the *content* of those subexpressions? If we went down that path, then string.Template would provide the most appropriate inspiration for the spelling, with "$" as the escape character rather than "\". For regular expressions, the only compatibility issue would be needing to double up on "$$" when matching against the end of the input data. Using "!" rather than "f" as the prefix, we could take advantage of the existing (and currently redundant in Python 3) "u" prefix to mean "untranslated": !"Message: $msg" <-- translated and interpolated text string (user messages) !u"Message: $msg" <-- untranslated and interpolated text string (debugging, logging) !b"Message: $msg" <-- untranslated and binary interpolated byte sequence !r"Message: $msg" <-- disables "\" escapes, but not "$" escapes The format strings after the ":" for the !b"${data:0.2f}" case would be defined in terms of bytes.__mod__ rather than str.format The reason I really like this idea is that combining automatic interpolation with translation will help encourage folks to write Python programs that are translatable by default, and perhaps have to go back in later and mark some strings as untranslated, rather than the status quo, where a lot of programs tend to be written on the assumption they'll never be translated, so making them translatable requires a significant investment of time to go through and build the message catalog before translation can even begin. Reviewing PEP 292, which introduced string.Template, further lead me to take a 15 year trip in the time machine to Ka-Ping Yee's original PEP 215: https://www.python.org/dev/peps/pep-0215/ That has a couple of nice refinements over the subsequent simpler PEP 292 interpolation syntax, in that it allows "$obj.attr.attr", "$data[key]" and "$f(arg)" without requiring curly braces. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From python-ideas at mgmiller.net Thu Aug 6 07:28:35 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 05 Aug 2015 22:28:35 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: <55C2F083.5030503@mgmiller.net> Oscar and Nick bring up some important points. Still, I don't think it will be as dangerous in the long run as it might appear ahead of time. I say that *if* (and it's an important if), we can find a way to limit the syntax to the .format mini-language and not go the full monty, as a few of us worry. Also, remember the list of languages on wikipedia that have string interpolation? People have made this trade-off many times and appear happy with the feature, especially in dynamic languages. I remember a PyCon keynote a few years back. Guido said (paraphrasing...) "from a birds-eye view, perl, python, and ruby are all the same language. In the important parts anyway." Like the other two, Python is also used for shell-scripting tasks, and unfortunately, it's the only one of those without direct string interpolation, which has probably hindered its uptake in that area. It'd be useful everywhere though. So, let's not make perfect the enemy of pretty-damn awesome. I've been waiting for this feature for 15 years, from back around the turn of the century *cough*, when I traded in perl for python. ;) -Mike On 08/05/2015 09:18 PM, Nick Coghlan wrote: > On 6 August 2015 at 07:24, Oscar Benjamin wrote: >> On 5 August 2015 at 19:56, Eric V. Smith wrote: >>> >>> In the "Briefer string format" thread, Guido suggested [1] in passing >>> that it would have been nice if all literal strings had always supported >>> string interpolation. From g.brandl at gmx.net Thu Aug 6 07:29:45 2015 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 6 Aug 2015 07:29:45 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: <20150806031843.GT3737@ando.pearwood.info> References: <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <20150806031843.GT3737@ando.pearwood.info> Message-ID: On 08/06/2015 05:18 AM, Steven D'Aprano wrote: > With the new syntax, Python's example will be: > > os.system(f"rm {file}") > > or even > > os.system("rm \{file}") > > if Eric's second proposal goes ahead. Similarly for SQL injection and > other command injection attacks. > > It is true that the same issues can occur today, for example: > > os.system("rm %s" % file) > > but it's easier to see the possibility of an injection with an explicit > interpolation operator than the proposed implicit one. Is it? Why? To me, the problem of injection is completely orthogonal to how exactly the string interpolation is performed. Also, there's nothing "implicit" about the new syntax. It does not magically interpolate where it feels like, or coerce objects to strings. It interpolates wherever you - explicitly - put the new syntax. cheers, Georg From njs at pobox.com Thu Aug 6 08:05:18 2015 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 5 Aug 2015 23:05:18 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On Wed, Aug 5, 2015 at 9:35 PM, Nick Coghlan wrote: > my suggested alternate colour > for the bikeshed, which is exactly the same as f-strings, but would > use "!" as the prefix instead of "f" to more clearly emphasise the > distinction from the subtle effects of "u", "b" and "r" Well, this is very half-baked, perhaps quarter-baked or less, but throwing it out there... it's occurred to me that possibly the most plausible "sweet spot" for those who want macros in python (for the actually practically useful cases, like PonyORM [0], numexpr [1], dplyr-like syntax for pandas [2], ...) would be to steal a page from Rust [3] and define a new call syntax '!(...)'. It'd be exactly like regular function call syntax, except that: foo!(bar + 1, baz, quux=1) doesn't evaluate the arguments, it just passes their AST to foo, i.e. the above is sugar for something like foo(Call(args=[BinOp(Name("bar"), op=Add(), Num(1)), Name("baz")] keywords=[keyword(arg="quux", value=Num(1))])) So this way you get a nice syntactic marker at macro call sites. Obviously there are further extensions you could ring on this -- maybe you want to get fancy and use a different protocol for this like __macrocall__ instead of __call__ to reduce the chance of weird errors when accidentally leaving out the !, or define @!foo as providing a macro-decorator that gets the ast of the decorated object, etc. -- but that's the basic idea. I'm by no means prepared to mount a full defense / work out details / write a PEP of this idea this week, but since IMO ! really is the only obvious character to use for this, and now we seem to be talking about other uses for the ! character, I wanted to get it on the radar... Hey, maybe $ would make an even better string-interpolation sigil anyway? -n [0] http://ponyorm.com/ -- 'mydatabase.select!(o for o in Order if o.price < 100)' [1] https://github.com/pydata/numexpr -- 'eval_quickly!(sin(a) ** 2 / 2)', currently you have to put your code into strings and pass that [2] https://cran.r-project.org/web/packages/dplyr/vignettes/introduction.html mytable.filter!(height / weight > 1 and value > 100) -> mytable.select_rows((mytable.columns["height"] / mytable.columns["weight"] > 1) & (mytable.columns["value"] > 100), except with more opportunities for optimization (The reason dplyr can get away with the examples you see in that link is that R is weird and passes all function call arguments as lazily evaluated ast thunks) [3] https://doc.rust-lang.org/book/macros.html -- Nathaniel J. Smith -- http://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Aug 6 08:21:32 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Aug 2015 16:21:32 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C2F083.5030503@mgmiller.net> References: <55C25C74.50008@trueblade.com> <55C2F083.5030503@mgmiller.net> Message-ID: On 6 August 2015 at 15:28, Mike Miller wrote: > Oscar and Nick bring up some important points. Still, I don't think it will > be as dangerous in the long run as it might appear ahead of time. I say > that *if* (and it's an important if), we can find a way to limit the syntax > to the .format mini-language and not go the full monty, as a few of us > worry. > > Also, remember the list of languages on wikipedia that have string > interpolation? People have made this trade-off many times and appear happy > with the feature, especially in dynamic languages. There isn't a specific practical reason to conflate the existing static string literal syntax with the proposed syntactic support for runtime data interpolation. They're different things, and we can easily add the latter without simultaneously deciding to change the semantics of the former. Languages that *don't already have* static string literals as a separate concept wouldn't gain much from adding them - you can approximate them well by only having runtime data interpolation that you simply don't use in some cases. However, folks using those languages also don't have 20+ years of experience with strictly static string literals, and existing bodies of code that also assume that string literals are always true constants. Consider how implicit string interpolation might interact with gettext message extraction, for example, or that without a marker prefix, static type analysers are going to have to start scanning *every* string literal for embedded subexpressions to analyse, rather than being able to skip over the vast majority of existing strings which won't be using this 3.6+ only feature. If we add syntactic interpolation support in 3.6, and folks love it and say "wow, if only all strings behaved like this!", and find the explicit prefix marker to be a hindrance rather than a help when it comes to readability, *then* it makes sense to have the discussion about removing all string literals other than raw strings and implicitly replacing them with string displays. But given the significant implications for Python source code analysis, both by readers and by computers, it makes far more sense to me to just reject the notion of retrofitting implicit interpolation support entirely, and instead be clear that requesting data interpolation into an output string will always be a locally explicit operation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From python-ideas at mgmiller.net Thu Aug 6 08:25:05 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 05 Aug 2015 23:25:05 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C2F083.5030503@mgmiller.net> References: <55C25C74.50008@trueblade.com> <55C2F083.5030503@mgmiller.net> Message-ID: <55C2FDC1.7060008@mgmiller.net> Here I go again, just stumbled across this. Apparently C# (an even more "appy" language) in the new version 6.0 went through this same discussion in the last year. Here's what they came up with, and it is very close to the ideas talked about here: http://davefancher.com/2014/12/04/c-6-0-string-interpolation/ https://msdn.microsoft.com/en-us/library/Dn961160.aspx TL;DR - Interesting, they started with this syntax: WriteLine("My name is \{name}"); Then moved to this one: WriteLine($"My name is {name}"); I suppose to match C#'s @strings. I think we're on the right track. -Mike On 08/05/2015 10:28 PM, Mike Miller wrote: > Oscar and Nick bring up some important points. Still, I don't think it will be > as dangerous in the long run as it might appear ahead of time. I say that *if* > (and it's an important if), we can find a way to limit the syntax to the .format > mini-language and not go the full monty, as a few of us worry. > > Also, remember the list of languages on wikipedia that have string > interpolation? People have made this trade-off many times and appear happy with > the feature, especially in dynamic languages. > > I remember a PyCon keynote a few years back. Guido said (paraphrasing...) > "from a birds-eye view, perl, python, and ruby are all the same language. In > the important parts anyway." > > Like the other two, Python is also used for shell-scripting tasks, and > unfortunately, it's the only one of those without direct string interpolation, > which has probably hindered its uptake in that area. It'd be useful everywhere > though. > > So, let's not make perfect the enemy of pretty-damn awesome. I've been waiting > for this feature for 15 years, from back around the turn of the century *cough*, > when I traded in perl for python. ;) > > -Mike > > > On 08/05/2015 09:18 PM, Nick Coghlan wrote: >> On 6 August 2015 at 07:24, Oscar Benjamin wrote: >>> On 5 August 2015 at 19:56, Eric V. Smith wrote: >>>> >>>> In the "Briefer string format" thread, Guido suggested [1] in passing >>>> that it would have been nice if all literal strings had always supported >>>> string interpolation. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From ncoghlan at gmail.com Thu Aug 6 08:27:27 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Aug 2015 16:27:27 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On 6 August 2015 at 16:05, Nathaniel Smith wrote: > I'm by no means prepared to mount a full defense / work out details / write > a PEP of this idea this week, but since IMO ! really is the only obvious > character to use for this, and now we seem to be talking about other uses > for the ! character, I wanted to get it on the radar... Fortunately, using "!" as a string prefix doesn't preclude using it for the case you describe, or even from offering a full compile time macro syntax as "!name(contents)". It's one of the main reasons I like it over "$" as the marker prefix - it fits as a general "compile time shenanigans are happening here" marker if we decide to go that way in the future, while "$" is both heavier visually and very specific to string interpolation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Aug 6 08:47:23 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Aug 2015 16:47:23 +1000 Subject: [Python-ideas] Briefer string format In-Reply-To: <20150806031843.GT3737@ando.pearwood.info> References: <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <20150806031843.GT3737@ando.pearwood.info> Message-ID: On 6 August 2015 at 13:18, Steven D'Aprano wrote: > We can teach > people to avoid the risk of command injection attacks by avoiding > interpolation, but the proposed syntax makes it easier to use > interpolation without noticing. We actually aim to teach folks to avoid shell injection attacks by avoiding the shell: https://docs.python.org/3/library/subprocess.html#security-considerations If you invoke the shell in any kind of networked application, it's inevitable that you're eventually going to let a shell injection attack through (at which point you better hope you have something like SELinux or AppArmor configured to protect your system from your mistake). That said, this is also why I'm a fan of eventually allowing syntax like: !sh("sort $file > uniq > wc -l") !sql("select $col from $table") !html("$body") that eventually adapts whatever interpolation syntax we decide on here for format strings to other operations like shell commands and SQL queries. The more time I spend dealing with the practical realities of writing commercial software, the more convinced I became that the right way to do something and the easiest way to do something have to be the same way if we seriously expect people to consistently get it right (and yes, the PEP 466 & 476 discussions had a significant role to play in that change of heart, as did the Unicode changes between Python 2 & 3). When the current easiest way is wrong, the only way to reliably get people to do it right in the future is to provide an even easier way that automatically does the right thing by default (this also helps act as a forcing function that encourages folks to learn "how to do it right" in older versions, even if the new feature itself isn't available there). It's not a panacea (bad habits are hard to unlearn), but we can at least try to help stop particularly pernicious problems getting worse. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From njs at pobox.com Thu Aug 6 09:25:15 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 6 Aug 2015 00:25:15 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On Wed, Aug 5, 2015 at 11:27 PM, Nick Coghlan wrote: > > On 6 August 2015 at 16:05, Nathaniel Smith wrote: > > I'm by no means prepared to mount a full defense / work out details / write > > a PEP of this idea this week, but since IMO ! really is the only obvious > > character to use for this, and now we seem to be talking about other uses > > for the ! character, I wanted to get it on the radar... > > Fortunately, using "!" as a string prefix doesn't preclude using it > for the case you describe, or even from offering a full compile time > macro syntax as "!name(contents)". > > It's one of the main reasons I like it over "$" as the marker prefix - > it fits as a general "compile time shenanigans are happening here" > marker if we decide to go that way in the future, while "$" is both > heavier visually and very specific to string interpolation. I guess it's a matter of taste -- string interpolation doesn't strike me as particularly compile-time-shenanigany in the way that macros are, given that you could right now implement a function f such that f("...") would work exactly like the proposed f"..." with no macros needed. But it's true that both can easily coexist; the only potential conflict is in the aesthetics. -n -- Nathaniel J. Smith -- http://vorpus.org From eric at trueblade.com Thu Aug 6 10:27:49 2015 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 6 Aug 2015 04:27:49 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150805155316.567e5d16@anarchist.wooz.org> References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> <20150805155316.567e5d16@anarchist.wooz.org> Message-ID: <55C31A85.4000906@trueblade.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 8/5/2015 3:53 PM, Barry Warsaw wrote: > On Aug 05, 2015, at 03:34 PM, Yury Selivanov wrote: > >> On 2015-08-05 2:56 PM, Eric V. Smith wrote: >>> The concept would be that all strings are scanned for \{ and } >>> pairs. > > I think it's a very interesting idea too, although the devil is in > the details. Since this will be operating on string literals, > they'd be scanned at compile time right? Yes, they'd be scanned at compile time. As the AST is being built, the string would be parsed and transformed into the AST for the appropriate function calls. > Agreed that raw strings probably shouldn't be scanned. Since it > may happen that some surprising behavior occurs (long after it's > past __future__), there should be some way to prevent scanning. To > me that either means r'' strings don't get scanned or f'' is > required. I've come around to raw strings not being scanned. > I'm still unclear on what the difference would be between f'' > strings and these currently mythical scanned-strings are, but I'll > wait for the PEP. Well, that's a not-fully-specified idea, as of now. >> Have you considered using '#{..}' syntax (used by Ruby and >> CoffeeScript)? >> >> '\{..}' feels unbalanced and weird. > > As it does for me. Let's see what particular color Eric reaches > for. I agree with Guido that we use \ to mean "something special happens with the next character". And we use braces for str.format. Although ${...} also tugs at my heart strings. Eric. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVwxqFAAoJENxauZFcKtNxj/MH/1qIW9LN92KxGc16iCJ5enwx tXxzvu+6ki2iXphxN9AKm3l7XIR4QFGBkXEA2HBF5JaBpzp76/Ofvso98EfNKXk8 R7SfvfYXt3SPtySgjR0Gt/5eOt5VxAXYq9FTSfxz4EK/IGXyk8zoGpQsmFxvh05X lm239Q8wliuFiMzLPUWdwp1bfXdgpyQ+jw7AA5FGk6kMLzsGGX4OLGnJEhXOHIG9 sESJKhpHhuBBJ5pUZTpygaeSpMDLURH7M40MTEt/bWyYHCAWNxfgPxRp2ml18otJ dMlNL++BNuA3YFsq0UpYX61BQV37A7AiFfy+arA5HkSU+gU7tRQwzrqgHLLLKNY= =V+8l -----END PGP SIGNATURE----- From tjreedy at udel.edu Thu Aug 6 11:31:43 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 6 Aug 2015 05:31:43 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: On 8/6/2015 1:23 AM, Nick Coghlan wrote: I prefer a symbol over an 'f' that is too similar to other prefix letters. > Pondering the fact that "\N{GREEK LETTER ALPHA}", > "{ref}".format_map(data), f"\{ref}" and string.Template("${ref}") all > overload on "{}" as their parenthetical pair gave me an idea. > > Since we're effectively defining a "string display" (which will > hopefully remain clearly independent of normal string literals), what > if we were to bake internationalisation and localisation directly into > this PEP, such that, by default, these new strings would be flagged > for translation, and translations could change the *order* in which > subexpressions were displayed, but not the *content* of those > subexpressions? For internationalising Idle's menu, lines like (!'file', [ (!'_New File', '<>'), (!'_Open...', '<>'), (!'Open _Module...', '<>'), (!'Class _Browser', '<>'), (!'_Path Browser', '<>'), ... + another 50 lines of menu definition are *much* easier to type, read, and proofread than (_('file'), [ (_('_New File'), '<>'), (_('_Open...'), '<>'), (_('Open _Module...'), '<>'), (_('Class _Browser'), '<>'), (_('_Path Browser'), '<>'),-- ... + 50 similar lines The obnoxiousness of the latter, which literally makes me dizzy to read, was half my opposition to 'preparing' Idle for a use that might or might not ever happen. If there were a switch to just ignore the ! prefix, leaving no runtime cost, then I would be even happier with adding the !s and telling people, 'ok, go ahead and prepare translations and Idle is ready to go'. Terry Jan Reedy From eric at trueblade.com Thu Aug 6 13:46:44 2015 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 06 Aug 2015 07:46:44 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C2FDC1.7060008@mgmiller.net> References: <55C25C74.50008@trueblade.com> <55C2F083.5030503@mgmiller.net> <55C2FDC1.7060008@mgmiller.net> Message-ID: <55C34924.90002@trueblade.com> On 08/06/2015 02:25 AM, Mike Miller wrote: > Here I go again, just stumbled across this. > > Apparently C# (an even more "appy" language) in the new version 6.0 went > through this same discussion in the last year. Here's what they came up > with, and it is very close to the ideas talked about here: > > http://davefancher.com/2014/12/04/c-6-0-string-interpolation/ > https://msdn.microsoft.com/en-us/library/Dn961160.aspx > > TL;DR - Interesting, they started with this syntax: > > WriteLine("My name is \{name}"); > > Then moved to this one: > > WriteLine($"My name is {name}"); > > I suppose to match C#'s @strings. I think we're on the right track. That's very interesting, thanks for the pointers. So they're basically doing what we described in the f-string thread, and what my PEP currently describes. They do some fancier things with the parser, though, relating to strings. They allow arbitrary expressions, and call expr.ToString with the format specifier, the equivalent of us calling expr.__format__. I'll have to investigate their usage of IFormattable. Maybe there's something we can learn from that. Eric. From ncoghlan at gmail.com Thu Aug 6 15:01:15 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Aug 2015 23:01:15 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On 6 August 2015 at 17:25, Nathaniel Smith wrote: > On Wed, Aug 5, 2015 at 11:27 PM, Nick Coghlan wrote: >> Fortunately, using "!" as a string prefix doesn't preclude using it >> for the case you describe, or even from offering a full compile time >> macro syntax as "!name(contents)". >> >> It's one of the main reasons I like it over "$" as the marker prefix - >> it fits as a general "compile time shenanigans are happening here" >> marker if we decide to go that way in the future, while "$" is both >> heavier visually and very specific to string interpolation. > > I guess it's a matter of taste -- string interpolation doesn't strike > me as particularly compile-time-shenanigany in the way that macros > are, given that you could right now implement a function f such that > > f("...") > > would work exactly like the proposed > > f"..." > > with no macros needed. But it's true that both can easily coexist; the > only potential conflict is in the aesthetics. You can write functions that work like the ones I described as well. However, they all have the same problem: * you can't restrict them to "literals only", so you run a much higher risk of code injection attacks * you can only implement them via stack walking, so name resolution doesn't work right. You can get at the locals and globals for the calling frame, but normal strings are opaque to the compiler, so lexical scoping doesn't trigger properly By contrast, the "compile time shenanigans" approach lets you: * restrict them to literals only, closing off the worst of the injection attack vectors * make the construct transparent to the compiler, allowing lexical scoping to work reliably Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eric at trueblade.com Thu Aug 6 15:43:39 2015 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 06 Aug 2015 09:43:39 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C31A85.4000906@trueblade.com> References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> <20150805155316.567e5d16@anarchist.wooz.org> <55C31A85.4000906@trueblade.com> Message-ID: <55C3648B.5030009@trueblade.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/06/2015 04:27 AM, Eric V. Smith wrote: >> Agreed that raw strings probably shouldn't be scanned. Since it >> may happen that some surprising behavior occurs (long after it's >> past __future__), there should be some way to prevent scanning. >> To me that either means r'' strings don't get scanned or f'' is >> required. > > I've come around to raw strings not being scanned. One advantage of the f-string approach is that you could interpolate raw strings if you wanted to: >>> x=42 >>> f"\b {x}" '\x08 42' >>> rf"\b {x}" '\\b 42' Eric. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAEBAgAGBQJVw2SLAAoJENxauZFcKtNx0hEIAKZg9urj8lLI11EDLcnNrcQN 6wFmILA6t4FxIRw9CHAJxvE02rrQhVgj/KzknSbMAilvb9PHI7Q7RTJ/yS0xbCc4 Mw+0nLCJMG/S3R7vrVyjroCO97FBlMCCyrZXGZlVh6/WFR4UnVFhqEIUO5i/kVbL 4fNc57wVY5ibfsu1NXkn0YmZqKEb6+t434wmb89bta5mYztG845CK+Vge+dT1zoi hIO05Vy9D+eUbWrVl+9sQAoZmZboemGyugRzKv6uZpTis5dyCeFxAWm4GQNtQe/G 3ICwUBTRKzvldkd5oc8ehi3bnGHUCTn8R4j4lPneO/S8pMn6vWsvkfFENHHSE/8= =gUZ0 -----END PGP SIGNATURE----- From ron3200 at gmail.com Thu Aug 6 16:15:46 2015 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 06 Aug 2015 10:15:46 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <20150806031843.GT3737@ando.pearwood.info> Message-ID: On 08/06/2015 02:47 AM, Nick Coghlan wrote: > the > right way to do something and the easiest way to do something have to > be the same way Maybe this should be added to Python's Zen? "The right way to do something and the easiest way to do something should be the same way." Cheers, Ron From guido at python.org Thu Aug 6 16:28:00 2015 From: guido at python.org (Guido van Rossum) Date: Thu, 6 Aug 2015 16:28:00 +0200 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On Thu, Aug 6, 2015 at 6:18 AM, Nick Coghlan wrote: > On 6 August 2015 at 07:24, Oscar Benjamin > wrote: > > I strongly dislike this idea. One of the things I like about Python is > > the fact that a string literal is just a string literal. I don't want > > to have to scan through a large string and try to work out if it > > really is just a literal or a dynamic context-dependent expression. I > > would hold this objection if the proposal was a limited form of > > variable interpolation (akin to .format) but if any string literal can > > embed arbitrary expressions than I *really* don't like that idea. > > I'm in this camp as well. We already suffer from the problem that, > unlike tuples, numbers and strings, lists, dictionary and set > "literals" are actually formally displays that provide a shorthand for > runtime procedural code, rather than literals that can potentially be > fully resolved at compile time. > > This means there are *fundamentally* different limitations on what we > can do with them. In particular, we can take literals, constant fold > them, do various other kinds of things with them, because we *know* > they're not dependent on runtime state - we know everything we need to > know about them at compile time. > I don't buy this argument. We already arrange things so that (x, y) invokes a tuple constructor after loading x and y, while (1, 2) is loaded as a single constant. Syntactically, "xyzzy" remains a constant, while "the \{x} and the \{y}" becomes an expression that (among other things) loads the values of x and y. > This is an absolute of Python: string literals are constants, not > arbitrary code execution constructs. Our own peephole generator > assumes this, AST manipulation code assumes this, people reading code > assume this, people teaching Python assume this. > > I already somewhat dislike the idea of having a "string display" be > introduced by something as subtle as a prefix character, but so long > as it gets its own AST node independent of the existing "I'm a > constant" string node, I can live with it. There's at least a marker > right up front to say to readers "unlike other strings, this one may > depend on runtime state". If the prefix was an exclamation mark to > further distinguish it from the alphabetical prefix characters, I'd be > even happier :) > > Dropping the requirement for the prefix *loses* expressiveness from > the language, because runtime dependent strings would no longer be > clearly distinguished from the genuine literals. Having at least f"I > may be runtime dependent!" as an indicator, and preferably !"I may be > runtime dependent!" instead, permits a clean simple syntax for > explicit interpolation, and dropping the prefix saves only one > character at writing time, while making every single string literal > potentially runtime dependent at reading time. > Here you're just expressing the POV of someone coming from Python 3.5 (or earlier). To future generations, like to users of all those languages mentioned in the Wikipedia article, it'll be second nature to scan string literals for interpolations, and since most strings are short most readers won't even be aware that they're doing it. And if there's a long string (say some template) somewhere, you have to look carefully anyway to notice things like en embedded "+x+" somewhere, or a trailing method call (e.g. .strip()). > Editors and IDEs can also be updated far more easily, since existing > strings can be continue to be marked up as is, while prefixed strings > can potentially be highlighted differently to indicate that they may > contain arbitrary code (and should also be scanned for name references > and type compatibility with string interpolation). > For an automated tool it's trivial to scan strings for \{. And yes, the part between \{ and } should be marked up differently (and probably the :format or !r/!s differently again). Also, your phrase "contain arbitrary code" still sounds like a worry about code injection. You might as well worry about code injection in function calls. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Aug 6 17:10:46 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Aug 2015 01:10:46 +1000 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <20150806031843.GT3737@ando.pearwood.info> Message-ID: On 7 August 2015 at 00:15, Ron Adam wrote: > > > On 08/06/2015 02:47 AM, Nick Coghlan wrote: >> >> the >> right way to do something and the easiest way to do something have to >> be the same way > > > Maybe this should be added to Python's Zen? > > > "The right way to do something and the easiest way > to do something should be the same way." It's already there in my view: $ python -m this | grep 'obvious way' There should be one-- and preferably only one --obvious way to do it. When a particular approach is both easy and right, it rapidly becomes the obvious choice. Issues arise when the right way is harder than the wrong way, since the apparently obvious way is a bad idea, but the superior alternative isn't as clearly applicable to the problem. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Aug 6 17:20:20 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Aug 2015 01:20:20 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On 7 August 2015 at 00:28, Guido van Rossum wrote: > Also, your phrase "contain arbitrary code" still sounds like a worry about > code injection. You might as well worry about code injection in function > calls. Sort of - it's more a matter of hanging around with functional programmers lately and hence paying more attention to the implications of expressions with side effects. At the moment, there's no need to even look inside a string for potential side effects, but that would change with implicit interpolation in the presence of mutable objects. I can't think of a good reason to include a mutating operation in an interpolated string, but there's nothing preventing it either, so it becomes another place requiring closer scrutiny during a code review. If interpolated strings are always prefixed, then longer strings lacking the prefix can often be skipped over as "no side effects here!" - the worst thing you're likely to miss in such cases is a typo. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Thu Aug 6 17:33:08 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 6 Aug 2015 16:33:08 +0100 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On 6 August 2015 at 05:18, Nick Coghlan wrote: > On 6 August 2015 at 07:24, Oscar Benjamin wrote: >> On 5 August 2015 at 19:56, Eric V. Smith wrote: >>> >>> In the "Briefer string format" thread, Guido suggested [1] in passing >>> that it would have been nice if all literal strings had always supported >>> string interpolation. >>> >>> I've come around to this idea as well, and I'm going to propose it for >>> inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider >>> either modifying it or creating a new (and very similar) PEP. >>> >>> The concept would be that all strings are scanned for \{ and } pairs. If >>> any are found, then they'd be interpreted in the same was as the other >>> discussion on "f-strings". That is, the expression between the \{ and } >>> would be extracted and searched for conversion characters and format >>> specifiers. The expression would be evaluated, converted if needed, have >>> its __format__ method called, and the resulting string inserted back in >>> to the original string. >> >> I strongly dislike this idea. One of the things I like about Python is >> the fact that a string literal is just a string literal. I don't want >> to have to scan through a large string and try to work out if it >> really is just a literal or a dynamic context-dependent expression. I >> would hold this objection if the proposal was a limited form of >> variable interpolation (akin to .format) but if any string literal can >> embed arbitrary expressions than I *really* don't like that idea. > > I'm in this camp as well. We already suffer from the problem that, > unlike tuples, numbers and strings, lists, dictionary and set > "literals" are actually formally displays that provide a shorthand for > runtime procedural code, rather than literals that can potentially be > fully resolved at compile time. > > This means there are *fundamentally* different limitations on what we > can do with them. In particular, we can take literals, constant fold > them, do various other kinds of things with them, because we *know* > they're not dependent on runtime state - we know everything we need to > know about them at compile time. > > This is an absolute of Python: string literals are constants, not > arbitrary code execution constructs. Our own peephole generator > assumes this, AST manipulation code assumes this, people reading code > assume this, people teaching Python assume this. > > I already somewhat dislike the idea of having a "string display" be > introduced by something as subtle as a prefix character, but so long > as it gets its own AST node independent of the existing "I'm a > constant" string node, I can live with it. There's at least a marker > right up front to say to readers "unlike other strings, this one may > depend on runtime state". If the prefix was an exclamation mark to > further distinguish it from the alphabetical prefix characters, I'd be > even happier :) > > Dropping the requirement for the prefix *loses* expressiveness from > the language, because runtime dependent strings would no longer be > clearly distinguished from the genuine literals. Having at least f"I > may be runtime dependent!" as an indicator, and preferably !"I may be > runtime dependent!" instead, permits a clean simple syntax for > explicit interpolation, and dropping the prefix saves only one > character at writing time, while making every single string literal > potentially runtime dependent at reading time. > > Editors and IDEs can also be updated far more easily, since existing > strings can be continue to be marked up as is, while prefixed strings > can potentially be highlighted differently to indicate that they may > contain arbitrary code (and should also be scanned for name references > and type compatibility with string interpolation). > > Regards, > Nick. I'm with Nick here. I think of string literals as just that - *literals* and this proposal breaks that. I had a vague discomfort with the f-string proposal, but I couldn't work out why, and the convenience outweighed the disquiet. But it was precisely this point - that f-strings aren't literals, whereas all of the *other* forms of (prefixed or otherwise) strings are. I'm still inclined in favour of the f-string proposal, because of the convenience (I have never really warmed to the verbosity of "a {}".format("message") even though I use it all the time). But I'm definitely against the idea of making unprefixed string notation no longer a literal (heck, I even had to stop myself saying "unprefixed string literals" there - that's how ingrained the idea that "..." is a literal is). Paul From p.f.moore at gmail.com Thu Aug 6 17:37:28 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 6 Aug 2015 16:37:28 +0100 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C34924.90002@trueblade.com> References: <55C25C74.50008@trueblade.com> <55C2F083.5030503@mgmiller.net> <55C2FDC1.7060008@mgmiller.net> <55C34924.90002@trueblade.com> Message-ID: On 6 August 2015 at 12:46, Eric V. Smith wrote: >> TL;DR - Interesting, they started with this syntax: >> >> WriteLine("My name is \{name}"); >> >> Then moved to this one: >> >> WriteLine($"My name is {name}"); >> >> I suppose to match C#'s @strings. I think we're on the right track. > > That's very interesting, thanks for the pointers. So they're basically > doing what we described in the f-string thread, and what my PEP > currently describes. They do some fancier things with the parser, > though, relating to strings. > > They allow arbitrary expressions, and call expr.ToString with the format > specifier, the equivalent of us calling expr.__format__. I'll have to > investigate their usage of IFormattable. Maybe there's > something we can learn from that. They also appear to have backed away from allowing interpolation without an explicit prefix (disclaimer - I didn't read the articles). Paul From random832 at fastmail.us Thu Aug 6 18:26:14 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Thu, 06 Aug 2015 12:26:14 -0400 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: <55C25C74.50008@trueblade.com> References: <55C25C74.50008@trueblade.com> Message-ID: <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote: > Because strings containing \{ are currently valid Which raises the question of why. (and as long as we're talking about things to deprecate in string literals, how about \v?) From barry at python.org Thu Aug 6 20:00:41 2015 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Aug 2015 14:00:41 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> <20150805155316.567e5d16@anarchist.wooz.org> <55C31A85.4000906@trueblade.com> Message-ID: <20150806140041.7992bd7f@anarchist.wooz.org> On Aug 06, 2015, at 04:27 AM, Eric V. Smith wrote: >Although ${...} also tugs at my heart strings. Now you're in PEP 292 territory, so of course I like that. :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From barry at python.org Thu Aug 6 20:16:16 2015 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Aug 2015 14:16:16 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: <20150806141616.73ea712e@anarchist.wooz.org> On Aug 06, 2015, at 03:23 PM, Nick Coghlan wrote: >If we went down that path, then string.Template would provide the most >appropriate inspiration for the spelling, with "$" as the escape >character rather than "\". For regular expressions, the only >compatibility issue would be needing to double up on "$$" when >matching against the end of the input data. Well, you've pretty much reinvented flufl.i18n :) except of course I had to use _() as a marker because I couldn't use a special prefix. (There are a few knock-on advantages to using a function for this too, such as translation contexts, which become important for applications that are more sophisticated than simple command line scripts.) Having used this library in lots of code myself *and* interacted with actual translators from the Mailman project, I really do think this approach is the easiest both to code in and to get high quality less error-prone translations. The only slightly uncomfortable bit in practice is that you can sometimes have local variables that appear to be unused because they only exist to support interpolation. This sometimes causes false positives with pyflakes for example. flufl.i18n doesn't support arbitrary expressions; it really is just built on top of string.Template. But TBH, I think arbitrary expressions, and even format strings are overkill (and possibly dangerous) for an i18n application. Dangerous because any additional noise that has to be copied verbatim by translators is going to lead to errors in the catalog. Much better to leave any conversion or expression evaluation to the actual code rather than the the string. The translated string should *only* interpolate - that's really all the power you need to add! >That has a couple of nice refinements over the subsequent simpler PEP >292 interpolation syntax, in that it allows "$obj.attr.attr", >"$data[key]" and "$f(arg)" without requiring curly braces. flufl.i18n also adds attribute chasing by using a customized dict-subclass that parse and interprets dots in the key. One other important note about translation contexts. It's very important to use .safe_substitute() because you absolutely do not want typos in the catalog to break your application. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From barry at python.org Thu Aug 6 20:27:41 2015 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Aug 2015 14:27:41 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> Message-ID: <20150806142741.3bebf7b3@anarchist.wooz.org> On Aug 06, 2015, at 11:01 PM, Nick Coghlan wrote: >* you can't restrict them to "literals only", so you run a much higher risk >of code injection attacks In an i18n context you do sometimes need to pass in non-literals. Restricting this thing to literals only doesn't really increase the attack vector significantly, and does close off an important use case. >* you can only implement them via stack walking, so name resolution doesn't >work right. You can get at the locals and globals for the calling frame, but >normal strings are opaque to the compiler, so lexical scoping doesn't trigger >properly In practice, you need sys._getframe(2) to make it work, although flufl.i18n does allow you to specify a different depth. In practice you could probably drop that for the most part. (ISTR an obscure use case for depth>2 but can't remember the details.) Really, the only nasty bit about flufl.i18n's implementation is the use of sys._getframe(). Fortunately, it's a big of ugliness that's buried in the implementation and never really seen by users. If there was a more better way of getting at globals and locals, that was Python-implementation independent, that would clean up this little wart. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From mertz at gnosis.cx Thu Aug 6 20:57:00 2015 From: mertz at gnosis.cx (David Mertz) Date: Thu, 6 Aug 2015 11:57:00 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: <55AC2EDF.7040205@mgmiller.net> References: <55AC2EDF.7040205@mgmiller.net> Message-ID: I've followed all the posts in this thread, and although my particular opinion has little significance, I'm definitely -1 on this idea (or actually -1000). To my mind is that we have already gone vastly too far in proliferating near synonyms for templating strings. Right now, I can type: >>> "My name is %(first)s %(last)s" % (**locals()) Or: >>> "My name is {first} {last}".format(**locals()) Or: >>> string.Template("My name is $first $last").substitute(**locals()) And they all mean the same thing, with pretty much the same capabilities. I REALLY don't want a 4th or 5th way to spell the same thing... let alone one with weird semantics with lots of edge cases that are almost impossible to teach. I really DO NOT want to spell the same thing as f"..." or !"...", let alone have every single string magically become a runtime evaluated complex object like "My name is \{first}". Yes, I know the oddball edge cases each style supports are slightly different... but that's exactly the problem. It's yet another thing to address an ever-so-slightly different case, where the actual differences are impossible to explain to students; and where there's frankly nothing you can't do with just a couple extra characters using str.format() right now. Yours, David... On Sun, Jul 19, 2015 at 4:12 PM, Mike Miller wrote: > Have long wished python could format strings easily like bash or perl do, > ... > and then it hit me: > > csstext += f'{nl}{selector}{space}{{{nl}' > > (This script included whitespace vars to provide a minification option.) > > I've seen others make similar suggestions, but to my knowledge they didn't > include this pleasing brevity aspect. > > -Mike > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Aug 6 21:02:26 2015 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 6 Aug 2015 14:02:26 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: On Wed, Aug 5, 2015 at 8:58 PM, Terry Reedy wrote: > On 8/5/2015 3:34 PM, Yury Selivanov wrote: > > '\{..}' feels unbalanced and weird. >> > > Escape both. The closing } is also treated specially, and not inserted > into the string. The compiler scans linearly from left to right, but human > eyes are not so constrained. > > s = "abc\{kjljid some long expression jk78738}def" > > versus > > s = "abc\{kjljid some long expression jk78738\}def" > > and how about > > s = "abc\{kjljid some {long} expression jk78738\}def" +1: escape \{both\}. Use cases where this is (as dangerous as other string interpolation methods): * Shell commands that should be shlex-parsed/quoted * (inappropriately, programmatically) writing code with manually-added quotes ' and doublequotes " * XML,HTML,CSS,SQL, textual query language injection * Convenient, but dangerous and IMHO much better handled by e.g. MarkupSafe, a DOM builder, a query ORM layer Docs / Utils: * [ ] ENH: AST scanner for these (before i do __futre__ import) * [ ] DOC: About string interpolation, in general > > > > > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Aug 6 21:25:02 2015 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 6 Aug 2015 14:25:02 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: On Thu, Aug 6, 2015 at 2:02 PM, Wes Turner wrote: > > > On Wed, Aug 5, 2015 at 8:58 PM, Terry Reedy wrote: > >> On 8/5/2015 3:34 PM, Yury Selivanov wrote: >> >> '\{..}' feels unbalanced and weird. >>> >> >> Escape both. The closing } is also treated specially, and not inserted >> into the string. The compiler scans linearly from left to right, but human >> eyes are not so constrained. >> >> s = "abc\{kjljid some long expression jk78738}def" >> >> versus >> >> s = "abc\{kjljid some long expression jk78738\}def" >> >> and how about >> >> s = "abc\{kjljid some {long} expression jk78738\}def" > > > +1: escape \{both\}. > > Use cases where this is (as dangerous as other string interpolation > methods): > > * Shell commands that should be shlex-parsed/quoted > * (inappropriately, programmatically) writing > code with manually-added quotes ' and doublequotes " > * XML,HTML,CSS,SQL, textual query language injection > * Convenient, but dangerous and IMHO much better handled > by e.g. MarkupSafe, a DOM builder, a query ORM layer > > Docs / Utils: > > * [ ] ENH: AST scanner for these (before i do __futre__ import) > * [ ] DOC: About string interpolation, in general > BTW here's a PR to add subprocess compat to sarge (e.g. for sarge.run) * https://bitbucket.org/vinay.sajip/sarge/pull-requests/1/enh-add-call-check_call-check_output * https://sarge.readthedocs.org/en/latest/overview.html#why-not-just-use-subprocess * https://cwe.mitre.org/top25/ * #1: https://cwe.mitre.org/top25/#CWE-89 SQL Injection * #2: https://cwe.mitre.org/top25/#CWE-78 OS Command injection * .... > > >> >> >> >> >> >> -- >> Terry Jan Reedy >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Thu Aug 6 21:37:33 2015 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Thu, 6 Aug 2015 15:37:33 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C25C74.50008@trueblade.com> References: <55C25C74.50008@trueblade.com> Message-ID: <55C3B77D.6020608@gmail.com> Eric, On 2015-08-05 2:56 PM, Eric V. Smith wrote: > I've come around to this idea as well, and I'm going to propose it for > inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider > either modifying it or creating a new (and very similar) PEP. While reading this thread, a few messages regarding i18n and ways to have it with new strings caught my attention. I'm not a big fan of having all string literals "scanned", so I'll illustrate my idea on f-strings. What if we introduce f-strings in the following fashion: 1. ``f'string {var}'`` is equivalent to ``'string {var}'.format(**locals())`` -- no new formatting syntax. 2. there is a 'sys.set_format_hook()' function that allows to set a global formatting hook for all f-strings: # pseudo-code def i18n(str, vars): if current_lang != 'en': str = gettext(str, current_lang) return str.format(vars) sys.set_format_hook(i18n) This would allow much more convenient way not only to format strings, but also to integrate various i18n frameworks: f'Welcome, {user}' instead of _('Welcome, {user}') Yury From eric at trueblade.com Thu Aug 6 21:44:21 2015 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 06 Aug 2015 15:44:21 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: <55C3B915.2060404@trueblade.com> On 08/06/2015 03:02 PM, Wes Turner wrote: > > > On Wed, Aug 5, 2015 at 8:58 PM, Terry Reedy > wrote: > > On 8/5/2015 3:34 PM, Yury Selivanov wrote: > > '\{..}' feels unbalanced and weird. > > > Escape both. The closing } is also treated specially, and not > inserted into the string. The compiler scans linearly from left to > right, but human eyes are not so constrained. > > s = "abc\{kjljid some long expression jk78738}def" > > versus > > s = "abc\{kjljid some long expression jk78738\}def" > > and how about > > s = "abc\{kjljid some {long} expression jk78738\}def" > > > +1: escape \{both\}. > > Use cases where this is (as dangerous as other string interpolation > methods): > > * Shell commands that should be shlex-parsed/quoted > * (inappropriately, programmatically) writing > code with manually-added quotes ' and doublequotes " > * XML,HTML,CSS,SQL, textual query language injection > * Convenient, but dangerous and IMHO much better handled > by e.g. MarkupSafe, a DOM builder, a query ORM layer > > Docs / Utils: > > * [ ] ENH: AST scanner for these (before i do __futre__ import) > * [ ] DOC: About string interpolation, in general I don't understand what you're trying to say. os.system("cp \{cmd}") is no better or worse than: os.system("cp " + cmd) Yes, there are lots of opportunities in the world for injection attacks. This proposal doesn't change that. I don't see how escaping the final } changes anything. Eric. From guido at python.org Thu Aug 6 22:02:00 2015 From: guido at python.org (Guido van Rossum) Date: Thu, 6 Aug 2015 22:02:00 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> Message-ID: Unfortunately, all spellings that require calling locals() are wrong. On Thu, Aug 6, 2015 at 8:57 PM, David Mertz wrote: > I've followed all the posts in this thread, and although my particular > opinion has little significance, I'm definitely -1 on this idea (or > actually -1000). > > To my mind is that we have already gone vastly too far in proliferating > near synonyms for templating strings. Right now, I can type: > > >>> "My name is %(first)s %(last)s" % (**locals()) > > Or: > > >>> "My name is {first} {last}".format(**locals()) > > Or: > > >>> string.Template("My name is $first $last").substitute(**locals()) > > And they all mean the same thing, with pretty much the same capabilities. > I REALLY don't want a 4th or 5th way to spell the same thing... let alone > one with weird semantics with lots of edge cases that are almost impossible > to teach. > > I really DO NOT want to spell the same thing as f"..." or !"...", let > alone have every single string magically become a runtime evaluated complex > object like "My name is \{first}". > > Yes, I know the oddball edge cases each style supports are slightly > different... but that's exactly the problem. It's yet another thing to > address an ever-so-slightly different case, where the actual differences > are impossible to explain to students; and where there's frankly nothing > you can't do with just a couple extra characters using str.format() right > now. > > Yours, David... > > > On Sun, Jul 19, 2015 at 4:12 PM, Mike Miller > wrote: > >> Have long wished python could format strings easily like bash or perl do, >> ... >> and then it hit me: >> >> csstext += f'{nl}{selector}{space}{{{nl}' >> >> (This script included whitespace vars to provide a minification option.) >> >> I've seen others make similar suggestions, but to my knowledge they didn't >> include this pleasing brevity aspect. >> >> -Mike >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Aug 6 22:35:57 2015 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 6 Aug 2015 15:35:57 -0500 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> Message-ID: On Aug 6, 2015 3:03 PM, "Guido van Rossum" wrote: > > Unfortunately, all spellings that require calling locals() are wrong. Is this where the potential source of surprising error is? * Explicit / Implicit locals() * To me, the practicality of finding '%' and .format is more important than the convenience of an additional syntax with implicit scope, but is that beside the point? > > On Thu, Aug 6, 2015 at 8:57 PM, David Mertz wrote: >> >> I've followed all the posts in this thread, and although my particular opinion has little significance, I'm definitely -1 on this idea (or actually -1000). >> >> To my mind is that we have already gone vastly too far in proliferating near synonyms for templating strings. Right now, I can type: >> >> >>> "My name is %(first)s %(last)s" % (**locals()) >> >> Or: >> >> >>> "My name is {first} {last}".format(**locals()) >> >> Or: >> >> >>> string.Template("My name is $first $last").substitute(**locals()) >> >> And they all mean the same thing, with pretty much the same capabilities. I REALLY don't want a 4th or 5th way to spell the same thing... let alone one with weird semantics with lots of edge cases that are almost impossible to teach. >> >> I really DO NOT want to spell the same thing as f"..." or !"...", let alone have every single string magically become a runtime evaluated complex object like "My name is \{first}". >> >> Yes, I know the oddball edge cases each style supports are slightly different... but that's exactly the problem. It's yet another thing to address an ever-so-slightly different case, where the actual differences are impossible to explain to students; and where there's frankly nothing you can't do with just a couple extra characters using str.format() right now. >> >> Yours, David... >> >> >> On Sun, Jul 19, 2015 at 4:12 PM, Mike Miller wrote: >>> >>> Have long wished python could format strings easily like bash or perl do, ... >>> and then it hit me: >>> >>> csstext += f'{nl}{selector}{space}{{{nl}' >>> >>> (This script included whitespace vars to provide a minification option.) >>> >>> I've seen others make similar suggestions, but to my knowledge they didn't >>> include this pleasing brevity aspect. >>> >>> -Mike >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> >> -- >> Keeping medicines from the bloodstreams of the sick; food >> from the bellies of the hungry; books from the hands of the >> uneducated; technology from the underdeveloped; and putting >> advocates of freedom in prisons. Intellectual property is >> to the 21st century what the slave trade was to the 16th. >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Aug 6 23:00:58 2015 From: guido at python.org (Guido van Rossum) Date: Thu, 6 Aug 2015 23:00:58 +0200 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> Message-ID: On Thu, Aug 6, 2015 at 9:02 PM, Wes Turner wrote: > > On Wed, Aug 5, 2015 at 8:58 PM, Terry Reedy wrote: > >> On 8/5/2015 3:34 PM, Yury Selivanov wrote: >> >> '\{..}' feels unbalanced and weird. >>> >> >> Escape both. The closing } is also treated specially, and not inserted >> into the string. The compiler scans linearly from left to right, but human >> eyes are not so constrained. >> >> s = "abc\{kjljid some long expression jk78738}def" >> >> versus >> >> s = "abc\{kjljid some long expression jk78738\}def" >> >> and how about >> >> s = "abc\{kjljid some {long} expression jk78738\}def" > > > +1: escape \{both\}. > That looks worse to me. In my eyes, the construct has two parts: the \ and the {...}. (Similar to \N{...}, whose parts are \N and {...}.) Most of the time the expression is short and sweet -- either something like \{width} or \{obj.width}, or perhaps a simple expression like \{width(obj)}. Adding an extra \ does nothing to enhance readability. Giving long or obfuscated expressions that *could* be written using some proposed feature to argue against it is a long-standing rhetoric strategy, similar to "strawman". -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Thu Aug 6 23:03:01 2015 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 7 Aug 2015 07:03:01 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On 6 August 2015 at 16:05, Nathaniel Smith wrote: > On Wed, Aug 5, 2015 at 9:35 PM, Nick Coghlan wrote: > > use "!" as the prefix instead of "f" to more clearly emphasise the > > distinction from the subtle effects of "u", "b" and "r" > > Hey, maybe $ would make an even better string-interpolation sigil anyway? > +1 for $"..." being an interpolated string. The syntax just makes sense. Doesn't prevent us from using $ elsewhere, but it does set a precedent that it should be used in interpolation/substitution-style contexts. +0 for !"..." being an interpolated string. It's not particularly obvious to me, but I do like the def foo!(ast) syntax, and symmetry with that wouldn't be bad. Although I wouldn't mind def foo$(ast) either - $ stands out more, and this could be considered a substitution-style context. -1000 on unprefixed string literals becoming interpolated. But the prefix should be able to be used with raw strings somehow ... r$"..."? $r"..."? Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Thu Aug 6 23:43:53 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 06 Aug 2015 23:43:53 +0200 Subject: [Python-ideas] fork In-Reply-To: References: <20150801173628.29BAB873FE@smtp04.mail.de> <55BFA0B3.1010702@mail.de> <6A8EA952-ED98-4C26-9A40-54BE54367849@yahoo.com> <55C0FFDD.5020002@mail.de> Message-ID: <55C3D519.90206@mail.de> On 06.08.2015 17:52, Xavier Combelle wrote: > >> One quick comment: from my experience (mostly with other >> languages that are very different from Python, so I can't promise >> how well it applies here...), implicit futures without implicit >> laziness or even an explicit delay mechanism are not as useful as >> they look at first glance. Code that forks off 8 Fibonacci calls, >> but waits for each one's result before forking off the next one, >> might as well have just stayed sequential. And if you're going to >> use the result by forking off another job, then it's actually >> more convenient to use explicit futures like the ones in the stdlib. >> >> One slightly bigger idea: If you really want to pursue your >> implicit-as-possible design further, you might want to consider >> making the decorators replace the function with an object whose >> __call__ method just implicitly submits it to the pool. > > I added two new decorators for this. But they don't work with the > @ syntax. It seems like a well-known issue of Python: > > _pickle.PicklingError: Can't pickle 0x7f8eaeb09730>: it's not the same object as __main__.fib_fork > > Would be great if somebody could fix that. > > > Sorry but I don't follow you have you any example that fail ? I fixed that, well, halfhearted: https://github.com/srkunze/fork/blob/2359265/fork.py#L47 and the following 3 lines. Remove that lines, and the tests using @cpu_bound_fork will fail. The reason for this is that the Pickle module is only capable of pickling module-level-named objects. Do you have a better fix? I would rather see that fixed in the Python internal decorator implementation than by me. Cheers, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu Aug 6 23:53:51 2015 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Aug 2015 17:53:51 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> Message-ID: <20150806175351.0f4c8001@anarchist.wooz.org> On Aug 06, 2015, at 03:37 PM, Yury Selivanov wrote: >What if we introduce f-strings in the following fashion: > >1. ``f'string {var}'`` is equivalent to >``'string {var}'.format(**locals())`` -- no new formatting >syntax. You really do want to include globals too, with locals overriding them. >2. there is a 'sys.set_format_hook()' function that allows >to set a global formatting hook for all f-strings: > > # pseudo-code > def i18n(str, vars): > if current_lang != 'en': > str = gettext(str, current_lang) > return str.format(vars) > > sys.set_format_hook(i18n) > >This would allow much more convenient way not only to format >strings, but also to integrate various i18n frameworks: > > f'Welcome, {user}' instead of _('Welcome, {user}') I don't think you want this to be a process-global hook since different modules may be using a different i18n systems. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From srkunze at mail.de Fri Aug 7 00:08:02 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 07 Aug 2015 00:08:02 +0200 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: <55C3DAC2.5070406@mail.de> I am somehow +0 on this. It seems like a crazy useful idea. However, it's maybe too much magic for Python? I have to admit that I dislike the \{...} syntax. Looks awkward as does escaping almost always. It's a personal taste but it seems there are others agreeing on that. This said, I would prefer f'...' in order to retain the nice {...} look. Regards, Sven From wes.turner at gmail.com Fri Aug 7 00:15:49 2015 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 6 Aug 2015 17:15:49 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C3B915.2060404@trueblade.com> References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> <55C3B915.2060404@trueblade.com> Message-ID: On Thu, Aug 6, 2015 at 2:44 PM, Eric V. Smith wrote: > On 08/06/2015 03:02 PM, Wes Turner wrote: > > > > > > On Wed, Aug 5, 2015 at 8:58 PM, Terry Reedy > > wrote: > > > > On 8/5/2015 3:34 PM, Yury Selivanov wrote: > > > > '\{..}' feels unbalanced and weird. > > > > > > Escape both. The closing } is also treated specially, and not > > inserted into the string. The compiler scans linearly from left to > > right, but human eyes are not so constrained. > > > > s = "abc\{kjljid some long expression jk78738}def" > > > > versus > > > > s = "abc\{kjljid some long expression jk78738\}def" > > > > and how about > > > > s = "abc\{kjljid some {long} expression jk78738\}def" > > > > > > +1: escape \{both\}. > > > > Use cases where this is (as dangerous as other string interpolation > > methods): > > > > * Shell commands that should be shlex-parsed/quoted > > * (inappropriately, programmatically) writing > > code with manually-added quotes ' and doublequotes " > > * XML,HTML,CSS,SQL, textual query language injection > > * Convenient, but dangerous and IMHO much better handled > > by e.g. MarkupSafe, a DOM builder, a query ORM layer > > > > Docs / Utils: > > > > * [ ] ENH: AST scanner for these (before i do __futre__ import) > > * [ ] DOC: About string interpolation, in general > > I don't understand what you're trying to say. > > os.system("cp \{cmd}") > > is no better or worse than: > > os.system("cp " + cmd) > All wrong (without appropriate escaping): os.system("cp thisinthemiddleofmy\{cmd}.tar") os.system("cp thisinthemiddleofmy\{cmd\}.tar") os.system("cp " + cmd) os.exec* os.spawn* Okay: subprocess.call(('cp', 'thisinthemiddleofmy\{cmd\}.tar')) # shell=True=Dangerous sarge.run('cp thisinthemiddleofmy{0!s}.tar', cmd) > > Yes, there are lots of opportunities in the world for injection attacks. > This proposal doesn't change that. I don't see how escaping the final } > changes anything. > > Eric. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Fri Aug 7 00:21:07 2015 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Thu, 6 Aug 2015 18:21:07 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150806175351.0f4c8001@anarchist.wooz.org> References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> Message-ID: <55C3DDD3.50807@gmail.com> Barry, On 2015-08-06 5:53 PM, Barry Warsaw wrote: > On Aug 06, 2015, at 03:37 PM, Yury Selivanov wrote: > >> What if we introduce f-strings in the following fashion: >> >> 1. ``f'string {var}'`` is equivalent to >> ``'string {var}'.format(**locals())`` -- no new formatting >> syntax. > You really do want to include globals too, with locals overriding them. Right, I should have written 'format(**globals(), **locals())', but in reality I hope we can make compile.c to inline vars statically. >> 2. there is a 'sys.set_format_hook()' function that allows >> to set a global formatting hook for all f-strings: >> >> # pseudo-code >> def i18n(str, vars): >> if current_lang != 'en': >> str = gettext(str, current_lang) >> return str.format(vars) >> >> sys.set_format_hook(i18n) >> >> This would allow much more convenient way not only to format >> strings, but also to integrate various i18n frameworks: >> >> f'Welcome, {user}' instead of _('Welcome, {user}') > I don't think you want this to be a process-global hook since different > modules may be using a different i18n systems. I agree this might be an issue. Not sure how widespread the practice of using multiple systems in one project is, though. Just some ideas off the top of my head on how this can be tackled (this is an off-topic for this thread, but it might result in something interesting): - we can have a convention of setting/unsetting the global callback per http request / rendering block / etc - we can pass the full module name (or module object) to the callback as an extra argument; this way it's possible to design a mechanism to "target" different i18n frameworks for different "parts" of the application - the idea can be extended to provide a more elaborate and standardized i18n API, so that different systems use it and can co-exist without conflicting with each other - during rendering of an f-string we can check if globals() have a '__format_hook__' name defined in it; this way it's possible to have a per-module i18n system Anyways, it would be nice if we can make i18n a little bit easier and standardized in python. That would help with adding i18n in existing projects, that weren't designed with it in mind from start. Yury From eric at trueblade.com Fri Aug 7 00:24:31 2015 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 6 Aug 2015 18:24:31 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> <55C3B915.2060404@trueblade.com> Message-ID: <55C3DE9F.5050204@trueblade.com> On 8/6/2015 6:15 PM, Wes Turner wrote: > > > On Thu, Aug 6, 2015 at 2:44 PM, Eric V. Smith > wrote: > > On 08/06/2015 03:02 PM, Wes Turner wrote: > > > > > > On Wed, Aug 5, 2015 at 8:58 PM, Terry Reedy > > >> wrote: > > > > On 8/5/2015 3:34 PM, Yury Selivanov wrote: > > > > '\{..}' feels unbalanced and weird. > > > > > > Escape both. The closing } is also treated specially, and not > > inserted into the string. The compiler scans linearly from left to > > right, but human eyes are not so constrained. > > > > s = "abc\{kjljid some long expression jk78738}def" > > > > versus > > > > s = "abc\{kjljid some long expression jk78738\}def" > > > > and how about > > > > s = "abc\{kjljid some {long} expression jk78738\}def" > > > > > > +1: escape \{both\}. > > > > Use cases where this is (as dangerous as other string interpolation > > methods): > > > > * Shell commands that should be shlex-parsed/quoted > > * (inappropriately, programmatically) writing > > code with manually-added quotes ' and doublequotes " > > * XML,HTML,CSS,SQL, textual query language injection > > * Convenient, but dangerous and IMHO much better handled > > by e.g. MarkupSafe, a DOM builder, a query ORM layer > > > > Docs / Utils: > > > > * [ ] ENH: AST scanner for these (before i do __futre__ import) > > * [ ] DOC: About string interpolation, in general > > I don't understand what you're trying to say. > > os.system("cp \{cmd}") > > is no better or worse than: > > os.system("cp " + cmd) > > > All wrong (without appropriate escaping): > > os.system("cp thisinthemiddleofmy\{cmd}.tar") > os.system("cp thisinthemiddleofmy\{cmd\}.tar") > os.system("cp " + cmd) > os.exec* > os.spawn* Not if you control cmd. I'm not sure of your point. As I said, there are opportunities for injection that exist before the interpolation proposals. > Okay: > > subprocess.call(('cp', 'thisinthemiddleofmy\{cmd\}.tar')) # > shell=True=Dangerous I know that. This proposal does not change any of this. Is any of this discussion of injections relevant to the interpolated string proposal? > sarge.run('cp thisinthemiddleofmy{0!s}.tar', cmd) Never heard of sarge. Eric. > > Yes, there are lots of opportunities in the world for injection attacks. > This proposal doesn't change that. I don't see how escaping the final } > changes anything. > > Eric. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From wes.turner at gmail.com Fri Aug 7 00:27:37 2015 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 6 Aug 2015 17:27:37 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> Message-ID: On Thu, Aug 6, 2015 at 4:03 PM, Tim Delaney wrote: > On 6 August 2015 at 16:05, Nathaniel Smith wrote: > >> On Wed, Aug 5, 2015 at 9:35 PM, Nick Coghlan wrote: >> > use "!" as the prefix instead of "f" to more clearly emphasise the >> > distinction from the subtle effects of "u", "b" and "r" >> >> Hey, maybe $ would make an even better string-interpolation sigil anyway? >> > > +1 for $"..." being an interpolated string. The syntax just makes sense. > Doesn't prevent us from using $ elsewhere, but it does set a precedent that > it should be used in interpolation/substitution-style contexts. > > +0 for !"..." being an interpolated string. It's not particularly obvious > to me, but I do like the def foo!(ast) syntax, and symmetry with that > wouldn't be bad. Although I wouldn't mind def foo$(ast) either - $ stands > out more, and this could be considered a substitution-style context. > > -1000 on unprefixed string literals becoming interpolated. But the prefix > should be able to be used with raw strings somehow ... r$"..."? $r"..."? > \{cmd} -- https://en.wikipedia.org/wiki/LaTeX#Examples https://docs.python.org/2/library/string.html # str.__mod__ '%s' % cmd https://docs.python.org/2/library/string.html#template-strings # string.Template '$cmd' '${cmd}' https://docs.python.org/2/library/string.html#format-string-syntax # str.format {0} -- format([] {cmd!s} -- .format(**kwargs) #{cmd} -- ruby literals, coffeescript string interpolation {{cmd}} -- jinja2, mustache, handlebars, angular templates # Proposed syntax \{cmd} -- python glocal string [...], LaTeX \{cmd\} -- " > > Tim Delaney > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anthony at xtfx.me Fri Aug 7 00:56:34 2015 From: anthony at xtfx.me (C Anthony Risinger) Date: Thu, 6 Aug 2015 17:56:34 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C3DAC2.5070406@mail.de> References: <55C25C74.50008@trueblade.com> <55C3DAC2.5070406@mail.de> Message-ID: On Aug 6, 2015 5:08 PM, "Sven R. Kunze" wrote: > > I am somehow +0 on this. It seems like a crazy useful idea. However, it's maybe too much magic for Python? > > I have to admit that I dislike the \{...} syntax. Looks awkward as does escaping almost always. > > It's a personal taste but it seems there are others agreeing on that. > > This said, I would prefer f'...' in order to retain the nice {...} look. I also prefer the f'...' prefix denoting an explicit opt-in to context formatting and avoiding the backslash, but using a backslash to parallel with other escaping reasons makes sense too. I'm not sure it's going to matter much because anyone writing code professionally (or not) is going to be using an editor with syntax highlighting... even simple/basic editors have this feature since it's practically expected. When editing shellcode I have no problem seeing the variables within a long string. Even though I've been developing professionally in a half dozen languages for over a decade, I can still barely read unhighlighted code. Any editor would show embedded expressions as exactly that -- an expression and not a string. If you are writing code in a basic text editor nothing is going to help you parse except your brain, and IMO none of the proposals make that any better or worse than the effort required to parse the code around it. In the end I like the f'...' prefix simply because it conveys the intent of the developer. -- C Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Aug 7 00:58:15 2015 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 6 Aug 2015 17:58:15 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C3DE9F.5050204@trueblade.com> References: <55C25C74.50008@trueblade.com> <55C2655E.8040907@gmail.com> <55C3B915.2060404@trueblade.com> <55C3DE9F.5050204@trueblade.com> Message-ID: On Thu, Aug 6, 2015 at 5:24 PM, Eric V. Smith wrote: > On 8/6/2015 6:15 PM, Wes Turner wrote: > > > > > > On Thu, Aug 6, 2015 at 2:44 PM, Eric V. Smith > > wrote: > > > > On 08/06/2015 03:02 PM, Wes Turner wrote: > > > > > > > > > On Wed, Aug 5, 2015 at 8:58 PM, Terry Reedy > > > >> wrote: > > > > > > On 8/5/2015 3:34 PM, Yury Selivanov wrote: > > > > > > '\{..}' feels unbalanced and weird. > > > > > > > > > Escape both. The closing } is also treated specially, and not > > > inserted into the string. The compiler scans linearly from > left to > > > right, but human eyes are not so constrained. > > > > > > s = "abc\{kjljid some long expression jk78738}def" > > > > > > versus > > > > > > s = "abc\{kjljid some long expression jk78738\}def" > > > > > > and how about > > > > > > s = "abc\{kjljid some {long} expression jk78738\}def" > > > > > > > > > +1: escape \{both\}. > > > > > > Use cases where this is (as dangerous as other string interpolation > > > methods): > > > > > > * Shell commands that should be shlex-parsed/quoted > > > * (inappropriately, programmatically) writing > > > code with manually-added quotes ' and doublequotes " > > > * XML,HTML,CSS,SQL, textual query language injection > > > * Convenient, but dangerous and IMHO much better handled > > > by e.g. MarkupSafe, a DOM builder, a query ORM layer > > > > > > Docs / Utils: > > > > > > * [ ] ENH: AST scanner for these (before i do __futre__ import) > > > * [ ] DOC: About string interpolation, in general > > > > I don't understand what you're trying to say. > > > > os.system("cp \{cmd}") > > > > is no better or worse than: > > > > os.system("cp " + cmd) > > > > > > All wrong (without appropriate escaping): > > > > os.system("cp thisinthemiddleofmy\{cmd}.tar") > > os.system("cp thisinthemiddleofmy\{cmd\}.tar") > > os.system("cp " + cmd) > > os.exec* > > os.spawn* > > Not if you control cmd. I'm not sure of your point. As I said, there are > opportunities for injection that exist before the interpolation proposals. > > > Okay: > > > > subprocess.call(('cp', 'thisinthemiddleofmy\{cmd\}.tar')) # > > shell=True=Dangerous > > I know that. This proposal does not change any of this. Is any of this > discussion of injections relevant to the interpolated string proposal? > This discussion of is directly relevant to static and dynamic analysis "scanners" for e.g. CWE-89, CWE-78 https://cwe.mitre.org/data/definitions/78.html#Relationships It's just another syntax but there are downstream changes to tooling. - [ ] Manual review > > > sarge.run('cp thisinthemiddleofmy{0!s}.tar', cmd) > > Never heard of sarge. > Sarge handles threading, shell escaping, and | pipes (even w/ Windows) on top of subprocess. Something similar in the stdlib someday #ideas would be great [and would solve for the 'how do i teach this person to write a shell script python module to be called by a salt module?' use case]. > > Eric. > > > > > Yes, there are lots of opportunities in the world for injection > attacks. > > This proposal doesn't change that. I don't see how escaping the > final } > > changes anything. > > > > Eric. > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Aug 7 01:01:48 2015 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 6 Aug 2015 18:01:48 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C3DAC2.5070406@mail.de> Message-ID: On Thu, Aug 6, 2015 at 5:56 PM, C Anthony Risinger wrote: > On Aug 6, 2015 5:08 PM, "Sven R. Kunze" wrote: > > > > I am somehow +0 on this. It seems like a crazy useful idea. However, > it's maybe too much magic for Python? > > > > I have to admit that I dislike the \{...} syntax. Looks awkward as does > escaping almost always. > > > > It's a personal taste but it seems there are others agreeing on that. > > > > This said, I would prefer f'...' in order to retain the nice {...} look. > > I also prefer the f'...' prefix denoting an explicit opt-in to context > formatting and avoiding the backslash, but using a backslash to parallel > with other escaping reasons makes sense too. > > I'm not sure it's going to matter much because anyone writing code > professionally (or not) is going to be using an editor with syntax > highlighting... even simple/basic editors have this feature since it's > practically expected. When editing shellcode I have no problem seeing the > variables within a long string. > > Even though I've been developing professionally in a half dozen languages > for over a decade, I can still barely read unhighlighted code. Any editor > would show embedded expressions as exactly that -- an expression and not a > string. If you are writing code in a basic text editor nothing is going to > help you parse except your brain, and IMO none of the proposals make that > any better or worse than the effort required to parse the code around it. > > In the end I like the f'...' prefix simply because it conveys the intent > of the developer. > f'....{{cmd}}' r'....{{cmd}}' > -- > > C Anthony > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim.baker at python.org Fri Aug 7 01:08:00 2015 From: jim.baker at python.org (Jim Baker) Date: Thu, 6 Aug 2015 17:08:00 -0600 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150806142741.3bebf7b3@anarchist.wooz.org> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: On Thu, Aug 6, 2015 at 12:27 PM, Barry Warsaw wrote: > ... > > Really, the only nasty bit about flufl.i18n's implementation is the use of > sys._getframe(). Fortunately, it's a big of ugliness that's buried in the > implementation and never really seen by users. If there was a more better > way > of getting at globals and locals, that was Python-implementation > independent, > that would clean up this little wart. > Jython supports sys._getframe, and there's really no need to not ever support this function for future performance reasons, given that the work on Graal[1] for the Java Virtual Machine will eventually make such lookups efficient. But I agree that it's best to avoid sys._getframe when possible. - Jim [1] http://openjdk.java.net/projects/graal/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Aug 7 01:31:43 2015 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Aug 2015 11:31:43 +1200 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55B17B81.3070000@canterbury.ac.nz> <85380dnvwp.fsf@benfinney.id.au> <85y4i4n02h.fsf@benfinney.id.au> <55B3E9B2.50709@trueblade.com> <55BD0555.6010204@trueblade.com> <20150801182545.GC25179@ando.pearwood.info> <55BD151D.6060702@trueblade.com> <55BEABD0.7000604@mgmiller.net> <55BED537.8020000@trueblade.com> <20150806031843.GT3737@ando.pearwood.info> Message-ID: <55C3EE5F.1090808@canterbury.ac.nz> Ron Adam wrote: > > Maybe this should be added to Python's Zen? > > "The right way to do something and the easiest way > to do something should be the same way." "Although the right way may not be right unless you're Dutch..."... no, that doesn't really work.:-) -- Greg From steve at pearwood.info Fri Aug 7 07:12:01 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 7 Aug 2015 15:12:01 +1000 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> Message-ID: <20150807051201.GZ3737@ando.pearwood.info> On Thu, Aug 06, 2015 at 12:26:14PM -0400, random832 at fastmail.us wrote: > On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote: > > Because strings containing \{ are currently valid > > Which raises the question of why. Because \C is currently valid, for all values of C. The idea is that if you typo an escape, say \d for \f, you get an obvious backslash in your string which is easy to spot. Personally, I think that's a mistake. It leads to errors like this: filename = 'C:\some\path\something.txt' silently doing the wrong thing. If we're going to change the way escapes work, it's time to deprecate the misfeature that \C is a literal backslash followed by C. Outside of raw strings, a backslash should *only* be allowed in an escape sequence. Deprecating invalid escape sequences would then open the door to adding new, useful escapes. > (and as long as we're talking about > things to deprecate in string literals, how about \v?) Why would you want to deprecate a useful and long-standing escape sequence? Admittedly \v isn't as common as \t or \n, but it still has its uses, and is a standard escape familiar to anyone who uses C, C++, C#, Octave, Haskell, Javascript, etc. If we're going to make major changes to the way escapes work, I'd rather add new escapes, not take them away: \e escape \x1B, as supported by gcc and clang; the escaping rules from Haskell: http://book.realworldhaskell.org/read/characters-strings-and-escaping-rules.html \P platform-specific newline (e.g. \r\n on Windows, \n on POSIX) \U+xxxx Unicode code point U+xxxx (with four to six hex digits) It's much nicer to be able to write Unicode code points that (apart from the backslash) look like the standard Unicode notation U+0000 to U+10FFFF, rather than needing to pad to a full eight digits as the \U00xxxxxx syntax requires. -- Steve From rosuav at gmail.com Fri Aug 7 09:15:34 2015 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Aug 2015 17:15:34 +1000 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: <20150807051201.GZ3737@ando.pearwood.info> References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> Message-ID: On Fri, Aug 7, 2015 at 3:12 PM, Steven D'Aprano wrote: > On Thu, Aug 06, 2015 at 12:26:14PM -0400, random832 at fastmail.us wrote: >> On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote: >> > Because strings containing \{ are currently valid >> >> Which raises the question of why. > > Because \C is currently valid, for all values of C. The idea is that if > you typo an escape, say \d for \f, you get an obvious backslash in your > string which is easy to spot. > > Personally, I think that's a mistake. It leads to errors like this: > > filename = 'C:\some\path\something.txt' > > silently doing the wrong thing. If we're going to change the way escapes > work, it's time to deprecate the misfeature that \C is a literal > backslash followed by C. Outside of raw strings, a backslash should > *only* be allowed in an escape sequence. I agree; plus, it means there's yet another thing for people to complain about when they switch to Unicode strings: path = "c:\users", "C:\Users" # OK on Py2 path = u"c:\users", u"C:\Users" # Fails Or equivalently, moving to Py3 and having those strings quietly become Unicode strings, and now having meaning on the \U and \u escapes. That said, though: It's now too late to change Python 2, which means that this is going to be yet another hurdle when people move (potentially large) Windows codebases to Python 3. IMO it's a good thing to trip people up immediately, rather than silently doing the wrong thing - but it is going to be another thing that people moan about when Python 3 starts complaining. First they have to add parentheses to print, then it's all those pointless (in their eyes) encode/decode calls, and now they have to go through and double all their backslashes as well! But the alternative is that some future version of Python adds a new escape code, and all their code starts silently doing weird stuff - or they change the path name and it goes haywire (changing from "c:\users\demo" to "c:\users\all users" will be a fun one to diagnose) - so IMO it's better to know about it early. > If we're going to make major changes to the way escapes work, I'd rather > add new escapes, not take them away: > > > \e escape \x1B, as supported by gcc and clang; Please, yes! Also supported by a number of other languages and commands (Pike, GNU echo, and some others that I don't recall (but not bind9, which has its own peculiarities)). > the escaping rules from Haskell: > > http://book.realworldhaskell.org/read/characters-strings-and-escaping-rules.html > > \P platform-specific newline (e.g. \r\n on Windows, \n on POSIX) Hmm. Not sure how useful this would be. Personally, I consider this to be a platform-specific encoding, on par with expecting b"\xc2\xa1" to display "?", and as such, it should be kept to boundaries. Work with "\n" internally, and have input routines convert to that, and output routines optionally add "\r" before them all. > \U+xxxx Unicode code point U+xxxx (with four to six hex digits) > > It's much nicer to be able to write Unicode code points that (apart from > the backslash) look like the standard Unicode notation U+0000 to > U+10FFFF, rather than needing to pad to a full eight digits as the > \U00xxxxxx syntax requires. The problem is the ambiguity. How do you specify that "\U+101010" be a two-character string? "\U000101010" forces it by having exactly eight digits, but as soon as you allow variable numbers of digits, you run into problems. I suppose you could always pad to six for that: "\U+0101010" could know that it doesn't need a seventh digit. (Though what would ever happen if the Unicode consortium decides to drop support for UTF-16 and push for a true 32-bit character set, I don't know.) It is tempting, though - it both removes the need for two pointless zeroes, and broadly unifies the syntax for Unicode escapes, instead of having a massive boundary from "\u1234" to "\U00012345". ChrisA From ncoghlan at gmail.com Fri Aug 7 09:33:33 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Aug 2015 17:33:33 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150806142741.3bebf7b3@anarchist.wooz.org> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: On 7 August 2015 at 04:27, Barry Warsaw wrote: > On Aug 06, 2015, at 11:01 PM, Nick Coghlan wrote: > >>* you can't restrict them to "literals only", so you run a much higher risk >>of code injection attacks > > In an i18n context you do sometimes need to pass in non-literals. Restricting > this thing to literals only doesn't really increase the attack vector > significantly, and does close off an important use case. flufl.il8n, gettext, etc wouldn't go away - my "allow il8n use as well" idea was just aimed at making interpolated strings easy to translate by default. If f-strings are always eagerly interpolated prior to translation, then I can foresee a lot of complaints from folks asking why this doesn't work right: print(_(f"This is a translated message with {a} and {b} interpolated")) When you're mixing translation with interpolation, you really want the translation lookup to happen first, when the placeholders are still present in the format string: print(_("This is a translated message with {a} and {b} interpolated").format(a=a, b=b)) I've made the lookup explicit there, but of course sys._getframe() also allows it to be implicit. We could potentially make f-strings translation friendly by introducing a bit of indirection into the f-string design: an __interpolate__ builtin, along the lines of __import__. That system could further be designed so that, by default, "__interpolate__ = str.format", but a module could also do something like "from flufl.il8n import __interpolate__" to get translated f-strings in that module (preferably using the PEP 292/215 syntax, rather than adding yet another spelling for string interpolation). >>* you can only implement them via stack walking, so name resolution doesn't >>work right. You can get at the locals and globals for the calling frame, but >>normal strings are opaque to the compiler, so lexical scoping doesn't trigger >>properly > > In practice, you need sys._getframe(2) to make it work, although flufl.i18n > does allow you to specify a different depth. In practice you could probably > drop that for the most part. (ISTR an obscure use case for depth>2 but can't > remember the details.) > > Really, the only nasty bit about flufl.i18n's implementation is the use of > sys._getframe(). Fortunately, it's a big of ugliness that's buried in the > implementation and never really seen by users. If there was a more better way > of getting at globals and locals, that was Python-implementation independent, > that would clean up this little wart. sys._getframe() usage is what I meant by stack walking. It's not *really* based on walking the stack, but you're relying on poking around in runtime state to do dynamic scoping, rather than being able to do lexical analysis at compile time (and hence why static analysers get confused about apparently unused local variables). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Aug 7 09:49:05 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Aug 2015 17:49:05 +1000 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> Message-ID: On 7 August 2015 at 06:35, Wes Turner wrote: > > On Aug 6, 2015 3:03 PM, "Guido van Rossum" wrote: >> >> Unfortunately, all spellings that require calling locals() are wrong. > > Is this where the potential source of surprising error is? Yes - it's what creates the temptation for people to use sys._getframe() to hide the locals() calls, and either approach hides the name references from lexical analysers (hence Barry's comment about false alarms regarding "unused locals" when scanning code that uses flufl.il8n). When people are being tempted to write code that is too clever for a computer to easily follow without executing it, that's cause for concern (being able to *write* such code is useful for exploratory purposes, but it's also the kind of thing that's desirable to factor out as a code base matures). When it comes to name based string interpolation, the current "correct" approach (which even computers can read) requires duplicating the name references explicitly in constructs like: print("This interpolates {a} and {b}".format(a=a, b=b)) Which doesn't fare well for readability when compared to sys._getframe() based implicit approaches like flufl.il8n's: print(_("This interpolates $a and $b")) The f-string proposal provides a way to write the latter kind of construct in a more explicit way that even computers can learn to read (without executing it). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Fri Aug 7 10:12:46 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Aug 2015 10:12:46 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> Message-ID: On Thu, Aug 6, 2015 at 10:35 PM, Wes Turner wrote: > > On Aug 6, 2015 3:03 PM, "Guido van Rossum" wrote: > > > > Unfortunately, all spellings that require calling locals() are wrong. > > Is this where the potential source of surprising error is? > > * Explicit / Implicit locals() > This is a big deal because of the worry about code injection. A "classic" format string given access to locals() (e.g. using s.format(**locals())) always stirs worries about code injection if the string is a variable. The proposed forms of string interpolation don't give access to locals *other than the locals where the string "literal" itself exists*. This latter access is no different from the access to locals in any expression. (The same for globals(), of course.) The other issue with explicit locals() is that to the people who would most benefit from variable interpolation (typically relatively unsophisticated users), it is magical boilerplate. (Worse, it's boilerplate that their more experienced mentors will warn them against because of the code injection worry.) > * To me, the practicality of finding '%' and .format is more important > than the convenience of an additional syntax with implicit scope, but is > that beside the point? > I'm not sure what your point is here. (Genuinely not sure -- this is not a rhetorical flourish.) Are you saying that you prefer the explicit formatting operation because it acts as a signal to the reader that formatting is taking place? Maybe in the end the f-string proposal is the right one -- it's minimally obtrusive and yet explicit, *and* backwards compatible? This isn't saying I'm giving up on always-interpolation; there seems to be at least an even split between languages that always interpolate (PHP?), languages that have a way to explicitly disable it (like single quotes in shell), and languages that require some sort of signal (like C#). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Aug 7 10:49:14 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Aug 2015 10:49:14 +0200 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: On Fri, Aug 7, 2015 at 9:33 AM, Nick Coghlan wrote: > We could potentially make f-strings translation friendly by > introducing a bit of indirection into the f-string design: an > __interpolate__ builtin, along the lines of __import__. > This seems interesting, but doesn't it require sys._getframe() or similar again? Translations may need to reorder variables. (Or even change the expressions? E.g. to access odd plurals?) The sys._getframe() requirement (if true) would kill this idea thoroughly for me. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Aug 7 11:03:35 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 7 Aug 2015 02:03:35 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: On Fri, Aug 7, 2015 at 1:49 AM, Guido van Rossum wrote: > On Fri, Aug 7, 2015 at 9:33 AM, Nick Coghlan wrote: >> >> We could potentially make f-strings translation friendly by >> introducing a bit of indirection into the f-string design: an >> __interpolate__ builtin, along the lines of __import__. > > > This seems interesting, but doesn't it require sys._getframe() or similar > again? Translations may need to reorder variables. (Or even change the > expressions? E.g. to access odd plurals?) > > The sys._getframe() requirement (if true) would kill this idea thoroughly > for me. AFAICT sys._getframe is unneeded -- I understand Nick's suggestion to be that we desugar f"..." to: __interpolate__("...", locals(), globals()) with the reference to __interpolate__ resolved using the usual lookup rules (locals -> globals -> builtins). -n -- Nathaniel J. Smith -- http://vorpus.org From guido at python.org Fri Aug 7 11:37:27 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Aug 2015 11:37:27 +0200 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: On Fri, Aug 7, 2015 at 11:03 AM, Nathaniel Smith wrote: > On Fri, Aug 7, 2015 at 1:49 AM, Guido van Rossum wrote: > > On Fri, Aug 7, 2015 at 9:33 AM, Nick Coghlan wrote: > >> > >> We could potentially make f-strings translation friendly by > >> introducing a bit of indirection into the f-string design: an > >> __interpolate__ builtin, along the lines of __import__. > > > > > > This seems interesting, but doesn't it require sys._getframe() or similar > > again? Translations may need to reorder variables. (Or even change the > > expressions? E.g. to access odd plurals?) > > > > The sys._getframe() requirement (if true) would kill this idea thoroughly > > for me. > > AFAICT sys._getframe is unneeded -- I understand Nick's suggestion to > be that we desugar f"..." to: > > __interpolate__("...", locals(), globals()) > > with the reference to __interpolate__ resolved using the usual lookup > rules (locals -> globals -> builtins). > sys._getframe() or locals()+globals() makes little difference to me -- it still triggers worries that we now could be executing code read from the translation database. The nice thing about f"{...}" or "\{...}" is that we can allow arbitrary expressions inside {...} without worrying, since the expression is right there for us to see. The __interpolate__ idea invalidates that. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Aug 7 11:41:53 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 7 Aug 2015 19:41:53 +1000 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> Message-ID: <20150807094153.GA3737@ando.pearwood.info> On Fri, Aug 07, 2015 at 05:15:34PM +1000, Chris Angelico wrote about deprecating \C giving a literal backslash C: [...] > That said, though: It's now too late to change Python 2, which means > that this is going to be yet another hurdle when people move > (potentially large) Windows codebases to Python 3. I don't think that changing string literals is an onerous task. The hardest part is deciding what fix you're going to apply: - replace \ in Windows paths with / - escape your backslashes - use raw strings > or they change the path name and it goes > haywire (changing from "c:\users\demo" to "c:\users\all users" will be > a fun one to diagnose) - so IMO it's better to know about it early. "c:\users" is already broken in Python 3. SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-4: truncated \uXXXX escape [...] > > \P platform-specific newline (e.g. \r\n on Windows, \n on POSIX) > > Hmm. Not sure how useful this would be. Personally, I consider this to > be a platform-specific encoding, Of course it's platform-specific. That's what I said :-) > on par with expecting b"\xc2\xa1" to > display "?", and as such, it should be kept to boundaries. This has nothing to do with bytes. \r and \n in Unicode strings give U+000D and U+000A respectively, \P would likewise be defined in terms of code points, not bytes. > Work with > "\n" internally, and have input routines convert to that, and output > routines optionally add "\r" before them all. That's fine as far as it goes, but sometimes you don't want automatic newline conversion. See the "newline" parameter to Python 3's open built-in. If I'm writing a file which the user has specified to use Windows end-of-line, I can't rely on Python automatically converting to \r\n because I might not actually be running on Windows, so I may disable universal newlines on output, and specify the end of line myself using the user's choice. One such choice being "whatever platform you're on, use the platform default". > > \U+xxxx Unicode code point U+xxxx (with four to six hex digits) > > > > It's much nicer to be able to write Unicode code points that (apart from > > the backslash) look like the standard Unicode notation U+0000 to > > U+10FFFF, rather than needing to pad to a full eight digits as the > > \U00xxxxxx syntax requires. > > The problem is the ambiguity. How do you specify that "\U+101010" be a > two-character string? Hence Haskell's \& which acts as a separator: "\U+10101\&0" Or use implicit concatenation: "\U+10101" "0" Also, the C++ style "\U000101010" will continue to work. However, it's hard to read: you need to count the digits to see that there are *nine* digits and so only the first eight belong to the \U escape. [...] > (Though > what would ever happen if the Unicode consortium decides to drop > support for UTF-16 and push for a true 32-bit character set, I don't > know.) If that ever happens, it will be one of the signs of the Apocalypse. To quote Ghostbusters: Fire and brimstone coming down from the skies! Rivers and seas boiling! Forty years of darkness! Earthquakes, volcanoes... The dead rising from the grave! Human sacrifice, dogs and cats living together... and the Unicode Consortium breaking their stability guarantee. -- Steve From mal at egenix.com Fri Aug 7 11:55:39 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 07 Aug 2015 11:55:39 +0200 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> Message-ID: <55C4809B.9040201@egenix.com> On 07.08.2015 09:15, Chris Angelico wrote: > On Fri, Aug 7, 2015 at 3:12 PM, Steven D'Aprano wrote: >> On Thu, Aug 06, 2015 at 12:26:14PM -0400, random832 at fastmail.us wrote: >>> On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote: >>>> Because strings containing \{ are currently valid >>> >>> Which raises the question of why. >> >> Because \C is currently valid, for all values of C. The idea is that if >> you typo an escape, say \d for \f, you get an obvious backslash in your >> string which is easy to spot. >> >> Personally, I think that's a mistake. It leads to errors like this: >> >> filename = 'C:\some\path\something.txt' >> >> silently doing the wrong thing. If we're going to change the way escapes >> work, it's time to deprecate the misfeature that \C is a literal >> backslash followed by C. Outside of raw strings, a backslash should >> *only* be allowed in an escape sequence. > > I agree; plus, it means there's yet another thing for people to > complain about when they switch to Unicode strings: > > path = "c:\users", "C:\Users" # OK on Py2 > path = u"c:\users", u"C:\Users" # Fails Um, Windows path names should always use the raw format: path = r"c:\users" Doesn't work with Unicode in Py2, though: path = ur"c:\users" on the plus side, you get a SyntaxError right away. > Or equivalently, moving to Py3 and having those strings quietly become > Unicode strings, and now having meaning on the \U and \u escapes. Same as above... use raw format in Py3: path = r"c:\users" (only now you get a raw Unicode string; this was changed in Py3 compared to Py2) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 07 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From rosuav at gmail.com Fri Aug 7 12:03:08 2015 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Aug 2015 20:03:08 +1000 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: <20150807094153.GA3737@ando.pearwood.info> References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> <20150807094153.GA3737@ando.pearwood.info> Message-ID: On Fri, Aug 7, 2015 at 7:41 PM, Steven D'Aprano wrote: > On Fri, Aug 07, 2015 at 05:15:34PM +1000, Chris Angelico wrote about > deprecating \C giving a literal backslash C: > > [...] >> That said, though: It's now too late to change Python 2, which means >> that this is going to be yet another hurdle when people move >> (potentially large) Windows codebases to Python 3. > > I don't think that changing string literals is an onerous task. The > hardest part is deciding what fix you're going to apply: > > - replace \ in Windows paths with / > - escape your backslashes > - use raw strings Right, which is what I'd recommend anyway. Hence my view that earlier breakage is better than subtle breakage later on. >> or they change the path name and it goes >> haywire (changing from "c:\users\demo" to "c:\users\all users" will be >> a fun one to diagnose) - so IMO it's better to know about it early. > > "c:\users" is already broken in Python 3. > > SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in > position 2-4: truncated \uXXXX escape I know. That's what I was saying - the current system means you get breakage when (a) you add a u prefix to the string, (b) you switch to Python 3, or (c) you change the path name to happen to include something that IS a recognized escape. Otherwise, it's lurking, pretending to work. > [...] >> > \P platform-specific newline (e.g. \r\n on Windows, \n on POSIX) >> >> Hmm. Not sure how useful this would be. Personally, I consider this to >> be a platform-specific encoding, > > Of course it's platform-specific. That's what I said :-) Of course it's platform-specific. What I mean is, it's on par with the encoding that LATIN SMALL LETTER A is "\x61". >> on par with expecting b"\xc2\xa1" to >> display "?", and as such, it should be kept to boundaries. > > This has nothing to do with bytes. \r and \n in Unicode strings give > U+000D and U+000A respectively, \P would likewise be defined in terms of > code points, not bytes. Okay, perhaps a better comparison: It's on par with knowing that your terminal expects "\x1b[34m" to change color. It's a platform-specific piece of information, which belongs in the os module, not as a magic piece of string literal syntax. Can you take a .pyc file from Unix and put it onto a Windows system? If so, what should \P in a string literal do? >> Work with >> "\n" internally, and have input routines convert to that, and output >> routines optionally add "\r" before them all. > > That's fine as far as it goes, but sometimes you don't want automatic > newline conversion. See the "newline" parameter to Python 3's open > built-in. If I'm writing a file which the user has specified > to use Windows end-of-line, I can't rely on Python automatically > converting to \r\n because I might not actually be running on Windows, > so I may disable universal newlines on output, and specify the end of > line myself using the user's choice. One such choice being "whatever > platform you're on, use the platform default". Specifying the end-of-line should therefore be done in one of three ways: ("\n", "\r\n", os.linesep). >> > \U+xxxx Unicode code point U+xxxx (with four to six hex digits) >> > >> > It's much nicer to be able to write Unicode code points that (apart from >> > the backslash) look like the standard Unicode notation U+0000 to >> > U+10FFFF, rather than needing to pad to a full eight digits as the >> > \U00xxxxxx syntax requires. >> >> The problem is the ambiguity. How do you specify that "\U+101010" be a >> two-character string? > > Hence Haskell's \& which acts as a separator: > > "\U+10101\&0" > > Or use implicit concatenation: > > "\U+10101" "0" > > Also, the C++ style "\U000101010" will continue to work. However, it's > hard to read: you need to count the digits to see that there are *nine* > digits and so only the first eight belong to the \U escape. True, the problem's exactly the same, and has the same solutions. +1 for this notation. > [...] >> (Though >> what would ever happen if the Unicode consortium decides to drop >> support for UTF-16 and push for a true 32-bit character set, I don't >> know.) > > If that ever happens, it will be one of the signs of the Apocalypse. To > quote Ghostbusters: > > Fire and brimstone coming down from the skies! Rivers and seas > boiling! Forty years of darkness! Earthquakes, volcanoes... The dead > rising from the grave! Human sacrifice, dogs and cats living > together... and the Unicode Consortium breaking their stability > guarantee. GASP! Next thing we know, Red Hat Enterprise Linux will have up-to-date software in it, and Windows will support UTF-8 everywhere! ChrisA From ncoghlan at gmail.com Fri Aug 7 11:50:36 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Aug 2015 19:50:36 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: On 7 August 2015 at 19:03, Nathaniel Smith wrote: > On Fri, Aug 7, 2015 at 1:49 AM, Guido van Rossum wrote: >> On Fri, Aug 7, 2015 at 9:33 AM, Nick Coghlan wrote: >>> >>> We could potentially make f-strings translation friendly by >>> introducing a bit of indirection into the f-string design: an >>> __interpolate__ builtin, along the lines of __import__. >> >> >> This seems interesting, but doesn't it require sys._getframe() or similar >> again? Translations may need to reorder variables. (Or even change the >> expressions? E.g. to access odd plurals?) >> >> The sys._getframe() requirement (if true) would kill this idea thoroughly >> for me. > > AFAICT sys._getframe is unneeded -- I understand Nick's suggestion to > be that we desugar f"..." to: > > __interpolate__("...", locals(), globals()) > > with the reference to __interpolate__ resolved using the usual lookup > rules (locals -> globals -> builtins). Not quite. While I won't be entirely clear on Eric's latest proposal until the draft PEP is available, my understanding is that an f-string like: f"This interpolates \{a} and \{b}" would currently end up effectively being syntactic sugar for a formatting operation like: "This interpolates " + format(a) + " and " + format(b) While str.format itself probably doesn't provide a good signature for __interpolate__, the essential information to be passed in to support lossless translation would be an ordered series of: * string literals * (expression_str, value, format_str) substitution triples Since the fastest string formatting operation we have is actually still mod-formatting, lets suppose the default implementation of __interpolate__ was semantically equivalent to: def __interpolate__(target, expressions, values, format_specs): return target % tuple(map(format, values, format_specs) With that definition for default interpolation, the f-string above would be translated at compile time to the runtime call: __interpolate__("This interpolates %s and %s", ("a", "b"), (a, b), ("", "")) All of those except for the __interpolate__ lookup and the (a, b) tuple would then be stored on the function object as constants. An opt-in translation interpolator might then look like: def __interpolate__(target, expressions, values, format_spec): if not all(expr.isidentifier() for expr in expressions): raise ValueError("Only variable substitions are permitted for il8n interpolation") if any(spec for spec in format_specs): raise ValueError("Format specifications are not permitted for il8n interpolation") catalog_str = target % tuple("${%s}" % expr for expr in expressions) translated = _(catalog_str) values = {k:v for k, v in zip(expressions, values)} return string.Template(translated).safe_substitute() The string extractor for the il8n library providing that implementation would also need to know to do the transformation from f-string formatting to string.Template formatting when generating the catalog strings Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Fri Aug 7 12:09:57 2015 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Aug 2015 20:09:57 +1000 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> Message-ID: On Fri, Aug 7, 2015 at 6:12 PM, Guido van Rossum wrote: > Maybe in the end the f-string proposal is the right one -- it's minimally > obtrusive and yet explicit, *and* backwards compatible? This isn't saying > I'm giving up on always-interpolation; there seems to be at least an even > split between languages that always interpolate (PHP?), languages that have > a way to explicitly disable it (like single quotes in shell), and languages > that require some sort of signal (like C#). PHP, like shell languages, has "interpolated strings with $double $quotes" and 'uninterpreted strings with single quotes'. At my last PHP job, the style guide eschewed any use of double quoted strings, but that job's style guide wasn't something I'd recommend, so that may not be all that significant. (Part of the problem was that one of the programmers used string interpolation in ways that killed readability, so I can understand the complaint.) ChrisA From guido at python.org Fri Aug 7 12:13:13 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Aug 2015 12:13:13 +0200 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: On Fri, Aug 7, 2015 at 11:50 AM, Nick Coghlan wrote: > On 7 August 2015 at 19:03, Nathaniel Smith wrote: > > On Fri, Aug 7, 2015 at 1:49 AM, Guido van Rossum > wrote: > >> On Fri, Aug 7, 2015 at 9:33 AM, Nick Coghlan > wrote: > >>> > >>> We could potentially make f-strings translation friendly by > >>> introducing a bit of indirection into the f-string design: an > >>> __interpolate__ builtin, along the lines of __import__. > >> > >> > >> This seems interesting, but doesn't it require sys._getframe() or > similar > >> again? Translations may need to reorder variables. (Or even change the > >> expressions? E.g. to access odd plurals?) > >> > >> The sys._getframe() requirement (if true) would kill this idea > thoroughly > >> for me. > > > > AFAICT sys._getframe is unneeded -- I understand Nick's suggestion to > > be that we desugar f"..." to: > > > > __interpolate__("...", locals(), globals()) > > > > with the reference to __interpolate__ resolved using the usual lookup > > rules (locals -> globals -> builtins). > > Not quite. While I won't be entirely clear on Eric's latest proposal > until the draft PEP is available, my understanding is that an f-string > like: > > f"This interpolates \{a} and \{b}" > > would currently end up effectively being syntactic sugar for a > formatting operation like: > > "This interpolates " + format(a) + " and " + format(b) > > While str.format itself probably doesn't provide a good signature for > __interpolate__, the essential information to be passed in to support > lossless translation would be an ordered series of: > > * string literals > * (expression_str, value, format_str) substitution triples > > Since the fastest string formatting operation we have is actually > still mod-formatting, lets suppose the default implementation of > __interpolate__ was semantically equivalent to: > > def __interpolate__(target, expressions, values, format_specs): > return target % tuple(map(format, values, format_specs) > > With that definition for default interpolation, the f-string above > would be translated at compile time to the runtime call: > > __interpolate__("This interpolates %s and %s", ("a", "b"), (a, b), > ("", "")) > > All of those except for the __interpolate__ lookup and the (a, b) > tuple would then be stored on the function object as constants. > > An opt-in translation interpolator might then look like: > > def __interpolate__(target, expressions, values, format_spec): > if not all(expr.isidentifier() for expr in expressions): > raise ValueError("Only variable substitions are permitted > for il8n interpolation") > if any(spec for spec in format_specs): > raise ValueError("Format specifications are not permitted > for il8n interpolation") > catalog_str = target % tuple("${%s}" % expr for expr in > expressions) > translated = _(catalog_str) > values = {k:v for k, v in zip(expressions, values)} > return string.Template(translated).safe_substitute() > > The string extractor for the il8n library providing that > implementation would also need to know to do the transformation from > f-string formatting to string.Template formatting when generating the > catalog strings > OK, that sounds reasonable, except that translators need to control substitution order, so s % tuple(...) doesn't work. However, if we use s.format(...) we can use "This interpolates {0} and {1}", and then I'm satisfied. (Further details of the signature of __interpolate__ TBD.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Aug 7 13:14:26 2015 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 7 Aug 2015 06:14:26 -0500 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> Message-ID: On Aug 7, 2015 3:13 AM, "Guido van Rossum" wrote: > > On Thu, Aug 6, 2015 at 10:35 PM, Wes Turner wrote: >> >> >> On Aug 6, 2015 3:03 PM, "Guido van Rossum" wrote: >> > >> > Unfortunately, all spellings that require calling locals() are wrong. >> >> Is this where the potential source of surprising error is? >> >> * Explicit / Implicit locals() > > This is a big deal because of the worry about code injection. A "classic" format string given access to locals() (e.g. using s.format(**locals())) always stirs worries about code injection if the string is a variable. The proposed forms of string interpolation don't give access to locals *other than the locals where the string "literal" itself exists*. This latter access is no different from the access to locals in any expression. (The same for globals(), of course.) > > The other issue with explicit locals() is that to the people who would most benefit from variable interpolation (typically relatively unsophisticated users), it is magical boilerplate. (Worse, it's boilerplate that their more experienced mentors will warn them against because of the code injection worry.) >> >> * To me, the practicality of finding '%' and .format is more important than the convenience of an additional syntax with implicit scope, but is that beside the point? > > I'm not sure what your point is here. (Genuinely not sure -- this is not a rhetorical flourish.) Are you saying that you prefer the explicit formatting operation because it acts as a signal to the reader that formatting is taking place? I should prefer str.format() when I reach for str.__mod__() because it's more likely that under manual review I'll notice or grep ".format(" than "%", sheerly by character footprint. > > Maybe in the end the f-string proposal is the right one -- it's minimally obtrusive and yet explicit, *and* backwards compatible? This isn't saying I'm giving up on always-interpolation; there seems to be at least an even split between languages that always interpolate (PHP?), languages that have a way to explicitly disable it (like single quotes in shell), and languages that require some sort of signal (like C#). A convenient but often dangerous syntactical shortcut (because it is infeasible to track more than 7+-2 glocal variables in mind at once). * Jinja2 autoescaping w/ LaTeX code is much easier w/ different operators. * f'... {Cmd}"' * r'... {Cmd}"' 0 / O > > -- > --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Fri Aug 7 13:52:19 2015 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 7 Aug 2015 07:52:19 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: <55C49BF3.20100@trueblade.com> On 8/7/2015 6:13 AM, Guido van Rossum wrote: > On Fri, Aug 7, 2015 at 11:50 AM, Nick Coghlan > wrote: > > On 7 August 2015 at 19:03, Nathaniel Smith > wrote: > > On Fri, Aug 7, 2015 at 1:49 AM, Guido van Rossum > wrote: > >> On Fri, Aug 7, 2015 at 9:33 AM, Nick Coghlan > wrote: > >>> > >>> We could potentially make f-strings translation friendly by > >>> introducing a bit of indirection into the f-string design: an > >>> __interpolate__ builtin, along the lines of __import__. > >> > >> > >> This seems interesting, but doesn't it require sys._getframe() or similar > >> again? Translations may need to reorder variables. (Or even change the > >> expressions? E.g. to access odd plurals?) > >> > >> The sys._getframe() requirement (if true) would kill this idea thoroughly > >> for me. > > > > AFAICT sys._getframe is unneeded -- I understand Nick's suggestion to > > be that we desugar f"..." to: > > > > __interpolate__("...", locals(), globals()) > > > > with the reference to __interpolate__ resolved using the usual lookup > > rules (locals -> globals -> builtins). > > Not quite. While I won't be entirely clear on Eric's latest proposal > until the draft PEP is available, my understanding is that an f-string > like: > > f"This interpolates \{a} and \{b}" > > would currently end up effectively being syntactic sugar for a > formatting operation like: > > "This interpolates " + format(a) + " and " + format(b) > > While str.format itself probably doesn't provide a good signature for > __interpolate__, the essential information to be passed in to support > lossless translation would be an ordered series of: > > * string literals > * (expression_str, value, format_str) substitution triples > > Since the fastest string formatting operation we have is actually > still mod-formatting, lets suppose the default implementation of > __interpolate__ was semantically equivalent to: > > def __interpolate__(target, expressions, values, format_specs): > return target % tuple(map(format, values, format_specs) > > With that definition for default interpolation, the f-string above > would be translated at compile time to the runtime call: > > __interpolate__("This interpolates %s and %s", ("a", "b"), (a, > b), ("", "")) > > All of those except for the __interpolate__ lookup and the (a, b) > tuple would then be stored on the function object as constants. > > An opt-in translation interpolator might then look like: > > def __interpolate__(target, expressions, values, format_spec): > if not all(expr.isidentifier() for expr in expressions): > raise ValueError("Only variable substitions are permitted > for il8n interpolation") > if any(spec for spec in format_specs): > raise ValueError("Format specifications are not permitted > for il8n interpolation") > catalog_str = target % tuple("${%s}" % expr for expr in > expressions) > translated = _(catalog_str) > values = {k:v for k, v in zip(expressions, values)} > return string.Template(translated).safe_substitute() > > The string extractor for the il8n library providing that > implementation would also need to know to do the transformation from > f-string formatting to string.Template formatting when generating the > catalog strings > > > OK, that sounds reasonable, except that translators need to control > substitution order, so s % tuple(...) doesn't work. However, if we use > s.format(...) we can use "This interpolates {0} and {1}", and then I'm > satisfied. (Further details of the signature of __interpolate__ TBD.) The example from C# is interesting. Look at IFormattable: https://msdn.microsoft.com/en-us/library/Dn961160.aspx https://msdn.microsoft.com/en-us/library/system.iformattable.aspx >From http://roslyn.codeplex.com/discussions/570292: """ When it is converted to the type IFormattable, the result of the string interpolation is an object that stores a compiler-constructed format string along with an array storing the evaluated expressions. The object's implementation of IFormattable.ToString(string format, IFormatProvider formatProvider) is an invocation of String.Format(IFormatProviders provider, String format, params object args[]) By taking advantage of the conversion from an interpolated string expression to IFormattable, the user can cause the formatting to take place later in a selected locale. See the section System.Runtime.CompilerServices.FormattedString for details. """ So (reverting to Python syntax, with the f-string syntax), in addition to converting directly to a string, there's a way to go from: f'abc{expr1:spec1}def{expr2:spec2}ghi' to: ('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, value-of-expr2)) The general idea is that you now have access to an i18n-able string, and the values of the embedded expressions as they were evaluated "in situ" where the f-string literal was present in the source code. Y ou can imagine the f-string above evaluating to a call to: __interpolate__('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, value-of-expr2)) The default implementation of __interpolate__ would be: def __interpolate__(fmt_str, values): return fmt_str.format(*values) Then you could hook this on a per-module (or global, I guess) basis to do the i18n of fmt_str. I don't see the need to separate out the format specifies (spec1 and spec2) from the generated format string. They belong to the type of values of the evaluated expressions, so you can just embed them in the generated fmt_str. Eric. From ncoghlan at gmail.com Fri Aug 7 13:55:49 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Aug 2015 21:55:49 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: On 7 August 2015 at 20:13, Guido van Rossum wrote: > On Fri, Aug 7, 2015 at 11:50 AM, Nick Coghlan wrote: >> An opt-in translation interpolator might then look like: >> >> def __interpolate__(target, expressions, values, format_spec): >> if not all(expr.isidentifier() for expr in expressions): >> raise ValueError("Only variable substitions are permitted >> for il8n interpolation") >> if any(spec for spec in format_specs): >> raise ValueError("Format specifications are not permitted >> for il8n interpolation") >> catalog_str = target % tuple("${%s}" % expr for expr in >> expressions) >> translated = _(catalog_str) >> values = {k:v for k, v in zip(expressions, values)} >> return string.Template(translated).safe_substitute() >> >> The string extractor for the il8n library providing that >> implementation would also need to know to do the transformation from >> f-string formatting to string.Template formatting when generating the >> catalog strings > > > OK, that sounds reasonable, except that translators need to control > substitution order, so s % tuple(...) doesn't work. However, if we use > s.format(...) we can use "This interpolates {0} and {1}", and then I'm > satisfied. (Further details of the signature of __interpolate__ TBD.) If we do go down this path of making it possible to override the interpolation behaviour, I agree we should reserve judgment on a signature for __interpolate__ However, the concept sketch *does* handle the reordering problem by using mod-formatting to create a PEP 292 translation string and then using name based formatting on that. To work through an example where the "translation" is from active to passive voice in English rather than between languages: f"\{a} affected \{b}" -> __interpolate__("%s affected %s", ("a", "b"), (a, b), ("", "")) -> "${a} affected ${b}" # catalog_str -> "${b} was affected by ${a}" # translated The reconstructued values mapping passed to string.Template.safe_substitute() ends up containing {"a":a, "b":b}, so it is able to handle the field reordering because the final substitution is name based. The filtering on the passed in expressions and format specifications serves to ensure that that particular il8n interpolator is only used with human-translator-friendly PEP 292 compatible translation strings (the message extractor would also be able to check that statically) I considered a few other signatures (like an ordered dict, or a tuple of 3-tuples, or assuming the formatting would be done with str.format_map), but they ended up being more complicated for the two example cases I was exploring. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eric at trueblade.com Fri Aug 7 14:12:00 2015 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 7 Aug 2015 08:12:00 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C49BF3.20100@trueblade.com> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> Message-ID: <55C4A090.9060300@trueblade.com> On 8/7/2015 7:52 AM, Eric V. Smith wrote: > So (reverting to Python syntax, with the f-string syntax), in addition > to converting directly to a string, there's a way to go from: > > f'abc{expr1:spec1}def{expr2:spec2}ghi' > > to: > > ('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, value-of-expr2)) > > The general idea is that you now have access to an i18n-able string, and > the values of the embedded expressions as they were evaluated "in situ" > where the f-string literal was present in the source code. > Y > ou can imagine the f-string above evaluating to a call to: > > __interpolate__('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, > value-of-expr2)) > > The default implementation of __interpolate__ would be: > > def __interpolate__(fmt_str, values): > return fmt_str.format(*values) I should add that it's unfortunate that this builds a string for str.format() to use. The f-string ast generator goes through a lot of hassle to parse the f-string and extract the parts. For it to then build another string that str.format would have to immediately parse again seems like a waste. My current implementation of f-strings would take the original f-string above and convert it to: ''.join(['abc', expr1.__format__('spec1'), 'def', expr2.__format__(spec2), 'ghi']) Which avoids re-parsing anything: it's just normal function calls. Making __interpolate__ take a tuple of literals and a tuple of (value, fmt_str) tuples seems like giant hassle to internationalize, but it would be more efficient in the normal case. Eric. From ncoghlan at gmail.com Fri Aug 7 14:18:39 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Aug 2015 22:18:39 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C49BF3.20100@trueblade.com> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> Message-ID: On 7 August 2015 at 21:52, Eric V. Smith wrote: > The general idea is that you now have access to an i18n-able string, and > the values of the embedded expressions as they were evaluated "in situ" > where the f-string literal was present in the source code. > Y > ou can imagine the f-string above evaluating to a call to: > > __interpolate__('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, > value-of-expr2)) > > The default implementation of __interpolate__ would be: > > def __interpolate__(fmt_str, values): > return fmt_str.format(*values) > > Then you could hook this on a per-module (or global, I guess) basis to > do the i18n of fmt_str. > > I don't see the need to separate out the format specifies (spec1 and > spec2) from the generated format string. They belong to the type of > values of the evaluated expressions, so you can just embed them in the > generated fmt_str. Right, when I wrote my concept sketch, I forgot about string.Formatter.parse (https://docs.python.org/3/library/string.html#string.Formatter.parse) for iterating over a fully constructed format string. With the format string containing indices rather than the original expressions, we'd still want to pass in the text of those as another tuple, though. With that signature the default interpolator would look like: def __interpolate__(format_str, expressions, values): return format_str.format(*values) And a custom PEP 292 based (for translators) il8n interpreter might look like: def _format_to_template(format_str, expressions): if not all(expr.isidentifier() for expr in expressions): raise ValueError("Only variable substitions permitted for il8n") parsed_format = string.Formatter().parse(format_str) template_parts = [] for literal_text, field_name, format_spec, conversion in parsed_format: if format_spec: raise ValueError("Format specifiers not permitted for il8n") if conversion: raise ValueError("Conversion specifiers not permitted for il8n") template_parts.append(literal_text) template_parts.append("${" + field_name + "}") return "".join(template_parts) def __interpolate__(format_str, expressions, values): catalog_str = _format_to_template(format_str) translated = _(catalog_str) values = {k:v for k, v in zip(expressions, values)} return string.Template(translated).safe_substitute(values) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Aug 7 14:31:50 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Aug 2015 22:31:50 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C4A090.9060300@trueblade.com> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> <55C4A090.9060300@trueblade.com> Message-ID: On 7 August 2015 at 22:12, Eric V. Smith wrote: > On 8/7/2015 7:52 AM, Eric V. Smith wrote: >> So (reverting to Python syntax, with the f-string syntax), in addition >> to converting directly to a string, there's a way to go from: >> >> f'abc{expr1:spec1}def{expr2:spec2}ghi' >> >> to: >> >> ('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, value-of-expr2)) >> >> The general idea is that you now have access to an i18n-able string, and >> the values of the embedded expressions as they were evaluated "in situ" >> where the f-string literal was present in the source code. >> Y >> ou can imagine the f-string above evaluating to a call to: >> >> __interpolate__('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, >> value-of-expr2)) >> >> The default implementation of __interpolate__ would be: >> >> def __interpolate__(fmt_str, values): >> return fmt_str.format(*values) > > I should add that it's unfortunate that this builds a string for > str.format() to use. The f-string ast generator goes through a lot of > hassle to parse the f-string and extract the parts. For it to then build > another string that str.format would have to immediately parse again > seems like a waste. > > My current implementation of f-strings would take the original f-string > above and convert it to: > > ''.join(['abc', expr1.__format__('spec1'), 'def', > expr2.__format__(spec2), 'ghi']) > > Which avoids re-parsing anything: it's just normal function calls. > Making __interpolate__ take a tuple of literals and a tuple of (value, > fmt_str) tuples seems like giant hassle to internationalize, but it > would be more efficient in the normal case. Perhaps we could use a variant of the string.Formatter.parse iterator format: https://docs.python.org/3/library/string.html#string.Formatter.parse ? If the first arg was a pre-parsed format_iter rather than a format string, then the default interpolator might look something like: _converter = string.Formatter().convert_field def __interpolate__(format_iter, expressions, values): template_parts = [] # field_num, rather than field_name, for speed reasons for literal_text, field_num, format_spec, conversion in format_iter: template_parts.append(literal_text) if field_num is not None: value = values[field_num] if conversion: value = _converter(value, conversion) field_str = format(value, format_spec) template_parts.append(field_str) return "".join(template_parts) My last il8n example called string.Formatter.parse() anyway, so it could readily be adapted to this model. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From wes.turner at gmail.com Fri Aug 7 14:38:15 2015 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 7 Aug 2015 07:38:15 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> <55C4A090.9060300@trueblade.com> Message-ID: On Fri, Aug 7, 2015 at 7:31 AM, Nick Coghlan wrote: > On 7 August 2015 at 22:12, Eric V. Smith wrote: > > On 8/7/2015 7:52 AM, Eric V. Smith wrote: > >> So (reverting to Python syntax, with the f-string syntax), in addition > >> to converting directly to a string, there's a way to go from: > >> > >> f'abc{expr1:spec1}def{expr2:spec2}ghi' > >> > >> to: > >> > >> ('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, value-of-expr2)) > >> > >> The general idea is that you now have access to an i18n-able string, and > >> the values of the embedded expressions as they were evaluated "in situ" > >> where the f-string literal was present in the source code. > >> Y > >> ou can imagine the f-string above evaluating to a call to: > >> > >> __interpolate__('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, > >> value-of-expr2)) > >> > >> The default implementation of __interpolate__ would be: > >> > >> def __interpolate__(fmt_str, values): > >> return fmt_str.format(*values) > > > > I should add that it's unfortunate that this builds a string for > > str.format() to use. The f-string ast generator goes through a lot of > > hassle to parse the f-string and extract the parts. For it to then build > > another string that str.format would have to immediately parse again > > seems like a waste. > > > > My current implementation of f-strings would take the original f-string > > above and convert it to: > > > > ''.join(['abc', expr1.__format__('spec1'), 'def', > > expr2.__format__(spec2), 'ghi']) > > > > Which avoids re-parsing anything: it's just normal function calls. > > Making __interpolate__ take a tuple of literals and a tuple of (value, > > fmt_str) tuples seems like giant hassle to internationalize, but it > > would be more efficient in the normal case. > > Perhaps we could use a variant of the string.Formatter.parse iterator > format: > https://docs.python.org/3/library/string.html#string.Formatter.parse > ? > > If the first arg was a pre-parsed format_iter rather than a format > string, then the default interpolator might look something like: > > _converter = string.Formatter().convert_field > > def __interpolate__(format_iter, expressions, values): > template_parts = [] > # field_num, rather than field_name, for speed reasons > for literal_text, field_num, format_spec, conversion in > format_iter: > template_parts.append(literal_text) > if field_num is not None: > value = values[field_num] > if conversion: > value = _converter(value, conversion) > field_str = format(value, format_spec) > template_parts.append(field_str) > return "".join(template_parts) > Would __interpolate__ then be an operator / protocol, or just a method of an r-string? Benefits / (other use cases): * implicit/explicit [shell,shlex,[SQL, SPARQL]] quoting (e.g. "" + repr(x)[1:-1] + "") > > My last il8n example called string.Formatter.parse() anyway, so it > could readily be adapted to this model. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Aug 7 14:40:20 2015 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 7 Aug 2015 07:40:20 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> <55C4A090.9060300@trueblade.com> Message-ID: On Fri, Aug 7, 2015 at 7:38 AM, Wes Turner wrote: > > > On Fri, Aug 7, 2015 at 7:31 AM, Nick Coghlan wrote: > >> On 7 August 2015 at 22:12, Eric V. Smith wrote: >> > On 8/7/2015 7:52 AM, Eric V. Smith wrote: >> >> So (reverting to Python syntax, with the f-string syntax), in addition >> >> to converting directly to a string, there's a way to go from: >> >> >> >> f'abc{expr1:spec1}def{expr2:spec2}ghi' >> >> >> >> to: >> >> >> >> ('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, value-of-expr2)) >> >> >> >> The general idea is that you now have access to an i18n-able string, >> and >> >> the values of the embedded expressions as they were evaluated "in situ" >> >> where the f-string literal was present in the source code. >> >> Y >> >> ou can imagine the f-string above evaluating to a call to: >> >> >> >> __interpolate__('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, >> >> value-of-expr2)) >> >> >> >> The default implementation of __interpolate__ would be: >> >> >> >> def __interpolate__(fmt_str, values): >> >> return fmt_str.format(*values) >> > >> > I should add that it's unfortunate that this builds a string for >> > str.format() to use. The f-string ast generator goes through a lot of >> > hassle to parse the f-string and extract the parts. For it to then build >> > another string that str.format would have to immediately parse again >> > seems like a waste. >> > >> > My current implementation of f-strings would take the original f-string >> > above and convert it to: >> > >> > ''.join(['abc', expr1.__format__('spec1'), 'def', >> > expr2.__format__(spec2), 'ghi']) >> > >> > Which avoids re-parsing anything: it's just normal function calls. >> > Making __interpolate__ take a tuple of literals and a tuple of (value, >> > fmt_str) tuples seems like giant hassle to internationalize, but it >> > would be more efficient in the normal case. >> >> Perhaps we could use a variant of the string.Formatter.parse iterator >> format: >> https://docs.python.org/3/library/string.html#string.Formatter.parse >> ? >> >> If the first arg was a pre-parsed format_iter rather than a format >> string, then the default interpolator might look something like: >> >> _converter = string.Formatter().convert_field >> >> def __interpolate__(format_iter, expressions, values): >> template_parts = [] >> # field_num, rather than field_name, for speed reasons >> for literal_text, field_num, format_spec, conversion in >> format_iter: >> template_parts.append(literal_text) >> if field_num is not None: >> value = values[field_num] >> if conversion: >> value = _converter(value, conversion) >> field_str = format(value, format_spec) >> template_parts.append(field_str) >> return "".join(template_parts) >> > > Would __interpolate__ then be an operator / protocol, > or just a method of an r-string? > Similar to pandas.DataFrame.pipe: * http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pipe.html * https://github.com/pydata/pandas/pull/10253 > > Benefits / (other use cases): > * implicit/explicit [shell,shlex,[SQL, SPARQL]] quoting (e.g. "" + > repr(x)[1:-1] + "") > > > >> >> My last il8n example called string.Formatter.parse() anyway, so it >> could readily be adapted to this model. >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Aug 7 14:44:29 2015 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 7 Aug 2015 07:44:29 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> <55C4A090.9060300@trueblade.com> Message-ID: On Fri, Aug 7, 2015 at 7:40 AM, Wes Turner wrote: > > > On Fri, Aug 7, 2015 at 7:38 AM, Wes Turner wrote: > >> >> >> On Fri, Aug 7, 2015 at 7:31 AM, Nick Coghlan wrote: >> >>> On 7 August 2015 at 22:12, Eric V. Smith wrote: >>> > On 8/7/2015 7:52 AM, Eric V. Smith wrote: >>> >> So (reverting to Python syntax, with the f-string syntax), in addition >>> >> to converting directly to a string, there's a way to go from: >>> >> >>> >> f'abc{expr1:spec1}def{expr2:spec2}ghi' >>> >> >>> >> to: >>> >> >>> >> ('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, value-of-expr2)) >>> >> >>> >> The general idea is that you now have access to an i18n-able string, >>> and >>> >> the values of the embedded expressions as they were evaluated "in >>> situ" >>> >> where the f-string literal was present in the source code. >>> >> Y >>> >> ou can imagine the f-string above evaluating to a call to: >>> >> >>> >> __interpolate__('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, >>> >> value-of-expr2)) >>> >> >>> >> The default implementation of __interpolate__ would be: >>> >> >>> >> def __interpolate__(fmt_str, values): >>> >> return fmt_str.format(*values) >>> > >>> > I should add that it's unfortunate that this builds a string for >>> > str.format() to use. The f-string ast generator goes through a lot of >>> > hassle to parse the f-string and extract the parts. For it to then >>> build >>> > another string that str.format would have to immediately parse again >>> > seems like a waste. >>> > >>> > My current implementation of f-strings would take the original f-string >>> > above and convert it to: >>> > >>> > ''.join(['abc', expr1.__format__('spec1'), 'def', >>> > expr2.__format__(spec2), 'ghi']) >>> > >>> > Which avoids re-parsing anything: it's just normal function calls. >>> > Making __interpolate__ take a tuple of literals and a tuple of (value, >>> > fmt_str) tuples seems like giant hassle to internationalize, but it >>> > would be more efficient in the normal case. >>> >>> Perhaps we could use a variant of the string.Formatter.parse iterator >>> format: >>> https://docs.python.org/3/library/string.html#string.Formatter.parse >>> ? >>> >>> If the first arg was a pre-parsed format_iter rather than a format >>> string, then the default interpolator might look something like: >>> >>> _converter = string.Formatter().convert_field >>> >>> def __interpolate__(format_iter, expressions, values): >>> template_parts = [] >>> # field_num, rather than field_name, for speed reasons >>> for literal_text, field_num, format_spec, conversion in >>> format_iter: >>> template_parts.append(literal_text) >>> if field_num is not None: >>> value = values[field_num] >>> if conversion: >>> value = _converter(value, conversion) >>> field_str = format(value, format_spec) >>> template_parts.append(field_str) >>> return "".join(template_parts) >>> >> >> Would __interpolate__ then be an operator / protocol, >> or just a method of an r-string? >> > > Similar to pandas.DataFrame.pipe: > > * > http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pipe.html > * https://github.com/pydata/pandas/pull/10253 > What is the benefit of this additional syntax over: str.format(**glocals_lookup_proxy) str.formatg(**kwargs_override) ? > > > >> >> Benefits / (other use cases): >> * implicit/explicit [shell,shlex,[SQL, SPARQL]] quoting (e.g. "" + >> repr(x)[1:-1] + "") >> >> >> >>> >>> My last il8n example called string.Formatter.parse() anyway, so it >>> could readily be adapted to this model. >>> >>> Cheers, >>> Nick. >>> >>> -- >>> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Aug 7 15:03:44 2015 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 7 Aug 2015 08:03:44 -0500 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> Message-ID: On Fri, Aug 7, 2015 at 2:15 AM, Chris Angelico wrote: > On Fri, Aug 7, 2015 at 3:12 PM, Steven D'Aprano > wrote: > > On Thu, Aug 06, 2015 at 12:26:14PM -0400, random832 at fastmail.us wrote: > >> On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote: > >> > Because strings containing \{ are currently valid > >> > >> Which raises the question of why. > > > > Because \C is currently valid, for all values of C. The idea is that if > > you typo an escape, say \d for \f, you get an obvious backslash in your > > string which is easy to spot. > > > > Personally, I think that's a mistake. It leads to errors like this: > > > > filename = 'C:\some\path\something.txt' > > > > silently doing the wrong thing. If we're going to change the way escapes > > work, it's time to deprecate the misfeature that \C is a literal > > backslash followed by C. Outside of raw strings, a backslash should > > *only* be allowed in an escape sequence. > > I agree; plus, it means there's yet another thing for people to > complain about when they switch to Unicode strings: > > path = "c:\users", "C:\Users" # OK on Py2 > path = u"c:\users", u"C:\Users" # Fails > So this doesn't work? path = pathilb.Path(u"c:\users") # SEC: path concatenation is often in conjunction with user-supplied input - [ ] docs for these - [ ] to/from r'rawstring' (DOC: encode/decode) > > Or equivalently, moving to Py3 and having those strings quietly become > Unicode strings, and now having meaning on the \U and \u escapes. > > That said, though: It's now too late to change Python 2, which means > that this is going to be yet another hurdle when people move > (potentially large) Windows codebases to Python 3. IMO it's a good > thing to trip people up immediately, rather than silently doing the > wrong thing - but it is going to be another thing that people moan > about when Python 3 starts complaining. First they have to add > parentheses to print, then it's all those pointless (in their eyes) > encode/decode calls, and now they have to go through and double all > their backslashes as well! But the alternative is that some future > version of Python adds a new escape code, and all their code starts > silently doing weird stuff - or they change the path name and it goes > haywire (changing from "c:\users\demo" to "c:\users\all users" will be > a fun one to diagnose) - so IMO it's better to know about it early. > > > If we're going to make major changes to the way escapes work, I'd rather > > add new escapes, not take them away: > > > > > > \e escape \x1B, as supported by gcc and clang; > > Please, yes! Also supported by a number of other languages and > commands (Pike, GNU echo, and some others that I don't recall (but not > bind9, which has its own peculiarities)). > > > the escaping rules from Haskell: > > > > > http://book.realworldhaskell.org/read/characters-strings-and-escaping-rules.html > > > > \P platform-specific newline (e.g. \r\n on Windows, \n on POSIX) > > Hmm. Not sure how useful this would be. Personally, I consider this to > be a platform-specific encoding, on par with expecting b"\xc2\xa1" to > display "?", and as such, it should be kept to boundaries. Work with > "\n" internally, and have input routines convert to that, and output > routines optionally add "\r" before them all. > > > \U+xxxx Unicode code point U+xxxx (with four to six hex digits) > > > > It's much nicer to be able to write Unicode code points that (apart from > > the backslash) look like the standard Unicode notation U+0000 to > > U+10FFFF, rather than needing to pad to a full eight digits as the > > \U00xxxxxx syntax requires. > > The problem is the ambiguity. How do you specify that "\U+101010" be a > two-character string? "\U000101010" forces it by having exactly eight > digits, but as soon as you allow variable numbers of digits, you run > into problems. I suppose you could always pad to six for that: > "\U+0101010" could know that it doesn't need a seventh digit. (Though > what would ever happen if the Unicode consortium decides to drop > support for UTF-16 and push for a true 32-bit character set, I don't > know.) It is tempting, though - it both removes the need for two > pointless zeroes, and broadly unifies the syntax for Unicode escapes, > instead of having a massive boundary from "\u1234" to "\U00012345". > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Aug 7 15:12:05 2015 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Aug 2015 23:12:05 +1000 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> Message-ID: On Fri, Aug 7, 2015 at 11:03 PM, Wes Turner wrote: > So this doesn't work? > > path = pathilb.Path(u"c:\users") > # SEC: path concatenation is often in conjunction with user-supplied > input If you try it, you'll see. You get an instant SyntaxError, because \u introduces a Unicode codepoint (eg \u0303) in a Unicode string. In a bytes string, it's meaningless, and therefore is the same thing as "\\u". ChrisA From wes.turner at gmail.com Fri Aug 7 15:40:18 2015 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 7 Aug 2015 08:40:18 -0500 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> Message-ID: On Fri, Aug 7, 2015 at 8:12 AM, Chris Angelico wrote: > On Fri, Aug 7, 2015 at 11:03 PM, Wes Turner wrote: > > So this doesn't work? > > > > path = pathilb.Path(u"c:\users") > > # SEC: path concatenation is often in conjunction with user-supplied > > input > > If you try it, you'll see. You get an instant SyntaxError, because \u > introduces a Unicode codepoint (eg \u0303) in a Unicode string. In a > bytes string, it's meaningless, and therefore is the same thing as > "\\u". > Thanks for the heads \up. This might be good for the pathlib docs and test cases? | Src: https://hg.python.org/cpython/file/tip/Lib/pathlib.py | Tst: https://hg.python.org/cpython/file/tip/Lib/test/test_pathlib.py | Doc: https://hg.python.org/cpython/file/tip/Doc/library/pathlib.rst - [ ] DOC: warning - [ ] DOC: versionadded > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Fri Aug 7 15:55:23 2015 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 07 Aug 2015 09:55:23 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C4A090.9060300@trueblade.com> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> <55C4A090.9060300@trueblade.com> Message-ID: <55C4B8CB.1030601@trueblade.com> On 08/07/2015 08:12 AM, Eric V. Smith wrote: > On 8/7/2015 7:52 AM, Eric V. Smith wrote: >> So (reverting to Python syntax, with the f-string syntax), in addition >> to converting directly to a string, there's a way to go from: >> >> f'abc{expr1:spec1}def{expr2:spec2}ghi' >> >> to: >> >> ('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, value-of-expr2)) >> >> The general idea is that you now have access to an i18n-able string, and >> the values of the embedded expressions as they were evaluated "in situ" >> where the f-string literal was present in the source code. >> Y >> ou can imagine the f-string above evaluating to a call to: >> >> __interpolate__('abc{0:spec1}def{1:spec2}ghi', (value-of-expr1, >> value-of-expr2)) >> >> The default implementation of __interpolate__ would be: >> >> def __interpolate__(fmt_str, values): >> return fmt_str.format(*values) > > I should add that it's unfortunate that this builds a string for > str.format() to use. The f-string ast generator goes through a lot of > hassle to parse the f-string and extract the parts. For it to then build > another string that str.format would have to immediately parse again > seems like a waste. > > My current implementation of f-strings would take the original f-string > above and convert it to: > > ''.join(['abc', expr1.__format__('spec1'), 'def', > expr2.__format__(spec2), 'ghi']) > > Which avoids re-parsing anything: it's just normal function calls. > Making __interpolate__ take a tuple of literals and a tuple of (value, > fmt_str) tuples seems like giant hassle to internationalize, but it > would be more efficient in the normal case. If we do implement __interpolate__ as something like we're describing here, it again brings up the question of concatenating adjacent strings and f-strings. When I'm just calling expr.__format__ and joining the results, you can't tell if I'm turning: f'{a}' ':' f'{b}' into multiple calls to join or not. But if we used __interpolate__, it would make a difference if I called: __interpolate__('{0}:{1}', (a, b)) or ''.join([__interpolate__('{0}', (a,)), ':', __interpolate('{0}', (b,))]) Eric. From ncoghlan at gmail.com Fri Aug 7 16:35:15 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 8 Aug 2015 00:35:15 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C4B8CB.1030601@trueblade.com> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> <55C4A090.9060300@trueblade.com> <55C4B8CB.1030601@trueblade.com> Message-ID: On 7 August 2015 at 23:55, Eric V. Smith wrote: > If we do implement __interpolate__ as something like we're describing > here, it again brings up the question of concatenating adjacent strings > and f-strings. When I'm just calling expr.__format__ and joining the > results, you can't tell if I'm turning: > > f'{a}' ':' f'{b}' > > into multiple calls to join or not. But if we used __interpolate__, it > would make a difference if I called: > > __interpolate__('{0}:{1}', (a, b)) > or > ''.join([__interpolate__('{0}', (a,)), ':', __interpolate('{0}', (b,))]) This is part of why I'd still like interpolated strings to be a clearly distinct thing from normal string literals - whichever behaviour we chose would be confusing to at least some users some of the time. Implicit concatenation is fine for things that are actually constants, but the idea of implicitly concatenating essentially arbitrary subexpressions (as f-strings are) remains strange to me, even when we know the return type will be a string object. As such, I think the behaviour of bytes vs str literals sets a useful precedent here, even though that particular example is forced by the type conflict: >>> b"asd" "asd" File "", line 1 SyntaxError: cannot mix bytes and nonbytes literals Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Fri Aug 7 17:43:32 2015 From: barry at python.org (Barry Warsaw) Date: Fri, 7 Aug 2015 11:43:32 -0400 Subject: [Python-ideas] Briefer string format References: <55AC2EDF.7040205@mgmiller.net> Message-ID: <20150807114332.4c2cc11f@anarchist.wooz.org> On Aug 07, 2015, at 10:12 AM, Guido van Rossum wrote: >This is a big deal because of the worry about code injection. A "classic" >format string given access to locals() (e.g. using s.format(**locals())) >always stirs worries about code injection if the string is a variable. The >proposed forms of string interpolation don't give access to locals *other >than the locals where the string "literal" itself exists*. This latter >access is no different from the access to locals in any expression. (The >same for globals(), of course.) I took a look at the Mailman trunk. It's definitely the case that the majority of the uses of flufl.i18n's string interpolation are with in-place literals. A few examples of where a variable is passed in instead: * An error notification where some other component calculates the error message and is passed to a generic reporting function. The error message may be composed from several literal bits and pieces. * Translate a template read from a data file. I'd put this in the camp of consenting adults. It's useful and rare, so if I saw non-literals in a code review, I'd question it, but probably not disallow it. I'd want to spend extra time reviewing the code to be assured it's not a vector for code injections. >The other issue with explicit locals() is that to the people who would most >benefit from variable interpolation (typically relatively unsophisticated >users), it is magical boilerplate. (Worse, it's boilerplate that their more >experienced mentors will warn them against because of the code injection >worry.) Which is why I think it can't be implicit for all strings. E.g. in an i18n context, seeing _('$person did $something') is a very explicit marker. >I'm not sure what your point is here. (Genuinely not sure -- this is not a >rhetorical flourish.) Are you saying that you prefer the explicit >formatting operation because it acts as a signal to the reader that >formatting is taking place? Although I didn't say it, I'd answer this question "yes". Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From ron3200 at gmail.com Fri Aug 7 17:56:19 2015 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 07 Aug 2015 11:56:19 -0400 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> Message-ID: On 08/07/2015 04:12 AM, Guido van Rossum wrote: > Maybe in the end the f-string proposal is the right one -- it's minimally > obtrusive and yet explicit, *and* backwards compatible? This isn't saying > I'm giving up on always-interpolation; there seems to be at least an even > split between languages that always interpolate (PHP?), languages that have > a way to explicitly disable it (like single quotes in shell), and languages > that require some sort of signal (like C#). I think one of the advantages of f-strings is they are explicitly created in the context of where the scope is defined. That scope includes non-locals too. So locals, and globals is a narrower selection than the defined static scope. Non-locals can't mask globals if they aren't included. So "...".format(*locals(), **globals()) is not the same as when the names are explicitly supplied as keywords. If it is opened up to dynamic scope, all bets are off. That hasn't been suggested, but when functions use locals and globals as arguments, I think that is the security concern. One of questions I have is, will there be a way to create an f-string other than by a literal. So far, I think no, because it's not an object, but a protocol. f"..." ---> "...".format(...). That doesn't mean we can't have a function to do that. Currently the best way would be to do eval('f"..."'), but it wouldn't be exactly the same because eval does not include the non-local part of the scope. It seems that hasn't been an an issue for other things, so maybe it's not an issue here as well. If all strings get scanned, I think it could complicate how strings act in various contexts. For example when a string is used both as f-string and then gets used again as a template or pattern. That suggests there should be a way to turn that scanning off for some strings. (?) So far I'm -1 on all strings, but +.25 on the explicit f-string. (Still waiting to see the PEP before I give it a full +1.) Cheers, Ron From barry at python.org Fri Aug 7 18:09:58 2015 From: barry at python.org (Barry Warsaw) Date: Fri, 7 Aug 2015 12:09:58 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: <20150807120958.392fa5db@anarchist.wooz.org> On Aug 07, 2015, at 05:33 PM, Nick Coghlan wrote: >When you're mixing translation with interpolation, you really want the >translation lookup to happen first, when the placeholders are still >present in the format string: It just doesn't work otherwise. >sys._getframe() usage is what I meant by stack walking. It's not >*really* based on walking the stack, but you're relying on poking >around in runtime state to do dynamic scoping, rather than being able >to do lexical analysis at compile time (and hence why static analysers >get confused about apparently unused local variables). Sure, yep. One other word about i18n based on experience. The escape format *really* matters. Keep in mind that we've always had positional interpolation, via '%(foo)s', but we found that to be very highly error prone. I can't tell you how many times a translator would accidentally leave off the trailing 's', thus breaking the translation. It's exactly the reason for string.Template -- $-strings are familiar to almost all translators, and really hard to screw up. I fear that something like \{ (and especially if \} is required) will be as error prone as %(foo)s. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From barry at python.org Fri Aug 7 18:16:54 2015 From: barry at python.org (Barry Warsaw) Date: Fri, 7 Aug 2015 12:16:54 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: <20150807121654.6b04e313@anarchist.wooz.org> On Aug 07, 2015, at 07:50 PM, Nick Coghlan wrote: >Not quite. While I won't be entirely clear on Eric's latest proposal >until the draft PEP is available, my understanding is that an f-string >like: > > f"This interpolates \{a} and \{b}" > >would currently end up effectively being syntactic sugar for a >formatting operation like: > > "This interpolates " + format(a) + " and " + format(b) Don't think of it this way, because this can't be translated. For i18n to work, translators *must* have access to the entire string. In some natural languages, fragments make no sense. Keep this in mind while you're writing your multilingual application. :) >With that definition for default interpolation, the f-string above >would be translated at compile time to the runtime call: > > __interpolate__("This interpolates %s and %s", ("a", "b"), (a, b), ("", > "")) You need named placeholders in order to allow for parameter reordering. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From barry at python.org Fri Aug 7 18:18:21 2015 From: barry at python.org (Barry Warsaw) Date: Fri, 7 Aug 2015 12:18:21 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> Message-ID: <20150807121821.095c44d0@anarchist.wooz.org> On Aug 07, 2015, at 12:13 PM, Guido van Rossum wrote: >OK, that sounds reasonable, except that translators need to control >substitution order, so s % tuple(...) doesn't work. However, if we use >s.format(...) we can use "This interpolates {0} and {1}", and then I'm >satisfied. (Further details of the signature of __interpolate__ TBD.) That doesn't work either, but this does: "This interpolates {apples} and {oranges}". Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From eric at trueblade.com Fri Aug 7 18:31:31 2015 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 07 Aug 2015 12:31:31 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150807121821.095c44d0@anarchist.wooz.org> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <20150807121821.095c44d0@anarchist.wooz.org> Message-ID: <55C4DD63.1050907@trueblade.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/07/2015 12:18 PM, Barry Warsaw wrote: > On Aug 07, 2015, at 12:13 PM, Guido van Rossum wrote: > >> OK, that sounds reasonable, except that translators need to >> control substitution order, so s % tuple(...) doesn't work. >> However, if we use s.format(...) we can use "This interpolates >> {0} and {1}", and then I'm satisfied. (Further details of the >> signature of __interpolate__ TBD.) > > That doesn't work either, but this does: > > "This interpolates {apples} and {oranges}". I think it would, because you could say this, in some language where the order had to be reversed: "This interpolates {1} and {0}" Now I'll grant you that it reduces usability. But it does provide the needed functionality. But I can't see how we'd automatically generate useful names from expressions, as opposed to just numbering the fields. That is, unless we go back from general expressions to just identifiers. Or, use something like Nick's suggestion of also passing in the text of the expressions, so we could map identifier-only expressions to their indexes so we could build up yet another string. Eric. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAEBAgAGBQJVxN1jAAoJENxauZFcKtNxXN0H/iYO8koEg/pqJ9wFQEN/10Sd Kp9xp0GHj0bHU9uPqzcJEoWPExOpRW5vUqswU+YwtrRg9uuWcvfaASoI1VI1bR29 ABg7R6zYJoxCLluaMo7eHyWQMnbTOAI0Ubm/TNJdvyAcBX+DL5zNNmtXTTr2ti1H uWo6xfjvGNv4RgGqL96GuPd+KL3ceuWmlapJrVPUT5QA2/nf8qYl9BSvHCY/VxR7 SzGhwOO4yMUOO5VXNLWYZiNvKEFHX9GSHvQcAIqymzY+MDGRt2aIOxz0b9x3jexH MqbRiVUlzsJObKVjWl2Ejc0yfp3trbYXJasCRMtoyE4VsWc8CNjTncnVgXw/41Q= =7SDr -----END PGP SIGNATURE----- From mal at egenix.com Fri Aug 7 18:33:42 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 07 Aug 2015 18:33:42 +0200 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150807121654.6b04e313@anarchist.wooz.org> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <20150807121654.6b04e313@anarchist.wooz.org> Message-ID: <55C4DDE6.7040201@egenix.com> On 07.08.2015 18:16, Barry Warsaw wrote: > On Aug 07, 2015, at 07:50 PM, Nick Coghlan wrote: > >> Not quite. While I won't be entirely clear on Eric's latest proposal >> until the draft PEP is available, my understanding is that an f-string >> like: >> >> f"This interpolates \{a} and \{b}" I like the general idea (we had a similar discussion on this topic a few years ago, only using i"18n" strings as syntax), but I *really* don't like the "f" prefix on strings. f-words usually refer to things you typically don't want in your code. f-strings are really no better, IMO, esp. when combined with the u prefix. Can the prefix character please be reconsidered before adding it to the language ? Some other options: i"nternationalization" (or i"18n") t"ranslate" l"ocalization" (or l"10n") Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 07 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From python at mrabarnett.plus.com Fri Aug 7 18:43:45 2015 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 7 Aug 2015 17:43:45 +0100 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: <20150807051201.GZ3737@ando.pearwood.info> References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> Message-ID: <55C4E041.3030804@mrabarnett.plus.com> On 2015-08-07 06:12, Steven D'Aprano wrote: > On Thu, Aug 06, 2015 at 12:26:14PM -0400, random832 at fastmail.us wrote: >> On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote: >> > Because strings containing \{ are currently valid >> >> Which raises the question of why. > > Because \C is currently valid, for all values of C. The idea is that if > you typo an escape, say \d for \f, you get an obvious backslash in your > string which is easy to spot. > > Personally, I think that's a mistake. It leads to errors like this: > > filename = 'C:\some\path\something.txt' > > silently doing the wrong thing. If we're going to change the way escapes > work, it's time to deprecate the misfeature that \C is a literal > backslash followed by C. Outside of raw strings, a backslash should > *only* be allowed in an escape sequence. > > Deprecating invalid escape sequences would then open the door to adding > new, useful escapes. > > >> (and as long as we're talking about >> things to deprecate in string literals, how about \v?) > > Why would you want to deprecate a useful and long-standing escape > sequence? Admittedly \v isn't as common as \t or \n, but it still has > its uses, and is a standard escape familiar to anyone who uses C, C++, > C#, Octave, Haskell, Javascript, etc. > > If we're going to make major changes to the way escapes work, I'd rather > add new escapes, not take them away: > > > \e escape \x1B, as supported by gcc and clang; > > the escaping rules from Haskell: > > http://book.realworldhaskell.org/read/characters-strings-and-escaping-rules.html > > \P platform-specific newline (e.g. \r\n on Windows, \n on POSIX) > > \U+xxxx Unicode code point U+xxxx (with four to six hex digits) > > It's much nicer to be able to write Unicode code points that (apart from > the backslash) look like the standard Unicode notation U+0000 to > U+10FFFF, rather than needing to pad to a full eight digits as the > \U00xxxxxx syntax requires. > Some other languages, such as Perl, have \x{...}, so that would be \x{10FFF}. From alex.gronholm at nextday.fi Fri Aug 7 18:51:57 2015 From: alex.gronholm at nextday.fi (=?UTF-8?B?QWxleCBHcsO2bmhvbG0=?=) Date: Fri, 07 Aug 2015 19:51:57 +0300 Subject: [Python-ideas] Making concurrent.futures.Futures awaitable Message-ID: <55C4E22D.101@nextday.fi> There's an open issue for adding support for awaiting for concurrent.futures.Futures here: http://bugs.python.org/issue24383 This is about writing code like this: async def handler(self): result = await some_blocking_api.do_something_cpu_heavy() await self.write(result) As it stands, without this feature, some boilerplate is required: from asyncio import wrap_future async def handler(self): result = await wrap_future(some_blocking_api.do_something_cpu_heavy()) await self.write(result) I wrote a patch (with tests by Yury Selivanov) that adds __await__() to concurrent.futures.Future and augments the asyncio Task class to handle concurrent Futures. My arguments on why we should add this: * it eliminates the boilerplate code, reducing complexity * it also makes concurrent Futures work with "yield from" style non-native coroutines * it does not interfere with any existing functionality * standard library components should work with each other -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Aug 7 19:08:10 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Aug 2015 19:08:10 +0200 Subject: [Python-ideas] Making concurrent.futures.Futures awaitable In-Reply-To: <55C4E22D.101@nextday.fi> References: <55C4E22D.101@nextday.fi> Message-ID: FWIW, I am against this (as Alex already knows), for the same reasons I didn't like Nick's proposal. Fuzzing the difference between threads and asyncio tasks is IMO asking for problems -- people will stop understanding what they are doing and then be bitten when they least need it. The example code should be written using loop.run_in_executor(). (This requires that do_something_cpu_heavy() be refactored into a function that does the work and a wrapper that creates the concurrent.futures.Future.) On Fri, Aug 7, 2015 at 6:51 PM, Alex Gr?nholm wrote: > There's an open issue for adding support for awaiting for > concurrent.futures.Futures here: > http://bugs.python.org/issue24383 > > This is about writing code like this: > > async def handler(self): > > result = await some_blocking_api.do_something_cpu_heavy() > > await self.write(result) > > > As it stands, without this feature, some boilerplate is required: > > from asyncio import wrap_future > > async def handler(self): > > result = await wrap_future(some_blocking_api.do_something_cpu_heavy()) > > await self.write(result) > > > I wrote a patch (with tests by Yury Selivanov) that adds __await__() to > concurrent.futures.Future > and augments the asyncio Task class to handle concurrent Futures. > > My arguments on why we should add this: > > - it eliminates the boilerplate code, reducing complexity > - it also makes concurrent Futures work with "yield from" style > non-native coroutines > - it does not interfere with any existing functionality > - standard library components should work with each other > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Fri Aug 7 19:08:45 2015 From: barry at python.org (Barry Warsaw) Date: Fri, 7 Aug 2015 13:08:45 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> Message-ID: <20150807130845.7da4d24b@anarchist.wooz.org> On Aug 06, 2015, at 06:21 PM, Yury Selivanov wrote: >Anyways, it would be nice if we can make i18n a little bit >easier and standardized in python. That would help with >adding i18n in existing projects, that weren't designed with >it in mind from start. Agreed. I have to say while I like the direction of trying to marry interpolation and translation, there are a few things about f-strings that bother me in this context. We won't know for sure until the PEP is written, but in brief: * Interpolation marker syntax. I've mentioned this before, but the reason why I wrote string.Template and adopted it for i18n is because $-strings are very familiar to translators, many of whom aren't even programmers. $-strings are pretty difficult to mess up. Anything with leading and trailing delimiters will cause problems, especially if there are multiple characters in the delimiters. (Yes, I know string.Template supports ${foo} but that is a compromise for the rare cases where disambiguation of where the placeholder ends is needed. Avoid these if possible in an i18n context.) * Arbitrary expressions These just add complexity. Remember than translators have to copy the placeholder verbatim into the translated string, so any additional noise will lead to broken translations, or worse, broken expressions (possibly also leading to security vulnerabilities or privacy leaks!). I personally think arbitrary expressions are overkill and unnecessary for interpolation, but if they're adopted in the final PEP, I would just urge i18n'ers to avoid them at all costs. * Literals only I've described elsewhere that accepting non-literals is useful in some cases. If this limitation is adopted, it just means in the few cases where non-literals are needed, the programmer will have to resort to less convenient "under-the-hood" calls to get the strings translated. Maybe that's acceptable. * Global state Most command line scripts have a single translation context, i.e. the locale of the user's environment. But other applications, e.g. servers, can have stacks of multiple translation contexts. As an example, imagine a Mailman server needing to send two notifications, one to the original poster and another to the list administrator. Those notifications are in different languages. flufl.i18n actually implements a stack of translations contexts so you can push the language for the poster, send the notification, then push the context for the admin and send that notification (yes, these are context managers). Then when you're all done, those contexts pop off the stack and you're left with the default context. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From barry at python.org Fri Aug 7 19:16:38 2015 From: barry at python.org (Barry Warsaw) Date: Fri, 7 Aug 2015 13:16:38 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <20150807121821.095c44d0@anarchist.wooz.org> <55C4DD63.1050907@trueblade.com> Message-ID: <20150807131638.10291d55@anarchist.wooz.org> On Aug 07, 2015, at 12:31 PM, Eric V. Smith wrote: >I think it would, because you could say this, in some language where >the order had to be reversed: >"This interpolates {1} and {0}" I think you'll find this rather error prone for translators to get right. They generally need some semantic clues to help understand how to translate the source string. Numbered placeholders will be confusing. >Now I'll grant you that it reduces usability. But it does provide the >needed functionality. > >But I can't see how we'd automatically generate useful names from >expressions, as opposed to just numbering the fields. That is, unless >we go back from general expressions to just identifiers. Or, use >something like Nick's suggestion of also passing in the text of the >expressions, so we could map identifier-only expressions to their >indexes so we could build up yet another string. Right. Again, *if* we're trying to marry i18n and interpolation, I would greatly prefer to ditch general expressions and just use identifiers. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From eric at trueblade.com Fri Aug 7 21:15:22 2015 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 7 Aug 2015 15:15:22 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C4DDE6.7040201@egenix.com> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <20150807121654.6b04e313@anarchist.wooz.org> <55C4DDE6.7040201@egenix.com> Message-ID: <55C503CA.1010602@trueblade.com> On 8/7/2015 12:33 PM, M.-A. Lemburg wrote: > On 07.08.2015 18:16, Barry Warsaw wrote: >> On Aug 07, 2015, at 07:50 PM, Nick Coghlan wrote: >> >>> Not quite. While I won't be entirely clear on Eric's latest proposal >>> until the draft PEP is available, my understanding is that an f-string >>> like: >>> >>> f"This interpolates \{a} and \{b}" > > I like the general idea (we had a similar discussion on this > topic a few years ago, only using i"18n" strings as syntax), but > I *really* don't like the "f" prefix on strings. > > f-words usually refer to things you typically don't want in your > code. f-strings are really no better, IMO, esp. when combined > with the u prefix. There would never be a reason to use "fu" as a prefix. "u" is only needed for python 2.x compatibility, and this feature is only for 3.6+. > Can the prefix character please be reconsidered before adding it > to the language ? > > Some other options: > > i"nternationalization" (or i"18n") > t"ranslate" > l"ocalization" (or l"10n") Well, if we generalize this to something more than just literal string formatting, maybe so. Until then, for the explicit version (as opposed to the "all strings" version), I like "f". When I'm done with that PEP we can start arguing about it. Eric. From yselivanov.ml at gmail.com Fri Aug 7 21:17:42 2015 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 7 Aug 2015 15:17:42 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150807130845.7da4d24b@anarchist.wooz.org> References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> <20150807130845.7da4d24b@anarchist.wooz.org> Message-ID: <55C50456.6010705@gmail.com> On 2015-08-07 1:08 PM, Barry Warsaw wrote: > * Arbitrary expressions > > These just add complexity. Remember than translators have to copy the > placeholder verbatim into the translated string, so any additional noise will > lead to broken translations, or worse, broken expressions (possibly also > leading to security vulnerabilities or privacy leaks!). I personally think > arbitrary expressions are overkill and unnecessary for interpolation, but if > they're adopted in the final PEP, I would just urge i18n'ers to avoid them at > all costs. Yes. And overall I think that sum = a + b print(f'the sum is {sum}') is more pythonic (readability, explicitness etc) than this: print(f'the sum is {a + b}') And that's just a trivial example. Yury From emile at fenx.com Fri Aug 7 22:34:32 2015 From: emile at fenx.com (Emile van Sebille) Date: Fri, 7 Aug 2015 13:34:32 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C50456.6010705@gmail.com> References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> <20150807130845.7da4d24b@anarchist.wooz.org> <55C50456.6010705@gmail.com> Message-ID: On 8/7/2015 12:17 PM, Yury Selivanov wrote: > Yes. And overall I think that > > sum = a + b > print(f'the sum is {sum}') > > is more pythonic (readability, explicitness etc) than this: > > print(f'the sum is {a + b}') except for your choice of 'sum' I'd agree. Otherwise shadowing builtins doesn't do any of these. Emile From ron3200 at gmail.com Fri Aug 7 22:42:26 2015 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 07 Aug 2015 16:42:26 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> Message-ID: On 08/07/2015 08:18 AM, Nick Coghlan wrote: > With the format string containing indices rather than the original > expressions, we'd still want to pass in the text of those as another > tuple, though. > > With that signature the default interpolator would look like: > > def __interpolate__(format_str, expressions, values): > return format_str.format(*values) While reading this discussion, I was thinking of what it would like if it was reduced to a minimal pattern that would still resemble the concept being discussed without any magic. To do that, each part could be handled separately. def _(value, fmt=''): ('{:%s}' % fmt).format(value) And then the exprssion become the very non-magical and obvious... 'abc' + _(expr1) + 'def' + _(expr2) + 'ghi' It nearly mirrors the proposed f-strings in how it reads. f"abc{expr1}def{expr2}ghi" Yes, it's a bit longer, but I thought it was interesting. It would also be easy to explain. There aren't any format specifiers in this example, but if they were present, they would be in the same order as you would see them in a format string. Cheers, Ron From ron3200 at gmail.com Fri Aug 7 23:40:27 2015 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 07 Aug 2015 17:40:27 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> Message-ID: On 08/07/2015 04:42 PM, Ron Adam wrote: > > def _(value, fmt=''): > ('{:%s}' % fmt).format(value) Hmmm, I notice that this can be rewritten as... _ = format 'abc' + _(expr1) + 'def' + _(expr2) + 'ghi' What surpised me is the docs say... format(format_string, *args, **kwargs) But this works... >>> format(123, '^15') ' 123 ' But this doesn't.... >>> format('^15', 123) Traceback (most recent call last): File "", line 1, in TypeError: must be str, not int Am I missing something, or do the docs need to be changed? Cheers, Ron From eric at trueblade.com Fri Aug 7 23:54:28 2015 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 7 Aug 2015 17:54:28 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> Message-ID: <55C52914.9060604@trueblade.com> On 8/7/2015 5:40 PM, Ron Adam wrote: > > > On 08/07/2015 04:42 PM, Ron Adam wrote: >> >> def _(value, fmt=''): >> ('{:%s}' % fmt).format(value) > > Hmmm, I notice that this can be rewritten as... > > _ = format > 'abc' + _(expr1) + 'def' + _(expr2) + 'ghi' > > > What surpised me is the docs say... > > format(format_string, *args, **kwargs) Where do you see that? https://docs.python.org/3/library/functions.html#format Says: format(value[, format_spec]) Eric. From Nikolaus at rath.org Sat Aug 8 00:24:04 2015 From: Nikolaus at rath.org (Nikolaus Rath) Date: Fri, 07 Aug 2015 15:24:04 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150807130845.7da4d24b@anarchist.wooz.org> (Barry Warsaw's message of "Fri, 7 Aug 2015 13:08:45 -0400") References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> <20150807130845.7da4d24b@anarchist.wooz.org> Message-ID: <87h9oarb3v.fsf@thinkpad.rath.org> On Aug 07 2015, Barry Warsaw wrote: > * Literals only > > I've described elsewhere that accepting non-literals is useful in some > cases. Are you saying you don't want f-strings, but you want something that looks like a function (but is actually a special form because it has access to the local context)? E.g. f(other_fn()) would perform literal interpolation on the result of other_fn()? I think that would be a very bad idea. It introduces something that looks like a function but isn't and it opens the door to a new class of injection vulnerabilities (every time you return a string it could potentially be used for interpolation at some point). Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 997 bytes Desc: not available URL: From wes.turner at gmail.com Sat Aug 8 00:33:01 2015 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 7 Aug 2015 17:33:01 -0500 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <87h9oarb3v.fsf@thinkpad.rath.org> References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> <20150807130845.7da4d24b@anarchist.wooz.org> <87h9oarb3v.fsf@thinkpad.rath.org> Message-ID: On Fri, Aug 7, 2015 at 5:24 PM, Nikolaus Rath wrote: > On Aug 07 2015, Barry Warsaw < > barry-+ZN9ApsXKcEdnm+yROfE0A at public.gmane.org> wrote: > > * Literals only > > > > I've described elsewhere that accepting non-literals is useful in some > > cases. > > Are you saying you don't want f-strings, but you want something that > looks like a function (but is actually a special form because it has > access to the local context)? E.g. f(other_fn()) would perform literal > interpolation on the result of other_fn()? > > I think that would be a very bad idea. It introduces something that > looks like a function but isn't and it opens the door to a new class of > injection vulnerabilities (every time you return a string it could > potentially be used for interpolation at some point). > glocals(), format_from(), lookup() (e.g. salt map.jinja stack of dicts) Contexts: * [Python-ideas] String interpolation for all literal strings * 'this should not be a {cmd}'.format(cmd=cmd) * 'this should not be a {cmd}'.format(globals() + locals() + {'cmd':cmd'}) * 'this should not be a \{cmd}' * f'this should not be a \{cmd}' * [Python-ideas] Briefer string format * [Python-ideas] Make non-meaningful backslashes illegal in string literals * u'C:\users' breaks because \u is an escape sequence * How does this interact with string interpolation (e.g. **when**, in the functional composition from string to string (with parameters), do these escape sequences get eval'd? * See: MarkupSafe (Jinja2) Justification: * "how are the resources shared relevant to these discussions?" * TL;DR * string interpolation is often dangerous (OS Command Injection and SQL Injection are the #1 and #2 according to the CWE/SANS 2011 Top 25) * string interpolation is already hard to review (because there are many ways to do it) * it's a functional composition of an AST? * Shared a number of seemingly tangential links (in python-ideas) in regards to proposals to add an additional string interpolation syntax with implicit local then global context / scope tentatively called 'f-strings'. * Bikeshedded on the \{syntax} ({{because}} {these} \{are\} more readable) * Bikeshedded on the name 'f-string', because of visual disambiguability from 'r-string' (for e.g. raw strings (and e.g. ``re``)) * Is there an AST scanner to find these? * Because a grep expression for ``f"`` or ``f'`` is not that helpful. * Especially as compared to ``grep ".format("`` Use Cases: ---------- As a developer, I want to: * grep, grep for string interpolations * include parameters in strings (and escape them appropriateyl) * The safer thing to do is should *usually* (often) be tokenized and e.g. quoted and serialized out * OS Commands, HTML DOM, SQL parse tree, SPARQL parse tree, CSV, TSV, (*injection* vectors with user supplied input and non-binary string-based data representation formats) * "Explicit is better than implicit" -- Zen of Python * Where are the values of these variables set? With *non* f-strings (str.format, str.__mod__) the context is explicit; and I regard that as a feature of Python. * If what is needed is a shorthand way to say * ``glocals(**kwargs) / gl()`` * ``lookup_from({}, locals(), globals())``, * ``.formatlookup(`` or ``.formatl(`` and/or not add a backwards-incompatible shortcut which is going to require additional review (as I am reviewing things that are commands or queries). * These are usually trees of tokens which are serialized for a particular context; and they are difficult because we often don't think of them in the same terms as say the Python AST; because we think we can just use string concatenation here (when there should/could be typed objects with serialization methods e.g * __str__ * __str_shell__ * __str_sql__(_, with_keywords=SQLVARIANT_KEYWORDS) With this form, the proposed f-string method would be: * __interpolate__ * [ ] Manual review * Which variables/expressions are defined or referenced here, syntax checker? * There are 3 other string interpolation syntaxes. * ``glocals(**kwargs) / gl()`` * **AND THEN**, "so I can just string-concatenate these now?" * Again, MarkupSafe __attr * Types and serialization over concatenation > > Best, > -Nikolaus > > -- > GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F > Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F > > ?Time flies like an arrow, fruit flies like a Banana.? > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Sat Aug 8 00:36:17 2015 From: barry at python.org (Barry Warsaw) Date: Fri, 7 Aug 2015 18:36:17 -0400 Subject: [Python-ideas] String interpolation for all literal strings References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> <20150807130845.7da4d24b@anarchist.wooz.org> <87h9oarb3v.fsf@thinkpad.rath.org> Message-ID: <20150807183617.290b1ce9@anarchist.wooz.org> On Aug 07, 2015, at 03:24 PM, Nikolaus Rath wrote: >On Aug 07 2015, Barry Warsaw wrote: >> * Literals only >> >> I've described elsewhere that accepting non-literals is useful in some >> cases. > >Are you saying you don't want f-strings, but you want something that >looks like a function (but is actually a special form because it has >access to the local context)? E.g. f(other_fn()) would perform literal >interpolation on the result of other_fn()? Maybe I misunderstood the non-literal discussion. For translations, you will usually operate on literal strings, but sometimes you want to operate on a string via a variable. E.g. print(_('These are $apples and $oranges')) vs. print(_(as_the_saying_goes)) Nothing magical there. I guess if we're talking about a string prefix to do all the magic, the latter doesn't make any sense, except that you couldn't pass an f-string into a function that did the latter, because you'd want to defer interpolation until the call site, not at the f-string definition site. Or maybe the translateable string comes from a file and isn't ever a literal. That makes me think that we have to make sure there's a way to access the interpolation programmatically. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From random832 at fastmail.us Sat Aug 8 01:05:30 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Fri, 07 Aug 2015 19:05:30 -0400 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: <20150807051201.GZ3737@ando.pearwood.info> References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> Message-ID: <1438988730.134996.350742881.7868D43C@webmail.messagingengine.com> On Fri, Aug 7, 2015, at 01:12, Steven D'Aprano wrote: > > (and as long as we're talking about > > things to deprecate in string literals, how about \v?) > > Why would you want to deprecate a useful and long-standing escape > sequence? Because it doesn't do anything useful and no-one uses it. http://prog21.dadgum.com/76.html http://prog21.dadgum.com/103.html > Admittedly \v isn't as common as \t or \n, but it still has > its uses, and is a standard escape familiar to anyone who uses C, C++, > C#, Octave, Haskell, Javascript, etc. I challenge you to find *one* use in the wild. Just one. Everyone does it because everyone else does it, but it's not useful to any real users. Meanwhile, on the subject of _adding_ one, how about \e? [or \E. Both printf(1) and terminfo actually support both, and \E is more "canonical" for termcap/terminfo usage.] From random832 at fastmail.us Sat Aug 8 01:15:22 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Fri, 07 Aug 2015 19:15:22 -0400 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: <20150807051201.GZ3737@ando.pearwood.info> References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> Message-ID: <1438989322.136729.350749585.472B95D3@webmail.messagingengine.com> On Fri, Aug 7, 2015, at 01:12, Steven D'Aprano wrote: > \P platform-specific newline (e.g. \r\n on Windows, \n on POSIX) There are not actually a whole hell of a lot of situations that are otherwise cross-platform where it's _actually_ appropriate to use \r\n on Windows. How about unicode character names? Say what you will about \xA0 \u00A0 vs \U000000A0 (and incidentally are we ever going to deprecate octal escapes? Or at least make them fixed-width like all the others), but you can't really beat \N{NO-BREAK SPACE} for clarity. Of course, you'd want a fixed set rather than Perl's insanity with user-defined ones, loose ones, and short ones. From abarnert at yahoo.com Sat Aug 8 02:24:53 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 7 Aug 2015 17:24:53 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150807120958.392fa5db@anarchist.wooz.org> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <20150807120958.392fa5db@anarchist.wooz.org> Message-ID: <35E9AF81-F149-40F1-90ED-B98615B3D905@yahoo.com> On Aug 7, 2015, at 09:09, Barry Warsaw wrote: > One other word about i18n based on experience. The escape format *really* > matters. Keep in mind that we've always had positional interpolation, via > '%(foo)s', but we found that to be very highly error prone. I can't tell you > how many times a translator would accidentally leave off the trailing 's', > thus breaking the translation. It's exactly the reason for string.Template -- > $-strings are familiar to almost all translators, and really hard to screw up. > I fear that something like \{ (and especially if \} is required) will be as > error prone as %(foo)s. Besides the familiarity issue, there's also a tools issue. I've worked on more than one project where we outsourced translation to companies who had (commercial or in-house) tools that recognized $var, ${var}, %s (possibly with the extended format that allows you to put position numbers in), and %1 (the last being a Microsoft thing) but nothing else. I don't know why so many of their tools are so crappy, or why they waste money on them when there are better (and often free) alternatives, but it is an argument in favor of $. From ncoghlan at gmail.com Sat Aug 8 02:49:44 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 8 Aug 2015 10:49:44 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150807131638.10291d55@anarchist.wooz.org> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <20150807121821.095c44d0@anarchist.wooz.org> <55C4DD63.1050907@trueblade.com> <20150807131638.10291d55@anarchist.wooz.org> Message-ID: On 8 Aug 2015 03:17, "Barry Warsaw" wrote: > > On Aug 07, 2015, at 12:31 PM, Eric V. Smith wrote: > > >Now I'll grant you that it reduces usability. But it does provide the > >needed functionality. > > > >But I can't see how we'd automatically generate useful names from > >expressions, as opposed to just numbering the fields. That is, unless > >we go back from general expressions to just identifiers. Or, use > >something like Nick's suggestion of also passing in the text of the > >expressions, so we could map identifier-only expressions to their > >indexes so we could build up yet another string. > > Right. Again, *if* we're trying to marry i18n and interpolation, I would > greatly prefer to ditch general expressions and just use identifiers. I think we're all losing track of what's being proposed and what we'd like to make easy (I know I am), so I'm going to sit on my hands in relation to this discussion until Eric has had a chance to draft his PEP (I leave for a business trip to the US tomorrow, so I may actually stick to that kind of commitment for once!). Once Eric's draft is done, we can create a competing PEP that centres the il8n use case by building on the syntax in Ka-Ping Yee's original PEP 215 (which also inspired the string.Template syntax in PEP 292) but using the enhanced syntactic interpolation machinery from Eric's proposal. (MAL's suggestion of "i-strings" as the prefix is also interesting to me, as that would work with either "interpolated string" or "il8n string" as the mnemonic) Regards, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Sat Aug 8 03:04:55 2015 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 07 Aug 2015 21:04:55 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C52914.9060604@trueblade.com> References: <55C25C74.50008@trueblade.com> <20150806142741.3bebf7b3@anarchist.wooz.org> <55C49BF3.20100@trueblade.com> <55C52914.9060604@trueblade.com> Message-ID: On 08/07/2015 05:54 PM, Eric V. Smith wrote: > On 8/7/2015 5:40 PM, Ron Adam wrote: >> > >> > >> >On 08/07/2015 04:42 PM, Ron Adam wrote: >>> >> >>> >> def _(value, fmt=''): >>> >> ('{:%s}' % fmt).format(value) >> > >> >Hmmm, I notice that this can be rewritten as... >> > >> > _ = format >> > 'abc' + _(expr1) + 'def' + _(expr2) + 'ghi' >> > >> > >> >What surpised me is the docs say... >> > >> > format(format_string, *args, **kwargs) > Where do you see that? > > https://docs.python.org/3/library/functions.html#format > > Says: format(value[, format_spec]) Here... https://docs.python.org/3/library/string.html But it was the method I was looking at, not the function. So I think it's fine. I wonder if methods should be listed as .method_name instead of just methods name. But I suppose it's not needed. Cheers, Ron From ron3200 at gmail.com Sat Aug 8 03:52:52 2015 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 07 Aug 2015 21:52:52 -0400 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> Message-ID: On 08/06/2015 12:26 PM, random832 at fastmail.us wrote: > On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote: >> >Because strings containing \{ are currently valid > Which raises the question of why. (and as long as we're talking about > things to deprecate in string literals, how about \v?) (In the below consider x as any character.) In most languages if \x is not a valid escape character, then an error is raised. In regular expressions when \x is not a valid escape character, they just makes it x. \s ---> s \h ---> h In Python it's \ + x. \s --> \\s \h --> \\h Personally I think if \x is not a valid escape character it should raise an error. But since it's a major change in python, I think it would need to be done in a major release, possibly python4. Currently if a new escape characters needs to be added, it involve the risk of breaking currently working code. It can be handled but it's not what I think is the best approach. It would be better if we could make escape codes work only if they are valid, and raise an error if they are not. Then when/if any new escape codes are added, it's not as much of a backwards compatible problem. That means '\ ' would raise an error, and would need to be '\\ ' or r'\ '. But we probably need to wait until a major release to do this. I'd be for it, but I understand why a lot of people would not like it. It would mean they may need to go back and repair possibly a lot(?) of code they already written. It's not pleasant to have a customers upset when programs break. Cheers, Ron From 4kir4.1i at gmail.com Sat Aug 8 03:54:20 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Sat, 08 Aug 2015 04:54:20 +0300 Subject: [Python-ideas] Briefer string format References: <55AC2EDF.7040205@mgmiller.net> Message-ID: <87k2t6zgs3.fsf@gmail.com> Guido van Rossum writes: > On Thu, Aug 6, 2015 at 10:35 PM, Wes Turner > wrote: > >> >> On Aug 6, 2015 3:03 PM, "Guido van Rossum" wrote: >> > >> > Unfortunately, all spellings that require calling locals() are wrong. >> >> Is this where the potential source of surprising error is? >> >> * Explicit / Implicit locals() >> > This is a big deal because of the worry about code injection. A "classic" > format string given access to locals() (e.g. using s.format(**locals())) > always stirs worries about code injection if the string is a variable. The > proposed forms of string interpolation don't give access to locals *other > than the locals where the string "literal" itself exists*. This latter > access is no different from the access to locals in any expression. (The > same for globals(), of course.) > > The other issue with explicit locals() is that to the people who would most > benefit from variable interpolation (typically relatively unsophisticated > users), it is magical boilerplate. (Worse, it's boilerplate that their more > experienced mentors will warn them against because of the code injection > worry.) Googling e.g., "python locals code injection" yields nothing specific: http://stackoverflow.com/questions/2515450/injecting-variables-into-the-callers-scope http://stackoverflow.com/questions/13312240/is-a-string-formatter-that-pulls-variables-from-its-calling-scope-bad-practice Could you provide an example what is wrong with "{a}{b}".format(**vars())? Is it correct to say that there is nothing wrong with it as long as the string is always a *literal*? From rosuav at gmail.com Sat Aug 8 04:21:46 2015 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 8 Aug 2015 12:21:46 +1000 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: <1438989322.136729.350749585.472B95D3@webmail.messagingengine.com> References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> <1438989322.136729.350749585.472B95D3@webmail.messagingengine.com> Message-ID: On Sat, Aug 8, 2015 at 9:15 AM, wrote: > On Fri, Aug 7, 2015, at 01:12, Steven D'Aprano wrote: >> \P platform-specific newline (e.g. \r\n on Windows, \n on POSIX) > > There are not actually a whole hell of a lot of situations that are > otherwise cross-platform where it's _actually_ appropriate to use \r\n > on Windows. > > How about unicode character names? Say what you will about \xA0 \u00A0 > vs \U000000A0 (and incidentally are we ever going to deprecate octal > escapes? Or at least make them fixed-width like all the others), but you > can't really beat \N{NO-BREAK SPACE} for clarity. Of course, you'd want > a fixed set rather than Perl's insanity with user-defined ones, loose > ones, and short ones. Not sure what you're saying here. Python already has those. >>> ACUTE = "\N{COMBINING ACUTE ACCENT}" >>> print("Libe{0}re{0}e, de{0}livre{0}e!".format(ACUTE)) Libe?re?e, de?livre?e! They do get just a _tad_ verbose, though. Are you suggesting adding short forms for them, something like: >>> print("Libe\N{ACUTE}re\N{ACUTE}e, de\N{ACUTE}livre\N{ACUTE}e!") ? Because that might be nice, but then someone has to decide what the short forms mean. We can always define our own local aliases the way I did up above; it'd be nice if constant folding could make this as simple as the \N escapes are, but that's a microoptimization. ChrisA From random832 at fastmail.us Sat Aug 8 04:51:12 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Fri, 07 Aug 2015 22:51:12 -0400 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> <1438989322.136729.350749585.472B95D3@webmail.messagingengine.com> Message-ID: <1439002272.180487.350829633.321B878F@webmail.messagingengine.com> On Fri, Aug 7, 2015, at 22:21, Chris Angelico wrote: > Not sure what you're saying here. Python already has those. Er, so it does. I tried it in the interactive interpreter (it turns out, on python 2.7, with what was therefore a byte string literal, which I didn't realize when I tried it), and it didn't work, and then I searched online to figure out where I remembered it from and it seemed to be a perl thing. From Nikolaus at rath.org Sat Aug 8 05:18:10 2015 From: Nikolaus at rath.org (Nikolaus Rath) Date: Fri, 07 Aug 2015 20:18:10 -0700 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <20150807183617.290b1ce9@anarchist.wooz.org> (Barry Warsaw's message of "Fri, 7 Aug 2015 18:36:17 -0400") References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> <20150807130845.7da4d24b@anarchist.wooz.org> <87h9oarb3v.fsf@thinkpad.rath.org> <20150807183617.290b1ce9@anarchist.wooz.org> Message-ID: <87r3nepix9.fsf@vostro.rath.org> On Aug 07 2015, Barry Warsaw wrote: > On Aug 07, 2015, at 03:24 PM, Nikolaus Rath wrote: > >>On Aug 07 2015, Barry Warsaw >> >> wrote: >>> * Literals only >>> >>> I've described elsewhere that accepting non-literals is useful in some >>> cases. >> >>Are you saying you don't want f-strings, but you want something that >>looks like a function (but is actually a special form because it has >>access to the local context)? E.g. f(other_fn()) would perform literal >>interpolation on the result of other_fn()? That should have been "perform string interpolation", not "perform literal interpolation". > > Maybe I misunderstood the non-literal discussion. For translations, you > will usually operate on literal strings, but sometimes you want to operate on > a string via a variable. E.g. > > print(_('These are $apples and $oranges')) > > vs. > > print(_(as_the_saying_goes)) > > Nothing magical there. > > I guess if we're talking about a string prefix to do all the magic, > the latter doesn't make any sense, except that you couldn't pass an > f-string into a function that did the latter, because you'd want to > defer interpolation until the call site, not at the f-string > definition site. Or maybe the translateable string comes from a file > and isn't ever a literal. That makes me think that we have to make > sure there's a way to access the interpolation programmatically. Aeh, but that already exists. There is %, there is format, and there is string.Template. So I'm a little confused what exactly you are arguing for (or against)? The one issue that would make sense in this context is to *combine* string interpolation and translation (as I believe Nick suggested), i.e. any literal of the form f"what a {quality} idea" would first be passed to a translation routine and then be subject to string interpolation. In that case it would also make sense to restrict interpolation to variables rather than arbitrary expression (so that translators are less likely to break things). Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From steve at pearwood.info Sat Aug 8 05:45:39 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 8 Aug 2015 13:45:39 +1000 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: <1438988730.134996.350742881.7868D43C@webmail.messagingengine.com> References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> <20150807051201.GZ3737@ando.pearwood.info> <1438988730.134996.350742881.7868D43C@webmail.messagingengine.com> Message-ID: <20150808034539.GE3737@ando.pearwood.info> On Fri, Aug 07, 2015 at 07:05:30PM -0400, random832 at fastmail.us wrote: > On Fri, Aug 7, 2015, at 01:12, Steven D'Aprano wrote: > > > (and as long as we're talking about > > > things to deprecate in string literals, how about \v?) > > > > Why would you want to deprecate a useful and long-standing escape > > sequence? > > Because it doesn't do anything useful and no-one uses it. [...] > I challenge you to find *one* use in the wild. Just one. I'll take that challenge. Here are SEVEN uses for \v in the real world: (1) Microsoft Word uses \v as a non-breaking end-of-paragraph marker. https://support.microsoft.com/en-au/kb/59096 (2) Similarly, it's also used in pptx files, for the same purpose. (3) .mer files use \v as embedded newlines within a single field. http://fmforums.com/topic/83079-exporting-to-mer-for-indesign/ (4) Similarly Filemaker can use \v as the end of line separator. (5) Quote: "In the medical industry, VT is used as the start of frame character in the MLLP/LLP/HLLP protocols that are used to frame HL-7 data." Source: http://stackoverflow.com/a/29479184 (6) Raster3D to Postscript conversion: http://manpages.ubuntu.com/manpages/natty/man1/r3dtops.1.html (7) Generating Tektronix 4010/4014 print files: http://odl.sysworks.biz/disk$cddoc04mar21/decw$book/d33vaaa8.p137.decw$book > Everyone does it because everyone else does it, but it's not useful to > any real users. Provided that we dismiss those who use \v as "not real users", you are correct. -- Steve From abarnert at yahoo.com Sat Aug 8 06:56:20 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 7 Aug 2015 21:56:20 -0700 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> Message-ID: On Aug 7, 2015, at 18:52, Ron Adam wrote: > >> On 08/06/2015 12:26 PM, random832 at fastmail.us wrote: >>> On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote: >>> >Because strings containing \{ are currently valid >> Which raises the question of why. (and as long as we're talking about >> things to deprecate in string literals, how about \v?) > > (In the below consider x as any character.) > > In most languages if \x is not a valid escape character, then an error is raised. Which most languages? In C, sh, perl, and most of their respective descendants, it means x. (Perl also goes out of its way to guarantee that if x is a punctuation character, it will never mean anything but x in any future version, either in strings or in regexps, so it's always safe to unnecessarily escape punctuation instead of remembering the rules for what punctuation to escape.) The only language I can think of off the top my head that raises an error is Haskell. I like the Haskell behavior better than the C/perl behavior, especially given the backward compatibility issues with Python up to 3.5 if it switched, but I don't think it's what most languages do. From random832 at fastmail.us Sat Aug 8 07:31:07 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Sat, 08 Aug 2015 01:31:07 -0400 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals Message-ID: <1439011867.262533.350876681.3FE4519E@webmail.messagingengine.com> On Fri, Aug 7, 2015, at 23:45, Steven D'Aprano wrote: > > I challenge you to find *one* use in the wild. Just one. > > I'll take that challenge. Here are SEVEN uses for \v in the real world I should have better defined what I meant by "use". It has to be A) The actual \v escape, actually used, in a string literal in source code. I was asking for real-world uses of _the escape_. I'd guess about half of the ones you named actually are embodied in source code in a C-derived language somewhere - but probably not in Python. B) For the actual vertical tab function, rather than some other purpose the byte is being repurposed for. For other functions, the \v spelling obfuscates rather than illuminating. C) Code merely used for parsing string literal formats that themselves define \v, naturally, don't count. Mentioned for completeness, since otherwise these would technically satisfy the other two while accomplishing nothing useful. The vertical tab function is clearly defined as moving the cursor down (or the paper up) one or more lines to a predetermined position - e.g. a multiple of six lines, just as a tab conventionally takes you to the next multiple of eight columns, or a position that has been programmed into the device by use of other control mechanisms: VT - LINE TABULATION Notation: (C0) Representation: 00/11 VT causes the active presentation position to be moved in the presentation component to the corresponding character position on the line at which the following line tabulation stop is set. From random832 at fastmail.us Sat Aug 8 07:52:53 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Sat, 08 Aug 2015 01:52:53 -0400 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> Message-ID: <1439013173.274371.350876753.1641011C@webmail.messagingengine.com> On Sat, Aug 8, 2015, at 00:56, Andrew Barnert via Python-ideas wrote: > Which most languages? In C, sh, perl, and most of their respective > descendants, it means x. In C it is undefined behavior. Many compilers will provide a warning, even for extensions they do define such as \e. C incidentally provides \u at a lower level than string literals (they can appear anywhere in source code), and it may not specify most ASCII characters, even in string literals. In POSIX sh, there is no support for any special backslash escape. Backslash before _any_ character outside of quotes makes that character literal - That is, \n is n, not newline. I wouldn't really regard this as the same kind of context. For completeness, I will note that inside double quotes, backslash before any character it is not required to escape (such as ` " or $) incudes the backslash in the result. Inside single quotes, backslash has no special meaning at all. In POSIX echo, the behavior is implementation-defined. Some existing implementations include the backslash like python. In POSIX printf, the behavior is unspecified. Some existing implementations include the backslash. In ksh $'strings', it means the literal character, no backslash. In bash $'strings', it includes the backslash. From Steve.Dower at microsoft.com Sat Aug 8 05:26:11 2015 From: Steve.Dower at microsoft.com (Steve Dower) Date: Sat, 8 Aug 2015 03:26:11 +0000 Subject: [Python-ideas] Briefer string format In-Reply-To: <87k2t6zgs3.fsf@gmail.com> References: <55AC2EDF.7040205@mgmiller.net> , <87k2t6zgs3.fsf@gmail.com> Message-ID: > Could you provide an example what is wrong with "{a}{b}".format(**vars())? >>> ["{a}{b}".format(**vars()) for _ in range(1)] Comprehensions have their own scope. This needs to be a compile-time transform into a normal variable lookup. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Akira Li Sent: ?8/?7/?2015 18:55 To: python-ideas at python.org Subject: Re: [Python-ideas] Briefer string format Guido van Rossum writes: > On Thu, Aug 6, 2015 at 10:35 PM, Wes Turner > wrote: > >> >> On Aug 6, 2015 3:03 PM, "Guido van Rossum" wrote: >> > >> > Unfortunately, all spellings that require calling locals() are wrong. >> >> Is this where the potential source of surprising error is? >> >> * Explicit / Implicit locals() >> > This is a big deal because of the worry about code injection. A "classic" > format string given access to locals() (e.g. using s.format(**locals())) > always stirs worries about code injection if the string is a variable. The > proposed forms of string interpolation don't give access to locals *other > than the locals where the string "literal" itself exists*. This latter > access is no different from the access to locals in any expression. (The > same for globals(), of course.) > > The other issue with explicit locals() is that to the people who would most > benefit from variable interpolation (typically relatively unsophisticated > users), it is magical boilerplate. (Worse, it's boilerplate that their more > experienced mentors will warn them against because of the code injection > worry.) Googling e.g., "python locals code injection" yields nothing specific: https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fstackoverflow.com%2fquestions%2f2515450%2finjecting-variables-into-the-callers-scope&data=01%7c01%7csteve.dower%40microsoft.com%7ceb455eb18c7b4fe4c47b08d29f947ec5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=WJJTSsqvRuTy9ZCKgDPNfqp8rC2032i%2fudmnZ%2bG%2bMZg%3d https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fstackoverflow.com%2fquestions%2f13312240%2fis-a-string-formatter-that-pulls-variables-from-its-calling-scope-bad-practice&data=01%7c01%7csteve.dower%40microsoft.com%7ceb455eb18c7b4fe4c47b08d29f947ec5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=RZKdUQiJRzlp%2bikOPERDJzX8facaBRWuf1brLXy0D6M%3d Could you provide an example what is wrong with "{a}{b}".format(**vars())? Is it correct to say that there is nothing wrong with it as long as the string is always a *literal*? _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.python.org%2fmailman%2flistinfo%2fpython-ideas&data=01%7c01%7csteve.dower%40microsoft.com%7ceb455eb18c7b4fe4c47b08d29f947ec5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=Rwn2JLTjmWxnzx%2bp0zixk8gQprBYF3mcp8a%2fUhio1mY%3d Code of Conduct: https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fpython.org%2fpsf%2fcodeofconduct%2f&data=01%7c01%7csteve.dower%40microsoft.com%7ceb455eb18c7b4fe4c47b08d29f947ec5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=DmWW4wAFmzYnI%2beEZSJcMVMgxGAojWSxyxP%2bVsusPfY%3d -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Aug 8 10:12:33 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 8 Aug 2015 18:12:33 +1000 Subject: [Python-ideas] Making concurrent.futures.Futures awaitable In-Reply-To: References: <55C4E22D.101@nextday.fi> Message-ID: On 8 August 2015 at 03:08, Guido van Rossum wrote: > FWIW, I am against this (as Alex already knows), for the same reasons I > didn't like Nick's proposal. Fuzzing the difference between threads and > asyncio tasks is IMO asking for problems -- people will stop understanding > what they are doing and then be bitten when they least need it. I'm against concurrent.futures offering native asyncio support as well - that dependency already goes the other way, from asyncio down to concurrent.futures by way of the loop's pool executor. The only aspect of my previous suggestions I'm still interested in is a name and signature change from "loop.run_in_executor(executor, callable)" to "loop.call_in_background(callable, *, executor=None)". Currently, the recommended way to implement a blocking call like Alex's example is this: from asyncio import get_event_loop async def handler(self): loop = asyncio.get_event_loop() result = await loop.run_in_executor(None, some_blocking_api.some_blocking_call) await self.write(result) I now see four concrete problems with this specific method name and signature: * we don't run functions, we call them * we do run event loops, but this call doesn't start an event loop running * "executor" only suggests "background call" to folks that already know how concurrent.futures works * we require the explicit "None" boilerplate to say "use the default executor", rather than using the more idiomatic approach of accepting an alternate executor as an optional keyword only argument With the suggested change to the method name and signature, the same example would instead look like: async def handler(self): loop = asyncio.get_event_loop() result = await loop.call_in_background(some_blocking_api.some_blocking_call) await self.write(result) That should make sense to anyone reading the handler, even if they know nothing about concurrent.futures - the precise mechanics of how the event loop goes about handing off the call to a background thread or process is something they can explore later, they don't need to know about it in order to locally reason about this specific handler. It also means that event loops would be free to implement their *default* background call functionality using something other than concurrent.futures, and only switch to the latter if an executor was specified explicitly. There are still some open questions about whether it makes sense to allow callables to indicate whether or not they expect to be IO bound or CPU bound, and hence allow event loop implementations to opt to dispatch the latter to a process pool by default (I saw someone suggest that recently, and I find the idea intriguing), but I think that's a separate question from dispatching a given call for parallel execution, with the result being awaited via a particular event loop. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Sat Aug 8 11:31:28 2015 From: guido at python.org (Guido van Rossum) Date: Sat, 8 Aug 2015 11:31:28 +0200 Subject: [Python-ideas] Making concurrent.futures.Futures awaitable In-Reply-To: References: <55C4E22D.101@nextday.fi> Message-ID: +1on the name change. On Aug 8, 2015 10:12 AM, "Nick Coghlan" wrote: > On 8 August 2015 at 03:08, Guido van Rossum wrote: > > FWIW, I am against this (as Alex already knows), for the same reasons I > > didn't like Nick's proposal. Fuzzing the difference between threads and > > asyncio tasks is IMO asking for problems -- people will stop > understanding > > what they are doing and then be bitten when they least need it. > > I'm against concurrent.futures offering native asyncio support as well > - that dependency already goes the other way, from asyncio down to > concurrent.futures by way of the loop's pool executor. > > The only aspect of my previous suggestions I'm still interested in is > a name and signature change from "loop.run_in_executor(executor, > callable)" to "loop.call_in_background(callable, *, executor=None)". > > Currently, the recommended way to implement a blocking call like > Alex's example is this: > > from asyncio import get_event_loop > > async def handler(self): > loop = asyncio.get_event_loop() > result = await loop.run_in_executor(None, > some_blocking_api.some_blocking_call) > await self.write(result) > > I now see four concrete problems with this specific method name and > signature: > > * we don't run functions, we call them > * we do run event loops, but this call doesn't start an event loop > running > * "executor" only suggests "background call" to folks that already > know how concurrent.futures works > * we require the explicit "None" boilerplate to say "use the > default executor", rather than using the more idiomatic approach of > accepting an alternate executor as an optional keyword only argument > > With the suggested change to the method name and signature, the same > example would instead look like: > > async def handler(self): > loop = asyncio.get_event_loop() > result = await > loop.call_in_background(some_blocking_api.some_blocking_call) > await self.write(result) > > That should make sense to anyone reading the handler, even if they > know nothing about concurrent.futures - the precise mechanics of how > the event loop goes about handing off the call to a background thread > or process is something they can explore later, they don't need to > know about it in order to locally reason about this specific handler. > > It also means that event loops would be free to implement their > *default* background call functionality using something other than > concurrent.futures, and only switch to the latter if an executor was > specified explicitly. > > There are still some open questions about whether it makes sense to > allow callables to indicate whether or not they expect to be IO bound > or CPU bound, and hence allow event loop implementations to opt to > dispatch the latter to a process pool by default (I saw someone > suggest that recently, and I find the idea intriguing), but I think > that's a separate question from dispatching a given call for parallel > execution, with the result being awaited via a particular event loop. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Sat Aug 8 11:49:48 2015 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 8 Aug 2015 19:49:48 +1000 Subject: [Python-ideas] PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules Message-ID: <20150808094948.GA65914@cskk.homeip.net> I was recently bitten by the fact that the command: python -m foo pulls in the module and attaches it as sys.modules['__main__'], but not to sys.modules['foo']. Should the program also: import foo it pulls in the same module code, but binds a completely independent separate instance of it to sys.modules['foo']. This is counter intuitive; it is a natural expectation that "python -m foo" imports "foo" in a normal fashion. If the program modifies items in "foo", those modifications are not effected in "__main__", since these are two distinct modules. I propose that "python -m foo" imports foo as normal, binding it to sys.modules["__main__"] as at present, but that it also binds the module to sys.modules["foo"]. This will remove the disconnect between "python -m foo" and a program's internal "import foo". For people who are concerned that the modules .__name__ is "__main__", note that the module's resolved "offical" name is present in .__spec__.name as described in PEP 451. There are two recent discussion threads on this in python-list at: https://mail.python.org/pipermail/python-list/2015-August/694905.html and in python-ideas at: https://mail.python.org/pipermail/python-ideas/2015-August/034947.html Please give them a read and give this PEP your thoughts. The raw text of the PEP is below. It feels uncontroversial to me, but then it would:-) It is visible on the web here: https://www.python.org/dev/peps/pep-0499/ and I've made a public repository to track the text as it evolves here: https://bitbucket.org/cameron_simpson/pep-0499/ Cheers, Cameron Simpson PEP: 499 Title: ``python -m foo`` should bind ``sys.modules['foo']`` in addition to ``sys.modules['__main__']`` Version: $Revision$ Last-Modified: $Date$ Author: Cameron Simpson Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 07-Aug-2015 Python-Version: 3.6 Abstract ======== When a module is used as a main program on the Python command line, such as by: python -m module.name ... it is easy to accidentally end up with two independent instances of the module if that module is again imported within the program. This PEP proposes a way to fix this problem. When a module is invoked via Python's -m option the module is bound to ``sys.modules['__main__']`` and its ``.__name__`` attribute is set to ``'__main__'``. This enables the standard "main program" boilerplate code at the bottom of many modules, such as:: if __name__ == '__main__': sys.exit(main(sys.argv)) However, when the above command line invocation is used it is a natural inference to presume that the module is actually imported under its official name ``module.name``, and therefore that if the program again imports that name then it will obtain the same module instance. That actuality is that the module was imported only as ``'__main__'``. Another import will obtain a distinct module instance, which can lead to confusing bugs. Proposal ======== It is suggested that to fix this situation all that is needed is a simple change to the way the ``-m`` option is implemented: in addition to binding the module object to ``sys.modules['__main__']``, it is also bound to ``sys.modules['module.name']``. Nick Coghlan has suggested that this is as simple as modifying the ``runpy`` module's ``_run_module_as_main`` function as follows:: main_globals = sys.modules["__main__"].__dict__ to instead be:: main_module = sys.modules["__main__"] sys.modules[mod_spec.name] = main_module main_globals = main_module.__dict__ Considerations and Prerequisites ================================ Pickling Modules ---------------- Nick has mentioned `issue 19702`_ which proposes (quoted from the issue): - runpy will ensure that when __main__ is executed via the import system, it will also be aliased in sys.modules as __spec__.name - if __main__.__spec__ is set, pickle will use __spec__.name rather than __name__ to pickle classes, functions and methods defined in __main__ - multiprocessing is updated appropriately to skip creating __mp_main__ in child processes when __main__.__spec__ is set in the parent process The first point above covers this PEP's specific proposal. Background ========== `I tripped over this issue`_ while debugging a main program via a module which tried to monkey patch a named module, that being the main program module. Naturally, the monkey patching was ineffective as it imported the main module by name and thus patched the second module instance, not the running module instance. However, the problem has been around as long as the ``-m`` command line option and is encountered regularly, if infrequently, by others. In addition to `issue 19702`_, the discrepancy around `__main__` is alluded to in PEP 451 and a similar proposal (predating PEP 451) is described in PEP 395 under `Fixing dual imports of the main module`_. References ========== .. _issue 19702: http://bugs.python.org/issue19702 .. _I tripped over this issue: https://mail.python.org/pipermail/python-list/2015-August/694905.html .. _Fixing dual imports of the main module: https://www.python.org/dev/peps/pep-0395/#fixing-dual-imports-of-the-main-module Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From srkunze at mail.de Sat Aug 8 12:28:33 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Sat, 08 Aug 2015 12:28:33 +0200 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C50456.6010705@gmail.com> References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> <20150807130845.7da4d24b@anarchist.wooz.org> <55C50456.6010705@gmail.com> Message-ID: <55C5D9D1.9030109@mail.de> On 07.08.2015 21:17, Yury Selivanov wrote: > Yes. And overall I think that > > sum = a + b > print(f'the sum is {sum}') > > is more pythonic (readability, explicitness etc) than this: > > print(f'the sum is {a + b}') > I have to admit I like shorter one more. It is equally well readable and explicit AND it is shorter. As long as people do not abuse expressions to a degree of unreadability (which should be covered by code reviews when it comes to corporal code), I am fine with exposing more possibilities. From rosuav at gmail.com Sat Aug 8 12:30:25 2015 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 8 Aug 2015 20:30:25 +1000 Subject: [Python-ideas] PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules In-Reply-To: <20150808094948.GA65914@cskk.homeip.net> References: <20150808094948.GA65914@cskk.homeip.net> Message-ID: On Sat, Aug 8, 2015 at 7:49 PM, Cameron Simpson wrote: > The raw text of the PEP is below. It feels uncontroversial to me, but then > it would:-) > I'm not sure that it'll be uncontroversial, but I agree with it :) The risk that I see (as I mentioned in the previous thread, but reiterating for those who just came in) is that it becomes possible to import something whose __name__ is not what you imported. Currently, you can "import math" and see that math.__name__ is "math", or "import urllib.parse" and, as you'd expect, urllib.parse.__name__ is "urllib.parse". In the few cases where it isn't exactly what you imported, it's the canonical name for it - for instance, os.path.__name__ is posixpath on my system. The change proposed here means that the canonical name for the module you're running as the main file is now "__main__", and not whatever else it would have been. Consequences for pickle/multiprocessing/Windows are mentioned in the PEP. Are there any other places where a module's name is checked? ChrisA From alex.gronholm at nextday.fi Sat Aug 8 14:47:46 2015 From: alex.gronholm at nextday.fi (=?UTF-8?B?QWxleCBHcsO2bmhvbG0=?=) Date: Sat, 08 Aug 2015 15:47:46 +0300 Subject: [Python-ideas] Making concurrent.futures.Futures awaitable In-Reply-To: References: <55C4E22D.101@nextday.fi> Message-ID: <55C5FA72.7030303@nextday.fi> 08.08.2015, 11:12, Nick Coghlan kirjoitti: > On 8 August 2015 at 03:08, Guido van Rossum wrote: >> FWIW, I am against this (as Alex already knows), for the same reasons I >> didn't like Nick's proposal. Fuzzing the difference between threads and >> asyncio tasks is IMO asking for problems -- people will stop understanding >> what they are doing and then be bitten when they least need it. > I'm against concurrent.futures offering native asyncio support as well > - that dependency already goes the other way, from asyncio down to > concurrent.futures by way of the loop's pool executor. Nobody is suggesting that. The __await__ support suggested for concurrent Futures is generic and has no ties whatsoever to asyncio. > The only aspect of my previous suggestions I'm still interested in is > a name and signature change from "loop.run_in_executor(executor, > callable)" to "loop.call_in_background(callable, *, executor=None)". That name would and argument placement would be better, but are you suggesting that the ability to pass along extra arguments should be removed? The original method was bad enough in that it only supported positional and not keyword arguments, forcing users to pass partial() objects as callables. > Currently, the recommended way to implement a blocking call like > Alex's example is this: > > from asyncio import get_event_loop > > async def handler(self): > loop = asyncio.get_event_loop() > result = await loop.run_in_executor(None, > some_blocking_api.some_blocking_call) > await self.write(result) > > I now see four concrete problems with this specific method name and signature: > > * we don't run functions, we call them > * we do run event loops, but this call doesn't start an event loop running > * "executor" only suggests "background call" to folks that already > know how concurrent.futures works > * we require the explicit "None" boilerplate to say "use the > default executor", rather than using the more idiomatic approach of > accepting an alternate executor as an optional keyword only argument > > With the suggested change to the method name and signature, the same > example would instead look like: > > async def handler(self): > loop = asyncio.get_event_loop() > result = await > loop.call_in_background(some_blocking_api.some_blocking_call) > await self.write(result) Am I the only one who's bothered by the fact that you have to get a reference to the event loop first? Wouldn't this be better: async def handler(self): result = await asyncio.call_in_background(some_blocking_api.some_blocking_call) await self.write(result) The call_in_background() function would return an awaitable object that is recognized by the asyncio Task class, which would then submit the function to the default executor of the event loop. > That should make sense to anyone reading the handler, even if they > know nothing about concurrent.futures - the precise mechanics of how > the event loop goes about handing off the call to a background thread > or process is something they can explore later, they don't need to > know about it in order to locally reason about this specific handler. > > It also means that event loops would be free to implement their > *default* background call functionality using something other than > concurrent.futures, and only switch to the latter if an executor was > specified explicitly. Do you mean background calls that don't return objects compatible with concurrent.futures.Futures? Can you think of a use case for this? > > There are still some open questions about whether it makes sense to > allow callables to indicate whether or not they expect to be IO bound > or CPU bound, What do you mean by this? > and hence allow event loop implementations to opt to > dispatch the latter to a process pool by default Bad idea! The semantics are too different and process pools have too many limitations. > (I saw someone > suggest that recently, and I find the idea intriguing), but I think > that's a separate question from dispatching a given call for parallel > execution, with the result being awaited via a particular event loop. > > Cheers, > Nick. > From ron3200 at gmail.com Sat Aug 8 15:09:50 2015 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 08 Aug 2015 09:09:50 -0400 Subject: [Python-ideas] Make non-meaningful backslashes illegal in string literals In-Reply-To: References: <55C25C74.50008@trueblade.com> <1438878374.3655303.349546561.115FD27C@webmail.messagingengine.com> Message-ID: On 08/08/2015 12:56 AM, Andrew Barnert via Python-ideas wrote: > On Aug 7, 2015, at 18:52, Ron > Adam wrote: >>> >>>>> On 08/06/2015 12:26 >>>>> PM,random832 at fastmail.us wrote: >>>>>>> On Wed, Aug 5, 2015, at 14:56, Eric V. Smith wrote: >>>>>>>>> Because strings containing \{ are currently valid >>>>> Which raises the question of why. (and as long as we're talking >>>>> about things to deprecate in string literals, how about \v?) >>> >>> (In the below consider x as any character.) >>> >>> In most languages if \x is not a valid escape character, then an >>> error is raised. > Which most languages? In C, sh, perl, and most of their respective > descendants, it means x. (Perl also goes out of its way to guarantee > that if x is a punctuation character, it will never mean anything but x > in any future version, either in strings or in regexps, so it's always > safe to unnecessarily escape punctuation instead of remembering the > rules for what punctuation to escape.) Actually this is what I thought, but when looking up what other languages do in this case, it was either not documented or suggested it raised an error. Apparently in C, it is suppose to raise an error, but compilers have supported echoing the escaped character instead. From https://en.wikipedia.org/wiki/Escape_sequences_in_C -------------------- Non-standard escape sequences A sequence such as \z is not a valid escape sequence according to the C standard as it is not found in the table above. The C standard requires such "invalid" escape sequences to be diagnosed (i.e., the compiler must print an error message). Notwithstanding this fact, some compilers may define additional escape sequences, with implementation-defined semantics. An example is the \e escape sequence, which has 1B as the hexadecimal value in ASCII, represents the escape character, and is supported in GCC,[1] clang and tcc. --------------------- > The only language I can think of off the top my head that raises an > error is Haskell. > I like the Haskell behavior better than the C/perl behavior, especially > given the backward compatibility issues with Python up to 3.5 if it > switched, but I don't think it's what most languages do. I like the Haskell behaviour as well. Cheers, Ron From stefan_ml at behnel.de Sat Aug 8 18:49:30 2015 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 08 Aug 2015 18:49:30 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: <55AC3425.5010509@mgmiller.net> References: <55AC2EDF.7040205@mgmiller.net> <55AC3425.5010509@mgmiller.net> Message-ID: Mike Miller schrieb am 20.07.2015 um 01:35: > csstext += '{nl}{key}{space}{{{nl}'.format(**locals()) > > This looks a bit better if you ignore the right half, but it is longer and not > as simple as one might hope. It is much longer still if you type out the > variables needed as kewword params! The '{}' option is not much improvement > either. > > csstext += '{nl}{key}{space}{{{nl}'.format(nl=nl, key=key, ... # uggh > csstext += '{}{}{}{{{}'.format(nl, key, space, nl) > > I've long wished python could format strings easily like bash or perl do, ... > and then it hit me: > > csstext += f'{nl}{key}{space}{{{nl}' > > An "f-formatted" string could automatically format with the locals dict. Not > yet sure about globals, and unicode only suggested for now. Perhaps could be > done directly to avoid the .format() function call, which adds some overhead > and tends to double the length of the line? Is this an actual use case that people *commonly* run into? I understand that the implicit name lookups here are safe and all that, but I cannot recall ever actually using locals() for string formatting. The above looks magical to me. It's completely unclear that string interpolation is happening behind my back here, unless I already know it. I think it's ok to have a "b" string prefix produce a special kind of string and expect people to guess that and look up what it does if they don't know (and syntax like a string prefix is difficult enough to look up already). Having an "f" prefix interpolate the string with names from the current namespace is way beyond what I would expect a string prefix to do. I'd prefer not seeing a "cool feature" added just "because it's cool". If it additionally is magic, it's usually not a good idea. Stefan From jonathan at slenders.be Sat Aug 8 19:23:19 2015 From: jonathan at slenders.be (Jonathan Slenders) Date: Sat, 8 Aug 2015 19:23:19 +0200 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C5D9D1.9030109@mail.de> References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> <20150807130845.7da4d24b@anarchist.wooz.org> <55C50456.6010705@gmail.com> <55C5D9D1.9030109@mail.de> Message-ID: Why don't we allow any possible expression to be used in the context of a decorator? E.g. this is not possible. @a + b def function(): pass While these are: @a(b + c) @a.b @a.b.c def function(): pass I guess there we also had a discussion about whether or not to limit the grammar, and I guess we had a reason. I don't like the idea to give the user too much freedom in f-string. A simple expression like addition, ok. But no comprehension, lambdas, etc... It's impossible to go back if this turns out badly, but we can always add more freedom later on. One more coments after reading the PEP: - I don't like that double braces are replaced by a single brace. Why not keep backslash \{ \} for the literals. In the PEP we have '{...}' for variables. (Instead of '\{...}') So that works fine. Jonathan 2015-08-08 12:28 GMT+02:00 Sven R. Kunze : > On 07.08.2015 21:17, Yury Selivanov wrote: > >> Yes. And overall I think that >> >> sum = a + b >> print(f'the sum is {sum}') >> >> is more pythonic (readability, explicitness etc) than this: >> >> print(f'the sum is {a + b}') >> >> > I have to admit I like shorter one more. It is equally well readable and > explicit AND it is shorter. > > As long as people do not abuse expressions to a degree of unreadability > (which should be covered by code reviews when it comes to corporal code), I > am fine with exposing more possibilities. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Sat Aug 8 23:34:03 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 8 Aug 2015 17:34:03 -0400 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> <20150807130845.7da4d24b@anarchist.wooz.org> <55C50456.6010705@gmail.com> <55C5D9D1.9030109@mail.de> Message-ID: <55C675CB.7010304@trueblade.com> On 8/8/2015 1:23 PM, Jonathan Slenders wrote: > Why don't we allow any possible expression to be used in the context of > a decorator? E.g. this is not possible. > > @a + b > def function(): > pass > > While these are: > > @a(b + c) > @a.b > @a.b.c > def function(): > pass > > I guess there we also had a discussion about whether or not to limit the > grammar, and I guess we had a reason. > > I don't like the idea to give the user too much freedom in f-string. A > simple expression like addition, ok. But no comprehension, lambdas, > etc... It's impossible to go back if this turns out badly, but we can > always add more freedom later on. Yes, there's been a fair amount of discussion on this. The trick would be finding a place in the grammar that allows enough, but not too much expressiveness. I personally think it should just be a code review item. Is there really anything wrong with: >>> msg = 'apple' >>> f'The sign said "{msg.upper()}".' 'The sign said "APPLE".' > One more coments after reading the PEP: > - I don't like that double braces are replaced by a single brace. Why > not keep backslash \{ \} for the literals. In the PEP we have '{...}' > for variables. (Instead of '\{...}') So that works fine. I kept the double braces to maximize compatibility with str.format. Eric. From rosuav at gmail.com Sun Aug 9 01:16:48 2015 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 9 Aug 2015 09:16:48 +1000 Subject: [Python-ideas] String interpolation for all literal strings In-Reply-To: <55C675CB.7010304@trueblade.com> References: <55C25C74.50008@trueblade.com> <55C3B77D.6020608@gmail.com> <20150806175351.0f4c8001@anarchist.wooz.org> <55C3DDD3.50807@gmail.com> <20150807130845.7da4d24b@anarchist.wooz.org> <55C50456.6010705@gmail.com> <55C5D9D1.9030109@mail.de> <55C675CB.7010304@trueblade.com> Message-ID: On Sun, Aug 9, 2015 at 7:34 AM, Eric V. Smith wrote: > Yes, there's been a fair amount of discussion on this. The trick would > be finding a place in the grammar that allows enough, but not too much > expressiveness. I personally think it should just be a code review item. > Is there really anything wrong with: > >>>> msg = 'apple' >>>> f'The sign said "{msg.upper()}".' > 'The sign said "APPLE".' Not in my opinion. I know it's always possible to make something _more_ powerful later on, and it's hard to make it _less_ powerful, but in this case, I'd be happy to see this with the full power of an expression. Anything that you could put after "lambda:" should be valid here, which according to Grammar/grammar is called "test". ChrisA From cs at zip.com.au Sun Aug 9 01:18:44 2015 From: cs at zip.com.au (Cameron Simpson) Date: Sun, 9 Aug 2015 09:18:44 +1000 Subject: [Python-ideas] PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules In-Reply-To: References: Message-ID: <20150808231844.GA53936@cskk.homeip.net> On 08Aug2015 20:30, Chris Angelico wrote: >On Sat, Aug 8, 2015 at 7:49 PM, Cameron Simpson wrote: >> The raw text of the PEP is below. It feels uncontroversial to me, but then >> it would:-) > >I'm not sure that it'll be uncontroversial, but I agree with it :) > >The risk that I see (as I mentioned in the previous thread, but >reiterating for those who just came in) is that it becomes possible to >import something whose __name__ is not what you imported. Currently, >you can "import math" and see that math.__name__ is "math", or "import >urllib.parse" and, as you'd expect, urllib.parse.__name__ is >"urllib.parse". In the few cases where it isn't exactly what you >imported, it's the canonical name for it - for instance, >os.path.__name__ is posixpath on my system. The change proposed here >means that the canonical name for the module you're running as the >main file is now "__main__", and not whatever else it would have been. I think I take the line that as of PEP 451 the conanical name for a module is .__spec__.name. The module's .__name__ normally matches that, but obviously in the case of "python -m" it does not. As you point out, suddenly a module can appear somewhere other than sys.modules['__main__'] where that difference shows. Let's ask the associated question: who introspects module.__name__ and expects it to be the cononical name? For what purpose? I'm of the opinion that those cases are few, and that they should in any case be updated to consult .__spec__.name these days (with, I suppose, fallback for older Python versions). I think that is the case even without the change suggested by PEP 499. Cheers, Cameron Simpson From ncoghlan at gmail.com Sun Aug 9 02:22:01 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 9 Aug 2015 10:22:01 +1000 Subject: [Python-ideas] Making concurrent.futures.Futures awaitable In-Reply-To: <55C5FA72.7030303@nextday.fi> References: <55C4E22D.101@nextday.fi> <55C5FA72.7030303@nextday.fi> Message-ID: On 8 Aug 2015 22:48, "Alex Gr?nholm" wrote: > > That name would and argument placement would be better, but are you suggesting that the ability to pass along extra arguments should be removed? The original method was bad enough in that it only supported positional and not keyword arguments, forcing users to pass partial() objects as callables. That's a deliberate design decision in many of asyncio's APIs to improve the introspection capabilities and to clearly separate concerns between "interacting with the event loop" and "the operation being dispatched for execution". >> With the suggested change to the method name and signature, the same >> example would instead look like: >> >> async def handler(self): >> loop = asyncio.get_event_loop() >> result = await >> loop.call_in_background(some_blocking_api.some_blocking_call) >> await self.write(result) > > Am I the only one who's bothered by the fact that you have to get a reference to the event loop first? > Wouldn't this be better: > > async def handler(self): > > result = await asyncio.call_in_background(some_blocking_api.some_blocking_call) > > await self.write(result) That was my original suggestion a few weeks ago, but after playing with it for a while, I came to agree with Guido that hiding the event loop in this case likely wasn't helpful to the conceptual learning process. Outside higher level frameworks that place more constraints on your code, you really can't get very far with asyncio without becoming comfortable with interacting with the event loop directly. I gave a demo using the current spelling as a lightning talk at PyCon Australia last weekend: https://www.youtube.com/watch?v=_pfJZfdwkgI The only part of that demo I really wasn't happy with was the "run_in_executor" call - the rest all felt good for the level asyncio operates at, while still allowing higher level third party APIs that hide more of the underlying machinery (like the event loop itself, as well as the use of partial function application). > > > The call_in_background() function would return an awaitable object that is recognized by the asyncio Task class, which would then submit the function to the default executor of the event loop. > >> That should make sense to anyone reading the handler, even if they >> know nothing about concurrent.futures - the precise mechanics of how >> the event loop goes about handing off the call to a background thread >> or process is something they can explore later, they don't need to >> know about it in order to locally reason about this specific handler. >> >> It also means that event loops would be free to implement their >> *default* background call functionality using something other than >> concurrent.futures, and only switch to the latter if an executor was >> specified explicitly. > > Do you mean background calls that don't return objects compatible with concurrent.futures.Futures? A background call already returns an asyncio awaitable, not a concurrent.futures.Future object. > Can you think of a use case for this? Yes, third party event loops like Twisted may have their own background call mechanism that they'd prefer to use by default, rather than the concurrent.futures model. >> There are still some open questions about whether it makes sense to >> allow callables to indicate whether or not they expect to be IO bound >> or CPU bound, > > What do you mean by this? There was a thread on the idea recently, but I don't have a link handy. Indicating CPU vs IO bound directly wouldn't work (that's context dependent), but allowing callables to explicitly indicate "recommended", "supported", "incompatible" for process pools could be interesting. >> and hence allow event loop implementations to opt to >> dispatch the latter to a process pool by default > > Bad idea! The semantics are too different and process pools have too many limitations. Yes, that's why I find it an intriguing notion to allow callables to explicitly indicate whether or not they're compatible with them. Cheers, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Sun Aug 9 02:48:39 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Sat, 08 Aug 2015 17:48:39 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> <55AC3425.5010509@mgmiller.net> Message-ID: <55C6A367.1010706@mgmiller.net> On 08/08/2015 09:49 AM, Stefan Behnel wrote: > Mike Miller schrieb am 20.07.2015 um 01:35: >> csstext += '{nl}{key}{space}{{{nl}'.format(**locals()) >> >> This looks a bit better if you ignore the right half, but it is longer and not >> >> csstext += '{nl}{key}{space}{{{nl}'.format(nl=nl, key=key, ... # uggh >> csstext += '{}{}{}{{{}'.format(nl, key, space, nl) > > Is this an actual use case that people *commonly* run into? I understand > that the implicit name lookups here are safe and all that, but I cannot > recall ever actually using locals() for string formatting. There are several ways to accomplish that line. If you look below it there two alternatives, that are suboptimal as well. > The above looks magical to me. It's completely unclear that string > ... > I'd prefer not seeing a "cool feature" added just "because it's cool". If > it additionally is magic, it's usually not a good idea. > Direct string interpolation is a widely desired feature, something the neckbeards of old, hipsters, and now suits have all agreed on. Since Python uses both " and ' for strings, there isn't an obvious way to separate normal strings from interpolated ones like shell languages do. That leaves, 1. interpolating all strings, or instead 2. marking those we want interpolated. Marking them appears to be the more popular solution here, the last detail is whether it should be f'', i'', or $'', etc. Letters are easier to read perhaps. The implementation is straightforward also. Since the feature will take about 30 seconds to learn, and pay back with a billion keystrokes saved, I'd argue it's a good tradeoff. -Mike From abarnert at yahoo.com Sun Aug 9 07:12:02 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 8 Aug 2015 22:12:02 -0700 Subject: [Python-ideas] PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules In-Reply-To: <20150808231844.GA53936@cskk.homeip.net> References: <20150808231844.GA53936@cskk.homeip.net> Message-ID: <75C9216C-5A2A-4DBD-8F57-361E47D67C02@yahoo.com> On Aug 8, 2015, at 16:18, Cameron Simpson wrote: > I think I take the line that as of PEP 451 the conanical name for a module is .__spec__.name. The module's .__name__ normally matches that, but obviously in the case of "python -m" it does not. > > As you point out, suddenly a module can appear somewhere other than sys.modules['__main__'] where that difference shows. > > Let's ask the associated question: who introspects module.__name__ and expects it to be the cononical name? For what purpose? I'd think the first place to look is code that deals directly with module objects and/or sys.modules--graphical debuggers, plugin frameworks, bridges (a la AppScript or PyObjC), etc. Especially since many of them want to retain compatibility with 3.3, if not 3.2, and to share as much code as possible with a 2.x version Of course you're probably right that there aren't too many such things, and they're also presumably written by people who know what they're doing and wouldn't have too much trouble adapting them for 3.6+ if needed. From joejev at gmail.com Sun Aug 9 09:05:42 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Sun, 9 Aug 2015 03:05:42 -0400 Subject: [Python-ideas] PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules In-Reply-To: <75C9216C-5A2A-4DBD-8F57-361E47D67C02@yahoo.com> References: <20150808231844.GA53936@cskk.homeip.net> <75C9216C-5A2A-4DBD-8F57-361E47D67C02@yahoo.com> Message-ID: If I have a package that defines both a __main__ and a __init__, then your change would bind the __main__ to the name instead of the __init__. That seems incorrect. On Sun, Aug 9, 2015 at 1:12 AM, Andrew Barnert via Python-ideas < python-ideas at python.org> wrote: > On Aug 8, 2015, at 16:18, Cameron Simpson wrote: > > I think I take the line that as of PEP 451 the conanical name for a > module is .__spec__.name. The module's .__name__ normally matches that, but > obviously in the case of "python -m" it does not. > > > > As you point out, suddenly a module can appear somewhere other than > sys.modules['__main__'] where that difference shows. > > > > Let's ask the associated question: who introspects module.__name__ and > expects it to be the cononical name? For what purpose? > > I'd think the first place to look is code that deals directly with module > objects and/or sys.modules--graphical debuggers, plugin frameworks, bridges > (a la AppScript or PyObjC), etc. Especially since many of them want to > retain compatibility with 3.3, if not 3.2, and to share as much code as > possible with a 2.x version > > Of course you're probably right that there aren't too many such things, > and they're also presumably written by people who know what they're doing > and wouldn't have too much trouble adapting them for 3.6+ if needed. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Sun Aug 9 10:00:33 2015 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 09 Aug 2015 10:00:33 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: <55C6A367.1010706@mgmiller.net> References: <55AC2EDF.7040205@mgmiller.net> <55AC3425.5010509@mgmiller.net> <55C6A367.1010706@mgmiller.net> Message-ID: Mike Miller schrieb am 09.08.2015 um 02:48: > On 08/08/2015 09:49 AM, Stefan Behnel wrote: >> Mike Miller schrieb am 20.07.2015 um 01:35: >>> csstext += '{nl}{key}{space}{{{nl}'.format(**locals()) >>> >>> This looks a bit better if you ignore the right half, but it is longer >>> and not >>> >>> csstext += '{nl}{key}{space}{{{nl}'.format(nl=nl, key=key, ... # uggh >>> csstext += '{}{}{}{{{}'.format(nl, key, space, nl) >> >> Is this an actual use case that people *commonly* run into? I understand >> that the implicit name lookups here are safe and all that, but I cannot >> recall ever actually using locals() for string formatting. > > There are several ways to accomplish that line. If you look below it there > two alternatives, that are suboptimal as well. > >> The above looks magical to me. It's completely unclear that string >> ... >> I'd prefer not seeing a "cool feature" added just "because it's cool". If >> it additionally is magic, it's usually not a good idea. > > Direct string interpolation is a widely desired feature, something the > neckbeards of old, hipsters, and now suits have all agreed on. But how common is it, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either. Meaning, in almost all cases, the formatting will use some more or less simple variant of this pattern: result = process("string with {a} and {b}").format(a=1, b=2) which commonly collapses into result = translate("string with {a} and {b}", a=1, b=2) by wrapping the concrete use cases in appropriate helper functions. I've seen Nick Coghlan's proposal for an implementation backed by a global function, which would at least catch some of these use cases. But it otherwise seems to me that this is a huge sledge hammer solution for a niche problem. Stefan From abarnert at yahoo.com Sun Aug 9 12:27:30 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 9 Aug 2015 03:27:30 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> <55AC3425.5010509@mgmiller.net> <55C6A367.1010706@mgmiller.net> Message-ID: On Aug 9, 2015, at 01:00, Stefan Behnel wrote: > > Mike Miller schrieb am 09.08.2015 um 02:48: >>> On 08/08/2015 09:49 AM, Stefan Behnel wrote: >>> Mike Miller schrieb am 20.07.2015 um 01:35: >>>> csstext += '{nl}{key}{space}{{{nl}'.format(**locals()) >>>> >>>> This looks a bit better if you ignore the right half, but it is longer >>>> and not >>>> >>>> csstext += '{nl}{key}{space}{{{nl}'.format(nl=nl, key=key, ... # uggh >>>> csstext += '{}{}{}{{{}'.format(nl, key, space, nl) >>> >>> Is this an actual use case that people *commonly* run into? I understand >>> that the implicit name lookups here are safe and all that, but I cannot >>> recall ever actually using locals() for string formatting. >> >> There are several ways to accomplish that line. If you look below it there >> two alternatives, that are suboptimal as well. >> >>> The above looks magical to me. It's completely unclear that string >>> ... >>> I'd prefer not seeing a "cool feature" added just "because it's cool". If >>> it additionally is magic, it's usually not a good idea. >> >> Direct string interpolation is a widely desired feature, something the >> neckbeards of old, hipsters, and now suits have all agreed on. > > But how common is it, really? Almost all of the string formatting that I've > used lately is either for logging (no help from this proposal here) or > requires some kind of translation/i18n *before* the formatting, which is > not helped by this proposal either. There's also text-based protocols, file formats, etc. I use string formatting quite a bit for those, and this proposal would help there. Also, do you really never need formatting in log messages? Do you only use highly structured log formats? I'm always debug-logging things like "restored position {}-{}x{}-{} not in current desktop bounds {}x{}" or "file '{}' didn't exist, creating" and so on, and this proposal would help there as well. But you're right that i18n is a bigger problem than it appears. I went back through some of my Swift code, and some of the blog posts others have written about how nifty string interpolation is, and I remembered something really obvious that I'd forgotten: Most user-interface strings in Cocoa[Touch] apps are in the interface-builder objects, or at least explicitly in strings files, not in the source code. (I don't know if this is similar for C# 8's similar feature, since I haven't done much C# since .NET was still called .NET, but I wouldn't be surprised.) So you don't have to i18n source-code strings very often. But when you do, you have to revert to the clunky ObjC way of creating an NSLocalizedString with %@ placeholders and and calling stringWithFormat: on it. In Python, user-interface strings are very often in the source code, so you'd have to revert to the 3.5 style all the time. Which isn't nearly as clunky as the ObjC style, but still... Now I have to go back and reread Nick's posts to see if his translated-and-interpolated string protocol makes sense and would be easy to use for at least GNU gettext and Cocoa (even without his quasi-associated semi-proposal for sort-of-macros), because without that, I'm no longer sure this is a good idea. If the feature helps tremendously for non-i18n user interface strings, but then you have to throw it away to i18n your code, that could just discourage people from writing programs that work outside the US. (I suspect Nick and others already made this argument, and better, so apologies if I'm being slow here.) From cs at zip.com.au Sun Aug 9 12:34:49 2015 From: cs at zip.com.au (Cameron Simpson) Date: Sun, 9 Aug 2015 20:34:49 +1000 Subject: [Python-ideas] PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules In-Reply-To: References: Message-ID: <20150809103449.GA17997@cskk.homeip.net> On 09Aug2015 03:05, Joseph Jevnik wrote: >If I have a package that defines both a __main__ and a __init__, then your >change would bind the __main__ to the name instead of the __init__. That >seems incorrect. Yes. Yes it does. I just did a quick test package named "testmod" via "python -m testmod" and: - __init__.py has the __name__ "testmod" - __main__.py has the __name__ "__main__" in both python 2.7 and python 3.4. Since my test script reports: % python3.4 -m testmod __init__.py: /Users/cameron/rc/python/testmod/__init__.py testmod __main__.py: /Users/cameron/rc/python/testmod/__main__.py __main__ % python2.7 -m testmod ('__init__.py:', '/Users/cameron/rc/python/testmod/__init__.pyc', 'testmod') ('__main__.py:', '/Users/cameron/rc/python/testmod/__main__.py', '__main__') would it be enough to say that this change should only apply if the module is not a package? I'll do some more fiddling to see exactly what happens in packages when I import pieces of them, too. Cheers, Cameron Simpson From stefan_ml at behnel.de Sun Aug 9 13:01:11 2015 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 09 Aug 2015 13:01:11 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> <55AC3425.5010509@mgmiller.net> <55C6A367.1010706@mgmiller.net> Message-ID: Andrew Barnert via Python-ideas schrieb am 09.08.2015 um 12:27: > do you really never need formatting in log messages? Do you only use > highly structured log formats? I'm always debug-logging things like > "restored position {}-{}x{}-{} not in current desktop bounds {}x{}" or > "file '{}' didn't exist, creating" and so on, and this proposal would > help there as well. Sure, I use formatting there. But the formatting is intentionally done *after* checking that the output passes the current log level. The proposal is about providing a way to format a string literal *before* anyone can do something else with it. So it won't help for logging. Especially not for debug logging. Also, take another look at your examples. They use positional formatting, not named formatting. This proposal requires the use of named formatting and only applies to the exact case where the names or expressions used in the template match the names used for (local/global) variables. As soon as the expressions become non-trivial or the variable names become longer (in order to be descriptive), having to use the same lengthy names and expressions in the template, or at least having to assign them to new local variables before-hand only to make them available for string formatting, will quickly get in the way more than it helps. With .format(), I can (and usually will) just say output = "writing {filename} ...".format( filename=self.build_printable_relative_filename(filename)) rather than having to say printable_filename = self.build_printable_relative_filename(filename) output = f"writing {printable_filename} ..." # magic happening here del printable_filename # not used anywhere else As soon as you leave the cosy little niche where *all* values are prepared ahead of time and stored in beautiful local variables with tersely short and well-chosen names that make your string template as readable as your code, this feature is not for you any more. Stefan From abarnert at yahoo.com Sun Aug 9 14:26:57 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 9 Aug 2015 05:26:57 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> <55AC3425.5010509@mgmiller.net> <55C6A367.1010706@mgmiller.net> Message-ID: <673633DC-81C7-4C07-AC92-F4B17390492D@yahoo.com> On Aug 9, 2015, at 04:01, Stefan Behnel wrote: > > Andrew Barnert via Python-ideas schrieb am 09.08.2015 um 12:27: >> do you really never need formatting in log messages? Do you only use >> highly structured log formats? I'm always debug-logging things like >> "restored position {}-{}x{}-{} not in current desktop bounds {}x{}" or >> "file '{}' didn't exist, creating" and so on, and this proposal would >> help there as well. > > Sure, I use formatting there. But the formatting is intentionally done > *after* checking that the output passes the current log level. Maybe it's just me, but 90%+ of my debug log messages are really only dev log messages for the current cycle/sprint/whatever, and I strip them out before pushing. So as long as they're not getting in the way of performance for the feature I'm working on in the environment I'm working in, there's no reason not to write them as quick&dirty as possible, forcing me to decide which ones will actually be useful in debugging user problems, and clean up and profile them as I do so. And quite often, it turns out that wasting time on a str.format call even when debug logging is turned off really doesn't have any measurable impact anyway, so I may end up using it in the main string or in one of the %s arguments anyway, if it's more readable that way. > The proposal > is about providing a way to format a string literal *before* anyone can do > something else with it. So it won't help for logging. Especially not for > debug logging. > > Also, take another look at your examples. They use positional formatting, > not named formatting. Yes, but that's because with simple messages in 3.5, positional formatting is more convenient. Under the proposal, that would change. Compare: "file '{}' didn't exist, creating".format(fname) "file '{fname}' didn't exist, creating".format(fname=fname) "file '{fname}' didn't exist, creating".format(**vars()) f"file '{fname}' didn't exist, creating" The second version has me repeating the name three times, while the third forces me to think about which scope to pass in, and is still more verbose and more error-prone than the first. But the last one doesn't have either of those problems. Hence the attraction. And of course there's nothing forcing you to use it all the time; when it's not appropriate (and it won't always be), str.format is still there. From python-ideas at mgmiller.net Sun Aug 9 21:41:12 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Sun, 09 Aug 2015 12:41:12 -0700 Subject: [Python-ideas] Briefer string format In-Reply-To: References: <55AC2EDF.7040205@mgmiller.net> <55AC3425.5010509@mgmiller.net> <55C6A367.1010706@mgmiller.net> Message-ID: <55C7ACD8.9030201@mgmiller.net> On 08/09/2015 04:01 AM, Stefan Behnel wrote: > Sure, I use formatting there. But the formatting is intentionally done > *after* checking that the output passes the current log level. The proposal > is about providing a way to format a string literal *before* anyone can do > something else with it. So it won't help for logging. Especially not for > debug logging. This discussion reminds me of a debate I had with a co-worker last year. I think I argued "your" side on that one, Stefan. He insisted on writing log lines like this: log.debug('File "{filename}" has {lines} lines.'.format(filename=filenames, lines=lines) #etc He said this form was most readable, because you could ignore the right side. While I said we should log like this, not only because it's shorter, but also because the formatting doesn't happen unless the log level is reached: log.debug('File "%s" has %s lines.', filename, lines) I also argued on performance grounds, but when I tried to prove it in real applications the difference was almost nothing, perhaps because the logger has to check a few things before deciding to format the string. Logging from a tight loop probably would create more overhead, but we've rarely done that. So performance didn't turn out to be a good reason to chose in most cases. In a tight loop you could still use isEnabledFor(level) for example. This experience did inform my original feature request, the result is now shorter, more readable, and the performance hit is negligible: log.debug(f'File "{filename}" has {lines} lines.') Also, my coworker and I would be able to move on to the next argument. ;) Another feature request would be to have logging support .format syntax like it does printf syntax, anyone know why that never happened? > > output = "writing {filename} ...".format( > filename=self.build_printable_relative_filename(filename)) > > rather than having to say > > printable_filename = self.build_printable_relative_filename(filename) > output = f"writing {printable_filename} ..." # magic happening here > del printable_filename # not used anywhere else This is where I disagree. Because if I have an important variable that I am using and bothering to log, such as a filename, I undoubtedly am going to use it again soon, to do an operation with it. So, I'll want to keep that variable around to use it again, rather than doing a recalculation. Cheers, -Mike From stefan_ml at behnel.de Sun Aug 9 22:26:46 2015 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 09 Aug 2015 22:26:46 +0200 Subject: [Python-ideas] Briefer string format In-Reply-To: <673633DC-81C7-4C07-AC92-F4B17390492D@yahoo.com> References: <55AC2EDF.7040205@mgmiller.net> <55AC3425.5010509@mgmiller.net> <55C6A367.1010706@mgmiller.net> <673633DC-81C7-4C07-AC92-F4B17390492D@yahoo.com> Message-ID: Andrew Barnert via Python-ideas schrieb am 09.08.2015 um 14:26: > And of course there's nothing forcing you to use it all the time; when > it's not appropriate (and it won't always be), str.format is still > there. Yes, I think that's what I dislike most about it. It's only a special purpose feature that forces me to learn two things instead of one in order to use it. Or actually more than two. I have to learn how to use it, I have to understand the limitations and learn to detect when I reach them (especially in terms of code style), and know how to transform my code to make it work again afterwards. One of the obvious quick comments on reddit was this: https://xkcd.com/927/ Stefan From ron3200 at gmail.com Sun Aug 9 23:20:10 2015 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 09 Aug 2015 17:20:10 -0400 Subject: [Python-ideas] Outside the box string formatting idea Message-ID: While the discussion of string formatting has focused on the concept of "format strings", it seems to me they are really format expressions. The expressions in this case are a bit like comprehensions in that they have special syntax/rules that is only valid in a small limited context. It may be possible to do that in this case as an actual expression rather than as a string and still get most of the benefits. Shorter more compressed isn't always better when it comes to readability. Consider this as a trial balloon. If there is clear consensus against it, no problem. It can be added to the rejected ideas section of PEP-498. Below are some examples to compare from the PEP-498. I strongly suggest posting the examples into idle or a syntax highlighting editor to really see what difference it makes in readability. (it won't run, but the highlighting will work.) I used % here, but that's just a place holder at this point, some other symbol could be used ... if this idea has any merit. Actually it could even be reduced to a parentheses with items separated by spaces. Examples from PEP: f'My name is {name}, my age next year is {age+1}, my anniversary is {anniversary:%A, %B %d, %Y}.' (% 'My name is ' name ', my age next year is ' (age+1) ', my anniversary is ' {anniversary ':%A, %B %d, %Y'} '.') # The {expres fmt_spec} sytnax is only valid in (% ...) expressions. f'He said his name is {name!r}.' (% 'He said his name is ' {name '!r'} '.') f'abc{expr1:spec1}{expr2!r:spec2}def{expr3:!s}ghi' (% 'abc' {expr1 'spec1'} {expr2 '!r:spec2'} 'def' expr3 'ghi') f'result={foo()}' (% 'result =' foo()) x = 10 y = 'hi' result = a' 'b' f'{x}' 'c' f'str<{y:^4}>' 'd' 'e' result = (% 'a' 'b' x 'c' '<' {y '^4'} '>' 'd' 'e') result = (% 'ab' x 'c<' {y '^4'} '>de') >>> f'{{k:v for k, v in [(1, 2), (3, 4)}}' '{k:v for k, v in [(1, 2), (3, 4)}' (% {k:v for k, v in [(1, 2), (3, 4)}) This evaluates to a dictionary first. Most expressions naturally evaluate from the inside out, left to right. So there's no conflict. It just converts the dict comprehension to a string. The features allowed in (% ...) 1. Implicit concatenation is allowed for expressions separated by a space. But only in an explicit and limited context. 2. Expressions and format specs can be combined in braces separated by a space. Expressions without format specs don't need them. 3. It maintains the separation of strings and expressions. It does require typing more quotes than a f'...' format string, but the separation of items by only spaces along with syntax highlighting makes it much more readable I think. It's a run time expression, and not an object that can be bound to a name, just like f-strings. It's result can though, that is the same as f-strings. It removes the apparent controversy of evaluating strings. Which f-strings really don't do, because they are compiled to something very similar to what this does. It's just much more obvious with this suggestion that that isn't a problem. Complex expressions need ()'s around them in the format expression to ensure it's a single value, but the parser may be able to tell a + b, from a b. This is the explicit run time expression would do what a f-string does. Reduce repetition of names, allow text and expressions to be interspersed in the order they are used with as little extra noise as possible. The f-strings go one step further and eliminate the extra parentheses and quotes. But at the cost of having them run together and loosing the syntax highlighting of expressions in an editor. It also will make it harder for external syntax checkers to work. It probably would need some ironing out of some the edge cases though. A {expr} evaluates as a set. Just use expr, or {expr ''}. Things like that. I think this is as close to an f-string as we could get and still keep the expressions out of the strings. Cheers, Ron From alexander.belopolsky at gmail.com Mon Aug 10 00:06:19 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 9 Aug 2015 18:06:19 -0400 Subject: [Python-ideas] Outside the box string formatting idea In-Reply-To: References: Message-ID: On Sun, Aug 9, 2015 at 5:20 PM, Ron Adam wrote: > (% 'My name is ' name ', my age next year is ' (age+1) This reminds me Javascript's automatic string promotion: $ node > name = 'Bob' 'Bob' > age = 5 5 > 'My name is ' + name + ', my age next year is ' + (age+1) 'My name is Bob, my age next year is 6' -1 From python-ideas at mgmiller.net Mon Aug 10 00:14:01 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Sun, 09 Aug 2015 15:14:01 -0700 Subject: [Python-ideas] Outside the box string formatting idea In-Reply-To: References: Message-ID: <55C7D0A9.1010305@mgmiller.net> My take: On 08/09/2015 02:20 PM, Ron Adam wrote: > It may be possible to do that in this case as an actual expression rather than > as a string and still get most of the benefits. Shorter more compressed isn't > always better when it comes to readability. Really the idea here is brevity. The long-form versions are still available if they would be better in a particular instance. > (% 'result =' foo()) I found these interesting, reminds me of polish notation. However these would need enhancements to syntax as it would be currently invalid. f'' is likely easier to implement w/o syntax changes. Also, it doesn't look much like python. Cheers, -Mike From cs at zip.com.au Mon Aug 10 00:48:41 2015 From: cs at zip.com.au (Cameron Simpson) Date: Mon, 10 Aug 2015 08:48:41 +1000 Subject: [Python-ideas] PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules In-Reply-To: <20150809103449.GA17997@cskk.homeip.net> References: <20150809103449.GA17997@cskk.homeip.net> Message-ID: <20150809224841.GA14193@cskk.homeip.net> On 09Aug2015 20:34, Cameron Simpson wrote: >On 09Aug2015 03:05, Joseph Jevnik wrote: >>If I have a package that defines both a __main__ and a __init__, then your >>change would bind the __main__ to the name instead of the __init__. That >>seems incorrect. > >Yes. Yes it does. [...] >would it be enough to say that this change should only apply if the module is >not a package? I append the code for my testmod below, being an __init__.py and a __main__.py. A run shows: % python3.4 -m testmod __init__.py: /Users/cameron/rc/python/testmod/__init__.py testmod testmod __main__.py: /Users/cameron/rc/python/testmod/__main__.py __main__ testmod.__main__ __main__ testmod (4 lines, should your mailer fold the output.) It seems to me that Python already does the "right thing" for packages, and it is only non-package modules which need the change proposed by the PEP. Comments please? Code below. Cheers, Cameron Simpson testmod/__init__.py: #!/usr/bin/python print('__init__.py:', __file__, __name__, __spec__.name) testmod/__main__.py: #!/usr/bin/python import pprint import sys print('__main__.py:', __file__, __name__, __spec__.name) for modname, mod in sorted(sys.modules.items()): rmod = repr(mod) if 'testmod' in modname or 'testmod' in rmod: print(modname, rmod) From ron3200 at gmail.com Mon Aug 10 00:59:08 2015 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 09 Aug 2015 18:59:08 -0400 Subject: [Python-ideas] Outside the box string formatting idea In-Reply-To: References: Message-ID: On 08/09/2015 06:06 PM, Alexander Belopolsky wrote: > On Sun, Aug 9, 2015 at 5:20 PM, Ron Adam wrote: >> >(% 'My name is ' name ', my age next year is ' (age+1) > This reminds me Javascript's automatic string promotion: > > $ node >> >name = 'Bob' > 'Bob' >> >age = 5 > 5 >> >'My name is ' + name + ', my age next year is ' + (age+1) > 'My name is Bob, my age next year is 6' It would only do that in a very narrow context. I'm not suggesting it be done outside of string format expressions. Cheers, Ron From ron3200 at gmail.com Mon Aug 10 01:08:54 2015 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 09 Aug 2015 19:08:54 -0400 Subject: [Python-ideas] Outside the box string formatting idea In-Reply-To: <55C7D0A9.1010305@mgmiller.net> References: <55C7D0A9.1010305@mgmiller.net> Message-ID: On 08/09/2015 06:14 PM, Mike Miller wrote: > My take: > > On 08/09/2015 02:20 PM, Ron Adam wrote: >> It may be possible to do that in this case as an actual expression rather >> than >> as a string and still get most of the benefits. Shorter more compressed >> isn't >> always better when it comes to readability. > > Really the idea here is brevity. The long-form versions are still > available if they would be better in a particular instance. > > > (% 'result =' foo()) > > I found these interesting, reminds me of polish notation. However these > would need enhancements to syntax as it would be currently invalid. f'' > is likely easier to implement w/o syntax changes. Also, it doesn't look > much like python. There are actually too parts... (% ...) Handles implicit concatination and string conversion. And {expr format} handles the formatting, inside (% ...) context only. So these only work in string format expressions, just like special syntax for comprehensions only works in comprehensions. As I suggested, I think it's the closest you can get and still not put the expressions into the strings. Comma's could be used to separate things, but it's not that much of a stretch to go from .... 'a' 'b' --> 'ab' to a = 'a' b = 'b' (% a b) --> 'ab' But we could have... (% a, b) If that seems more pythonic. Cheers, Ron From joejev at gmail.com Mon Aug 10 01:33:38 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Sun, 9 Aug 2015 19:33:38 -0400 Subject: [Python-ideas] PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules In-Reply-To: <20150809224841.GA14193@cskk.homeip.net> References: <20150809103449.GA17997@cskk.homeip.net> <20150809224841.GA14193@cskk.homeip.net> Message-ID: I would be okay if this change did not affect execution of a package with the python -m flag. I was only concerned because a __main__ in a package is common and wanted to make sure you had addressed it. On Sun, Aug 9, 2015 at 6:48 PM, Cameron Simpson wrote: > On 09Aug2015 20:34, Cameron Simpson wrote: > >> On 09Aug2015 03:05, Joseph Jevnik wrote: >> >>> If I have a package that defines both a __main__ and a __init__, then >>> your >>> change would bind the __main__ to the name instead of the __init__. That >>> seems incorrect. >>> >> >> Yes. Yes it does. >> > [...] > >> would it be enough to say that this change should only apply if the >> module is not a package? >> > > I append the code for my testmod below, being an __init__.py and a > __main__.py. A run shows: > > % python3.4 -m testmod > __init__.py: /Users/cameron/rc/python/testmod/__init__.py testmod testmod > __main__.py: /Users/cameron/rc/python/testmod/__main__.py __main__ > testmod.__main__ > __main__ '/Users/cameron/rc/python/testmod/__main__.py'> > testmod '/Users/cameron/rc/python/testmod/__init__.py'> > > (4 lines, should your mailer fold the output.) > > It seems to me that Python already does the "right thing" for packages, > and it is only non-package modules which need the change proposed by the > PEP. > > Comments please? > > Code below. > > Cheers, > Cameron Simpson > > testmod/__init__.py: > #!/usr/bin/python > print('__init__.py:', __file__, __name__, __spec__.name) > > testmod/__main__.py: > #!/usr/bin/python > import pprint > import sys > print('__main__.py:', __file__, __name__, __spec__.name) > for modname, mod in sorted(sys.modules.items()): > rmod = repr(mod) > if 'testmod' in modname or 'testmod' in rmod: > print(modname, rmod) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.cliffe at btinternet.com Mon Aug 10 03:11:52 2015 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Mon, 10 Aug 2015 02:11:52 +0100 Subject: [Python-ideas] Outside the box string formatting idea In-Reply-To: References: <55C7D0A9.1010305@mgmiller.net> Message-ID: <55C7FA58.6030302@btinternet.com> On 10/08/2015 00:08, Ron Adam wrote: > > > > There are actually too parts... > > (% ...) Handles implicit concatination and string conversion. > > And {expr format} handles the formatting, inside (% ...) context only. > > So these only work in string format expressions, just like special > syntax for comprehensions only works in comprehensions. > > As I suggested, I think it's the closest you can get and still not put > the expressions into the strings. > > Comma's could be used to separate things, but it's not that much of a > stretch to go from .... > > 'a' 'b' --> 'ab' > > to > a = 'a' > b = 'b' > (% a b) --> 'ab' > > But we could have... > > (% a, b) If that seems more pythonic. > How does this gain over def f(*args): return ''.join(args) a='a' b='b' f(a, b) Rob Cliffe From ron3200 at gmail.com Mon Aug 10 04:31:49 2015 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 09 Aug 2015 22:31:49 -0400 Subject: [Python-ideas] Outside the box string formatting idea In-Reply-To: <55C7FA58.6030302@btinternet.com> References: <55C7D0A9.1010305@mgmiller.net> <55C7FA58.6030302@btinternet.com> Message-ID: On 08/09/2015 09:11 PM, Rob Cliffe wrote: >> Comma's could be used to separate things, but it's not that much of a >> stretch to go from .... >> >> 'a' 'b' --> 'ab' >> >> to >> a = 'a' >> b = 'b' >> (% a b) --> 'ab' >> >> But we could have... >> >> (% a, b) If that seems more pythonic. >> > How does this gain over > > def f(*args): return ''.join(args) > a='a' > b='b' > f(a, b) To make that work in the same way you would also need to add a way to handle string conversion of expressions and formatting. The main gain is the removal of some of the syntax elements and repetive method or function calls that would be required in more complex situations. The point was to boil it down to the minimum that could be done and still not move the expression into the string. Lets look at this example... (% 'My name is ' name ', my age next year is ' (age+1) ', my anniversary is ' {anniversary ':%A, %B %d, %Y'} '.') Using functions and method calls it might become... >>> f = lambda *args: ''.join(args) >>> _ = format >>> import datetime >>> name = 'Fred' >>> age = 50 >>> anniversary = datetime.date(1991, 10, 12) >>> f('My name is ', _(name), ', my age next year is ', _(age+1), ... ', my anniversary is ', _(anniversary, ':%A, %B %d, %Y'), '.') 'My name is Fred, my age next year is 51, my anniversary is :Saturday, October 12, 1991.' That isn't that much different and works today. A special format expression would remove some of the syntax elements and make it a standardised and cleaner looking solution. Because it requires the extra steps to define and rename the join and format functions to something shorter, it's not a standardised solution. Having it look the same in many programs is valuable. Also Nicks improvements of combining translation with it might be doable as well. Even with commas, it's still may be enough. Than again if everyone like the expressions in strings, it doesn't matter. Cheers, Ron From vito.detullio at gmail.com Mon Aug 10 07:40:20 2015 From: vito.detullio at gmail.com (Vito De Tullio) Date: Mon, 10 Aug 2015 07:40:20 +0200 Subject: [Python-ideas] Outside the box string formatting idea References: <55C7D0A9.1010305@mgmiller.net> <55C7FA58.6030302@btinternet.com> Message-ID: Ron Adam wrote: >> How does this gain over >> >> def f(*args): return ''.join(args) >> a='a' >> b='b' >> f(a, b) > > To make that work in the same way you would also need to add a way to > handle string conversion of expressions and formatting. > Lets look at this example... > > (% 'My name is ' name ', my age next year is ' (age+1) > ', my anniversary is ' {anniversary ':%A, %B %d, %Y'} '.') > > > Using functions and method calls it might become... > > >>> f = lambda *args: ''.join(args) > >>> _ = format > >>> import datetime > >>> name = 'Fred' > >>> age = 50 > >>> anniversary = datetime.date(1991, 10, 12) > > >>> f('My name is ', _(name), ', my age next year is ', _(age+1), > ... ', my anniversary is ', _(anniversary, ':%A, %B %d, %Y'), '.') > > 'My name is Fred, my age next year is 51, my anniversary is :Saturday, > October 12, 1991.' > That isn't that much different and works today. A special format > expression would remove some of the syntax elements and make it a > standardised and cleaner looking solution. what about a slightly different "f"? def f(*args): f_args = [] for arg in args: if isinstance(arg, tuple): f_args.append(format(*arg)) else: f_args.append(format(arg)) return ''.join(f_args) import datetime name = 'Fred'; age = 50; anniversary = datetime.date(1991, 10, 12) print(f('My name is ', name, ', my age next year is ', age+1, ', my anniversary is ', (anniversary, ':%A, %B %d, %Y'), '.')) My name is Fred, my age next year is 51, my anniversary is :Saturday, October 12, 1991. -- By ZeD From cs at zip.com.au Mon Aug 10 12:13:02 2015 From: cs at zip.com.au (Cameron Simpson) Date: Mon, 10 Aug 2015 20:13:02 +1000 Subject: [Python-ideas] PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules In-Reply-To: <75C9216C-5A2A-4DBD-8F57-361E47D67C02@yahoo.com> References: <75C9216C-5A2A-4DBD-8F57-361E47D67C02@yahoo.com> Message-ID: <20150810101302.GA50157@cskk.homeip.net> On 08Aug2015 22:12, Andrew Barnert wrote: >On Aug 8, 2015, at 16:18, Cameron Simpson wrote: >> I think I take the line that as of PEP 451 the conanical name for a module is .__spec__.name. The module's .__name__ normally matches that, but obviously in the case of "python -m" it does not. >> >> As you point out, suddenly a module can appear somewhere other than sys.modules['__main__'] where that difference shows. >> >> Let's ask the associated question: who introspects module.__name__ and expects it to be the cononical name? For what purpose? > >I'd think the first place to look is code that deals directly with module objects and/or sys.modules--graphical debuggers, plugin frameworks, bridges (a la AppScript or PyObjC), etc. Especially since many of them want to retain compatibility with 3.3, if not 3.2, and to share as much code as possible with a 2.x version > >Of course you're probably right that there aren't too many such things, and they're also presumably written by people who know what they're doing and wouldn't have too much trouble adapting them for 3.6+ if needed. One might hope. So I've started with the stdlib in two passes: looking for .__name__ associated with "mod", and looking for __main__ not in the standard boilerplate (__name__ == '__main__'). Obviously all this code is unfamiliar to me so anyone with deeper understanding who wants to look is most welcome. Pass 1 with this command: find . -type f -name \*.py | xargs fgrep .__name__ /dev/null | grep mod to look for module related code using .__name__. Of course a lot of it is reporting, but there are some interesting relevant bits. doctest: This refers to module.__name__ quite a lot. The _normalize_module() function uses __name__ instead of __spec__.name. _from_module() tests is an object is defined in a particular module based on __name__; I'm (naively) surprised that this can't use "is", but it looks like an object's __module__ attribute is a string, which I imagine avoids circular references. _get_test() uses __name__ instead of __spec__.name, though only as a fallback if there is no __file__. SkipDocTestCase.shortDescription() uses __name__. importlib: mostly seems fine according to my shallow understanding? inspect: getmodule() seems correct (uses __name__ but seems correctish) - this does seem to be a grope around in the available places looking for a match function, and feels unreliable anyway. modulefinder: this does look like it could use __spec__.name more widely, or as an adjunct to __name__. scan_code() looks like another "grope around" function trying to infer structure from the pieces sitting about:-) pdb: Pdb.do_whatis definitely reports using .__name__. Not necessarily incorrect. pkgutils: get_loader() uses .__name__, probably ougtht to be __spec__.name pydoc: also probably should upgrade to .__spec__.name unittest: TestLoader.discover seems to rely on __name__ instead of __spec__.name while constructing a pathname; definitely seems like it needs updating for PEP 451. It also looks up __name__ in sys.builtin_module_names to reject constructing a pathname. Pass 2 with this command: find . -type f -name \*.py |xxargs fgrep __main__ | grep -v 'if *__name__ *== *["'\'']__main__' looking for __main__ but discarding the boilerplate. I'm actually striking out here. Since this PEP doesn't change __name__ == '__main__' I've not found anything here that looks like it would stop working. Even runpy, surcory though my look at it is, is going forward: setting __name__ to '__main__' instead of working backwards. Further thoughts? Cheers, Cameron Simpson From cs at zip.com.au Mon Aug 10 12:49:58 2015 From: cs at zip.com.au (Cameron Simpson) Date: Mon, 10 Aug 2015 20:49:58 +1000 Subject: [Python-ideas] PEP-499: "python -m foo" should bind to both "__main__" and "foo" in sys.modules In-Reply-To: References: Message-ID: <20150810104958.GA39060@cskk.homeip.net> On 09Aug2015 19:33, Joseph Jevnik wrote: >> On 09Aug2015 20:34, Cameron Simpson wrote: >>> On 09Aug2015 03:05, Joseph Jevnik wrote: >>>> If I have a package that defines both a __main__ and a __init__, then >>>> your change would bind the __main__ to the name instead of the __init__. >>>> That seems incorrect. >>> >>> Yes. Yes it does. [...] >>> would it be enough to say that this change should only apply if the >>> module is not a package? >I would be okay if this change did not affect execution of a package with >the python -m flag. I was only concerned because a __main__ in a package is >common and wanted to make sure you had addressed it. Good point. Please see if this update states your issue fairly and addresses it: https://bitbucket.org/cameron_simpson/pep-0499/commits/3efcd9b54e238a1ff7f5c5df805df139d6cb5a30 Cheers, Cameron Simpson From jonathan at slenders.be Mon Aug 10 13:49:18 2015 From: jonathan at slenders.be (Jonathan Slenders) Date: Mon, 10 Aug 2015 13:49:18 +0200 Subject: [Python-ideas] String interpolation for all literal strings Message-ID: 2015-08-07 9:33 GMT+02:00 Nick Coghlan : > If f-strings are always eagerly interpolated > prior to translation, then I can foresee a lot of complaints from > folks asking why this doesn't work right: > > print(_(f"This is a translated message with {a} and {b} interpolated")) > What if we decide to lazy-interpolate f-strings? A creation of such an f-string, creates an f-string object. (Which has a proper __repr__, so that it's transparent for those that don't care.) Calling the __str__ method of the f-string triggers the interpolation. This should give enough freedom to implement a working ugettext. (But I still don't see any reason to go beyond supporting more than just is simple variable name between the curly braces...) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Mon Aug 10 22:31:58 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 10 Aug 2015 13:31:58 -0700 Subject: [Python-ideas] PEP 501 - i18n with marked strings Message-ID: <55C90A3E.1080906@mgmiller.net> Hi, I haven't done i18n recently, so bare with me. I'm not sure about bolting this on to "format strings", in that it feels like an orthogonal concept. However, what if we had i18n strings as well, not instead of: i'Hello there, $name {age}.' and that they were complimentary to f'', each handling their different duties: fi'Hello there, $name {age}.' Different syntax would probably be needed for each, is that correct? Since each have different requirements, e.g. Barry's concerns about format strings being too powerful for non-developers, while also making a project vulnerable to arbitrary code. Perhaps PEP 498 with non-arbitrary strings (but attribute/keys support) would allow the syntax to be unified. -Mike From pmiscml at gmail.com Tue Aug 11 00:32:06 2015 From: pmiscml at gmail.com (Paul Sokolovsky) Date: Tue, 11 Aug 2015 01:32:06 +0300 Subject: [Python-ideas] Running scripts with relative imports directly, was: Re: proposal: "python -m foo" In-Reply-To: References: <20150805054630.GA66989@cskk.homeip.net> <20150805182320.052ab9d7@x230> <20150806023210.18b5387d@x230> <85twsdjo1y.fsf@benfinney.id.au> Message-ID: <20150811013206.677159b7@x230> Hello, On Thu, 6 Aug 2015 13:57:19 +1000 Nick Coghlan wrote: [] > > "I'm -1 on this and on any other proposed twiddlings of the > > __main__ machinery. The only use case seems to be running scripts > > that happen to be living inside a module's directory, which I've > > always seen as an antipattern. To make me change my mind you'd have > > to convince me that it isn't." > > > > > > > > He doesn't describe (that I can find) what makes him think it's an > > antipattern, so I'm not clear on how he expects to be convinced > > it's a valid pattern. While Nick's PEP-0395 lists enough things in Python import system which may confuse casual users (and which thus should raise alarm for all Python advocates, who think it's nice, easy-to-use language), I'd like to elaborate on my particular usecase. So, when you start a new "python library", you probably start it as a single-file module. And it's easy to play with it for both you and your users - just make another file in the same dir as your module, add "import my_module" to it, voila. At some point, you may decide that library is too big for a single file, and needs splitting. Which of course means converting a module to a package. And of course, when you import "utils", you want to be sure it imports your utils, not something else, which means using relative imports. But then suddenly, you no longer can drop your test scripts in the same directory where code is (like you did it before), but need to drop it level up, which may be not always convenient. And it would be one thing if it required extra step to run scripts located inside package dir (casual-user-in-me hunch feeling would be that PYTHONPATH needs to be set to ..), but we talk about not being able to do it at all. > It's an anti-pattern because doing it fundamentally confuses the > import system's internal state: > https://www.python.org/dev/peps/pep-0395/#why-are-my-imports-broken Excellent PEP, Nick, it took some time to read thru it and references, but it answered all my questions. After reading it, it's hard to disagree that namespace packages support, a simplification and clarification in itself, conflicted and blocked an automagic way to resolve imports confusion for the user. But then indeed a logical solution is to give user's power to explicitly resolve this issue, if implicit no longer can work - by letting -m accept relative module paths, as you show below. It's also a discovery for me that -m's functionality appears to be handled largely on stdlib side using runpy module, and that's the main purpose of that module. I'll look into prototyping relative import support when I have time. And Nick, if you count votes for reviving PEP-395, +1 for that, IMHO, it's much worthy work than e.g. yet another (3rd!) string templating variant (still adhoc and limited, e.g. not supporting control statements). > > Relative imports from the main module just happen to be a situation > where the failure is an obvious one rather than subtle state > corruption. > > > Nonetheless, that appears to be the hurdle you'd need to > > confront. > > This came up more recently during the PEP 420 discussions, when the > requirement to include __init__.py to explicitly mark package > directories was eliminated. This means there's no longer any way for > the interpreter to reliably infer from the filesystem layout precisely > where in the module hierarchy you intended a module to live. See > https://www.python.org/dev/peps/pep-0420/#discussion for references. > > However, one of the subproposals from PEP 395 still offers a potential > fix: https://www.python.org/dev/peps/pep-0395/#id24 > > That proposes to allow explicit relative imports at the command line, > such that Paul's example could be correctly invoked as: > > python3 -m ..pkg.foo > > It would also be possible to provide a syntactic shorthand for > submodules of top level packages: > > python3 -m .foo > > The key here is that the interpreter is being explicitly told that the > current directory is inside a package, as well as how far down in the > package hierarchy it lives, and can adjust the way it sets sys.path[0] > accordingly before proceeding on to import "pkg.foo" as __main__. > > That should be a relatively uncomplicated addition to > runpy._run_module_as_main that could be rolled into Cameron's PEP > plans. Steps required: > > * count leading dots in the supplied mod_name > * remove "leading_dots-1" trailing directory names from sys.path[0] > * strip the leading dots from mod_name before continuing with the rest > of the function > * in the special case of only 1 leading dot, remove the final > directory segment from sys.path[0] and prepend it to mod_name with a > dot separator > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -- Best regards, Paul mailto:pmiscml at gmail.com From srkunze at mail.de Tue Aug 11 09:36:25 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 11 Aug 2015 09:36:25 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: <55C90A3E.1080906@mgmiller.net> References: <55C90A3E.1080906@mgmiller.net> Message-ID: <55C9A5F9.1030501@mail.de> Also bare with me but couldn't i18n not just be another format spec? i'Hello there, {name:i18n} {age}.' On 10.08.2015 22:31, Mike Miller wrote: > Hi, > > I haven't done i18n recently, so bare with me. I'm not sure about > bolting this on to "format strings", in that it feels like an > orthogonal concept. > > However, what if we had i18n strings as well, not instead of: > > i'Hello there, $name {age}.' > > and that they were complimentary to f'', each handling their different > duties: > > fi'Hello there, $name {age}.' > > Different syntax would probably be needed for each, is that correct? > Since each have different requirements, e.g. Barry's concerns about > format strings being too powerful for non-developers, while also > making a project vulnerable to arbitrary code. > > Perhaps PEP 498 with non-arbitrary strings (but attribute/keys > support) would allow the syntax to be unified. > > -Mike > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From encukou at gmail.com Tue Aug 11 10:35:33 2015 From: encukou at gmail.com (Petr Viktorin) Date: Tue, 11 Aug 2015 10:35:33 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: <55C9A5F9.1030501@mail.de> References: <55C90A3E.1080906@mgmiller.net> <55C9A5F9.1030501@mail.de> Message-ID: On Tue, Aug 11, 2015 at 9:36 AM, Sven R. Kunze wrote: > Also bare with me but couldn't i18n not just be another format spec? > > i'Hello there, {name:i18n} {age}.' Usually it's not the substitutions that you need to translate, but the surrounding text. From ned at nedbatchelder.com Tue Aug 11 12:43:30 2015 From: ned at nedbatchelder.com (Ned Batchelder) Date: Tue, 11 Aug 2015 06:43:30 -0400 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: <55C90A3E.1080906@mgmiller.net> References: <55C90A3E.1080906@mgmiller.net> Message-ID: <55C9D1D2.1070703@nedbatchelder.com> On 8/10/15 4:31 PM, Mike Miller wrote: > Hi, > > I haven't done i18n recently, so bare with me. I'm not sure about > bolting this on to "format strings", in that it feels like an > orthogonal concept. > > However, what if we had i18n strings as well, not instead of: > > i'Hello there, $name {age}.' > You haven't said what this would *mean*, precisely. I18n tends to get very involved, and is often specific to the larger framework that you are using. Most people have adopted conventions that mean the syntax is already quite simple for strings to be localized: _("Hello there, {name}").format(name=name) What would i"" bring to the table? --Ned. From encukou at gmail.com Tue Aug 11 12:50:14 2015 From: encukou at gmail.com (Petr Viktorin) Date: Tue, 11 Aug 2015 12:50:14 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: <55C9D1D2.1070703@nedbatchelder.com> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> Message-ID: On Tue, Aug 11, 2015 at 12:43 PM, Ned Batchelder wrote: > On 8/10/15 4:31 PM, Mike Miller wrote: >> >> Hi, >> >> I haven't done i18n recently, so bare with me. I'm not sure about bolting >> this on to "format strings", in that it feels like an orthogonal concept. >> >> However, what if we had i18n strings as well, not instead of: >> >> i'Hello there, $name {age}.' >> > You haven't said what this would *mean*, precisely. I18n tends to get very > involved, and is often specific to the larger framework that you are using. > Most people have adopted conventions that mean the syntax is already quite > simple for strings to be localized: > > _("Hello there, {name}").format(name=name) > > What would i"" bring to the table? Not having to repeat the variable name three times. From ned at nedbatchelder.com Tue Aug 11 13:14:31 2015 From: ned at nedbatchelder.com (Ned Batchelder) Date: Tue, 11 Aug 2015 07:14:31 -0400 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> Message-ID: <55C9D917.7060801@nedbatchelder.com> On 8/11/15 6:50 AM, Petr Viktorin wrote: > On Tue, Aug 11, 2015 at 12:43 PM, Ned Batchelder wrote: >> On 8/10/15 4:31 PM, Mike Miller wrote: >>> Hi, >>> >>> I haven't done i18n recently, so bare with me. I'm not sure about bolting >>> this on to "format strings", in that it feels like an orthogonal concept. >>> >>> However, what if we had i18n strings as well, not instead of: >>> >>> i'Hello there, $name {age}.' >>> >> You haven't said what this would *mean*, precisely. I18n tends to get very >> involved, and is often specific to the larger framework that you are using. >> Most people have adopted conventions that mean the syntax is already quite >> simple for strings to be localized: >> >> _("Hello there, {name}").format(name=name) >> >> What would i"" bring to the table? > Not having to repeat the variable name three times. That's what f"" does. I don't understand what i"" adds to it. --Ned. From encukou at gmail.com Tue Aug 11 13:23:46 2015 From: encukou at gmail.com (Petr Viktorin) Date: Tue, 11 Aug 2015 13:23:46 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: <55C9D917.7060801@nedbatchelder.com> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <55C9D917.7060801@nedbatchelder.com> Message-ID: On Tue, Aug 11, 2015 at 1:14 PM, Ned Batchelder wrote: > On 8/11/15 6:50 AM, Petr Viktorin wrote: >> >> On Tue, Aug 11, 2015 at 12:43 PM, Ned Batchelder >> wrote: >>> >>> On 8/10/15 4:31 PM, Mike Miller wrote: >>>> >>>> Hi, >>>> >>>> I haven't done i18n recently, so bare with me. I'm not sure about >>>> bolting >>>> this on to "format strings", in that it feels like an orthogonal >>>> concept. >>>> >>>> However, what if we had i18n strings as well, not instead of: >>>> >>>> i'Hello there, $name {age}.' >>>> >>> You haven't said what this would *mean*, precisely. I18n tends to get >>> very >>> involved, and is often specific to the larger framework that you are >>> using. >>> Most people have adopted conventions that mean the syntax is already >>> quite >>> simple for strings to be localized: >>> >>> _("Hello there, {name}").format(name=name) >>> >>> What would i"" bring to the table? >> >> Not having to repeat the variable name three times. > > That's what f"" does. I don't understand what i"" adds to it. Well, if you want to get the equivalent of: _("Hello there, {name}").format(name=name) you can't use: _(f"Hello there, {name}") because then the `_` function would get the substituted string. The translation database only contains "Hello there, {name}", not "Hello there, Ned"; you need to pass the former to `_`. In other words, if f was a function instead of a prefix, you want to call f(_("string")), not _(f("string")). The i"" would allow specifying a translation function, which is typically custom but project- (or at least module-) global. From stephen at xemacs.org Tue Aug 11 14:41:11 2015 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 11 Aug 2015 21:41:11 +0900 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <55C9D917.7060801@nedbatchelder.com> Message-ID: <87614mxajc.fsf@uwakimon.sk.tsukuba.ac.jp> Petr Viktorin writes: > Well, if you want to get the equivalent of: > > _("Hello there, {name}").format(name=name) > > you can't use: > > _(f"Hello there, {name}") This is the "eager vs lazy" interpolation issue that also affects the logging use case, right? From jonathan at slenders.be Tue Aug 11 15:04:26 2015 From: jonathan at slenders.be (Jonathan Slenders) Date: Tue, 11 Aug 2015 15:04:26 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: <87614mxajc.fsf@uwakimon.sk.tsukuba.ac.jp> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <55C9D917.7060801@nedbatchelder.com> <87614mxajc.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Is there actually any issue with lazy interpolation? If not, I think it's very neat. 2015-08-11 14:41 GMT+02:00 Stephen J. Turnbull : > Petr Viktorin writes: > > > Well, if you want to get the equivalent of: > > > > _("Hello there, {name}").format(name=name) > > > > you can't use: > > > > _(f"Hello there, {name}") > > This is the "eager vs lazy" interpolation issue that also affects the > logging use case, right? > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Tue Aug 11 15:05:37 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 11 Aug 2015 15:05:37 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: <55C90A3E.1080906@mgmiller.net> References: <55C90A3E.1080906@mgmiller.net> Message-ID: <55C9F321.2030904@egenix.com> On 10.08.2015 22:31, Mike Miller wrote: > Hi, > > I haven't done i18n recently, so bare with me. I'm not sure about bolting this on to "format > strings", in that it feels like an orthogonal concept. > > However, what if we had i18n strings as well, not instead of: > > i'Hello there, $name {age}.' > > and that they were complimentary to f'', each handling their different duties: > > fi'Hello there, $name {age}.' > > Different syntax would probably be needed for each, is that correct? Since each have different > requirements, e.g. Barry's concerns about format strings being too powerful for non-developers, > while also making a project vulnerable to arbitrary code. > > Perhaps PEP 498 with non-arbitrary strings (but attribute/keys support) would allow the syntax to be > unified. IMO, having just one string literal interpolation standard is better than having two and since i"" fits both needs, I'm +1 on i"" and -0 on f"". The only problem I see with i"" is that you may want to use formatting only in some cases, without triggering the translation machinery which may be active in a module. I guess it's fine to fallback to the standard .format() or %-approach for those few situations, though. In all other use cases, having the literal strings already prepared for translation in a Python module is a huge win: just drop a translation hook into the module and you're good to go :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 11 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From jonathan at slenders.be Tue Aug 11 15:10:29 2015 From: jonathan at slenders.be (Jonathan Slenders) Date: Tue, 11 Aug 2015 15:10:29 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: <55C9F321.2030904@egenix.com> References: <55C90A3E.1080906@mgmiller.net> <55C9F321.2030904@egenix.com> Message-ID: -1 on any approach that uses a translation hook. Many frameworks have their own way of translating things. So that should definitely not be a global. 2015-08-11 15:05 GMT+02:00 M.-A. Lemburg : > On 10.08.2015 22:31, Mike Miller wrote: > > Hi, > > > > I haven't done i18n recently, so bare with me. I'm not sure about > bolting this on to "format > > strings", in that it feels like an orthogonal concept. > > > > However, what if we had i18n strings as well, not instead of: > > > > i'Hello there, $name {age}.' > > > > and that they were complimentary to f'', each handling their different > duties: > > > > fi'Hello there, $name {age}.' > > > > Different syntax would probably be needed for each, is that correct? > Since each have different > > requirements, e.g. Barry's concerns about format strings being too > powerful for non-developers, > > while also making a project vulnerable to arbitrary code. > > > > Perhaps PEP 498 with non-arbitrary strings (but attribute/keys support) > would allow the syntax to be > > unified. > > IMO, having just one string literal interpolation standard is better > than having two and since i"" fits both needs, I'm +1 on i"" and > -0 on f"". > > The only problem I see with i"" is that you may want to use > formatting only in some cases, without triggering the translation > machinery which may be active in a module. I guess it's fine to > fallback to the standard .format() or %-approach for those > few situations, though. > > In all other use cases, having the literal strings already > prepared for translation in a Python module is a huge win: > just drop a translation hook into the module and you're > good to go :-) > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Source (#1, Aug 11 2015) > >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ > >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ > >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ > ________________________________________________________________________ > > ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: > > eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > Registered at Amtsgericht Duesseldorf: HRB 46611 > http://www.egenix.com/company/contact/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Tue Aug 11 15:54:28 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 11 Aug 2015 15:54:28 +0200 Subject: [Python-ideas] fork In-Reply-To: <991E71A1-FDB3-4BE5-B607-B74850903DCD@yahoo.com> References: <991E71A1-FDB3-4BE5-B607-B74850903DCD@yahoo.com> Message-ID: <20150811135429.48F0C8016E@smtp04.mail.de> Hi everybody, I finally managed to implement all the tiny little details of fork that were important from my perspective (cf. https://pypi.python.org/pypi/xfork). An interesting piece of code is the iterative evaluation of OperationFuture using generators to avoid stack overflows. The only thing I am not satisfied with is exception handling. In spite of preserving the original traceback, when the ResultEvaluationError is thrown is unfortunately up to the evalutor. Maybe, somebody here has a better idea or compromise here. Co-workers proposed using function scopes as the ultimate evaluation scope. That is when a function returns a ResultProxy, it gets evaluated. However, I have absolutely no idea how to do this as I couldn't find any __returned__ hook or something. I learned from writing this module and some key insights I would like to share: 1) Pickle not working with decorated functions 2) One 'traceback' is not like another. There are different concepts in Python with the same name. 3) Tracebacks are not really first-class, thus customizing them is hard/impossible. 4) contextlib.contextmanager only creates decorators/context managers with parameters but what if you have none? @decorator() looks weird. 5) Generators can be used for operation evaluation to avoid the stack limit 6) Python is awesome: despite the above obstacles, I managed to hammer out a short and comprehensible implementation for fork. It would be great if experts here could fix 1) - 4). 1) - 3) have corresponding StackOverflow threads. @_Andrew_ I am going to address you questions shortly after this. Best, Sven ------------------------------------------------------------------------------------------------- FreeMail powered by mail.de - MEHR SICHERHEIT, SERIOSIT?T UND KOMFORT -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Tue Aug 11 16:06:13 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 11 Aug 2015 16:06:13 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9F321.2030904@egenix.com> Message-ID: <55CA0155.8050303@egenix.com> On 11.08.2015 15:10, Jonathan Slenders wrote: > -1 on any approach that uses a translation hook. Many frameworks have their > own way of translating things. So that should definitely not be a global. The module global approach is only one way to define a __interpolate__ function. As I understand the PEP, the compiler would simply translate the literal into a regular function call, which then is subject to the usual scoping rules in Python. It would therefore be possible to override the builtin in a local scope to e.g. address things like context or per-session based i18n. You could e.g. pass in a ${context} variable to the string, so that your __interpolate__ function can then directly access the required translation context. Alternatively, the __interpolate__ function could inspect the call stack to automatically find the needed context variable. I guess this particular use case could be made more elegant :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 11 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From srkunze at mail.de Tue Aug 11 16:33:04 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 11 Aug 2015 16:33:04 +0200 Subject: [Python-ideas] fork In-Reply-To: <991E71A1-FDB3-4BE5-B607-B74850903DCD@yahoo.com> References: <991E71A1-FDB3-4BE5-B607-B74850903DCD@yahoo.com> Message-ID: <20150811143305.A565780181@smtp04.mail.de> Am 05-Aug-2015 16:30:27 +0200 schrieb abarnert at yahoo.com: > What does that even mean? How would you not allow races? If you let people throw arbitrary tasks at a thread pool, with no restriction on mutable shared state, you've allowed races. Let me answer this in a more implicit way. Why do we need to mark global variables as such? I think the answer is clear: to mark side-effects (quoting the docs). Why are all variables thread-shared by default? I don't know, maybe efficiency reasons but that hardly apply to Python in the first place. > And how do you propose "not having them"? What would happen if all shared variables were thread-local by default and need to marked as shared if desired? I think the answer would also be very clear: to mark side-effects and to have people think about it explicitly. > And that's exactly the problem. What makes concurrent code with shared state hard, more than anything else, is people who don't realize what's hard about it and write code that seems to work but doesn't. Precisely because 'shared state' is hard, why is it the default? > Making it easier for such people to write broken code without even realizing they're doing so is not a good thing. That argument only applies when the broken code (using shared states) is the default. As you can see, this thought experiment assumes that there could be another way to approach that situation. How and when this can be done and if at all is a completely different matter. As usual, I leave that to the experts like you to figure out. Best, Sven ------------------------------------------------------------------------------------------------- FreeMail powered by mail.de - MEHR SICHERHEIT, SERIOSIT?T UND KOMFORT -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Aug 11 17:20:43 2015 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Aug 2015 11:20:43 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> Message-ID: <20150811112043.676c6038@anarchist.wooz.org> On Aug 11, 2015, at 12:50 PM, Petr Viktorin wrote: >Not having to repeat the variable name three times. To me, this is really the crux of both the f-string and i-string proposals. It's also a more general issue it me because it's almost exactly the raison d'?tre for flufl.i18n. The complicated examples of f-strings I've seen really give me the shudders. Maybe in practice it won't be so bad, but it's definitely true that if it can be done, someone will do it. So I expect to see "abuses" of them in the wild. But the DRY argument is much more compelling to me, and currently I think the best way to reduce repetition in function arguments is through sys._getframe() and other such nasty tricks. I'd really much prefer to see this small annoyance fixed in a targeted way than add a hugely complicated new feature that reduces readability (IMHO). Which is why I like the scope() and similar ideas. Something like a built-in that provides you with a ChainMap of the current namespaces in effect. The tricky bit is that you still need something like _getframe()'s depth argument, or perhaps the object returned by scope() -or whatever it's called- would have links back to the namespaces of earlier call frames. I also don't know whether all of this makes sense for all the alternative implementations, but there's certainly a *logical* call stack for any particular point in a Python program. What's the simplest thing we can do to make this pain go away? A few extraneous locals really aren't that bad. They'll be rarely needed, and besides I already use such things when the alternative is a hideously long line of code. In any case, they're a small price to pay for keeping things simple. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From python-ideas at mgmiller.net Tue Aug 11 18:41:32 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 11 Aug 2015 09:41:32 -0700 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9A5F9.1030501@mail.de> Message-ID: <55CA25BC.1040704@mgmiller.net> Also, excuse me, I wrote "bare" when I meant "bear", and a few others along. ;) -Mike On 08/11/2015 01:35 AM, Petr Viktorin wrote: > On Tue, Aug 11, 2015 at 9:36 AM, Sven R. Kunze wrote: >> Also bare with me but couldn't i18n not just be another format spec? >> >> i'Hello there, {name:i18n} {age}.' > > Usually it's not the substitutions that you need to translate, but the > surrounding text. From srkunze at mail.de Tue Aug 11 19:12:49 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 11 Aug 2015 19:12:49 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9A5F9.1030501@mail.de> Message-ID: <55CA2D11.5080201@mail.de> I actually thought this was about a two-step process using lazy evaluation. This way {name:i18n} or {name:later} basically marks lazy evaluation. But as it seems, i'...' is more supposed to do all (translation + formatting) of this at once. My fault, sorry. On 11.08.2015 10:35, Petr Viktorin wrote: > On Tue, Aug 11, 2015 at 9:36 AM, Sven R. Kunze wrote: >> Also bare with me but couldn't i18n not just be another format spec? >> >> i'Hello there, {name:i18n} {age}.' > Usually it's not the substitutions that you need to translate, but the > surrounding text. From wes.turner at gmail.com Tue Aug 11 20:22:06 2015 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 11 Aug 2015 13:22:06 -0500 Subject: [Python-ideas] [Python-Dev] PEP-498: Literal String Formatting In-Reply-To: References: <55C55DC3.8040605@trueblade.com> <55C79A73.1030901@trueblade.com> <20150810172631.GN3737@ando.pearwood.info> <20150810143127.66c5f842@anarchist.wooz.org> <55CA1030.3060808@sdamon.com> Message-ID: On Tue, Aug 11, 2015 at 12:52 PM, Wes Turner wrote: > ... I'm now -1000 on this. > > ~"Make it hard to do wrong; or easy to do correctly" > > ... Here are these, (which should also not be used for porting shell > scripts to python): http://jinja.pocoo.org/docs/dev/templates/#expressions > So, again, I am -1000 on (both of these PEPs) because they are just another way of making it too easy to do the wrong thing. * #1 most prevalent security vulnerability: *1**CWE-89 : Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')* * ORM with parametrization, quoting, escaping and lists of reserved words * SQLAlchemy * #2 most prevalent security vulnerability: *2**CWE-78 : Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')* * Command preparation library (which builds a tuple() for exec) * Sarge, subprocess.call(shell=False=0) - [ ] DOC: (Something like this COULD/SHOULD be in the % and str.format docs as well) > > On Tue, Aug 11, 2015 at 12:48 PM, Wes Turner wrote: > >> >> On Tue, Aug 11, 2015 at 12:08 PM, Nick Coghlan >> wrote: >> >>> [off list] >>> >>> On 12 August 2015 at 01:28, Wes Turner wrote: >>> > >>> > On Aug 11, 2015 10:19 AM, "Wes Turner" wrote: >>> >> >>> >> >>> >> On Aug 11, 2015 10:10 AM, "Alexander Walters" < >>> tritium-list at sdamon.com> >>> >> wrote: >>> >> > >>> >> > This may seam like a simplistic solution to i18n, but why not just >>> add a >>> >> > method to string objects (assuming we implement f-strings) that >>> just returns >>> >> > the original, unprocessed string. If the string was not an >>> f-string, it >>> >> > just returns self. The gettext module can be modified, I think >>> trivially, >>> >> > to use the method instead of the string directly. >>> >> > >>> >> > Is this a horrible idea? >>> > >>> > - [ ] review all string interpolation (for "injection") >>> > * [ ] review every '%' >>> > * [ ] review every ".format()" >>> > * [ ] review every f-string (AND LOCALS AND GLOBALS) >>> > * every os.system, os.exec*, subprocess.Popen >>> > * every unclosed tag >>> > * every unescaped control character >>> > >>> > This would create work we don't need. >>> > >>> > Solution: __str_shell_ escapes, adds slashes, and quotes. __str__SQL__ >>> refs >>> > a global list of reserved words. >>> >>> Wes, we're not mind readers - I know you're trying to be concise to >>> save people time when reading, but these bullet-point-only posts are >>> *harder* to read than if you wrote out a full explanation of what you >>> meant. With this cryptic form, we have to try to guess the missing >>> pieces, which is slower and less certain than having them already >>> written out in the post. >>> >> >> ~"This is another way to make it easier to do the wrong thing; where a >> better solution (AND/OR DOCS ON ALL STRING INTERPOLATION) would be less >> likely to increase the ocurrence of CWE TOP 25 #1 and #2" >> >> printf is often dangerous and wrng because things aren't escaped (or >> scope is not controlled, or things are mutable) >> >> >> ~"Make it hard to do; or easy to do the right way" >> >> >>> >>> Regards, >>> Nick. >>> >>> -- >>> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Tue Aug 11 20:34:25 2015 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 11 Aug 2015 14:34:25 -0400 Subject: [Python-ideas] [Python-Dev] PEP-498: Literal String Formatting In-Reply-To: References: <55C55DC3.8040605@trueblade.com> <55C79A73.1030901@trueblade.com> <20150810172631.GN3737@ando.pearwood.info> <20150810143127.66c5f842@anarchist.wooz.org> <55CA1030.3060808@sdamon.com> Message-ID: <55CA4031.7090003@trueblade.com> Wes: Your objection is noted. Thanks. Eric. On 08/11/2015 02:22 PM, Wes Turner wrote: > > > On Tue, Aug 11, 2015 at 12:52 PM, Wes Turner > wrote: > > ... I'm now -1000 on this. > > ~"Make it hard to do wrong; or easy to do correctly" > > ... Here are these, (which should also not be used for porting shell > scripts to > python): http://jinja.pocoo.org/docs/dev/templates/#expressions > > > So, again, I am > -1000 on (both of these PEPs) > because they are just another way of making it too easy to do the wrong > thing. > > * #1 most prevalent security vulnerability: > *1* *CWE-89 : Improper > Neutralization of Special Elements used in an SQL Command ('SQL Injection')* > > > * ORM with parametrization, quoting, escaping and lists of reserved > words > * SQLAlchemy > > * #2 most prevalent security vulnerability: > *2* *CWE-78 : Improper > Neutralization of Special Elements used in an OS Command ('OS Command > Injection')* > > > * Command preparation library (which builds a tuple() for exec) > * Sarge, subprocess.call(shell=False=0) > > > - [ ] DOC: (Something like this COULD/SHOULD be in the % and str.format > docs as well) > > > > On Tue, Aug 11, 2015 at 12:48 PM, Wes Turner > wrote: > > > On Tue, Aug 11, 2015 at 12:08 PM, Nick Coghlan > > wrote: > > [off list] > > On 12 August 2015 at 01:28, Wes Turner > wrote: > > > > On Aug 11, 2015 10:19 AM, "Wes Turner" > wrote: > >> > >> > >> On Aug 11, 2015 10:10 AM, "Alexander Walters" > > >> wrote: > >> > > >> > This may seam like a simplistic solution to i18n, but why not just add a > >> > method to string objects (assuming we implement f-strings) that just returns > >> > the original, unprocessed string. If the string was not an f-string, it > >> > just returns self. The gettext module can be modified, I think trivially, > >> > to use the method instead of the string directly. > >> > > >> > Is this a horrible idea? > > > > - [ ] review all string interpolation (for "injection") > > * [ ] review every '%' > > * [ ] review every ".format()" > > * [ ] review every f-string (AND LOCALS AND GLOBALS) > > * every os.system, os.exec*, subprocess.Popen > > * every unclosed tag > > * every unescaped control character > > > > This would create work we don't need. > > > > Solution: __str_shell_ escapes, adds slashes, and quotes. __str__SQL__ refs > > a global list of reserved words. > > Wes, we're not mind readers - I know you're trying to be > concise to > save people time when reading, but these bullet-point-only > posts are > *harder* to read than if you wrote out a full explanation of > what you > meant. With this cryptic form, we have to try to guess the > missing > pieces, which is slower and less certain than having them > already > written out in the post. > > > ~"This is another way to make it easier to do the wrong thing; > where a better solution (AND/OR DOCS ON ALL STRING > INTERPOLATION) would be less likely to increase the ocurrence of > CWE TOP 25 #1 and #2" > > printf is often dangerous and wrng because things aren't escaped > (or scope is not controlled, or things are mutable) > > > ~"Make it hard to do; or easy to do the right way" > > > > Regards, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com > | Brisbane, Australia > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From rymg19 at gmail.com Tue Aug 11 20:37:10 2015 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Tue, 11 Aug 2015 13:37:10 -0500 Subject: [Python-ideas] [Python-Dev] PEP-498: Literal String Formatting In-Reply-To: References: <55C55DC3.8040605@trueblade.com> <55C79A73.1030901@trueblade.com> <20150810172631.GN3737@ando.pearwood.info> <20150810143127.66c5f842@anarchist.wooz.org> <55CA1030.3060808@sdamon.com> Message-ID: <47D74296-BA36-4956-B67A-126148201B47@gmail.com> Isn't it already like this? It's no harder than: Popen('%s a.c' % cc, shell=True) Heck, I used to do that when I started programming (I hadn't yet learned about injection stuff). If someone is uneducated about injection, they *will do it anyway*. The introduction of format strings (f-strings sounds like a certain word to me...) wouldn't make it any easier, really. On August 11, 2015 1:22:06 PM CDT, Wes Turner wrote: >On Tue, Aug 11, 2015 at 12:52 PM, Wes Turner >wrote: > >> ... I'm now -1000 on this. >> >> ~"Make it hard to do wrong; or easy to do correctly" >> >> ... Here are these, (which should also not be used for porting shell >> scripts to python): >http://jinja.pocoo.org/docs/dev/templates/#expressions >> > >So, again, I am >-1000 on (both of these PEPs) >because they are just another way of making it too easy to do the wrong >thing. > >* #1 most prevalent security vulnerability: >*1**CWE-89 : Improper >Neutralization of Special Elements used in an SQL Command ('SQL >Injection')* > > * ORM with parametrization, quoting, escaping and lists of reserved >words > * SQLAlchemy > >* #2 most prevalent security vulnerability: >*2**CWE-78 : Improper >Neutralization of Special Elements used in an OS Command ('OS Command >Injection')* > > * Command preparation library (which builds a tuple() for exec) > * Sarge, subprocess.call(shell=False=0) > > >- [ ] DOC: (Something like this COULD/SHOULD be in the % and str.format >docs as well) > > >> >> On Tue, Aug 11, 2015 at 12:48 PM, Wes Turner >wrote: >> >>> >>> On Tue, Aug 11, 2015 at 12:08 PM, Nick Coghlan >>> wrote: >>> >>>> [off list] >>>> >>>> On 12 August 2015 at 01:28, Wes Turner >wrote: >>>> > >>>> > On Aug 11, 2015 10:19 AM, "Wes Turner" >wrote: >>>> >> >>>> >> >>>> >> On Aug 11, 2015 10:10 AM, "Alexander Walters" < >>>> tritium-list at sdamon.com> >>>> >> wrote: >>>> >> > >>>> >> > This may seam like a simplistic solution to i18n, but why not >just >>>> add a >>>> >> > method to string objects (assuming we implement f-strings) >that >>>> just returns >>>> >> > the original, unprocessed string. If the string was not an >>>> f-string, it >>>> >> > just returns self. The gettext module can be modified, I >think >>>> trivially, >>>> >> > to use the method instead of the string directly. >>>> >> > >>>> >> > Is this a horrible idea? >>>> > >>>> > - [ ] review all string interpolation (for "injection") >>>> > * [ ] review every '%' >>>> > * [ ] review every ".format()" >>>> > * [ ] review every f-string (AND LOCALS AND GLOBALS) >>>> > * every os.system, os.exec*, subprocess.Popen >>>> > * every unclosed tag >>>> > * every unescaped control character >>>> > >>>> > This would create work we don't need. >>>> > >>>> > Solution: __str_shell_ escapes, adds slashes, and quotes. >__str__SQL__ >>>> refs >>>> > a global list of reserved words. >>>> >>>> Wes, we're not mind readers - I know you're trying to be concise to >>>> save people time when reading, but these bullet-point-only posts >are >>>> *harder* to read than if you wrote out a full explanation of what >you >>>> meant. With this cryptic form, we have to try to guess the missing >>>> pieces, which is slower and less certain than having them already >>>> written out in the post. >>>> >>> >>> ~"This is another way to make it easier to do the wrong thing; where >a >>> better solution (AND/OR DOCS ON ALL STRING INTERPOLATION) would be >less >>> likely to increase the ocurrence of CWE TOP 25 #1 and #2" >>> >>> printf is often dangerous and wrng because things aren't escaped (or >>> scope is not controlled, or things are mutable) >>> >>> >>> ~"Make it hard to do; or easy to do the right way" >>> >>> >>>> >>>> Regards, >>>> Nick. >>>> >>>> -- >>>> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >>>> >>> >>> >> > > >------------------------------------------------------------------------ > >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas >Code of Conduct: http://python.org/psf/codeofconduct/ -- Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Tue Aug 11 21:03:40 2015 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 11 Aug 2015 14:03:40 -0500 Subject: [Python-ideas] [Python-Dev] PEP-498: Literal String Formatting In-Reply-To: References: <55C55DC3.8040605@trueblade.com> <55C79A73.1030901@trueblade.com> <20150810172631.GN3737@ando.pearwood.info> <20150810143127.66c5f842@anarchist.wooz.org> <55CA1030.3060808@sdamon.com> Message-ID: On Tue, Aug 11, 2015 at 1:22 PM, Wes Turner wrote: > > > On Tue, Aug 11, 2015 at 12:52 PM, Wes Turner wrote: > >> ... I'm now -1000 on this. >> >> ~"Make it hard to do wrong; or easy to do correctly" >> >> ... Here are these, (which should also not be used for porting shell >> scripts to python): >> http://jinja.pocoo.org/docs/dev/templates/#expressions >> > > So, again, I am > -1000 on (both of these PEPs) > because they are just another way of making it too easy to do the wrong > thing. > > * #1 most prevalent security vulnerability: > *1**CWE-89 : Improper > Neutralization of Special Elements used in an SQL Command ('SQL Injection')* > > * ORM with parametrization, quoting, escaping and lists of reserved > words > * SQLAlchemy > > * #2 most prevalent security vulnerability: > *2**CWE-78 : Improper > Neutralization of Special Elements used in an OS Command ('OS Command > Injection')* > > * Command preparation library (which builds a tuple() for exec) > * Sarge, subprocess.call(shell=False=0) > > > - [ ] DOC: (Something like this COULD/SHOULD be in the % and str.format > docs as well) > Maybe it would be helpful to think of string concatenation more in terms of compiling a template for serializable DOM(html,js,brython)/doctree(docutils,sphinx)/jinja nodes which have types (Path, CommandOption/Arg, [Tag, Attr]) and appropriate quoting, escaping, encoding, **and translation** rules according to a given output context. # because this is what could just not be: [os.system(f'echo "{cmd}") for cmd in cmds] os.system(f'echo2 '{cmd}') What is the target output format for this string concatenation, most of the time? > >> >> On Tue, Aug 11, 2015 at 12:48 PM, Wes Turner >> wrote: >> >>> >>> On Tue, Aug 11, 2015 at 12:08 PM, Nick Coghlan >>> wrote: >>> >>>> [off list] >>>> >>>> On 12 August 2015 at 01:28, Wes Turner wrote: >>>> > >>>> > On Aug 11, 2015 10:19 AM, "Wes Turner" wrote: >>>> >> >>>> >> >>>> >> On Aug 11, 2015 10:10 AM, "Alexander Walters" < >>>> tritium-list at sdamon.com> >>>> >> wrote: >>>> >> > >>>> >> > This may seam like a simplistic solution to i18n, but why not just >>>> add a >>>> >> > method to string objects (assuming we implement f-strings) that >>>> just returns >>>> >> > the original, unprocessed string. If the string was not an >>>> f-string, it >>>> >> > just returns self. The gettext module can be modified, I think >>>> trivially, >>>> >> > to use the method instead of the string directly. >>>> >> > >>>> >> > Is this a horrible idea? >>>> > >>>> > - [ ] review all string interpolation (for "injection") >>>> > * [ ] review every '%' >>>> > * [ ] review every ".format()" >>>> > * [ ] review every f-string (AND LOCALS AND GLOBALS) >>>> > * every os.system, os.exec*, subprocess.Popen >>>> > * every unclosed tag >>>> > * every unescaped control character >>>> > >>>> > This would create work we don't need. >>>> > >>>> > Solution: __str_shell_ escapes, adds slashes, and quotes. >>>> __str__SQL__ refs >>>> > a global list of reserved words. >>>> >>>> Wes, we're not mind readers - I know you're trying to be concise to >>>> save people time when reading, but these bullet-point-only posts are >>>> *harder* to read than if you wrote out a full explanation of what you >>>> meant. With this cryptic form, we have to try to guess the missing >>>> pieces, which is slower and less certain than having them already >>>> written out in the post. >>>> >>> >>> ~"This is another way to make it easier to do the wrong thing; where a >>> better solution (AND/OR DOCS ON ALL STRING INTERPOLATION) would be less >>> likely to increase the ocurrence of CWE TOP 25 #1 and #2" >>> >>> printf is often dangerous and wrng because things aren't escaped (or >>> scope is not controlled, or things are mutable) >>> >>> >>> ~"Make it hard to do; or easy to do the right way" >>> >>> >>>> >>>> Regards, >>>> Nick. >>>> >>>> -- >>>> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Tue Aug 11 21:05:59 2015 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 11 Aug 2015 14:05:59 -0500 Subject: [Python-ideas] [Python-Dev] PEP-498: Literal String Formatting In-Reply-To: <47D74296-BA36-4956-B67A-126148201B47@gmail.com> References: <55C55DC3.8040605@trueblade.com> <55C79A73.1030901@trueblade.com> <20150810172631.GN3737@ando.pearwood.info> <20150810143127.66c5f842@anarchist.wooz.org> <55CA1030.3060808@sdamon.com> <47D74296-BA36-4956-B67A-126148201B47@gmail.com> Message-ID: On Tue, Aug 11, 2015 at 1:37 PM, Ryan Gonzalez wrote: > Isn't it already like this? It's no harder than: > > Popen('%s a.c' % cc, shell=True) > > Heck, I used to do that when I started programming (I hadn't yet learned > about injection stuff). > > If someone is uneducated about injection, they *will do it anyway*. The > introduction of format strings (f-strings sounds like a certain word to > me...) wouldn't make it any easier, really. > Well, exactly. So I/we must grep for shell=True, %, .format(, .format_globals(**kwargs), and f" or f' and update static analysis tools (to essentially re-AST string.Template with merge(globals, locals, kwargs)) > > On August 11, 2015 1:22:06 PM CDT, Wes Turner > wrote: > >> >> >> On Tue, Aug 11, 2015 at 12:52 PM, Wes Turner >> wrote: >> >>> ... I'm now -1000 on this. >>> >>> ~"Make it hard to do wrong; or easy to do correctly" >>> >>> ... Here are these, (which should also not be used for porting shell >>> scripts to python): >>> http://jinja.pocoo.org/docs/dev/templates/#expressions >>> >> >> So, again, I am >> -1000 on (both of these PEPs) >> because they are just another way of making it too easy to do the wrong >> thing. >> >> * #1 most prevalent security vulnerability: >> *1**CWE-89 : Improper >> Neutralization of Special Elements used in an SQL Command ('SQL Injection')* >> >> * ORM with parametrization, quoting, escaping and lists of reserved >> words >> * SQLAlchemy >> >> * #2 most prevalent security vulnerability: >> *2**CWE-78 : Improper >> Neutralization of Special Elements used in an OS Command ('OS Command >> Injection')* >> >> * Command preparation library (which builds a tuple() for exec) >> * Sarge, subprocess.call(shell=False=0) >> >> >> - [ ] DOC: (Something like this COULD/SHOULD be in the % and str.format >> docs as well) >> >> >>> >>> On Tue, Aug 11, 2015 at 12:48 PM, Wes Turner >>> wrote: >>> >>>> >>>> On Tue, Aug 11, 2015 at 12:08 PM, Nick Coghlan >>>> wrote: >>>> >>>>> [off list] >>>>> >>>>> On 12 August 2015 at 01:28, Wes Turner wrote: >>>>> > >>>>> > On Aug 11, 2015 10:19 AM, "Wes Turner" wrote: >>>>> >> >>>>> >> >>>>> >> On Aug 11, 2015 10:10 AM, "Alexander Walters" < >>>>> tritium-list at sdamon.com> >>>>> >> wrote: >>>>> >> > >>>>> >> > This may seam like a simplistic solution to i18n, but why not >>>>> just add a >>>>> >> > method to string objects (assuming we implement f-strings) that >>>>> just returns >>>>> >> > the original, unprocessed string. If the string was not an >>>>> f-string, it >>>>> >> > just returns self. The gettext module can be modified, I think >>>>> trivially, >>>>> >> > to use the method instead of the string directly. >>>>> >> > >>>>> >> > Is this a horrible idea? >>>>> > >>>>> > - [ ] review all string interpolation (for "injection") >>>>> > * [ ] review every '%' >>>>> > * [ ] review every ".format()" >>>>> > * [ ] review every f-string (AND LOCALS AND GLOBALS) >>>>> > * every os.system, os.exec*, subprocess.Popen >>>>> > * every unclosed tag >>>>> > * every unescaped control character >>>>> > >>>>> > This would create work we don't need. >>>>> > >>>>> > Solution: __str_shell_ escapes, adds slashes, and quotes. >>>>> __str__SQL__ refs >>>>> > a global list of reserved words. >>>>> >>>>> Wes, we're not mind readers - I know you're trying to be concise to >>>>> save people time when reading, but these bullet-point-only posts are >>>>> *harder* to read than if you wrote out a full explanation of what you >>>>> meant. With this cryptic form, we have to try to guess the missing >>>>> pieces, which is slower and less certain than having them already >>>>> written out in the post. >>>>> >>>> >>>> ~"This is another way to make it easier to do the wrong thing; where a >>>> better solution (AND/OR DOCS ON ALL STRING INTERPOLATION) would be less >>>> likely to increase the ocurrence of CWE TOP 25 #1 and #2" >>>> >>>> printf is often dangerous and wrng because things aren't escaped (or >>>> scope is not controlled, or things are mutable) >>>> >>>> >>>> ~"Make it hard to do; or easy to do the right way" >>>> >>>> >>>>> >>>>> Regards, >>>>> Nick. >>>>> >>>>> -- >>>>> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >>>>> >>>> >>>> >>> >> ------------------------------ >> >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > -- > Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liik.joonas at gmail.com Tue Aug 11 21:25:27 2015 From: liik.joonas at gmail.com (Joonas Liik) Date: Tue, 11 Aug 2015 22:25:27 +0300 Subject: [Python-ideas] [Python-Dev] PEP-498: Literal String Formatting In-Reply-To: References: <55C55DC3.8040605@trueblade.com> <55C79A73.1030901@trueblade.com> <20150810172631.GN3737@ando.pearwood.info> <20150810143127.66c5f842@anarchist.wooz.org> <55CA1030.3060808@sdamon.com> <47D74296-BA36-4956-B67A-126148201B47@gmail.com> Message-ID: I would rather think of this as an opportunity to help avoid injection vectors. if there was a separate.. . interpolation provider .. then something like os.system('dosomething {a} {b} {c}'.format(...)) could be written as ( !cmd here being a special type of f-string that does command line escaping, borrowing syntax from another thread a few days ago..) os.sytem(!cmd'dosomething {a} {b} {c}') This is both shorter and more resilient to injections. Essentially it feels like you annotate a string as "this will be executed on the command line" and the interpolation adapts. this would make doing the right thing the same as doing the easy thing and this would be good overall, no? I don't know about you, but i dont know by heart how to escape arbitrary user input and deal with all of the corner cases. yes, you can do this more safely with Popen.. but that is quite a bit more effort. also often times there is no such alternative or it is very unweildy (sql land this happens more often) -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Tue Aug 11 21:35:46 2015 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 11 Aug 2015 14:35:46 -0500 Subject: [Python-ideas] [Python-Dev] PEP-498: Literal String Formatting In-Reply-To: References: <55C55DC3.8040605@trueblade.com> <55C79A73.1030901@trueblade.com> <20150810172631.GN3737@ando.pearwood.info> <20150810143127.66c5f842@anarchist.wooz.org> <55CA1030.3060808@sdamon.com> <47D74296-BA36-4956-B67A-126148201B47@gmail.com> Message-ID: On Tue, Aug 11, 2015 at 2:25 PM, Joonas Liik wrote: > I would rather think of this as an opportunity to help avoid injection > vectors. > you get an "F" grade/letter/mark every time you build an f-string without defining what the user-supplied input and destination outputs could/would be. > > if there was a separate.. . interpolation provider .. > then something like > > os.system('dosomething {a} {b} {c}'.format(...)) > > could be written as ( !cmd here being a special type of f-string that does > command line escaping, borrowing syntax from another thread a few days > ago..) > > os.sytem(!cmd'dosomething {a} {b} {c}') > sarge.run('do something {0} {1} {2}', a, b, c) is currently supported (and could/should be stdlib IMHO) https://sarge.readthedocs.org/en/latest/overview.html#why-not-just-use-subprocess . * (again, sorry) this adds ~subprocess compat to sarge: https://bitbucket.org/vinay.sajip/sarge/pull-requests/1/enh-add-call-check_call-check_output ("ENH: Add call, check_call, check_output, CalledProcessError, expect_returncode") > > This is both shorter and more resilient to injections. > Essentially it feels like you annotate a string as "this will be executed > on the command line" and the interpolation adapts. > > this would make doing the right thing the same as doing the easy thing and > this would be good overall, no? > I don't know about you, but i dont know by heart how to escape arbitrary > user input and deal with all of the corner cases. > So, IPython/Jupyter understands _repr_html_ (_repr_*_) methods, IDK why we couldn't have e.g. _repr_shell_path_, _repr_shell_cmdarg_, _repr_sql_sqlite_reserved_keywords_. Representing things for an output format which is expressed as a string but has control characters in order to separate data and code. > > yes, you can do this more safely with Popen.. but that is quite a bit more > effort. > also often times there is no such alternative or it is very unweildy (sql > land this happens more often) > POSIX exec accepts a tuple (and does not parse ';' or '--'). > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Tue Aug 11 21:56:22 2015 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 11 Aug 2015 14:56:22 -0500 Subject: [Python-ideas] [Python-Dev] PEP-498: Literal String Formatting In-Reply-To: References: <55C55DC3.8040605@trueblade.com> <55C79A73.1030901@trueblade.com> <20150810172631.GN3737@ando.pearwood.info> <20150810143127.66c5f842@anarchist.wooz.org> <55CA1030.3060808@sdamon.com> <47D74296-BA36-4956-B67A-126148201B47@gmail.com> Message-ID: On Tue, Aug 11, 2015 at 2:35 PM, Wes Turner wrote: > > > On Tue, Aug 11, 2015 at 2:25 PM, Joonas Liik > wrote: > >> I would rather think of this as an opportunity to help avoid injection >> vectors. >> > > you get an "F" grade/letter/mark every time you build an f-string > without defining what the user-supplied input and destination outputs > could/would be. > A configuration object (passable as e.g. format(**conf)) more explicitly defines the scope (as variables that need to be - [ ] escaped - [ ] encoded - [ ] translated - [ ] concatenated - [ ] mutated or not mutated - [ ] formatted In an ordered idempotent sequence. lookup = partial[kwargs, locals, globals] merged = merge(globals, locals, kwargs) .formatg(**kwargs) .format(lookup(kwargs)) .formatl(**kwargs) uno = trans("one}") f"abc {uno}" ft"abc {uno}" eetcmf"abc {uno}" > > >> >> if there was a separate.. . interpolation provider .. >> then something like >> >> os.system('dosomething {a} {b} {c}'.format(...)) >> >> could be written as ( !cmd here being a special type of f-string that >> does command line escaping, borrowing syntax from another thread a few days >> ago..) >> >> os.sytem(!cmd'dosomething {a} {b} {c}') >> > > sarge.run('do something {0} {1} {2}', a, b, c) is currently supported > (and could/should be stdlib IMHO) > https://sarge.readthedocs.org/en/latest/overview.html#why-not-just-use-subprocess > . > > * (again, sorry) this adds ~subprocess compat to sarge: > https://bitbucket.org/vinay.sajip/sarge/pull-requests/1/enh-add-call-check_call-check_output > ("ENH: Add call, check_call, check_output, CalledProcessError, > expect_returncode") > > >> >> This is both shorter and more resilient to injections. >> Essentially it feels like you annotate a string as "this will be executed >> on the command line" and the interpolation adapts. >> > >> this would make doing the right thing the same as doing the easy thing >> and this would be good overall, no? >> I don't know about you, but i dont know by heart how to escape arbitrary >> user input and deal with all of the corner cases. >> > > So, IPython/Jupyter understands _repr_html_ (_repr_*_) methods, > IDK why we couldn't have e.g. _repr_shell_path_, _repr_shell_cmdarg_, > _repr_sql_sqlite_reserved_keywords_. > > Representing things for an output format which is expressed as a string > but has control characters > in order to separate data and code. > > >> >> yes, you can do this more safely with Popen.. but that is quite a bit >> more effort. >> also often times there is no such alternative or it is very unweildy (sql >> land this happens more often) >> > > POSIX exec accepts a tuple (and does not parse ';' or '--'). > > >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Tue Aug 11 22:09:42 2015 From: ron3200 at gmail.com (Ron Adam) Date: Tue, 11 Aug 2015 16:09:42 -0400 Subject: [Python-ideas] Outside the box string formatting idea In-Reply-To: References: <55C7D0A9.1010305@mgmiller.net> <55C7FA58.6030302@btinternet.com> Message-ID: On 08/10/2015 01:40 AM, Vito De Tullio wrote: > what about a slightly different "f"? > > def f(*args): > f_args = [] > for arg in args: > if isinstance(arg, tuple): > f_args.append(format(*arg)) > else: > f_args.append(format(arg)) > return ''.join(f_args) > > import datetime > name = 'Fred'; age = 50; anniversary = datetime.date(1991, 10, 12) > print(f('My name is ', name, ', my age next year is ', age+1, > ', my anniversary is ', (anniversary, ':%A, %B %d, %Y'), '.')) > > > My name is Fred, my age next year is 51, my anniversary is :Saturday, > October 12, 1991. What makes this difficult is it's a trinary operation combining three kinds of data. If it was a pure binary operation it would be simple. So the possible relationships are... 1. (string with format codes) + values # % and .format() 2. string + (formated values) 3. (string with values) + format codes 4. (string with both format codes and values) #f-strings 5. string + format codes + values # trinary operation? #2 is interesting. It can already be done by calling format(value, fmt). But maybe it can be improved on. If it had a dedicated operator, it might be nice enough to fill most needs. One of the main complaints with .format(...) is the length of the name, and it adds another level of parentheses. It can be split into separate fill and format operators and let precedence handle things to avoid conflicts with passing tuples. A quick hack... class S: def __init__(self, value): self.value = value def __lshift__(self, other): """Fill left most {}.""" return S(str(self.value).replace('{}', str(other), 1)) def __rfloordiv__(self, obj): """Format object with fmt string.""" return S(format(obj, self.value)) def __str__(self): return str(self.value) import datetime name = 'Fred' age = 50 anniversary = datetime.date(1991, 10, 12) print(S('My name is {}, my age next year is {}, my anniversary is {}.') << name << (age+1) << anniversary // S('%A, %B %d, %Y')) My name is Fred, my age next year is 51, my anniversary is Saturday, October 12, 1991. If those methods were added to strings, then it could look like this... print('My name is {}, my age next year is {}, my anniversary is {}.' << name << (age+1) << anniversary // '%A, %B %d, %Y') Other operators might be better, but those aren't that commonly used, and I think they show the intents well. Cheers, Ron From ericfahlgren at gmail.com Tue Aug 11 22:32:03 2015 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Tue, 11 Aug 2015 13:32:03 -0700 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <20150811112043.676c6038@anarchist.wooz.org> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> Message-ID: <01c601d0d474$c94a77e0$5bdf67a0$@gmail.com> From: Barry Warsaw [mailto:barry at python.org] > But the DRY argument is much more compelling to me, and currently I think the > best way to reduce repetition in function arguments is through sys._getframe() > and other such nasty tricks. I'd really much prefer to see this small annoyance > fixed in a targeted way than add a hugely complicated new feature that reduces > readability (IMHO). > > Which is why I like the scope() and similar ideas. Something like a built-in that This +100. I've been doing my own variants on the f-string for over a decade and every time I type "sys.__getframe()" I think to myself, "There's got to be a better way..." Here's a stripped-down example from Real Life (gah!): >>> def formatSolverFunction(self, s, **kwds): >>> localFrame = sys._getframe().f_back >>> localDict = localFrame.f_locals >>> localDict.update(kwds) # Our **kwds override locals. >>> globalDict = localFrame.f_globals >>> >>> #----------------------------------------------------------------------- >>> def sub(match): >>> fmt = match.groups()[0].split(":", 1) >>> val = eval(fmt[0], globalDict, localDict) >>> >>> if isinstance(val, SimulatableObject): >>> val = val.id >>> >>> if len(fmt) > 1: >>> try: >>> val = val.__format__(fmt[1]) >>> except ValueError as e: >>> ... error reporting ... >>> return str(val) >>> >>> while True: >>> s, n = re.subn(r"{([\w.:]*)}", sub, s) # We don't support !s or !r. >>> if n == 0: break >>> return s From srkunze at mail.de Tue Aug 11 23:00:26 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 11 Aug 2015 23:00:26 +0200 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <01c601d0d474$c94a77e0$5bdf67a0$@gmail.com> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <01c601d0d474$c94a77e0$5bdf67a0$@gmail.com> Message-ID: <55CA626A.9050605@mail.de> I like the i-string idea because it enables IDE support (syntax, click-to-definition etc.). Everything else are just custom workarounds (not saying they are bad, but I like a standardized syntax). Furthermore, scope() might have its some merit on its own. :) On 11.08.2015 22:32, Eric Fahlgren wrote: > From: Barry Warsaw [mailto:barry at python.org] >> But the DRY argument is much more compelling to me, and currently I think the >> best way to reduce repetition in function arguments is through sys._getframe() >> and other such nasty tricks. I'd really much prefer to see this small annoyance >> fixed in a targeted way than add a hugely complicated new feature that reduces >> readability (IMHO). >> >> Which is why I like the scope() and similar ideas. Something like a built-in that > This +100. I've been doing my own variants on the f-string for over a decade and > every time I type "sys.__getframe()" I think to myself, "There's got to be a better way..." > > Here's a stripped-down example from Real Life (gah!): > >>>> def formatSolverFunction(self, s, **kwds): >>>> localFrame = sys._getframe().f_back >>>> localDict = localFrame.f_locals >>>> localDict.update(kwds) # Our **kwds override locals. >>>> globalDict = localFrame.f_globals >>>> >>>> #----------------------------------------------------------------------- >>>> def sub(match): >>>> fmt = match.groups()[0].split(":", 1) >>>> val = eval(fmt[0], globalDict, localDict) >>>> >>>> if isinstance(val, SimulatableObject): >>>> val = val.id >>>> >>>> if len(fmt) > 1: >>>> try: >>>> val = val.__format__(fmt[1]) >>>> except ValueError as e: >>>> ... error reporting ... >>>> return str(val) >>>> >>>> while True: >>>> s, n = re.subn(r"{([\w.:]*)}", sub, s) # We don't support !s or !r. >>>> if n == 0: break >>>> return s > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From srkunze at mail.de Tue Aug 11 23:26:56 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 11 Aug 2015 23:26:56 +0200 Subject: [Python-ideas] Learning from the shell in supporting asyncio background calls In-Reply-To: References: Message-ID: <55CA68A0.7080501@mail.de> @Nick It seems like, we are not alone in our thinking that asyncio still needs many more convenience wrappers. https://mail.python.org/pipermail/python-list/2015-August/694859.html Same conclusion as yours and mine: "I think python's non blocking I/O is far from being something useful for developers till non-async code can invoke async code transparently. Duplicating all code/libs when you realize that something not fits asyncio is not a solution and even less a pythonic solution." On 12.07.2015 04:48, Nick Coghlan wrote: > On 11 July 2015 at 20:17, Nick Coghlan wrote: >> I'll sleep on that, and if I still like that structure in the morning, >> I'll look at revising my coroutine posts. > I've revised both of my asyncio posts to use this three part helper > API to work with coroutines and the event loop from the interactive > prompt: > > * run_in_foreground > * schedule_coroutine > * call_in_background > > I think the revised TCP echo client and server post is the better of > the two descriptions, since it uses actual network IO operations as > its underlying example, rather than toy background timers: > http://www.curiousefficiency.org/posts/2015/07/asyncio-tcp-echo-server.html > > As with most of the main asyncio API, "run" in this revised setup now > refers specifically to running the event loop. ("run_in_executor" is > still an anomaly, which I now believe might have been better named > "call_in_executor" to align with the call_soon, call_soon_threadsafe > and call_later callback management APIs, rather than the run_* event > loop invocation APIs) > > The foreground/background split is now intended to refer primarily to > "main thread in the main process" (e.g. the interactive prompt, the > GUI thread in a desktop application, the main server process in a > network application) vs "worker threads and processes" (whether > managed by the default executor, or another executor passed in > specifically to "call_in_background"). This is much closer in spirit > to the shell meaning. > > The connection that "call_in_background" has to asyncio over using > concurrent.futures directly is that, just like schedule_coroutine, > it's designed to be used in tandem with run_in_foreground (either > standalone, or in combination with asyncio.wait, or asyncio.wait_for) > to determine if the results are available yet. > > Both schedule_coroutine and call_in_background are deliberately > restricted in the kinds of objects they accept - unlike ensure_future, > schedule_coroutine will complain if given an existing future, while > call_in_background will complain immediately if given something that > isn't some kind of callable. > > Regards, > Nick. > From jonathan at slenders.be Wed Aug 12 00:37:01 2015 From: jonathan at slenders.be (Jonathan Slenders) Date: Wed, 12 Aug 2015 00:37:01 +0200 Subject: [Python-ideas] Learning from the shell in supporting asyncio background calls In-Reply-To: <55CA68A0.7080501@mail.de> References: <55CA68A0.7080501@mail.de> Message-ID: Honestly, I don't understand the issue. (Or maybe I'm too tired right now.) But I think asyncio is actually very well designed. We just have to keep in mind that code is either synchronous or asynchronous. Code that consumes too much CPU or does blocking calls does definitely not belong in an event driven system. By the way, the shell equivalent of "&" is definitely a thread, not a coroutine. And "fg" (foreground) equals "thread.join". You were also talking about a Repl. Not sure if this helps, but "ptpython", the REPL that I develop can be embedded into any application as a coroutine. There is no blocking I/O in the Repl. https://github.com/jonathanslenders/ptpython/blob/master/examples/asyncio-python-embed.py @nick: About the discussion you are referring to. For solving the producer/consumer problem, the answer is probably to use asyncio Queues (Have a look at the put and get method.) Jonathan 2015-08-11 23:26 GMT+02:00 Sven R. Kunze : > @Nick > > It seems like, we are not alone in our thinking that asyncio still needs > many more convenience wrappers. > > https://mail.python.org/pipermail/python-list/2015-August/694859.html > > Same conclusion as yours and mine: > > "I think python's non blocking I/O is far from being something useful for > developers till non-async code can invoke async code transparently. > Duplicating all code/libs when you realize that something not fits asyncio > is not a solution and even less a pythonic solution." > > > > On 12.07.2015 04:48, Nick Coghlan wrote: > >> On 11 July 2015 at 20:17, Nick Coghlan wrote: >> >>> I'll sleep on that, and if I still like that structure in the morning, >>> I'll look at revising my coroutine posts. >>> >> I've revised both of my asyncio posts to use this three part helper >> API to work with coroutines and the event loop from the interactive >> prompt: >> >> * run_in_foreground >> * schedule_coroutine >> * call_in_background >> >> I think the revised TCP echo client and server post is the better of >> the two descriptions, since it uses actual network IO operations as >> its underlying example, rather than toy background timers: >> >> http://www.curiousefficiency.org/posts/2015/07/asyncio-tcp-echo-server.html >> >> As with most of the main asyncio API, "run" in this revised setup now >> refers specifically to running the event loop. ("run_in_executor" is >> still an anomaly, which I now believe might have been better named >> "call_in_executor" to align with the call_soon, call_soon_threadsafe >> and call_later callback management APIs, rather than the run_* event >> loop invocation APIs) >> >> The foreground/background split is now intended to refer primarily to >> "main thread in the main process" (e.g. the interactive prompt, the >> GUI thread in a desktop application, the main server process in a >> network application) vs "worker threads and processes" (whether >> managed by the default executor, or another executor passed in >> specifically to "call_in_background"). This is much closer in spirit >> to the shell meaning. >> >> The connection that "call_in_background" has to asyncio over using >> concurrent.futures directly is that, just like schedule_coroutine, >> it's designed to be used in tandem with run_in_foreground (either >> standalone, or in combination with asyncio.wait, or asyncio.wait_for) >> to determine if the results are available yet. >> >> Both schedule_coroutine and call_in_background are deliberately >> restricted in the kinds of objects they accept - unlike ensure_future, >> schedule_coroutine will complain if given an existing future, while >> call_in_background will complain immediately if given something that >> isn't some kind of callable. >> >> Regards, >> Nick. >> >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at slenders.be Wed Aug 12 00:59:26 2015 From: jonathan at slenders.be (Jonathan Slenders) Date: Wed, 12 Aug 2015 00:59:26 +0200 Subject: [Python-ideas] Learning from the shell in supporting asyncio background calls In-Reply-To: <55CA68A0.7080501@mail.de> References: <55CA68A0.7080501@mail.de> Message-ID: > "I think python's non blocking I/O is far from being something useful for > developers till non-async code can invoke async code transparently. > Duplicating all code/libs when you realize that something not fits asyncio > is not a solution and even less a pythonic solution." About this. I think I absolutely understand the difficulties, but in reality the "problem" is broader. Code is always written using certain paradigms. And Python allows a lot of these, so depending on the background, use case and interests of the author, he decides what paradigm to use. These days, we have functional code, reactive code, imperative, declarative, object oriented, event-driven, etc... All paradigms have different answers to questions like "Where do we keep our state?" (the variables), "Where do we do our I/O?", "how do we reuse our code?", "how do we write our logic?". Not everything is compatible. Mixing two styles can sometimes be very ugly and confusing. I guess the question is more about finding the right glue to put things together, without forcing code to fit into another paradigm. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at slenders.be Wed Aug 12 01:22:41 2015 From: jonathan at slenders.be (Jonathan Slenders) Date: Wed, 12 Aug 2015 01:22:41 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: <55CA2D11.5080201@mail.de> References: <55C90A3E.1080906@mgmiller.net> <55C9A5F9.1030501@mail.de> <55CA2D11.5080201@mail.de> Message-ID: Not exactly. Take this string for instance: f'hello {name}' And our FString implementation, very simple: class FString(str): def __init__(self, value, **kwargs): super().__init__(value.format(**self.kwargs)) self.value = value self.kwargs = kwargs What the above f-string should do is create an instance of that class. This is just a compiler detail. A preprocessor step. Like this: FString('hello {name}', name=str(name)) FString is just an str instance, it has the actual interpolated value, but it still contains the original uninterpolated string and all parameters (as strings as well.) Now, what gettext can do, if we would wrap this string in the underscore function, is take the "value" attribute from this string FString, translate that, and apply the interpolation again. This way, we are completely compatible with the format() call. There is no need at all for using globals/locals or _getframe(). The name bindings are static, this is lintable. Please tell me if I'm missing something. 2015-08-11 19:12 GMT+02:00 Sven R. Kunze : > I actually thought this was about a two-step process using lazy evaluation. > > This way {name:i18n} or {name:later} basically marks lazy evaluation. > > But as it seems, i'...' is more supposed to do all (translation + > formatting) of this at once. My fault, sorry. > > > On 11.08.2015 10:35, Petr Viktorin wrote: > >> On Tue, Aug 11, 2015 at 9:36 AM, Sven R. Kunze wrote: >> >>> Also bare with me but couldn't i18n not just be another format spec? >>> >>> i'Hello there, {name:i18n} {age}.' >>> >> Usually it's not the substitutions that you need to translate, but the >> surrounding text. >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gronholm at nextday.fi Wed Aug 12 01:22:49 2015 From: alex.gronholm at nextday.fi (=?UTF-8?B?QWxleCBHcsO2bmhvbG0=?=) Date: Wed, 12 Aug 2015 02:22:49 +0300 Subject: [Python-ideas] Making concurrent.futures.Futures awaitable In-Reply-To: References: <55C4E22D.101@nextday.fi> <55C5FA72.7030303@nextday.fi> Message-ID: <55CA83C9.6050900@nextday.fi> 09.08.2015, 03:22, Nick Coghlan kirjoitti: > > > On 8 Aug 2015 22:48, "Alex Gr?nholm" > wrote: > > > > That name would and argument placement would be better, but are you > suggesting that the ability to pass along extra arguments should be > removed? The original method was bad enough in that it only supported > positional and not keyword arguments, forcing users to pass partial() > objects as callables. > > That's a deliberate design decision in many of asyncio's APIs to > improve the introspection capabilities and to clearly separate > concerns between "interacting with the event loop" and "the operation > being dispatched for execution". > While I won't pretend to understand what this means, I recognize that you've given it considerably more thought than I have. > > >> With the suggested change to the method name and signature, the same > >> example would instead look like: > >> > >> async def handler(self): > >> loop = asyncio.get_event_loop() > >> result = await > >> loop.call_in_background(some_blocking_api.some_blocking_call) > >> await self.write(result) > > > > Am I the only one who's bothered by the fact that you have to get a > reference to the event loop first? > > Wouldn't this be better: > > > > async def handler(self): > > > > result = await > asyncio.call_in_background(some_blocking_api.some_blocking_call) > > > > await self.write(result) > > That was my original suggestion a few weeks ago, but after playing > with it for a while, I came to agree with Guido that hiding the event > loop in this case likely wasn't helpful to the conceptual learning > process. Outside higher level frameworks that place more constraints > on your code, you really can't get very far with asyncio without > becoming comfortable with interacting with the event loop directly. > As long as I can still write a high level framework where boilerplate is minimized in user code, I can "yield" on this issue. > > I gave a demo using the current spelling as a lightning talk at PyCon > Australia last weekend: > https://www.youtube.com/watch?v=_pfJZfdwkgI > > The only part of that demo I really wasn't happy with was the > "run_in_executor" call - the rest all felt good for the level asyncio > operates at, while still allowing higher level third party APIs that > hide more of the underlying machinery (like the event loop itself, as > well as the use of partial function application). > > > > > > The call_in_background() function would return an awaitable object > that is recognized by the asyncio Task class, which would then submit > the function to the default executor of the event loop. > > > >> That should make sense to anyone reading the handler, even if they > >> know nothing about concurrent.futures - the precise mechanics of how > >> the event loop goes about handing off the call to a background thread > >> or process is something they can explore later, they don't need to > >> know about it in order to locally reason about this specific handler. > >> > >> It also means that event loops would be free to implement their > >> *default* background call functionality using something other than > >> concurrent.futures, and only switch to the latter if an executor was > >> specified explicitly. > > > > Do you mean background calls that don't return objects compatible > with concurrent.futures.Futures? > > A background call already returns an asyncio awaitable, not a > concurrent.futures.Future object. > > > Can you think of a use case for this? > > Yes, third party event loops like Twisted may have their own > background call mechanism that they'd prefer to use by default, rather > than the concurrent.futures model. > What I don't get is why you say that this name and signature change would somehow enable event loops to implement an alternative mechanism for background calls. By event loops do you mean something like Twisted's reactors or just customized versions of asyncio event loops? To me, the former makes no sense at all and with the latter, I don't see how this name and signature change changes anything. Could they not already use whatever mechanism they please as long as it returns an awaitable (or iterable in the case of 3.4 or earlier) object, by having their custom implementation of run_in_executor()? > > >> There are still some open questions about whether it makes sense to > >> allow callables to indicate whether or not they expect to be IO bound > >> or CPU bound, > > > > What do you mean by this? > > There was a thread on the idea recently, but I don't have a link > handy. Indicating CPU vs IO bound directly wouldn't work (that's > context dependent), but allowing callables to explicitly indicate > "recommended", "supported", "incompatible" for process pools could be > interesting. > Yeah -- it'll be interesting to see where that goes. > > >> and hence allow event loop implementations to opt to > >> dispatch the latter to a process pool by default > > > > Bad idea! The semantics are too different and process pools have too > many limitations. > > Yes, that's why I find it an intriguing notion to allow callables to > explicitly indicate whether or not they're compatible with them. > > Cheers, > Nick > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nikolaus at rath.org Wed Aug 12 04:33:18 2015 From: Nikolaus at rath.org (Nikolaus Rath) Date: Tue, 11 Aug 2015 19:33:18 -0700 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <20150811112043.676c6038@anarchist.wooz.org> (Barry Warsaw's message of "Tue, 11 Aug 2015 11:20:43 -0400") References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> Message-ID: <8737zp8ccx.fsf@vostro.rath.org> On Aug 11 2015, Barry Warsaw wrote: > The complicated examples of f-strings I've seen really give me the shudders. > Maybe in practice it won't be so bad, but it's definitely true that if it can > be done, someone will do it. So I expect to see "abuses" of them in > the wild. [...] > Which is why I like the scope() and similar ideas. Something like a built-in > that provides you with a ChainMap of the current namespaces in effect. The > tricky bit is that you still need something like _getframe()'s depth argument, > or perhaps the object returned by scope() -or whatever it's called- would have > links back to the namespaces of earlier call frames. You mean instead of allowing expressions inside strings, you want to make it easier for functions to mess with their callers scope? def test(): x = 3 print(x) --> 3 increas_my_x() print(x) --> 4 def increase_my_x(): scope(depth=-1)['x'] += 1 Somehow I think the risk of abuse here is much higher than with expression strings. At least their effects are local. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 997 bytes Desc: not available URL: From abarnert at yahoo.com Wed Aug 12 05:06:20 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 11 Aug 2015 20:06:20 -0700 Subject: [Python-ideas] fork In-Reply-To: <20150811135429.48F0C8016E@smtp04.mail.de> References: <991E71A1-FDB3-4BE5-B607-B74850903DCD@yahoo.com> <20150811135429.48F0C8016E@smtp04.mail.de> Message-ID: <4A86012D-89FB-4A2E-A0D4-48D9FF09FE33@yahoo.com> On Aug 11, 2015, at 06:54, Sven R. Kunze wrote: > > Co-workers proposed using function scopes as the ultimate evaluation scope. That is when a function returns a ResultProxy, it gets evaluated. However, I have absolutely no idea how to do this as I couldn't find any __returned__ hook or something. I'm not sure I completely understand what you're looking for here. If you just want a hook that gets called whenever a function returns, just write a decorator that calls the real function then does the hook thing: def hookify(func): @wraps def wrapper(*args, **kwargs): result = func(*args, **kwargs) do_hook_stuff() return result return wrapper (Or, if you want to hook both raising and returning, use a finally.) But I'm not sure what good that would do anyway. If you unwrap futures every time they're returned, they're not doing anything useful as futures in the first place; you might as well just return the values directly. From abarnert at yahoo.com Wed Aug 12 05:33:08 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 11 Aug 2015 20:33:08 -0700 Subject: [Python-ideas] fork In-Reply-To: <20150811143305.A565780181@smtp04.mail.de> References: <991E71A1-FDB3-4BE5-B607-B74850903DCD@yahoo.com> <20150811143305.A565780181@smtp04.mail.de> Message-ID: <0E21D339-0A01-4A0E-B2C7-2996518B4686@yahoo.com> On Aug 11, 2015, at 07:33, Sven R. Kunze wrote: > > Am 05-Aug-2015 16:30:27 +0200 schrieb abarnert at yahoo.com: > > > What does that even mean? How would you not allow races? If you let people throw arbitrary tasks at a thread pool, with no restriction on mutable shared state, you've allowed races. > > Let me answer this in a more implicit way. > > Why do we need to mark global variables as such? > I think the answer is clear: to mark side-effects (quoting the docs). > > Why are all variables thread-shared by default? > I don't know, maybe efficiency reasons but that hardly apply to Python in the first place. First, are you suggesting that your idea doesn't make sense unless Python is first modified to not have shared variables? In that case, it doesn't seem like a very useful proposal, because it applies to some different language that isn't Python. And applying it to Python instead means you're still inviting race conditions. Pointing out that in a different language those races wouldn't exist is not really an answer to that. Second, the reason for the design is that that's what threads mean, by definition: things that are like processes except that they share the same heap and other global state. What's the point of a proposal that lets people select between threads and processes if its threads aren't actually processes? Finally, just making variables thread-local wouldn't help. You'd need a completely separate heap for each thread; otherwise, just passing a list to another thread means it can modify your values. And if you make a separate heap for each thread, what happens when you do x[0]=y if x is local and y shared, or vice-versa? You could build a whole shared-memory API and/or message-passing API a la the multiprocessing module, but if that's an acceptable solution, what's stopping you from using multiprocessing in the first place? (If you're going to say "not every message can be pickled", consider how you could deep-copy an object that can't be pickled.) Of course there's no reason that you couldn't implement something that's basically a process at the abstract level, but implemented with threads at the OS level. And that could make both explicit shared memory and IPC simpler at least under the covers, and more efficient. And it could lead to a way to eliminate the GIL. And there could be other benefits as well. That's why people are exploring things like the recent subinterpreters thread, PyParallel, PyPy+STM, etc. If this were an easy problem, it would have been solved by now. (Well, it _has_ been solved for different classes of languages--pure-immutable languages can share with impunity; languages designed from ground up for message passing can get away with only message passing; etc. But that doesn't help for Python.) > > And that's exactly the problem. What makes concurrent code with shared state hard, more than anything else, is people who don't realize what's hard about it and write code that seems to work but doesn't. > > Precisely because 'shared state' is hard, why is it the default? The default is to write sequential code. You have to go out of your way to use threads. And when you do, you have to intentionally choose threads over processes or some kind of microthreads. It's only when you've chosen to use shared-memory threading as the design for your app that shared memory becomes the default. > > Making it easier for such people to write broken code without even realizing they're doing so is not a good thing. > > That argument only applies when the broken code (using shared states) is the default. But that is the default in Python, so your proposal would make it easier for such people to write broken code without even realizing they're doing so, so it's not a good thing. From abarnert at yahoo.com Wed Aug 12 06:00:46 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 11 Aug 2015 21:00:46 -0700 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9A5F9.1030501@mail.de> <55CA2D11.5080201@mail.de> Message-ID: <3803DDD5-E236-4472-A0DB-0558942C37AD@yahoo.com> On Aug 11, 2015, at 16:22, Jonathan Slenders wrote: > > Not exactly. > > Take this string for instance: > f'hello {name}' > > And our FString implementation, very simple: > > class FString(str): > def __init__(self, value, **kwargs): > super().__init__(value.format(**self.kwargs)) > self.value = value > self.kwargs = kwargs > > What the above f-string should do is create an instance of that class. This is just a compiler detail. A preprocessor step. Like this: > > FString('hello {name}', name=str(name)) > > FString is just an str instance, it has the actual interpolated value, but it still contains the original uninterpolated string and all parameters (as strings as well.) So you want every i18n string to interpolate the string, ignore that, look up the raw string, re-interpolate that, and somehow modify the string to hold the new l10n+interpolated value instead? Besides the performance cost of interpolating every string twice for no reason, and the possibility of irrelevant errors popping up while doing so, it's also impossible given that strings are immutable. (That also means you have to use a __new__ rather than __init__, by the way, but that's just a minor quibble.) Also, what happens if the translated string uses a variable that the original string didn't? For example, maybe your English string uses {Salutation} and {Last Name}, but your Chinese string has no need for a salutation, and your Icelandic string only uses the first name. People don't do that very often, but that's partly because many i18n systems are too inflexible to handle it. Python's str.format makes it relatively easy to build something that is flexible enough. Nick's proposal and Barry's both are. This one isn't. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed Aug 12 06:02:55 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 11 Aug 2015 21:02:55 -0700 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <8737zp8ccx.fsf@vostro.rath.org> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> Message-ID: <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> On Aug 11, 2015, at 19:33, Nikolaus Rath wrote: > >> On Aug 11 2015, Barry Warsaw wrote: >> The complicated examples of f-strings I've seen really give me the shudders. >> Maybe in practice it won't be so bad, but it's definitely true that if it can >> be done, someone will do it. So I expect to see "abuses" of them in >> the wild. > [...] >> Which is why I like the scope() and similar ideas. Something like a built-in >> that provides you with a ChainMap of the current namespaces in effect. The >> tricky bit is that you still need something like _getframe()'s depth argument, >> or perhaps the object returned by scope() -or whatever it's called- would have >> links back to the namespaces of earlier call frames. > > You mean instead of allowing expressions inside strings, you want to > make it easier for functions to mess with their callers scope? I think he was proposing an immutable mapping (or at worst one that is mutable, but is or at least may be detached copy, a la locals()). And if he wasn't, it's trivial to change his proposal into one using immutable mappings. Which still retains all the benefits for string formatting, and does have the problem you raised. From guido at python.org Wed Aug 12 07:47:53 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Aug 2015 07:47:53 +0200 Subject: [Python-ideas] [Python-Dev] PEP-498: Literal String Formatting In-Reply-To: References: <55C55DC3.8040605@trueblade.com> <55C79A73.1030901@trueblade.com> <20150810172631.GN3737@ando.pearwood.info> <20150810143127.66c5f842@anarchist.wooz.org> <55CA1030.3060808@sdamon.com> <47D74296-BA36-4956-B67A-126148201B47@gmail.com> Message-ID: Wes, I don't know you, but your contributions to this thread are adding more noise than light. I am not the only one who is exasperated at many of your posts. Please stop. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 12 08:50:40 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Aug 2015 08:50:40 +0200 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> Message-ID: On Wed, Aug 12, 2015 at 6:02 AM, Andrew Barnert via Python-ideas < python-ideas at python.org> wrote: > On Aug 11, 2015, at 19:33, Nikolaus Rath wrote: > > > >> On Aug 11 2015, Barry Warsaw < > barry-+ZN9ApsXKcEdnm+yROfE0A at public.gmane.org> wrote: > >> The complicated examples of f-strings I've seen really give me the > shudders. > >> Maybe in practice it won't be so bad, but it's definitely true that if > it can > >> be done, someone will do it. So I expect to see "abuses" of them in > >> the wild. > > [...] > >> Which is why I like the scope() and similar ideas. Something like a > built-in > >> that provides you with a ChainMap of the current namespaces in effect. > The > >> tricky bit is that you still need something like _getframe()'s depth > argument, > >> or perhaps the object returned by scope() -or whatever it's called- > would have > >> links back to the namespaces of earlier call frames. > > > > You mean instead of allowing expressions inside strings, you want to > > make it easier for functions to mess with their callers scope? > > I think he was proposing an immutable mapping (or at worst one that is > mutable, but is or at least may be detached copy, a la locals()). > > And if he wasn't, it's trivial to change his proposal into one using > immutable mappings. Which still retains all the benefits for string > formatting, and does have the problem you raised. > If I understand the proposal for scope() correctly, it's just a cleverer way to spell locals() etc.[1] and that means I don't want it to play any role in the string formatting proposal. It also has the same problems as locals(), sys._getframe(), etc., which is that their presence makes certain optimizations harder (in IronPython IIRC the creation of frame objects is normally skipped to speed up function calls, but the optimizer must detect the presence of those functions in order to disable that optimization). That doesn't mean I'm opposed to it (I don't have a problem with locals()), but it does mean that I think their use should probably not be encouraged. TBH I'm sorry Barry, but whenever someone use DRY as a rallying cry I get a bad taste in my mouth. The solutions that are then proposed are too often uglier than the problem. (So I'm glad PEP 498 doesn't mention DRY. :-) [1] I know it's not just locals(), but it's too much of a mouthful to give the full definition. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Aug 12 09:30:14 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 12 Aug 2015 09:30:14 +0200 Subject: [Python-ideas] PEP 501 - i18n with marked strings In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9A5F9.1030501@mail.de> <55CA2D11.5080201@mail.de> Message-ID: <55CAF606.5050503@mail.de> I think I understood that. How can I differentiate between {variables} to be translated and variables not to be translated? I thought this was the intention of Mike's idea: unifying both i and f as they are orthogonal to each other. As I don't like the $ so much, I proposed using {...} as well with a special marker i18n or something. That could be completely useless, I am unsure. On 12.08.2015 01:22, Jonathan Slenders wrote: > Not exactly. > > Take this string for instance: > > f'hello {name}' > > > And our FString implementation, very simple: > > class FString(str): > def __init__(self, value, **kwargs): > super().__init__(value.format(**self.kwargs)) > > self.value = value > > self.kwargs = kwargs > > > What the above f-string should do is create an instance of that class. > This is just a compiler detail. A preprocessor step. Like this: > > FString('hello {name}', name=str(name)) > > > FString is just an str instance, it has the actual interpolated value, > but it still contains the original uninterpolated string and all > parameters (as strings as well.) > > Now, what gettext can do, if we would wrap this string in the > underscore function, is take the "value" attribute from this string > FString, translate that, and apply the interpolation again. > > This way, we are completely compatible with the format() call. There > is no need at all for using globals/locals or _getframe(). The name > bindings are static, this is lintable. > > Please tell me if I'm missing something. > > > 2015-08-11 19:12 GMT+02:00 Sven R. Kunze >: > > I actually thought this was about a two-step process using lazy > evaluation. > > This way {name:i18n} or {name:later} basically marks lazy evaluation. > > But as it seems, i'...' is more supposed to do all (translation + > formatting) of this at once. My fault, sorry. > > > On 11.08.2015 10:35, Petr Viktorin wrote: > > On Tue, Aug 11, 2015 at 9:36 AM, Sven R. Kunze > > wrote: > > Also bare with me but couldn't i18n not just be another > format spec? > > i'Hello there, {name:i18n} {age}.' > > Usually it's not the substitutions that you need to translate, > but the > surrounding text. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Wed Aug 12 10:29:02 2015 From: mistersheik at gmail.com (Neil Girdhar) Date: Wed, 12 Aug 2015 01:29:02 -0700 (PDT) Subject: [Python-ideas] PEP 487 Message-ID: Has there been any progress with PEP 487? I am finding myself writing a lot of boilerplate because of Python's so-called "metaclass hell". What are the problems with PEP 487? Best, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Wed Aug 12 14:25:34 2015 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 12 Aug 2015 21:25:34 +0900 Subject: [Python-ideas] PEP 487 In-Reply-To: References: Message-ID: <871tf8y9q9.fsf@uwakimon.sk.tsukuba.ac.jp> Neil Girdhar writes: > Has there been any progress with PEP 487? I am finding myself > writing a lot of boilerplate because of Python's so-called > "metaclass hell". What are the problems with PEP 487? According to the PEP, the code (metaclass, now in version 1.1 on PyPI) is in a module on PyPI and in successful use at least by the author. The PEP proposes adding it to the stdlib after experience with the PyPI module. @Martin: The section Connections to Other PEP begins, "This is a competing proposal to PEP 422". I suggest some kind of clarification that PEP 422 has been withdrawn. Something like "This is a competing proposal to PEP 422 (withdrawn in favor of this PEP) ...", or even just adding "(withdrawn)". (I know the header says "replaces", which is why I find the current wording confusing.) Steve From scott.b.sanderson90 at gmail.com Wed Aug 12 15:57:03 2015 From: scott.b.sanderson90 at gmail.com (Scott Sanderson) Date: Wed, 12 Aug 2015 09:57:03 -0400 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) Message-ID: Hi All, Occasionally I find myself wanting to unpack the values of a dictionary into local variables of a function. This most often occurs when marshalling values to/from some serialization format. For example: def do_stuff_from_json(json_dict): actual_dict = json.loads(json_dict) foo = actual_dict['foo'] bar = actual_dict['bar'] # Do stuff with foo and bar. In the same spirit as allowing argument unpacking into tuples or lists, what I'd really like to be able write is something like: def do_stuff_from_json(json_dict): # Assigns variables in the **values** of the lefthand side by doing lookups # of the corresponding keys in the result of the righthand side expression. {'foo': foo, 'bar': bar} = json.loads(json_dict) Nearly all the arguments in favor of tuple/list unpacking also apply to this construct. In particular: 1. It makes the code more self-documenting, in that the left side of the expression looks more like the expected output of the right side. 2. The construct can be implemented more efficiently by the interpreter by using a dictionary analog of the UNPACK_SEQUENCE opcode (e.g. UNPACK_MAP). An interesting question that falls out of this idea is whether/how we should handle nested structures. I'd expect the rule to be that something like: {'toplevel': {'key1': key1, 'key2': key2}} = value would desugar into something equivalent to: TEMP = value['toplevel'] key1 = TEMP['key1'] key2 = TEMP['key2'] del TEMP while something like {'toplevel': (x, y)} = value would desugar into something like: (x, y) = value['toplevel'] At the bytecode level, I'd expect this to be implemented with a new instruction, analogous to the current UNPACK_SEQUENCE, which would pop N keys and a map from the stack, and push map[key] onto the stack for each popped key. We'd then recurse through the values left on the stack, storing them as we would store the sub-lvalues if they were in a standard assignment. Thus the code for something like: {'name': name, 'tuple': (x, y), 'dict': {'subkey': subvalue}} = values would translate into the following "pseudo-bytecode": LOAD_NAME 'values' # Push rvalue onto the stack. LOAD_CONST 'dict' # Push top-level keys onto the stack. LOAD_CONST 'tuple' LOAD_CONST 'name' UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the stack. # TOS = values['name'] # TOS1 = values['tuple'] # TOS2 = values['dict'] STORE_FAST name # Terminal names are simply stored. UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the stack. # TOS = values['tuple'][0] # TOS1 = values['tuple'][1] # TOS2 = values['dict'] STORE_FAST x STORE_FAST y LOAD_CONST 'subkey' # TOS = 'subkey' # TOS1 = values['dict'] UNPACK_MAP 1 # TOS = values['dict']['subkey'] STORE_FAST subvalue I'd be curious to hear others' thoughts on whether this seems like a reasonable idea. One open question is whether non-literals should be allowed as keys in dictionaries (the above still works as expected if the keys are allowed to be names or expressions; the LOAD_CONSTs would turn into whatever expression or LOAD_* is necessary to put the necessary value on the stack). Another question is if/how we should handle extra keys in right-hand side of the assignment (my guess is that we shouldn't do anything special with that case). -Scott P.S. I attempted to post this last night, but it seems to have not gone through. Apologies for the double post if I'm mistaken about that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.us Wed Aug 12 17:41:18 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Wed, 12 Aug 2015 11:41:18 -0400 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: References: Message-ID: <1439394078.2428143.354454697.20039686@webmail.messagingengine.com> On Wed, Aug 12, 2015, at 09:57, Scott Sanderson wrote: > def do_stuff_from_json(json_dict): > # Assigns variables in the **values** of the lefthand side by doing > lookups > # of the corresponding keys in the result of the righthand side > expression. > {'foo': foo, 'bar': bar} = json.loads(json_dict) How about: key = 'foo' key2 = 'bar' {key: value, key2: value2, **rest} = json.loads(json_dict) From joejev at gmail.com Wed Aug 12 17:46:10 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Wed, 12 Aug 2015 11:46:10 -0400 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: <1439394078.2428143.354454697.20039686@webmail.messagingengine.com> References: <1439394078.2428143.354454697.20039686@webmail.messagingengine.com> Message-ID: >From a language design standpoint I think that having non-constant keys in the unpack map makes a lot of sense. As far as implementation, I would imagine that using non-constant expressions for the keys should be fine. If you look at the proposed implementation, the UNPACK_MAP instruction just wants the stack to have N values on the stack, it shouldn't matter how they got there. On Wed, Aug 12, 2015 at 11:41 AM, wrote: > On Wed, Aug 12, 2015, at 09:57, Scott Sanderson wrote: > > def do_stuff_from_json(json_dict): > > # Assigns variables in the **values** of the lefthand side by doing > > lookups > > # of the corresponding keys in the result of the righthand side > > expression. > > {'foo': foo, 'bar': bar} = json.loads(json_dict) > > How about: > > key = 'foo' > key2 = 'bar' > {key: value, key2: value2, **rest} = json.loads(json_dict) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed Aug 12 18:06:27 2015 From: barry at python.org (Barry Warsaw) Date: Wed, 12 Aug 2015 12:06:27 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> Message-ID: <20150812120627.0efc2b79@anarchist.wooz.org> On Aug 12, 2015, at 08:50 AM, Guido van Rossum wrote: >> I think he was proposing an immutable mapping FTR, yes. Immutable reads are all that's required for i18n. >It also has the same problems as locals(), sys._getframe(), etc., which is >that their presence makes certain optimizations harder (in IronPython IIRC >the creation of frame objects is normally skipped to speed up function >calls, but the optimizer must detect the presence of those functions in >order to disable that optimization). That doesn't mean I'm opposed to it (I >don't have a problem with locals()), but it does mean that I think their >use should probably not be encouraged. I'm much less concerned about the performance impact loss of optimization provides because I think i18n is already generally slower... and that's okay! I mean _() has to at least do a dictionary look (assuming the catalog is warmed in memory) and then a piece-wise interpolation into the resulting translated string. So you're already paying runtime penalty to do i18n. >TBH I'm sorry Barry, but whenever someone use DRY as a rallying cry I get a >bad taste in my mouth. The solutions that are then proposed are too often >uglier than the problem. (So I'm glad PEP 498 doesn't mention DRY. :-) I can appreciate that. It reminds me of the days of Python before keyword arguments. Remember the fun we had with tkinter back then? :) i18n is one of those places where DRY really is a limiting factor. You just can't force coders to pass in all the arguments to their translated strings, say into the _() function. The code looked horrible, it's way too much typing, and people (well, *I* ;) just won't do it. After implementing the sys._getframe() hack, it made i18n just so much more pleasant and easy to write, you almost couldn't not do it. One of the things that intrigues me about this whole idea of syntactic and compiler support is the ability to narrow down the set of substitution values available for interpolation, by parsing the source string and passing them into the interpolation call. Currently, _() is forced to expose all of locals and global to interpolation, although I guess it could also parse out the $-placeholders in the source string too[1]. Not doing this does open an information leak vector via maliciously translated strings. If the source string were parsed and *only* those names were available for interpolation, a maliciously translated string couldn't be used to expose additional information because the keys in the interpolation dictionary would be limited. This mythical scope() could take arguments which would name the variables in the enclosing scopes it should export. It would still be a PITA if used explicitly, but could work nicely if i-strings essentially boiled down to: placeholders = source_string.extract_placeholders() substitutions = scope(*placeholders) translated_string = i18n.lookup(source_string) return translated_string.safe_substitute(substitutions) That would actually be quite useful. Cheers, -Barry [1] https://gitlab.com/warsaw/flufl.i18n/issues/1 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From rosuav at gmail.com Wed Aug 12 18:10:13 2015 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 13 Aug 2015 02:10:13 +1000 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: References: Message-ID: On Wed, Aug 12, 2015 at 11:57 PM, Scott Sanderson wrote: > LOAD_NAME 'values' # Push rvalue onto the stack. > LOAD_CONST 'dict' # Push top-level keys onto the stack. > LOAD_CONST 'tuple' > LOAD_CONST 'name' > UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the stack. > # TOS = values['name'] > # TOS1 = values['tuple'] > # TOS2 = values['dict'] > > STORE_FAST name # Terminal names are simply stored. > > UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the > stack. > # TOS = values['tuple'][0] > # TOS1 = values['tuple'][1] > # TOS2 = values['dict'] > STORE_FAST x > STORE_FAST y This sounds reasonable in theory; is it going to have problems with the non-orderedness of dictionaries? With sequence unpacking, it's straight-forward - you evaluate things in a known order, you iterate over the thing, you assign. In this case, you might end up with some bizarre stack manipulation needed to make this work. Inside that UNPACK_MAP opcode, arbitrary code could be executed (imagine if the RHS is not a dict per se, but an object with a __getitem__ method), so it'll need to be popping some things off and pushing others on, and presumably would need to know what goes where. Unless, of course, this doesn't "pop" and "push", but does some sort of replacement. Suppose you load the keys first, and only once those are loaded, you load the rvalue - so the rvalue is on the top of the stack. "UNPACK_MAP 3" means this: 1) Pop the top item off the stack - it is the map we're working with. 2) Reach 3 items down in the stack. Take that item, subscript our map with it, and replace that stack entry with the result. 3) Reach 2 items down, subscript, replace. Repeat till we subscript with the top of the stack. I've no idea how plausible that is, but it'd kinda work. It would also mean you could evaluate the keys in the order that they're shown in the dict display *and* assign to them in that order, which the current proposal doesn't do (it assigns in order, but evaluates in reverse order). Stupid, unworkable idea? ChrisA From steve at pearwood.info Wed Aug 12 18:38:16 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 13 Aug 2015 02:38:16 +1000 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: References: Message-ID: <20150812163816.GG5249@ando.pearwood.info> On Wed, Aug 12, 2015 at 09:57:03AM -0400, Scott Sanderson wrote: > Hi All, > > Occasionally I find myself wanting to unpack the values of a dictionary > into local variables of a function. This most often occurs when > marshalling values to/from some serialization format. I think that anything that is only needed "occasionally" doesn't have a strong claim to deserve syntax. > For example: > > def do_stuff_from_json(json_dict): > actual_dict = json.loads(json_dict) > foo = actual_dict['foo'] > bar = actual_dict['bar'] > # Do stuff with foo and bar. Seems reasonable and not too much of a burden to me. If I needed a lot of keys, I'd do: # sequence unpacking version spam, eggs, cheese, foo, bar, baz = [actual_dict[key] for key in "spam eggs cheese foo bar baz".split()] > In the same spirit as allowing argument unpacking into tuples or lists, > what I'd really like to be able write is something like: > > def do_stuff_from_json(json_dict): > # Assigns variables in the **values** of the lefthand side by doing lookups > # of the corresponding keys in the result of the righthand side expression. > {'foo': foo, 'bar': bar} = json.loads(json_dict) I think the sequence unpacking version above reads much better than this hypothetical dict unpacking version: {'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, 'eggs': eggs, 'cheese': cheese} = json.loads(json_dict) Both are roughly as verbose, both have a little duplication, but the sequence unpacking version requires far fewer quotation marks and other punctuation. I also think it's much more readable, and of course the big advantage of it is that it works right now, you don't have to wait two or three years to start using it in production. If there is a downside to the sequence unpacking version, it is that it requires a temporary variable actual_dict, but that's not a real problem. I don't think dict unpacking is needed when you have only two or three variables, and I don't think your suggested syntax is readable when you have many variables. So I would be -1 on this suggestion. However, if you wanted to think outside the box, it's a pity that locals() is not writable. If it were, we could do: locals().update(json.loads(json_dict)) although of course that might update too many local names. So, just throwing it out there for discussion: - Make locals() writable. If the compiler detects that locals() may be written to, that will have to disable the fast local variable access for that specific function. More practical, and in the spirit of tuple unpacking: spam, eggs, cheese = **expression being equivalent to: _tmp = expression spam = _tmp['spam'] eggs = _tmp['eggs'] cheese = _tmp['cheese'] del _tmp except that _tmp is never actually created/deleted. This is easier to write and simpler to read, and doesn't allow nested unpacking. (I consider that last point to be a positive feature, not a lack.) -- Steve From steve at pearwood.info Wed Aug 12 18:48:28 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 13 Aug 2015 02:48:28 +1000 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: References: <1439394078.2428143.354454697.20039686@webmail.messagingengine.com> Message-ID: <20150812164828.GH5249@ando.pearwood.info> On Wed, Aug 12, 2015 at 11:46:10AM -0400, Joseph Jevnik wrote: > From a language design standpoint I think that having non-constant keys in > the unpack map makes a lot of sense. mydict = {'spam': 1, 'eggs': 2} spam = 'eggs' eggs = 99 {spam: spam} = mydict print(spam, eggs) What gets printed? I can only guess that you want it to print eggs 1 rather than 1 99 but I can't be sure. I am reasonably sure that whatever you pick, it will surprise some people. It will also play havok with CPython's local variable optimization, since the compiler cannot tell what the name of the local will be: def func(): mydict = dict(foo=1, bar=2, baz=3) spam = random.choice(['foo', 'bar', 'baz']) {spam: spam} = mydict # which locals exist at this point? -- Steve From joejev at gmail.com Wed Aug 12 18:52:15 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Wed, 12 Aug 2015 12:52:15 -0400 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: References: Message-ID: If you look carefully at the way the stack is setup, we are not iterating over the map, instead we are executing a sequence of PyObject_GetItem calls in the execution of the opcode and then pushing the results back onto the stack. The order of the results is based on the order of keys that were on the stack. On Wed, Aug 12, 2015 at 12:10 PM, Chris Angelico wrote: > On Wed, Aug 12, 2015 at 11:57 PM, Scott Sanderson > wrote: > > LOAD_NAME 'values' # Push rvalue onto the stack. > > LOAD_CONST 'dict' # Push top-level keys onto the stack. > > LOAD_CONST 'tuple' > > LOAD_CONST 'name' > > UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the > stack. > > # TOS = values['name'] > > # TOS1 = values['tuple'] > > # TOS2 = values['dict'] > > > > STORE_FAST name # Terminal names are simply stored. > > > > UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the > > stack. > > # TOS = values['tuple'][0] > > # TOS1 = values['tuple'][1] > > # TOS2 = values['dict'] > > STORE_FAST x > > STORE_FAST y > > This sounds reasonable in theory; is it going to have problems with > the non-orderedness of dictionaries? With sequence unpacking, it's > straight-forward - you evaluate things in a known order, you iterate > over the thing, you assign. In this case, you might end up with some > bizarre stack manipulation needed to make this work. Inside that > UNPACK_MAP opcode, arbitrary code could be executed (imagine if the > RHS is not a dict per se, but an object with a __getitem__ method), so > it'll need to be popping some things off and pushing others on, and > presumably would need to know what goes where. > > Unless, of course, this doesn't "pop" and "push", but does some sort > of replacement. Suppose you load the keys first, and only once those > are loaded, you load the rvalue - so the rvalue is on the top of the > stack. "UNPACK_MAP 3" means this: > > 1) Pop the top item off the stack - it is the map we're working with. > 2) Reach 3 items down in the stack. Take that item, subscript our map > with it, and replace that stack entry with the result. > 3) Reach 2 items down, subscript, replace. Repeat till we subscript > with the top of the stack. > > I've no idea how plausible that is, but it'd kinda work. It would also > mean you could evaluate the keys in the order that they're shown in > the dict display *and* assign to them in that order, which the current > proposal doesn't do (it assigns in order, but evaluates in reverse > order). > > Stupid, unworkable idea? > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joejev at gmail.com Wed Aug 12 18:57:53 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Wed, 12 Aug 2015 12:57:53 -0400 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: References: Message-ID: Steven, in your example you would get `2 99`. Also, you can always tell what the name of the local is. I think the bytecode example that scott showed was a pretty clear implementation example. Also, the idea of this is not that the variable has the name name as the key, but that you can pull the values out of a mapping and name them. This means that the idea of updating the locals with the dict is not really the same idea. That also forces you to use all or none of the keye instead of allowing you to take a subset. On Wed, Aug 12, 2015 at 12:52 PM, Joseph Jevnik wrote: > If you look carefully at the way the stack is setup, we are not iterating > over the map, instead we are executing a sequence of PyObject_GetItem calls > in the execution of the opcode and then pushing the results back onto the > stack. The order of the results is based on the order of keys that were on > the stack. > > On Wed, Aug 12, 2015 at 12:10 PM, Chris Angelico wrote: > >> On Wed, Aug 12, 2015 at 11:57 PM, Scott Sanderson >> wrote: >> > LOAD_NAME 'values' # Push rvalue onto the stack. >> > LOAD_CONST 'dict' # Push top-level keys onto the stack. >> > LOAD_CONST 'tuple' >> > LOAD_CONST 'name' >> > UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the >> stack. >> > # TOS = values['name'] >> > # TOS1 = values['tuple'] >> > # TOS2 = values['dict'] >> > >> > STORE_FAST name # Terminal names are simply stored. >> > >> > UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the >> > stack. >> > # TOS = values['tuple'][0] >> > # TOS1 = values['tuple'][1] >> > # TOS2 = values['dict'] >> > STORE_FAST x >> > STORE_FAST y >> >> This sounds reasonable in theory; is it going to have problems with >> the non-orderedness of dictionaries? With sequence unpacking, it's >> straight-forward - you evaluate things in a known order, you iterate >> over the thing, you assign. In this case, you might end up with some >> bizarre stack manipulation needed to make this work. Inside that >> UNPACK_MAP opcode, arbitrary code could be executed (imagine if the >> RHS is not a dict per se, but an object with a __getitem__ method), so >> it'll need to be popping some things off and pushing others on, and >> presumably would need to know what goes where. >> >> Unless, of course, this doesn't "pop" and "push", but does some sort >> of replacement. Suppose you load the keys first, and only once those >> are loaded, you load the rvalue - so the rvalue is on the top of the >> stack. "UNPACK_MAP 3" means this: >> >> 1) Pop the top item off the stack - it is the map we're working with. >> 2) Reach 3 items down in the stack. Take that item, subscript our map >> with it, and replace that stack entry with the result. >> 3) Reach 2 items down, subscript, replace. Repeat till we subscript >> with the top of the stack. >> >> I've no idea how plausible that is, but it'd kinda work. It would also >> mean you could evaluate the keys in the order that they're shown in >> the dict display *and* assign to them in that order, which the current >> proposal doesn't do (it assigns in order, but evaluates in reverse >> order). >> >> Stupid, unworkable idea? >> >> ChrisA >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From scoutoss at gmail.com Wed Aug 12 20:01:24 2015 From: scoutoss at gmail.com (Scott Sanderson) Date: Wed, 12 Aug 2015 11:01:24 -0700 (PDT) Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: References: Message-ID: <518dba3d-a6b9-4602-9af2-91d9eb5dea98@googlegroups.com> > > This sounds reasonable in theory; is it going to have problems with > the non-orderedness of dictionaries? You can make this deterministic by always iterating over the LHS keys in declaration order. Expressed another way, dictionary **literals** can be ordered, even if dictionaries themselves are not ordered at runtime. We only need to have a well-defined order when generating opcodes at compile time. I think you were getting at this same idea with your proposal as well. Unless, of course, this doesn't "pop" and "push", but does some sort > of replacement. Suppose you load the keys first, and only once those > are loaded, you load the rvalue - so the rvalue is on the top of the > stack. "UNPACK_MAP 3" means this: I think you could make this work with either stack ordering; the compiler would be generating both the UNPACK_* calls and the LOAD_* calls all at once, so it could decide to order them however the interpreter found it most convenient to work with. I think there are already opcodes that operate on the top N elements of the stack, so whether we're actually doing true pushes and pops is just an implementation detail. I do agree that the accesses should happen in the order that the keys appear in the LHS, and I'd expect nested structures to be traversed depth-first, which would matter if the same leaf name appeared in multiple places. This would be analogous to the fact that a, a = (1, 2) results in the value of a being 2. -Scott On Wednesday, August 12, 2015 at 12:10:43 PM UTC-4, Chris Angelico wrote: > > On Wed, Aug 12, 2015 at 11:57 PM, Scott Sanderson > > wrote: > > LOAD_NAME 'values' # Push rvalue onto the stack. > > LOAD_CONST 'dict' # Push top-level keys onto the stack. > > LOAD_CONST 'tuple' > > LOAD_CONST 'name' > > UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the > stack. > > # TOS = values['name'] > > # TOS1 = values['tuple'] > > # TOS2 = values['dict'] > > > > STORE_FAST name # Terminal names are simply stored. > > > > UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the > > stack. > > # TOS = values['tuple'][0] > > # TOS1 = values['tuple'][1] > > # TOS2 = values['dict'] > > STORE_FAST x > > STORE_FAST y > > This sounds reasonable in theory; is it going to have problems with > the non-orderedness of dictionaries? With sequence unpacking, it's > straight-forward - you evaluate things in a known order, you iterate > over the thing, you assign. In this case, you might end up with some > bizarre stack manipulation needed to make this work. Inside that > UNPACK_MAP opcode, arbitrary code could be executed (imagine if the > RHS is not a dict per se, but an object with a __getitem__ method), so > it'll need to be popping some things off and pushing others on, and > presumably would need to know what goes where. > > Unless, of course, this doesn't "pop" and "push", but does some sort > of replacement. Suppose you load the keys first, and only once those > are loaded, you load the rvalue - so the rvalue is on the top of the > stack. "UNPACK_MAP 3" means this: > > 1) Pop the top item off the stack - it is the map we're working with. > 2) Reach 3 items down in the stack. Take that item, subscript our map > with it, and replace that stack entry with the result. > 3) Reach 2 items down, subscript, replace. Repeat till we subscript > with the top of the stack. > > I've no idea how plausible that is, but it'd kinda work. It would also > mean you could evaluate the keys in the order that they're shown in > the dict display *and* assign to them in that order, which the current > proposal doesn't do (it assigns in order, but evaluates in reverse > order). > > Stupid, unworkable idea? > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From scoutoss at gmail.com Wed Aug 12 20:44:05 2015 From: scoutoss at gmail.com (Scott Sanderson) Date: Wed, 12 Aug 2015 11:44:05 -0700 (PDT) Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: <20150812163816.GG5249@ando.pearwood.info> References: <20150812163816.GG5249@ando.pearwood.info> Message-ID: > > I think that anything that is only needed "occasionally" doesn't have a > strong claim to deserve syntax. > I suppose I might have worded this more strongly. I meant "occasionally" to be interpreted as "often enough that I've been irked by not having a cleaner way to express this construct". I do appreciate the fact that syntax extensions have a real cost, and that they should be reserved for cases where the additional clarity and/or economy of expression outweighs the cost of implementation, maintenance, and (perhaps most importantly) teaching the language to others. I think the sequence unpacking version above reads much better than > this hypothetical dict unpacking version: > {'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, > 'eggs': eggs, 'cheese': cheese} = json.loads(json_dict) As with many things in Python, I think that how you format this expression makes a big difference. I'd write it like this: { 'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, 'eggs': eggs, 'cheese': cheese, } = json.loads(json_dict) I prefer this to your example of unpacking from a list comprehension because I think it does a better job of expressing to a reader the expected structure of the input data. It's also much easier to modify this to extract nested values, ala: { 'foo': foo, 'bar': bar, 'baz': (baz_x, baz_y), 'spam': spam, 'eggs': eggs, 'cheese': cheese, } = json.loads(json_dict) However, if you wanted to think outside the box, it's a pity that locals() > is not writable. If it were, we could do: > locals().update(json.loads(json_dict)) locals() is already writable in certain contexts, most notably in class bodies. This works fine, for example: In [1]: class Foo(object): ...: locals().update({'foo': lambda self: 3}) ...: In [2]: Foo().foo() Out[2]: 3 locals() is not writable, as you point out, in function calls. However, I'm not sure that having a mutable locals is a good solution to this problem. As mentioned in the original post, I most often want to do this in contexts where I'm unpacking serialized data, in which case it's probably not a great idea to have that data trample your namespace with no restrictions. More practical, and in the spirit of tuple unpacking: > spam, eggs, cheese = **expression I like how concise this syntax is. I'd be sad that it doesn't allow unpacking of nested expressions, though I think we disagree on whether that's actually an issue. A more substantial objection might be that this could only work on mapping objects with strings for keys. - Scott On Wednesday, August 12, 2015 at 12:38:47 PM UTC-4, Steven D'Aprano wrote: > > On Wed, Aug 12, 2015 at 09:57:03AM -0400, Scott Sanderson wrote: > > Hi All, > > > > Occasionally I find myself wanting to unpack the values of a dictionary > > into local variables of a function. This most often occurs when > > marshalling values to/from some serialization format. > > I think that anything that is only needed "occasionally" doesn't have a > strong claim to deserve syntax. > > > For example: > > > > def do_stuff_from_json(json_dict): > > actual_dict = json.loads(json_dict) > > foo = actual_dict['foo'] > > bar = actual_dict['bar'] > > # Do stuff with foo and bar. > > Seems reasonable and not too much of a burden to me. If I needed a lot > of keys, I'd do: > > # sequence unpacking version > spam, eggs, cheese, foo, bar, baz = [actual_dict[key] for > key in "spam eggs cheese foo bar baz".split()] > > > > In the same spirit as allowing argument unpacking into tuples or lists, > > what I'd really like to be able write is something like: > > > > def do_stuff_from_json(json_dict): > > # Assigns variables in the **values** of the lefthand side by doing > lookups > > # of the corresponding keys in the result of the righthand side > expression. > > {'foo': foo, 'bar': bar} = json.loads(json_dict) > > I think the sequence unpacking version above reads much better than > this hypothetical dict unpacking version: > > {'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, > 'eggs': eggs, 'cheese': cheese} = json.loads(json_dict) > > Both are roughly as verbose, both have a little duplication, but the > sequence unpacking version requires far fewer quotation marks and other > punctuation. I also think it's much more readable, and of course the big > advantage of it is that it works right now, you don't have to wait two > or three years to start using it in production. > > If there is a downside to the sequence unpacking version, it is that it > requires a temporary variable actual_dict, but that's not a real > problem. > > I don't think dict unpacking is needed when you have only two or three > variables, and I don't think your suggested syntax is readable when you > have many variables. So I would be -1 on this suggestion. > > However, if you wanted to think outside the box, it's a pity that > locals() is not writable. If it were, we could do: > > locals().update(json.loads(json_dict)) > > although of course that might update too many local names. So, just > throwing it out there for discussion: > > - Make locals() writable. If the compiler detects that locals() may be > written to, that will have to disable the fast local variable access > for that specific function. > > > More practical, and in the spirit of tuple unpacking: > > spam, eggs, cheese = **expression > > > being equivalent to: > > _tmp = expression > spam = _tmp['spam'] > eggs = _tmp['eggs'] > cheese = _tmp['cheese'] > del _tmp > > except that _tmp is never actually created/deleted. > > This is easier to write and simpler to read, and doesn't allow nested > unpacking. (I consider that last point to be a positive feature, not a > lack.) > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Aug 12 21:04:42 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 12 Aug 2015 21:04:42 +0200 Subject: [Python-ideas] fork In-Reply-To: <4A86012D-89FB-4A2E-A0D4-48D9FF09FE33@yahoo.com> References: <991E71A1-FDB3-4BE5-B607-B74850903DCD@yahoo.com> <20150811135429.48F0C8016E@smtp04.mail.de> <4A86012D-89FB-4A2E-A0D4-48D9FF09FE33@yahoo.com> Message-ID: <55CB98CA.2080306@mail.de> On 12.08.2015 05:06, Andrew Barnert wrote: > But I'm not sure what good that would do anyway. If you unwrap futures > every time they're returned, they're not doing anything useful as > futures in the first place; you might as well just return the values > directly. I think I found a better solution. Not functions should be the boundaries but try: blocks. Why? Because they mark the boundaries for exception handling and this is what the problem is about. I started another thread here: https://mail.python.org/pipermail/python-list/2015-August/695313.html If an exception is raised within an try: block that is not supposed to be handled there, weird things might happen (wrong handling, superfluous handling, no handling, etc.). Confining the evaluation of result proxies within the try: blocks they are created in would basically retain all sequential properties. So, plugging in 'fork' and removing it would basically change nothing (at least if you don't try anything really insane which at least is disallowed by our coding standards. ;) ) Some example ('function' here mean stack frame of a function): def b(): return 'a string' try: function: a = fork(b) a += 3 function: b = 5 b *= 4 * a except TypeError: print('damn, I mixed strings and numbers') The given try: block needs to make sure if eventually collects all exceptions that would have been raised in the sequential case. Conclusion: the approach is compromise between: 1) deferred evaluation (later is better) 2) proper exception handling (early is better) Best, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed Aug 12 23:20:59 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 12 Aug 2015 14:20:59 -0700 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: References: <20150812163816.GG5249@ando.pearwood.info> Message-ID: On Aug 12, 2015, at 11:44, Scott Sanderson wrote: >> I think that anything that is only needed "occasionally" doesn't have a >> strong claim to deserve syntax. > > I suppose I might have worded this more strongly. I meant "occasionally" to be interpreted as "often enough that I've been irked by not having a cleaner way to express this construct" Personally, I've been irked by not having a way to express generalized pattern matching more often than I've been irked by the fact that the limited pattern matching doesn't include dicts (to the point that my reasonable thorough but not seriously proposed idea for pattern matching didn't even the obvious way to fit dicts into the system and I didn't notice until someone else pointed it out). I don't know if that's because we're writing different code, or if I spend more time coming back to Python code with my brain still halfway on another language, or just what we find natural... Personally, I'd still rather have full pattern matching (including a protocol roughly akin to copy/pickle to let arbitrary types participate in matching), but I can see why others might find the special case more useful than the general one. As for nested dicts assignment (or nested dicts and tuples), my first reaction was that you're building something very complicated but still very limited if it can handle fully general nesting of mappings and sequences but can't handle any other kind of containment. But then I realized that the exact same thing is true of JSON, and that's turned out to be pretty useful. When I use YAML, I make lots of use of things like having datetimes as a native type, but use other kinds of containers (I think I've used a multidict extension once...). So maybe my gut reaction here is wrong. > locals() is not writable, as you point out, in function calls. However, I'm not sure that having a mutable locals is a good solution to this problem. As mentioned in the original post, I most often want to do this in contexts where I'm unpacking serialized data, in which case it's probably not a great idea to have that data trample your namespace with no restrictions. What about exposing LocalsToFast to the language? Then, in the rare cases where you do want to mutate locals, you make it explicit--and it's also much more obvious that you have to name the variables somewhere and that you're potentially pessimizing the code. >> More practical, and in the spirit of tuple unpacking: >> spam, eggs, cheese = **expression > > I like how concise this syntax is. I'd be sad that it doesn't allow unpacking of nested expressions, though I think we disagree on whether that's actually an issue. > A more substantial objection might be that this could only work on mapping objects with strings for keys. The same substantial objection applies to the existing uses of **, both in passing a dict as keyword arguments and in capturing unbound keyword arguments. So, for example, you can't really pass a dict through a function call by passing and accepting **kw, and c = dict(a, **b) doesn't really merge two dicts--and yet it's still useful for that purpose in many cases. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Aug 13 04:41:41 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 13 Aug 2015 12:41:41 +1000 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: References: <20150812163816.GG5249@ando.pearwood.info> Message-ID: <20150813024140.GK5249@ando.pearwood.info> On Wed, Aug 12, 2015 at 11:44:05AM -0700, Scott Sanderson wrote: > > I think the sequence unpacking version above reads much better than > > this hypothetical dict unpacking version: > > {'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, > > 'eggs': eggs, 'cheese': cheese} = json.loads(json_dict) > > > As with many things in Python, I think that how you format this expression > makes a big difference. I'd write it like this: > > { > 'foo': foo, > 'bar': bar, > 'baz': baz, > 'spam': spam, > 'eggs': eggs, > 'cheese': cheese, > } = json.loads(json_dict) That's still awfully verbose, and not much of a saving from: foo = d['foo'] bar = d['bar'] baz = d['baz'] etc. You save a little bit of typing, but not that much. > I prefer this to your example of unpacking from a list comprehension > because I think it does a better job of expressing to a reader the expected > structure of the input data. I don't think it does. I think the above would be incomprehensible to somebody who hasn't learned the details of this. It looks like you are creating a dict, but not assigning the dict to anything. And where do the unquoted values foo, bar, etc. come from? They look like they should come from already existing local variables: {'foo': foo} as an expression (rather than an assignment target) requires an existing foo variable (otherwise you get a NameError). So the behaviour has to be learned, it isn't something that the reader can extrapolate from other assignment syntax. It isn't obvious what this does: {foo: bar} = some_dict because there's no one-to-one correspondence between assignment target and assignment name. With sequence unpacking, the targets are obvious: foo, bar, baz = ... clearly has assignment targets foo, bar and baz. What else could they be? It's easy to extrapolate it from single assignment foo = ... But with your syntax, you have keys and values, and it isn't clear what gets used for what. The dict display form doesn't look like any other assignment target, you have to learn it as a special case. A reader who hasn't learned the rules could be forgiven for guessing any of the following rules: (1) create a variable foo from existing variable bar (2) create a variable foo from some_dict['bar'] (3) create a variable with the name given by the value of foo, from some_dict['bar'] (4) create a variable bar from some_dict['foo'] (5) create a variable with the name given by the value of bar, from some_dict['foo'] and others. You could make that a bit more clear by requiring the keys to be quoted, so {foo: bar} = ... would be illegal, and you have to write {'foo': 'bar'}, but that's annoying. Or we could go the other way and not quote anything: {foo: bar} = d could create variable foo from d['bar']. That's not bad looking, and avoids all the quote marks, but I don't think people would guess that's the behaviour. It still doesn't look like an assignment target. And the common case is still verbose: {foo: foo} = ... What if we have expressions in there? {foo.upper() + 's': bar} = some_dict {foo: bar or baz} = some_dict I would hope both of those are syntax errors! But maybe somebody will want them. At least, some people will expect them, because that sort of thing works in dict displays. You even hint at arbitrary values below, with a tuple (baz_x, baz_y). > It's also much easier to modify this to > extract nested values, ala: > > { > 'foo': foo, > 'bar': bar, > 'baz': (baz_x, baz_y), > 'spam': spam, > 'eggs': eggs, > 'cheese': cheese, > } = json.loads(json_dict) So baz is a tuple of d['baz_x'], d['baz_y']? Does this mean you want to allow arbitrary expressions for the values? {'foo': func(foo or bar.upper() + "s") + baz} = d If so, what are the scoping rules? Which of func, foo, bar and baz are looked up from the right-hand side dict, and which are taken from the current scope? I think allowing arbitrary expressions cannot work in any reasonable manner, but special casing tuples (baz_x, baz_y) is too much of a special case. > More practical, and in the spirit of tuple unpacking: > > spam, eggs, cheese = **expression > > > I like how concise this syntax is. I'd be sad that it doesn't allow > unpacking of nested expressions, though I think we disagree on whether > that's actually an issue. > A more substantial objection might be that this could only work on mapping > objects with strings for keys. Does that mean that you expect your syntax to support non-identifier key lookups? {'foo': 123, 'bar': 'x y', 'baz': None} = d will look for keys 123 (or should that be '123'?), 'x y' and None (or possibly 'None')? If so, I think you've over-generalised from a fairly straightforward use-case: unpack key:values in a mapping to variables with the same name as the keys and YAGNI applies. For cases where the keys are not the same as the variables, or you want to use non-identifier keys, just use the good-old fashioned form: variable = d['some non-identifier'] -- Steve From rosuav at gmail.com Thu Aug 13 05:26:50 2015 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 13 Aug 2015 13:26:50 +1000 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) In-Reply-To: <20150813024140.GK5249@ando.pearwood.info> References: <20150812163816.GG5249@ando.pearwood.info> <20150813024140.GK5249@ando.pearwood.info> Message-ID: On Thu, Aug 13, 2015 at 12:41 PM, Steven D'Aprano wrote: > What if we have expressions in there? > > {foo.upper() + 's': bar} = some_dict > {foo: bar or baz} = some_dict > > I would hope both of those are syntax errors! But maybe somebody will > want them. At least, some people will expect them, because that sort of > thing works in dict displays. You even hint at arbitrary values below, > with a tuple (baz_x, baz_y). > > >> It's also much easier to modify this to >> extract nested values, ala: >> >> { >> 'foo': foo, >> 'bar': bar, >> 'baz': (baz_x, baz_y), >> 'spam': spam, >> 'eggs': eggs, >> 'cheese': cheese, >> } = json.loads(json_dict) > > So baz is a tuple of d['baz_x'], d['baz_y']? > > Does this mean you want to allow arbitrary expressions for the values? > > {'foo': func(foo or bar.upper() + "s") + baz} = d > > If so, what are the scoping rules? Which of func, foo, bar and baz are > looked up from the right-hand side dict, and which are taken from the > current scope? > > I think allowing arbitrary expressions cannot work in any reasonable > manner, but special casing tuples (baz_x, baz_y) is too much of a > special case. baz would be a multiple assignment target. The way I understand this, the keys are ordinary expressions, and the 'values' are assignment targets, and can be nested just as sequence unpacking can: >>> x=[1,2,[3,4],5] >>> a,b,(c,d),e = x >>> a,b,c,d,e (1, 2, 3, 4, 5) So 'baz': (baz_x, baz_y) would take d['baz'] and expect it to be a sequence of length 2. Arbitrary expressions in the values would be illogical, just as they are anywhere else: >>> foo or bar = 1 File "", line 1 SyntaxError: can't assign to operator Arbitrary expressions in the keys would make perfect sense, although I would hope they'd be rare. Whatever it evaluates to, that would be retrieved from the source object, and the result assigned to the corresponding target. The idea's internally consistent. I'm not convinced it's particularly useful, but it does hold water. ChrisA From guido at python.org Thu Aug 13 06:37:01 2015 From: guido at python.org (Guido van Rossum) Date: Thu, 13 Aug 2015 06:37:01 +0200 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <20150812120627.0efc2b79@anarchist.wooz.org> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> Message-ID: On Wed, Aug 12, 2015 at 6:06 PM, Barry Warsaw wrote: > [...] > On Aug 12, 2015, at 08:50 AM, Guido van Rossum wrote: > >It also has the same problems as locals(), sys._getframe(), etc., which is > >that their presence makes certain optimizations harder (in IronPython IIRC > >the creation of frame objects is normally skipped to speed up function > >calls, but the optimizer must detect the presence of those functions in > >order to disable that optimization). That doesn't mean I'm opposed to it > (I > >don't have a problem with locals()), but it does mean that I think their > >use should probably not be encouraged. > > I'm much less concerned about the performance impact loss of optimization > provides because I think i18n is already generally slower... and that's > okay! > I mean _() has to at least do a dictionary look (assuming the catalog is > warmed in memory) and then a piece-wise interpolation into the resulting > translated string. So you're already paying runtime penalty to do i18n. > Fair enough. (Though IMO the real cost of i18n is that it introduces a feeling of programming in molasses.) > [...] > i18n is one of those places where DRY really is a limiting factor. You > just > can't force coders to pass in all the arguments to their translated > strings, > say into the _() function. The code looked horrible, it's way too much > typing, and people (well, *I* ;) just won't do it. After implementing the > sys._getframe() hack, it made i18n just so much more pleasant and easy to > write, you almost couldn't not do it. > Agreed. At Dropbox we use %(name)s in our i18n strings and the code always ends up looking ugly. > One of the things that intrigues me about this whole idea of syntactic and > compiler support is the ability to narrow down the set of substitution > values > available for interpolation, by parsing the source string and passing them > into the interpolation call. > > Currently, _() is forced to expose all of locals and global to > interpolation, > although I guess it could also parse out the $-placeholders in the source > string too[1]. Not doing this does open an information leak vector via > maliciously translated strings. If the source string were parsed and > *only* > those names were available for interpolation, a maliciously translated > string > couldn't be used to expose additional information because the keys in the > interpolation dictionary would be limited. > Yes, this is a real advantage of pursuing the current set of ideas further. > This mythical scope() could take arguments which would name the variables > in > the enclosing scopes it should export. It would still be a PITA if used > explicitly, but could work nicely if i-strings essentially boiled down to: > > placeholders = source_string.extract_placeholders() > substitutions = scope(*placeholders) > translated_string = i18n.lookup(source_string) > return translated_string.safe_substitute(substitutions) > > That would actually be quite useful. > Agreed. But whereas you are quite happy having only simple variable names in i18n templates, the feature required for the non-i18n use case really needs arbitrary expressions. If we marry the two, your i18n code will just have to yell at the programmer if they use something too complex for the translators as a substitution. So possibly PEP 501 can be rescued. But I think we need separate prefixes for the PEP 498 and PEP 501 use cases; perhaps f'{...}' and _'{...}'. (But it would not be up to the compiler to limit the substitution syntax in _'{...}') -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From vito.detullio at gmail.com Thu Aug 13 08:36:27 2015 From: vito.detullio at gmail.com (Vito De Tullio) Date: Thu, 13 Aug 2015 08:36:27 +0200 Subject: [Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues) References: <1439394078.2428143.354454697.20039686@webmail.messagingengine.com> <20150812164828.GH5249@ando.pearwood.info> Message-ID: Steven D'Aprano wrote: > On Wed, Aug 12, 2015 at 11:46:10AM -0400, Joseph Jevnik wrote: >> From a language design standpoint I think that having non-constant keys >> in the unpack map makes a lot of sense. > > mydict = {'spam': 1, 'eggs': 2} > spam = 'eggs' > eggs = 99 > {spam: spam} = mydict > print(spam, eggs) > > > What gets printed? I can only guess that you want it to print > > eggs 1 > > rather than > > 1 99 why? replacing bound variables with the literal values we have {spam:spam} equals to {'eggs':spam} mydict equals to {'spam': 1, 'eggs': 2} the original assignement {spam:spam} = mydict is equivalent to write {'eggs': spam} = {'spam': 1, 'eggs': 2} this form of desugaring rougly wants to be read as "write in the variable 'spam' the value looked up in the {'spam':1,'eggs':2} dict with the key 'eggs'" or spam = {'spam':1,'eggs':2}['eggs'] = 2 the 'variable' eggs is not touched at all in this assignment, so print(spam, eggs) "prints" `2 99` > but I can't be sure. I am reasonably sure that whatever you pick, it > will surprise some people. It will also play havok with CPython's local > variable optimization, since the compiler cannot tell what the name of > the local will be: > > def func(): > mydict = dict(foo=1, bar=2, baz=3) > spam = random.choice(['foo', 'bar', 'baz']) > {spam: spam} = mydict > # which locals exist at this point? the 'name' of the local is spam; the value is one of 1, 2 or 3 for what I can see {'x1': y1, 'x2': y2, 'x3': y3 } = z can be translated to y1 = z['x1'] y2 = z['x2'] y3 = z['x3'] -- By ZeD From srkunze at mail.de Thu Aug 13 08:48:44 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 13 Aug 2015 08:48:44 +0200 Subject: [Python-ideas] Learning from the shell in supporting asyncio background calls In-Reply-To: References: <55CA68A0.7080501@mail.de> Message-ID: <55CC3DCC.20101@mail.de> On 12.08.2015 00:37, Jonathan Slenders wrote: > Honestly, I don't understand the issue. (Or maybe I'm too tired right > now.) > But I think asyncio is actually very well designed. Nobody says asyncio is not well designed if that is what you were thinking about others who have issues with asyncio were thinking (did that make sense?). > We just have to keep in mind that code is either synchronous or > asynchronous. Code that consumes too much CPU or does blocking calls > does definitely not belong in an event driven system. That is exactly what people complain about. They don't like this thinking. They don't like this "all in or nothing" attitude. You may ask why? Because people want to try stuff out. But when in order to do so, they need to convert 10 mio lines of code in order to *see some results*, it just looks insane to them (looking at you, too, Python 3). The point is not to have some toy projects show-casting the abilities of asyncio, but trying it out among the lines of formerly 100% synchronous code. I am sorry but that is the world we live in, so, we need to compromise; otherwise people will disagree and not follow. This is not about doing 100% right and perfect design but taking people with you on the journey. Best, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Thu Aug 13 08:53:29 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Thu, 13 Aug 2015 08:53:29 +0200 Subject: [Python-ideas] Making concurrent.futures.Futures awaitable In-Reply-To: References: <55C4E22D.101@nextday.fi> <55C5FA72.7030303@nextday.fi> Message-ID: <55CC3EE9.5030101@mail.de> On 09.08.2015 02:22, Nick Coghlan wrote: > There was a thread on the idea recently, but I don't have a link > handy. Indicating CPU vs IO bound directly wouldn't work (that's > context dependent), but allowing callables to explicitly indicate > "recommended", "supported", "incompatible" for process pools could be > interesting. Thread start (more or less): https://mail.python.org/pipermail/python-ideas/2015-August/034917.html Current post of that thread: https://mail.python.org/pipermail/python-ideas/2015-August/035211.html And I would like to hear you insights on the try: block idea as well. :) From eric at trueblade.com Thu Aug 13 13:58:17 2015 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 13 Aug 2015 07:58:17 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> Message-ID: <55CC8659.2080308@trueblade.com> On 08/13/2015 12:37 AM, Guido van Rossum wrote: > On Wed, Aug 12, 2015 at 6:06 PM, Barry Warsaw > wrote: > > placeholders = source_string.extract_placeholders() > substitutions = scope(*placeholders) > translated_string = i18n.lookup(source_string) > return translated_string.safe_substitute(substitutions) > > That would actually be quite useful. > > > Agreed. But whereas you are quite happy having only simple variable > names in i18n templates, the feature required for the non-i18n use case > really needs arbitrary expressions. If we marry the two, your i18n code > will just have to yell at the programmer if they use something too > complex for the translators as a substitution. So possibly PEP 501 can > be rescued. But I think we need separate prefixes for the PEP 498 and > PEP 501 use cases; perhaps f'{...}' and _'{...}'. (But it would not be > up to the compiler to limit the substitution syntax in _'{...}') For the sake of the following argument, let's agree to disagree on: - arbitrary expressions: we'll say yes - string prefix character: we'll say 'f' - how to identify expressions in a string: we'll say {...} I promise we can bikeshed about these later. I'm just using the PEP 498 version because I'm more familiar with it. And let's say that PEP 498 will take this: name = 'Eric' dog_name = 'Fluffy' f"My name is {name}, my dog's name is {dog_name}" And convert it to this (inspired by Victor): "My name is {0}, my dog's name is {1}".format('Eric', 'Fluffy') Resulting in: "My name is Eric, my dog's name is Fluffy" It seems to me that all you need for i18n is to instead make it produce: __i18n__("My name is {0}, my dog's name is {1}").format('Eric', 'Fluffy') The __i18n__ function would do whatever lookup is needed to produce the translated string. So, in some English dialect where pet names had to come first, it could return: 'The owner of the dog {1} is named {0}' So the result would be: 'The owner of the dog Fluffy is named Eric' I promise we can bikeshed about the name __i18n__. So the translator has no say in how the expressions are evaluated. This removes any concern about information leakage. If the source code said: f"My name is {name}, my dog's name is {dog_name.upper()}" then the string being passed to __i18n__ would remain unchanged. If by convention you wanted to not use arbitrary expressions and just use identifiers, then just make it a coding standard thing. It doesn't affect the implementation one way or the other. The default implementation for my proposed __i18n__ function (probably a builtin) would be just to return its string argument. Then you get the PEP 498 behavior. But in your module, you could say: __i18n__ = gettext.gettext and now you'd be using that machinery. The one downside of this is that the strings that the translator is translating from do not appear in the source code. The translator would have to know that the string being translated is: "My name is {0}, my dog's name is {1}" But since this only operates on f-string literals, you could mechanically extract them from the source. For example, given the example f-string above, my current PEP 498 implementation returns this: 'Module(body=[Expr(value=FormattedStr(value=Call(func=Attribute(value=Str(s="My name is {0}, my dog\'s name is {1}"), attr=\'format\', ctx=Load()), args=[Name(id=\'name\', ctx=Load()), Name(id=\'dog_name\', ctx=Load())], keywords=[])))])' So the translatable string can easily be extracted from the ast. I could modify the FormattedStr node to make that string easier to find. Eric. From python at mrabarnett.plus.com Thu Aug 13 14:23:10 2015 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 13 Aug 2015 13:23:10 +0100 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <55CC8659.2080308@trueblade.com> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> Message-ID: <55CC8C2E.5020102@mrabarnett.plus.com> On 2015-08-13 12:58, Eric V. Smith wrote: > On 08/13/2015 12:37 AM, Guido van Rossum wrote: >> On Wed, Aug 12, 2015 at 6:06 PM, Barry Warsaw > > wrote: > >> >> placeholders = source_string.extract_placeholders() >> substitutions = scope(*placeholders) >> translated_string = i18n.lookup(source_string) >> return translated_string.safe_substitute(substitutions) >> >> That would actually be quite useful. >> >> >> Agreed. But whereas you are quite happy having only simple variable >> names in i18n templates, the feature required for the non-i18n use case >> really needs arbitrary expressions. If we marry the two, your i18n code >> will just have to yell at the programmer if they use something too >> complex for the translators as a substitution. So possibly PEP 501 can >> be rescued. But I think we need separate prefixes for the PEP 498 and >> PEP 501 use cases; perhaps f'{...}' and _'{...}'. (But it would not be >> up to the compiler to limit the substitution syntax in _'{...}') > > For the sake of the following argument, let's agree to disagree on: > - arbitrary expressions: we'll say yes > - string prefix character: we'll say 'f' > - how to identify expressions in a string: we'll say {...} > > I promise we can bikeshed about these later. I'm just using the PEP 498 > version because I'm more familiar with it. > > And let's say that PEP 498 will take this: > > name = 'Eric' > dog_name = 'Fluffy' > f"My name is {name}, my dog's name is {dog_name}" > > And convert it to this (inspired by Victor): > > "My name is {0}, my dog's name is {1}".format('Eric', 'Fluffy') > Resulting in: > "My name is Eric, my dog's name is Fluffy" > > It seems to me that all you need for i18n is to instead make it produce: > > __i18n__("My name is {0}, my dog's name is {1}").format('Eric', 'Fluffy') > > The __i18n__ function would do whatever lookup is needed to produce the > translated string. So, in some English dialect where pet names had to > come first, it could return: > 'The owner of the dog {1} is named {0}' > > So the result would be: > 'The owner of the dog Fluffy is named Eric' > I think that looking up only the translation string and then inserting the values isn't good enough. For example, what if the string was "Found {0} matches"? If the number of matches was 1, you'd get "Found 1 matches". Ideally, you'd want to pass the values too, so that the lookup could pick the correct translation. [snip] From eric at trueblade.com Thu Aug 13 15:40:43 2015 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 13 Aug 2015 09:40:43 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <55CC8C2E.5020102@mrabarnett.plus.com> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> <55CC8C2E.5020102@mrabarnett.plus.com> Message-ID: <55CC9E5B.5020808@trueblade.com> On 08/13/2015 08:23 AM, MRAB wrote: > On 2015-08-13 12:58, Eric V. Smith wrote: >> On 08/13/2015 12:37 AM, Guido van Rossum wrote: >>> On Wed, Aug 12, 2015 at 6:06 PM, Barry Warsaw >> > wrote: >> >>> >>> placeholders = source_string.extract_placeholders() >>> substitutions = scope(*placeholders) >>> translated_string = i18n.lookup(source_string) >>> return translated_string.safe_substitute(substitutions) >>> >>> That would actually be quite useful. >>> >>> >>> Agreed. But whereas you are quite happy having only simple variable >>> names in i18n templates, the feature required for the non-i18n use case >>> really needs arbitrary expressions. If we marry the two, your i18n code >>> will just have to yell at the programmer if they use something too >>> complex for the translators as a substitution. So possibly PEP 501 can >>> be rescued. But I think we need separate prefixes for the PEP 498 and >>> PEP 501 use cases; perhaps f'{...}' and _'{...}'. (But it would not be >>> up to the compiler to limit the substitution syntax in _'{...}') >> >> For the sake of the following argument, let's agree to disagree on: >> - arbitrary expressions: we'll say yes >> - string prefix character: we'll say 'f' >> - how to identify expressions in a string: we'll say {...} >> >> I promise we can bikeshed about these later. I'm just using the PEP 498 >> version because I'm more familiar with it. >> >> And let's say that PEP 498 will take this: >> >> name = 'Eric' >> dog_name = 'Fluffy' >> f"My name is {name}, my dog's name is {dog_name}" >> >> And convert it to this (inspired by Victor): >> >> "My name is {0}, my dog's name is {1}".format('Eric', 'Fluffy') >> Resulting in: >> "My name is Eric, my dog's name is Fluffy" >> >> It seems to me that all you need for i18n is to instead make it produce: >> >> __i18n__("My name is {0}, my dog's name is {1}").format('Eric', 'Fluffy') >> >> The __i18n__ function would do whatever lookup is needed to produce the >> translated string. So, in some English dialect where pet names had to >> come first, it could return: >> 'The owner of the dog {1} is named {0}' >> >> So the result would be: >> 'The owner of the dog Fluffy is named Eric' >> > I think that looking up only the translation string and then inserting > the values isn't good enough. > > For example, what if the string was "Found {0} matches"? > > If the number of matches was 1, you'd get "Found 1 matches". > > Ideally, you'd want to pass the values too, so that the lookup could > pick the correct translation. That's certainly doable. You could pass in the values as a tuple, and either have __i18n__ call .format itself, or still just return the translated string and then call .format on the result. def __i18n__(message, values): return message But I'm not sure how much of this to build in to the f-string machinery. gettext.gettext doesn't solve this problem by itself, either. Eric. From tritium-list at sdamon.com Thu Aug 13 15:42:56 2015 From: tritium-list at sdamon.com (Alexander Walters) Date: Thu, 13 Aug 2015 09:42:56 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <55CC8C2E.5020102@mrabarnett.plus.com> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> <55CC8C2E.5020102@mrabarnett.plus.com> Message-ID: <55CC9EE0.8020700@sdamon.com> On 8/13/2015 08:23, MRAB wrote: > I think that looking up only the translation string and then inserting > the values isn't good enough. > > For example, what if the string was "Found {0} matches"? > > If the number of matches was 1, you'd get "Found 1 matches". > > Ideally, you'd want to pass the values too, so that the lookup could > pick the correct translation. > > [snip] > Why would we solve this on new-formatting, but not in old-formatting when doing i18n? You have identified an existing problem (pluralization), the solutions to which would also work to solve the problem under consideration. From barry at python.org Thu Aug 13 16:00:42 2015 From: barry at python.org (Barry Warsaw) Date: Thu, 13 Aug 2015 10:00:42 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> Message-ID: <20150813100042.3f026ce5@anarchist.wooz.org> On Aug 13, 2015, at 07:58 AM, Eric V. Smith wrote: >The one downside of this is that the strings that the translator is >translating from do not appear in the source code. The translator would >have to know that the string being translated is: >"My name is {0}, my dog's name is {1}" I think unfortunately, this is a non-starter for the i18n use case. The message catalog must include the source string as it appears in the code because otherwise, translators will not be able to reliably map the intended meaning to their native language. They'll have to keep a mental map between source string placeholders and numeric placeholders, and I am fairly confident that this will be a source of broken translations. Is there a problem with keeping the named placeholders throughout the entire stack? Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tjreedy at udel.edu Thu Aug 13 16:04:20 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 13 Aug 2015 10:04:20 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> Message-ID: On 8/13/2015 12:37 AM, Guido van Rossum wrote: > Fair enough. (Though IMO the real cost of i18n is that it introduces a > feeling of programming in molasses.) For some structured situations, such as gui menus, the molasses is not needed. _(...) does two things: mark a string for the translator collector, and actually do the translation. Idle defines 'menudefs' structures, which are lists of menu tuples. The first item of each tuple is the string to be displayed on the menu, the second is the binding for that item, either a pseudoevent or a list of menu tuples for a submenu. A function walks the structure to extract the names to pass to tk menu calls. For internationalization, the gettext.gettext translation call could be added in one place, where the string is passed to tk, rather than 80 places in the structure definition. An altered version of the menudefs walker could be used to collect the menu strings for translation. If we want to encourage multi-language tkinter apps, i18n code should be added somewhere public in the tkinter package (and gettext module), rather than hidden away in idlelib. -- Terry Jan Reedy From barry at python.org Thu Aug 13 16:05:16 2015 From: barry at python.org (Barry Warsaw) Date: Thu, 13 Aug 2015 10:05:16 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> <55CC8C2E.5020102@mrabarnett.plus.com> <55CC9E5B.5020808@trueblade.com> Message-ID: <20150813100516.4a712342@anarchist.wooz.org> On Aug 13, 2015, at 09:40 AM, Eric V. Smith wrote: >But I'm not sure how much of this to build in to the f-string machinery. >gettext.gettext doesn't solve this problem by itself, either. Our gettext module does have some support for plural forms, but it's probably not great. https://docs.python.org/2/library/gettext.html#gettext.GNUTranslations.ngettext See also for reference: https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html For any built-in machinery such as f-strings we'd want to at least make sure it's possible to support plural forms. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From barry at python.org Thu Aug 13 17:39:36 2015 From: barry at python.org (Barry Warsaw) Date: Thu, 13 Aug 2015 11:39:36 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> Message-ID: <20150813113936.2a9b8595@anarchist.wooz.org> On Aug 13, 2015, at 10:04 AM, Terry Reedy wrote: >For internationalization, the gettext.gettext translation call could be added >in one place, where the string is passed to tk, rather than 80 places in the >structure definition. An altered version of the menudefs walker could be >used to collect the menu strings for translation. That would require being able to translate non-literals. I'd need the same, and it would be okay if the translation call were spelled less conveniently, as long as it's possible to both extract and translate the source strings. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From steve.dower at python.org Thu Aug 13 19:06:47 2015 From: steve.dower at python.org (Steve Dower) Date: Thu, 13 Aug 2015 10:06:47 -0700 Subject: [Python-ideas] More "ensure*" packages Message-ID: <55CCCEA7.8000406@python.org> I'd like to propose expanding the list of 3rd-party packages we bundle and install by default. (Obviously this does not apply to platforms that repackage Python and can do whatever they want, but on Windows and Mac we are fully responsible for these.) Currently, we bundle pip (and some of its dependencies - let's avoid that particular discussion right now please, it's on python-dev) and install it by default in a way that lets users easily update to the latest version. Including pip in the standard library would lock users into a specific version for the lifetime of that Python version, which would be a bad thing. From my point-of-view, this has been very successful in Python 2.7, 3.4 and will also be successful in 3.5. For Python 3.6, I'd like to do a similar thing with: * requests * tkinter (including tcl/tk, IDLE, and other dependencies) Given the language summit discussion at PyCon this year, I think requests is easy to justify. (Quick summary for those who weren't there: we'd love to include requests in the stdlib, but it's too important and needs much more frequent updates.) Preinstalling a given version in a way that allows updates (and maybe attempting an update on installation) sounds great to me. tkinter is worth more discussion :) For the remainder of this email, I'll use "tkinter" as shorthand to refer to Tcl, Tk, Tix, _tkinter, tkinter, idlelib/IDLE, PyDoc, turtledemo and any other dependencies or dependents that I missed. In my experience, few Python scripts depend on or assume tkinter is available. tkinter is already an optional item in the Windows installer (maybe Mac too? I don't know) and there are certainly installations of Python out there that don't have it. From this side, nothing would actually change by installing tkinter into site-packages rather than Lib. (One impact may be the start menu shortcuts for IDLE and PyDoc, but provided the entry points into those tools are kept stable we can continue adding shortcuts from the installer. People who omit tkinter and then install it later would not get shortcuts. But since they omitted it from the installer, they probably don't want them - they likely just got a package that has tkinter as a dependency.) IDLE is already allowed to make enhancements in maintenance branches (https://www.python.org/dev/peps/pep-0434/), and we have recently received patches that are to be applied to *four* branches. The freedom to enhance IDLE is greatly improved by making it a PyPI installable package and disconnecting it from the stdlib's schedule. How this would actually be structured is up for discussion. I believe the change can be made without sacrificing anything, and the resulting flexibility will be worth it. Thoughts? Cheers, Steve From liik.joonas at gmail.com Thu Aug 13 19:31:50 2015 From: liik.joonas at gmail.com (Joonas Liik) Date: Thu, 13 Aug 2015 20:31:50 +0300 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CCCEA7.8000406@python.org> References: <55CCCEA7.8000406@python.org> Message-ID: > Currently, we bundle pip (and some of its dependencies - let's avoid that > particular discussion right now please, it's on python-dev) and install it > by default in a way that lets users easily update to the latest version. > Including pip in the standard library would lock users into a specific > version for the lifetime of that Python version, which would be a bad thing. pip install --upgrade pip has worked well every time i have tried it.. I would like to mention that dealing with any package that has a c-extension is utter pain under windows tho, ..especially if you want somebody else to be able to run your code, how is not a dev. From donald at stufft.io Thu Aug 13 19:35:22 2015 From: donald at stufft.io (Donald Stufft) Date: Thu, 13 Aug 2015 13:35:22 -0400 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CCCEA7.8000406@python.org> References: <55CCCEA7.8000406@python.org> Message-ID: On August 13, 2015 at 1:08:11 PM, Steve Dower (steve.dower at python.org) wrote: > I'd like to propose expanding the list of 3rd-party packages we bundle > and install by default. (Obviously this does not apply to platforms that > repackage Python and can do whatever they want, but on Windows and Mac > we are fully responsible for these.) > > Currently, we bundle pip (and some of its dependencies - let's avoid > that particular discussion right now please, it's on python-dev) and > install it by default in a way that lets users easily update to the > latest version. Including pip in the standard library would lock users > into a specific version for the lifetime of that Python version, which > would be a bad thing. > > From my point-of-view, this has been very successful in Python 2.7, 3.4 > and will also be successful in 3.5. For Python 3.6, I'd like to do a > similar thing with: > > * requests > * tkinter (including tcl/tk, IDLE, and other dependencies) > > Given the language summit discussion at PyCon this year, I think > requests is easy to justify. (Quick summary for those who weren't there: > we'd love to include requests in the stdlib, but it's too important and > needs much more frequent updates.) Preinstalling a given version in a > way that allows updates (and maybe attempting an update on installation) > sounds great to me. > > tkinter is worth more discussion :) For the remainder of this email, > I'll use "tkinter" as shorthand to refer to Tcl, Tk, Tix, _tkinter, > tkinter, idlelib/IDLE, PyDoc, turtledemo and any other dependencies or > dependents that I missed. > > In my experience, few Python scripts depend on or assume tkinter is > available. tkinter is already an optional item in the Windows installer > (maybe Mac too? I don't know) and there are certainly installations of > Python out there that don't have it. From this side, nothing would > actually change by installing tkinter into site-packages rather than Lib. > > (One impact may be the start menu shortcuts for IDLE and PyDoc, but > provided the entry points into those tools are kept stable we can > continue adding shortcuts from the installer. People who omit tkinter > and then install it later would not get shortcuts. But since they > omitted it from the installer, they probably don't want them - they > likely just got a package that has tkinter as a dependency.) > > IDLE is already allowed to make enhancements in maintenance branches > (https://www.python.org/dev/peps/pep-0434/), and we have recently > received patches that are to be applied to *four* branches. The freedom > to enhance IDLE is greatly improved by making it a PyPI installable > package and disconnecting it from the stdlib's schedule. > > How this would actually be structured is up for discussion. I believe > the change can be made without sacrificing anything, and the resulting > flexibility will be worth it. > > Thoughts? >? One possible thing to look at for prior art, is what Haskell does. They don?t have a bunch of ensure* modules or anything like it, instead they have their compiler (which is like ?Haskell Core? and then on top of that they layer a bunch of libraries (Called ?Haskell Platform?).?This platform releases every ~6 months and just includes something like 40 different libraries with it that represent common development tools and widely used libraries [1]. So I guess my question is, instead of continuing down a path where we add more ensure* style modules to the standard library, why not do something similar and have ?Python the Language? and ?The Python Platform?, and the platform would be the Python language + N ?important? or ?popular? packages. This could release on a quicker release schedule than Python itself (since it would really be more like a meta package than anything that itself got developed) and would give the ability to ship things like this without the problems that we?ve had with ensurepip. From a downstream perspective they would just package all of this stuff as normal and it would just be available as normal. We could even publish a metapackage on PyPI that had no code of it?s own, but existed simply to list all of the platform packages as dependencies (with ==) and then people could easily depend on the Python ?platform? in their own code. This would essentially involve someone(s) needing to be the gatekeeper of which libraries become part of the Python platform, some small packaging shims to handle the metapackage on PyPI, and then the installer stuff for OSX and Windows (probably nothing for other OSs? Or maybe a tarball? I don?t know). [1]?https://www.haskell.org/platform/contents.html ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA From steve.dower at python.org Thu Aug 13 19:46:30 2015 From: steve.dower at python.org (Steve Dower) Date: Thu, 13 Aug 2015 10:46:30 -0700 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> Message-ID: <55CCD7F6.1010009@python.org> On 13Aug2015 1031, Joonas Liik wrote: >> Currently, we bundle pip (and some of its dependencies - let's avoid that >> particular discussion right now please, it's on python-dev) and install it >> by default in a way that lets users easily update to the latest version. >> Including pip in the standard library would lock users into a specific >> version for the lifetime of that Python version, which would be a bad thing. > > pip install --upgrade pip > has worked well every time i have tried it.. > > > I would like to mention that dealing with any package that > has a c-extension is utter pain under windows tho, > > ..especially if you want somebody else to be able to run your code, > how is not a dev. We are well aware of those issues, but it is completely off-topic for this thread. Cheers, Steve From alex.gronholm at nextday.fi Thu Aug 13 20:49:10 2015 From: alex.gronholm at nextday.fi (=?UTF-8?B?QWxleCBHcsO2bmhvbG0=?=) Date: Thu, 13 Aug 2015 21:49:10 +0300 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> Message-ID: <55CCE6A6.7000505@nextday.fi> 13.08.2015, 20:35, Donald Stufft kirjoitti: > > On August 13, 2015 at 1:08:11 PM, Steve Dower (steve.dower at python.org) wrote: >> I'd like to propose expanding the list of 3rd-party packages we bundle >> and install by default. (Obviously this does not apply to platforms that >> repackage Python and can do whatever they want, but on Windows and Mac >> we are fully responsible for these.) >> >> Currently, we bundle pip (and some of its dependencies - let's avoid >> that particular discussion right now please, it's on python-dev) and >> install it by default in a way that lets users easily update to the >> latest version. Including pip in the standard library would lock users >> into a specific version for the lifetime of that Python version, which >> would be a bad thing. >> >> From my point-of-view, this has been very successful in Python 2.7, 3.4 >> and will also be successful in 3.5. For Python 3.6, I'd like to do a >> similar thing with: >> >> * requests >> * tkinter (including tcl/tk, IDLE, and other dependencies) >> >> Given the language summit discussion at PyCon this year, I think >> requests is easy to justify. (Quick summary for those who weren't there: >> we'd love to include requests in the stdlib, but it's too important and >> needs much more frequent updates.) Preinstalling a given version in a >> way that allows updates (and maybe attempting an update on installation) >> sounds great to me. >> >> tkinter is worth more discussion :) For the remainder of this email, >> I'll use "tkinter" as shorthand to refer to Tcl, Tk, Tix, _tkinter, >> tkinter, idlelib/IDLE, PyDoc, turtledemo and any other dependencies or >> dependents that I missed. >> >> In my experience, few Python scripts depend on or assume tkinter is >> available. tkinter is already an optional item in the Windows installer >> (maybe Mac too? I don't know) and there are certainly installations of >> Python out there that don't have it. From this side, nothing would >> actually change by installing tkinter into site-packages rather than Lib. >> >> (One impact may be the start menu shortcuts for IDLE and PyDoc, but >> provided the entry points into those tools are kept stable we can >> continue adding shortcuts from the installer. People who omit tkinter >> and then install it later would not get shortcuts. But since they >> omitted it from the installer, they probably don't want them - they >> likely just got a package that has tkinter as a dependency.) >> >> IDLE is already allowed to make enhancements in maintenance branches >> (https://www.python.org/dev/peps/pep-0434/), and we have recently >> received patches that are to be applied to *four* branches. The freedom >> to enhance IDLE is greatly improved by making it a PyPI installable >> package and disconnecting it from the stdlib's schedule. >> >> How this would actually be structured is up for discussion. I believe >> the change can be made without sacrificing anything, and the resulting >> flexibility will be worth it. >> >> Thoughts? >> > One possible thing to look at for prior art, is what Haskell does. They don?t have a bunch of ensure* modules or anything like it, instead they have their compiler (which is like ?Haskell Core? and then on top of that they layer a bunch of libraries (Called ?Haskell Platform?). This platform releases every ~6 months and just includes something like 40 different libraries with it that represent common development tools and widely used libraries [1]. > > So I guess my question is, instead of continuing down a path where we add more ensure* style modules to the standard library, why not do something similar and have ?Python the Language? and ?The Python Platform?, and the platform would be the Python language + N ?important? or ?popular? packages. This could release on a quicker release schedule than Python itself (since it would really be more like a meta package than anything that itself got developed) and would give the ability to ship things like this without the problems that we?ve had with ensurepip. From a downstream perspective they would just package all of this stuff as normal and it would just be available as normal. We could even publish a metapackage on PyPI that had no code of it?s own, but existed simply to list all of the platform packages as dependencies (with ==) and then people could easily depend on the Python ?platform? in their own code. > > This would essentially involve someone(s) needing to be the gatekeeper of which libraries become part of the Python platform, some small packaging shims to handle the metapackage on PyPI, and then the installer stuff for OSX and Windows (probably nothing for other OSs? Or maybe a tarball? I don?t know). Amen to this. This is EXACTLY where I've hoped Python would go :) > > [1] https://www.haskell.org/platform/contents.html > > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From steve.dower at python.org Thu Aug 13 22:28:09 2015 From: steve.dower at python.org (Steve Dower) Date: Thu, 13 Aug 2015 13:28:09 -0700 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CCE6A6.7000505@nextday.fi> References: <55CCCEA7.8000406@python.org> <55CCE6A6.7000505@nextday.fi> Message-ID: <55CCFDD9.4090108@python.org> Didn't see Donald's email, so I'm replying via Alex's reply. 13.08.2015, 20:35, Donald Stufft kirjoitti: > So I guess my question is, instead of continuing down a path where we > add more ensure* style modules to the standard library, why not do > something similar and have ?Python the Language? and ?The Python > Platform?, and the platform would be the Python language + N > ?important? or ?popular? packages. This could release on a quicker > release schedule than Python itself (since it would really be more > like a meta package than anything that itself got developed) and would > give the ability to ship things like this without the problems that > we?ve had with ensurepip. From a downstream perspective they would > just package all of this stuff as normal and it would just be > available as normal. We could even publish a metapackage on PyPI that > had no code of it?s own, but existed simply to list all of the > platform packages as dependencies (with ==) and then people could > easily depend on the Python ?platform? in their own code. > > This would essentially involve someone(s) needing to be the gatekeeper > of which libraries become part of the Python platform, some small > packaging shims to handle the metapackage on PyPI, and then the > installer stuff for OSX and Windows (probably nothing for other OSs? > Or maybe a tarball? I don?t know). So basically we could add a requirements.txt into the core CPython repo and installers could trigger it (with user permission) on install? We could call it "python-platform.txt", and even add an "ensurepythonplatform" module to run the install (all names are only suggestions). I'd still like to bundle wheels with the latest available versions at build so that non-networked installs can get the packages, if not necessarily the latest. This seems like a better long term approach than one-ensure-module-per-package. I feel like moving tkinter&co onto PyPI is the more controversial suggestion :) Cheers, Steve From abarnert at yahoo.com Thu Aug 13 22:32:42 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 13 Aug 2015 13:32:42 -0700 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CCCEA7.8000406@python.org> References: <55CCCEA7.8000406@python.org> Message-ID: <2D46E3CE-465E-4CE9-8C19-A16F1086F5B0@yahoo.com> On Aug 13, 2015, at 10:06, Steve Dower wrote: > > From my point-of-view, this has been very successful in Python 2.7, 3.4 and will also be successful in 3.5. For Python 3.6, I'd like to do a similar thing with: > > * requests > * tkinter (including tcl/tk, IDLE, and other dependencies) Is the latter actually doable? Can tkinter be packaged in such a way that it includes or downloads or (best of all) downloads only if changed an entire Tcl/Tk installation (without affecting any other Tcl/Tk installations)? And, besides the technical question, at least on OS X we recommend ActiveTcl; does their weird licensing allow Python to just download and install that automatically? And, even if that is possible, does that mean I'll end up with 27 copies of ActiveTcl on my laptop (one for each Python and each virtual env) plus Apple's Tcl, instead of the 1 copy I installed manually? With a 30MB download for each one? If all of those are easily solved, then this seems like a cool idea. From donald at stufft.io Thu Aug 13 22:48:16 2015 From: donald at stufft.io (Donald Stufft) Date: Thu, 13 Aug 2015 16:48:16 -0400 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CCFDD9.4090108@python.org> References: <55CCCEA7.8000406@python.org> <55CCE6A6.7000505@nextday.fi> <55CCFDD9.4090108@python.org> Message-ID: On August 13, 2015 at 4:29:11 PM, Steve Dower (steve.dower at python.org) wrote: > Didn't see Donald's email, so I'm replying via Alex's reply. Hopefully you get this, I got some error about SPF last time I sent the email :/ > > So basically we could add a requirements.txt into the core CPython repo > and installers could trigger it (with user permission) on install? We > could call it "python-platform.txt", and even add an > "ensurepythonplatform" module to run the install (all names are only > suggestions). > > I'd still like to bundle wheels with the latest available versions at > build so that non-networked installs can get the packages, if not > necessarily the latest. This seems like a better long term approach than > one-ensure-module-per-package. Well, I?d probably make the ?Python Platform? sort of a super-installer that has inside of it the Python installer and then a requirements.txt or whatever along with whatever bundled wheels it needs. Leave the old Python installers alone and continue to generate them (as a sort of ?Minimal? installation option). The benefit to having a super installer over top of the core installers, is that you can update and version this super installer independently of the Python installers, so we could do releases more often, maybe every 3-6 months or so. This would essentially just be pulling in the latest Python version, and the latest versions of each of the bundled libraries. To be clear, the bundled libraries would be bundled into the super installer itself, not into Python. The idea being, that this is something that gets layered overtop of the ?Python Runtime? (aka Python the language and standard library) and doesn?t require anything special or any changes to Python itself. I *think* this will make it easier for downstream redistributors because they could implement this ?Python Platform? using just a metapackage and typical dependency information and wouldn?t need to deal with any of the mess that the bundled ensurepip module has created. This metapackage would just depend on Python, and each of the included third party libraries in the platform. This isn?t really something that python-dev itself would have to do (since it requires no changes to Python itself), however there?s benefit to python-dev doing it both in the official-ness of it, and getting it put onto www.python.org?and the documentation. Ideally I think we?d change the downloads to push the Python Platform for installers over the actual Python runtime. > > I feel like moving tkinter&co onto PyPI is the more controversial > suggestion :) I don?t have an opinion specific to tkinter & co, other than I think a smaller standard library and a larger external ecosystem combined with the platform thing I described above is a better direction to go in. I?ve long wanted such a thing. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA From steve.dower at python.org Thu Aug 13 22:52:00 2015 From: steve.dower at python.org (Steve Dower) Date: Thu, 13 Aug 2015 13:52:00 -0700 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <2D46E3CE-465E-4CE9-8C19-A16F1086F5B0@yahoo.com> References: <55CCCEA7.8000406@python.org> <2D46E3CE-465E-4CE9-8C19-A16F1086F5B0@yahoo.com> Message-ID: <55CD0370.5060705@python.org> On 13Aug2015 1332, Andrew Barnert wrote: > On Aug 13, 2015, at 10:06, Steve Dower wrote: >> >> From my point-of-view, this has been very successful in Python 2.7, 3.4 and will also be successful in 3.5. For Python 3.6, I'd like to do a similar thing with: >> >> * requests >> * tkinter (including tcl/tk, IDLE, and other dependencies) > > Is the latter actually doable? I believe so, but I'm only really familiar with how we currently distribute it on Windows. > Can tkinter be packaged in such a way that it includes or downloads or (best of all) downloads only if changed an entire Tcl/Tk installation (without affecting any other Tcl/Tk installations)? _tkinter.pyd (on Windows) links directly against Tcl and Tk's binaries, and while theoretically the version of Tcl/Tk being used could be replaced, I'm not aware of anyone doing this or even whether it works. It's certainly not trivial, because our tcl and tk binaries are installed right next to _tkinter, so you need to break your own installation in order to do it. My idea was that whichever part of the package includes _tkinter would also include the matched version of Tix, Tcl and Tk. There may be some issues arise out of that (beyond breaking people who assumed these files would never move) - I haven't done a full investigation yet. I do know there are some environment variable settings that can cause problems. As I understand it, nobody should be importing anything other than `tkinter`, which means we have a chance to resolve those easily, but a change like this would be very likely to break people who aren't using the public documented interfaces. (I'm totally okay with breaking them, BTW, and this would be a 3.5->3.6 change, not 3.5.1.) > And, besides the technical question, at least on OS X we recommend ActiveTcl; does their weird licensing allow Python to just download and install that automatically? If we're currently downloading and installing that automatically, then yes (or at least I'd assume so). If we're not, then we will continue to not do it. > And, even if that is possible, does that mean I'll end up with 27 copies of ActiveTcl on my laptop (one for each Python and each virtual env) plus Apple's Tcl, instead of the 1 copy I installed manually? I'd expect not. If a separate install is a valid setup, then I'd expect that to continue to be the case. But as I say, I'm most familiar with Windows where that isn't something you can do. Cheers, Steve > With a 30MB download for each one? > > If all of those are easily solved, then this seems like a cool idea. > From steve.dower at python.org Thu Aug 13 23:28:09 2015 From: steve.dower at python.org (Steve Dower) Date: Thu, 13 Aug 2015 14:28:09 -0700 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> <55CCE6A6.7000505@nextday.fi> <55CCFDD9.4090108@python.org> Message-ID: <55CD0BE9.4010704@python.org> On 13Aug2015 1348, Donald Stufft wrote: > On August 13, 2015 at 4:29:11 PM, Steve Dower (steve.dower at python.org) wrote: >> Didn't see Donald's email, so I'm replying via Alex's reply. > > Hopefully you get this, I got some error about SPF last time I sent the email :/ I'm on a new mail server, so it's entirely possible that it's one of my settings. Can you forward the full error to me off-list? > >> >> So basically we could add a requirements.txt into the core CPython repo >> and installers could trigger it (with user permission) on install? We >> could call it "python-platform.txt", and even add an >> "ensurepythonplatform" module to run the install (all names are only >> suggestions). >> >> I'd still like to bundle wheels with the latest available versions at >> build so that non-networked installs can get the packages, if not >> necessarily the latest. This seems like a better long term approach than >> one-ensure-module-per-package. > > > Well, I?d probably make the ?Python Platform? sort of a super-installer that has inside of it the Python installer and then a requirements.txt or whatever along with whatever bundled wheels it needs. Leave the old Python installers alone and continue to generate them (as a sort of ?Minimal? installation option). The benefit to having a super installer over top of the core installers, is that you can update and version this super installer independently of the Python installers, so we could do releases more often, maybe every 3-6 months or so. This would essentially just be pulling in the latest Python version, and the latest versions of each of the bundled libraries. To be clear, the bundled libraries would be bundled into the super installer itself, not into Python. > > The idea being, that this is something that gets layered overtop of the ?Python Runtime? (aka Python the language and standard library) and doesn?t require anything special or any changes to Python itself. I *think* this will make it easier for downstream redistributors because they could implement this ?Python Platform? using just a metapackage and typical dependency information and wouldn?t need to deal with any of the mess that the bundled ensurepip module has created. This metapackage would just depend on Python, and each of the included third party libraries in the platform. > > This isn?t really something that python-dev itself would have to do (since it requires no changes to Python itself), however there?s benefit to python-dev doing it both in the official-ness of it, and getting it put onto www.python.org and the documentation. Ideally I think we?d change the downloads to push the Python Platform for installers over the actual Python runtime. It's already a fairly crowded marketplace, at least on Windows. Anaconda, Canopy, WinPython, Pythonxy and Portable Python all come to mind, but none of them have reliably replaced the official Python installer. In large part, I suspect this is because they do too much - most include the scipy stack and at least one (typically 4-5) editors. What might be interesting is if we installed the meta-package into the Lib directory rather than Lib/site-packages. That also opens up the possibility of removing old/deprecated modules from the stdlib but having an install option to restore them. Which helps break up the stdlib a bit and make the core Python install lighter, but still isn't really suitable for something like requests. I also don't see any real gain in making it a separate thing from the main installer unless you're planning 10+ packages to be in there. I would expect an incredibly small amount of packages to be available - those with no significant competitors, extremely broad uses, and completely portable. Cheers, Steve > >> >> I feel like moving tkinter&co onto PyPI is the more controversial >> suggestion :) > > I don?t have an opinion specific to tkinter & co, other than I think a smaller standard library and a larger external ecosystem combined with the platform thing I described above is a better direction to go in. I?ve long wanted such a thing. > > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From donald at stufft.io Thu Aug 13 23:50:13 2015 From: donald at stufft.io (Donald Stufft) Date: Thu, 13 Aug 2015 17:50:13 -0400 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CD0BE9.4010704@python.org> References: <55CCCEA7.8000406@python.org> <55CCE6A6.7000505@nextday.fi> <55CCFDD9.4090108@python.org> <55CD0BE9.4010704@python.org> Message-ID: On August 13, 2015 at 5:29:15 PM, Steve Dower (steve.dower at python.org) wrote: > > It's already a fairly crowded marketplace, at least on Windows. > Anaconda, Canopy, WinPython, Pythonxy and Portable Python all come to > mind, but none of them have reliably replaced the official Python > installer. In large part, I suspect this is because they do too much - > most include the scipy stack and at least one (typically 4-5) editors. I think that?s also discounting the *huge* benefit of something being offered as the official Python thing. I think a lot of people are hesitant to install those other things because they come someone else, or they aren?t even aware of them because they searched for Python on google and got the official installer. > > What might be interesting is if we installed the meta-package into the > Lib directory rather than Lib/site-packages. That also opens up the > possibility of removing old/deprecated modules from the stdlib but > having an install option to restore them. Which helps break up the > stdlib a bit and make the core Python install lighter, but still isn't > really suitable for something like requests. Right, I don?t think it?s suitable at all to install something that is typically installed from PyPI into anything other than Lib/site-packages. > > I also don't see any real gain in making it a separate thing from the > main installer unless you're planning 10+ packages to be in there. I > would expect an incredibly small amount of packages to be available - > those with no significant competitors, extremely broad uses, and > completely portable. > Can we release new, different, installers for say, a hypothetical 3.5.0 without cutting a new release of 3.5.0 that include updated versions of the bundled software? If so, how will people know they are using the latest version of that installer? I know that requests will not be very happy being bundled like that if getting a new version is tied to a new version of Python being released. This is one of the primary benefits of a separate installer, the installer (really the set of things that get installed) gets a version number. You can release it independently of a Python release, so if requests has a security issue then you can just roll out a new version of the platform installer without that affecting Python at all.? Another major benefit of a separate installer that layers ontop of Python is reducing (or eliminating) the friction it will cause with downstream redistributors. The ?sort of stdlib, sort of not? status of pip is weird and has caused a bit of a problem. I?m still dealing with the fallout of that, and while we had to do it that way in order for it to work inside of virtual environments, I don?t think we need to do this that way. I also think the separation just makes way more sense, when you have ensure* it means that things like python-pip depend on Python, but then Python also depends on them. I don?t think that Python the runtime needs to depends on requests and I think that doing that is going the wrong way. What this really is, is just a collection of preinstalled packages, so treating it like that seems like a better option. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA From wes.turner at gmail.com Fri Aug 14 01:03:01 2015 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 13 Aug 2015 18:03:01 -0500 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> <55CCE6A6.7000505@nextday.fi> <55CCFDD9.4090108@python.org> <55CD0BE9.4010704@python.org> Message-ID: On Thu, Aug 13, 2015 at 4:50 PM, Donald Stufft wrote: > On August 13, 2015 at 5:29:15 PM, Steve Dower (steve.dower at python.org) > wrote: > > > > It's already a fairly crowded marketplace, at least on Windows. > > Anaconda, Canopy, WinPython, Pythonxy and Portable Python all come to > > mind, but none of them have reliably replaced the official Python > > installer. In large part, I suspect this is because they do too much - > > most include the scipy stack and at least one (typically 4-5) editors. > > I think that?s also discounting the *huge* benefit of something being > offered as the official Python thing. I think a lot of people are hesitant > to install those other things because they come someone else, or they > aren?t even aware of them because they searched for Python on google and > got the official installer. > > > > > What might be interesting is if we installed the meta-package into the > > Lib directory rather than Lib/site-packages. That also opens up the > > possibility of removing old/deprecated modules from the stdlib but > > having an install option to restore them. Which helps break up the > > stdlib a bit and make the core Python install lighter, but still isn't > > really suitable for something like requests. > > Right, I don?t think it?s suitable at all to install something that is > typically installed from PyPI into anything other than Lib/site-packages. > > > > > I also don't see any real gain in making it a separate thing from the > > main installer unless you're planning 10+ packages to be in there. I > > would expect an incredibly small amount of packages to be available - > > those with no significant competitors, extremely broad uses, and > > completely portable. > > > > Can we release new, different, installers for say, a hypothetical 3.5.0 > without cutting a new release of 3.5.0 that include updated versions of the > bundled software? If so, how will people know they are using the latest > version of that installer? I know that requests will not be very happy > being bundled like that if getting a new version is tied to a new version > of Python being released. > > This is one of the primary benefits of a separate installer, the installer > (really the set of things that get installed) gets a version number. You > can release it independently of a Python release, so if requests has a > security issue then you can just roll out a new version of the platform > installer without that affecting Python at all. > > Another major benefit of a separate installer that layers ontop of Python > is reducing (or eliminating) the friction it will cause with downstream > redistributors. The ?sort of stdlib, sort of not? status of pip is weird > and has caused a bit of a problem. I?m still dealing with the fallout of > that, and while we had to do it that way in order for it to work inside of > virtual environments, I don?t think we need to do this that way. > > I also think the separation just makes way more sense, when you have > ensure* it means that things like python-pip depend on Python, but then > Python also depends on them. I don?t think that Python the runtime needs to > depends on requests and I think that doing that is going the wrong way. > What this really is, is just a collection of preinstalled packages, so > treating it like that seems like a better option. > Challenges - [ ] repeatable build scripts (to ensure reproducible environments) - tox - Pip - Dockerfiles (for specific operating system) - Installer script / PACKAGES (to be installed/called be each Dockerfile) - What about windows? - [ ] installing third party packages (python -m ensurepip; pip install -U pip; pip install -r requirements.txt) If you're suggesting that Python should test and maintain a distribution of specific PyPi packages, which commands do I need to add to my Dockerfiles, and OSX/Windows? - https://github.com/ipython/ipython/wiki/Install:-Docker#anaconda--ipython-configurations - https://www.python.org/dev/buildbot/ - https://wiki.python.org/moin/BuildbotOnWindows (this is probably out of date; download and build which packages every time?) > > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 > DCFA > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Aug 14 01:13:02 2015 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 13 Aug 2015 18:13:02 -0500 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> <55CCE6A6.7000505@nextday.fi> <55CCFDD9.4090108@python.org> <55CD0BE9.4010704@python.org> Message-ID: "The SciPy Stack" http://www.scipy.org/install.html http://www.scipy.org/stackspec.html https://westurner.org/tools/#scipy-stack http://docs.continuum.io/anaconda/pkg-docs tk Linux Mac8.5.18 requests 2.7.0 https://www.enthought.com/products/canopy/package-index/ requests 2.7.0 On Thu, Aug 13, 2015 at 6:03 PM, Wes Turner wrote: > > > On Thu, Aug 13, 2015 at 4:50 PM, Donald Stufft wrote: > >> On August 13, 2015 at 5:29:15 PM, Steve Dower (steve.dower at python.org) >> wrote: >> > >> > It's already a fairly crowded marketplace, at least on Windows. >> > Anaconda, Canopy, WinPython, Pythonxy and Portable Python all come to >> > mind, but none of them have reliably replaced the official Python >> > installer. In large part, I suspect this is because they do too much - >> > most include the scipy stack and at least one (typically 4-5) editors. >> >> I think that?s also discounting the *huge* benefit of something being >> offered as the official Python thing. I think a lot of people are hesitant >> to install those other things because they come someone else, or they >> aren?t even aware of them because they searched for Python on google and >> got the official installer. >> >> > >> > What might be interesting is if we installed the meta-package into the >> > Lib directory rather than Lib/site-packages. That also opens up the >> > possibility of removing old/deprecated modules from the stdlib but >> > having an install option to restore them. Which helps break up the >> > stdlib a bit and make the core Python install lighter, but still isn't >> > really suitable for something like requests. >> >> Right, I don?t think it?s suitable at all to install something that is >> typically installed from PyPI into anything other than Lib/site-packages. >> >> > >> > I also don't see any real gain in making it a separate thing from the >> > main installer unless you're planning 10+ packages to be in there. I >> > would expect an incredibly small amount of packages to be available - >> > those with no significant competitors, extremely broad uses, and >> > completely portable. >> > >> >> Can we release new, different, installers for say, a hypothetical 3.5.0 >> without cutting a new release of 3.5.0 that include updated versions of the >> bundled software? If so, how will people know they are using the latest >> version of that installer? I know that requests will not be very happy >> being bundled like that if getting a new version is tied to a new version >> of Python being released. >> >> This is one of the primary benefits of a separate installer, the >> installer (really the set of things that get installed) gets a version >> number. You can release it independently of a Python release, so if >> requests has a security issue then you can just roll out a new version of >> the platform installer without that affecting Python at all. >> >> Another major benefit of a separate installer that layers ontop of Python >> is reducing (or eliminating) the friction it will cause with downstream >> redistributors. The ?sort of stdlib, sort of not? status of pip is weird >> and has caused a bit of a problem. I?m still dealing with the fallout of >> that, and while we had to do it that way in order for it to work inside of >> virtual environments, I don?t think we need to do this that way. >> >> I also think the separation just makes way more sense, when you have >> ensure* it means that things like python-pip depend on Python, but then >> Python also depends on them. I don?t think that Python the runtime needs to >> depends on requests and I think that doing that is going the wrong way. >> What this really is, is just a collection of preinstalled packages, so >> treating it like that seems like a better option. >> > > Challenges > > - [ ] repeatable build scripts (to ensure reproducible environments) > - tox > - Pip > - Dockerfiles (for specific operating system) > - Installer script / PACKAGES (to be installed/called be each Dockerfile) > - What about windows? > - [ ] installing third party packages > (python -m ensurepip; pip install -U pip; pip install -r > requirements.txt) > > If you're suggesting that Python should test and maintain a distribution > of specific PyPi packages, > which commands do I need to add to my Dockerfiles, and OSX/Windows? > > - > https://github.com/ipython/ipython/wiki/Install:-Docker#anaconda--ipython-configurations > - https://www.python.org/dev/buildbot/ > - https://wiki.python.org/moin/BuildbotOnWindows (this is probably out > of date; download and build which packages every time?) > > >> >> ----------------- >> Donald Stufft >> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 >> DCFA >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Aug 14 01:15:12 2015 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 13 Aug 2015 18:15:12 -0500 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> <55CCE6A6.7000505@nextday.fi> <55CCFDD9.4090108@python.org> <55CD0BE9.4010704@python.org> Message-ID: And then for OS system packages: whohas "requests" whohas "python-requests" https://github.com/whohas/whohas On Thu, Aug 13, 2015 at 6:13 PM, Wes Turner wrote: > "The SciPy Stack" > http://www.scipy.org/install.html > http://www.scipy.org/stackspec.html > https://westurner.org/tools/#scipy-stack > > > http://docs.continuum.io/anaconda/pkg-docs > > tk Linux Mac8.5.18 > requests 2.7.0 > > https://www.enthought.com/products/canopy/package-index/ > > requests 2.7.0 > > > > On Thu, Aug 13, 2015 at 6:03 PM, Wes Turner wrote: > >> >> >> On Thu, Aug 13, 2015 at 4:50 PM, Donald Stufft wrote: >> >>> On August 13, 2015 at 5:29:15 PM, Steve Dower (steve.dower at python.org) >>> wrote: >>> > >>> > It's already a fairly crowded marketplace, at least on Windows. >>> > Anaconda, Canopy, WinPython, Pythonxy and Portable Python all come to >>> > mind, but none of them have reliably replaced the official Python >>> > installer. In large part, I suspect this is because they do too much - >>> > most include the scipy stack and at least one (typically 4-5) editors. >>> >>> I think that?s also discounting the *huge* benefit of something being >>> offered as the official Python thing. I think a lot of people are hesitant >>> to install those other things because they come someone else, or they >>> aren?t even aware of them because they searched for Python on google and >>> got the official installer. >>> >>> > >>> > What might be interesting is if we installed the meta-package into the >>> > Lib directory rather than Lib/site-packages. That also opens up the >>> > possibility of removing old/deprecated modules from the stdlib but >>> > having an install option to restore them. Which helps break up the >>> > stdlib a bit and make the core Python install lighter, but still isn't >>> > really suitable for something like requests. >>> >>> Right, I don?t think it?s suitable at all to install something that is >>> typically installed from PyPI into anything other than Lib/site-packages. >>> >>> > >>> > I also don't see any real gain in making it a separate thing from the >>> > main installer unless you're planning 10+ packages to be in there. I >>> > would expect an incredibly small amount of packages to be available - >>> > those with no significant competitors, extremely broad uses, and >>> > completely portable. >>> > >>> >>> Can we release new, different, installers for say, a hypothetical 3.5.0 >>> without cutting a new release of 3.5.0 that include updated versions of the >>> bundled software? If so, how will people know they are using the latest >>> version of that installer? I know that requests will not be very happy >>> being bundled like that if getting a new version is tied to a new version >>> of Python being released. >>> >>> This is one of the primary benefits of a separate installer, the >>> installer (really the set of things that get installed) gets a version >>> number. You can release it independently of a Python release, so if >>> requests has a security issue then you can just roll out a new version of >>> the platform installer without that affecting Python at all. >>> >>> Another major benefit of a separate installer that layers ontop of >>> Python is reducing (or eliminating) the friction it will cause with >>> downstream redistributors. The ?sort of stdlib, sort of not? status of pip >>> is weird and has caused a bit of a problem. I?m still dealing with the >>> fallout of that, and while we had to do it that way in order for it to work >>> inside of virtual environments, I don?t think we need to do this that way. >>> >>> I also think the separation just makes way more sense, when you have >>> ensure* it means that things like python-pip depend on Python, but then >>> Python also depends on them. I don?t think that Python the runtime needs to >>> depends on requests and I think that doing that is going the wrong way. >>> What this really is, is just a collection of preinstalled packages, so >>> treating it like that seems like a better option. >>> >> >> Challenges >> >> - [ ] repeatable build scripts (to ensure reproducible environments) >> - tox >> - Pip >> - Dockerfiles (for specific operating system) >> - Installer script / PACKAGES (to be installed/called be each >> Dockerfile) >> - What about windows? >> - [ ] installing third party packages >> (python -m ensurepip; pip install -U pip; pip install -r >> requirements.txt) >> >> If you're suggesting that Python should test and maintain a distribution >> of specific PyPi packages, >> which commands do I need to add to my Dockerfiles, and OSX/Windows? >> >> - >> https://github.com/ipython/ipython/wiki/Install:-Docker#anaconda--ipython-configurations >> - https://www.python.org/dev/buildbot/ >> - https://wiki.python.org/moin/BuildbotOnWindows (this is probably out >> of date; download and build which packages every time?) >> >> >>> >>> ----------------- >>> Donald Stufft >>> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 >>> DCFA >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Fri Aug 14 01:25:25 2015 From: steve.dower at python.org (Steve Dower) Date: Thu, 13 Aug 2015 16:25:25 -0700 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> <55CCE6A6.7000505@nextday.fi> <55CCFDD9.4090108@python.org> <55CD0BE9.4010704@python.org> Message-ID: <55CD2765.5010402@python.org> On 13Aug2015 1450, Donald Stufft wrote: > On August 13, 2015 at 5:29:15 PM, Steve Dower (steve.dower at python.org) wrote: >> >> It's already a fairly crowded marketplace, at least on Windows. >> Anaconda, Canopy, WinPython, Pythonxy and Portable Python all come to >> mind, but none of them have reliably replaced the official Python >> installer. In large part, I suspect this is because they do too much - >> most include the scipy stack and at least one (typically 4-5) editors. > > I think that?s also discounting the *huge* benefit of something being offered as the official Python thing. I think a lot of people are hesitant to install those other things because they come someone else, or they aren?t even aware of them because they searched for Python on google and got the official installer. Agreed, but that sort of endorsement needs to be handled very very carefully. > >> >> What might be interesting is if we installed the meta-package into the >> Lib directory rather than Lib/site-packages. That also opens up the >> possibility of removing old/deprecated modules from the stdlib but >> having an install option to restore them. Which helps break up the >> stdlib a bit and make the core Python install lighter, but still isn't >> really suitable for something like requests. > > Right, I don?t think it?s suitable at all to install something that is typically installed from PyPI into anything other than Lib/site-packages. Yes, but if the aim is to lighten the stdlib, then installing the heavier stdlib into its usual place would make the transition easier. Not a huge concern right now though. > >> >> I also don't see any real gain in making it a separate thing from the >> main installer unless you're planning 10+ packages to be in there. I >> would expect an incredibly small amount of packages to be available - >> those with no significant competitors, extremely broad uses, and >> completely portable. >> > > Can we release new, different, installers for say, a hypothetical 3.5.0 without cutting a new release of 3.5.0 that include updated versions of the bundled software? If so, how will people know they are using the latest version of that installer? I know that requests will not be very happy being bundled like that if getting a new version is tied to a new version of Python being released. We could, but I don't want to, mainly to avoid the "which version am I on" problem. What I want is for the question to be "do I have the latest version" and the answer to be "pip install -U ...". I would also insist that something like requests should do that on initial install anyway, so you only get the bundled version if you install offline. > This is one of the primary benefits of a separate installer, the installer (really the set of things that get installed) gets a version number. You can release it independently of a Python release, so if requests has a security issue then you can just roll out a new version of the platform installer without that affecting Python at all. > > Another major benefit of a separate installer that layers ontop of Python is reducing (or eliminating) the friction it will cause with downstream redistributors. The ?sort of stdlib, sort of not? status of pip is weird and has caused a bit of a problem. I?m still dealing with the fallout of that, and while we had to do it that way in order for it to work inside of virtual environments, I don?t think we need to do this that way. Obviously by becoming a downstream redistributor you no longer affect the other downstream redistributors (unless they see you as competition, which is more of a concern on Windows where distros generally don't have platform lock-in). > I also think the separation just makes way more sense, when you have ensure* it means that things like python-pip depend on Python, but then Python also depends on them. I don?t think that Python the runtime needs to depends on requests and I think that doing that is going the wrong way. What this really is, is just a collection of preinstalled packages, so treating it like that seems like a better option. Maybe this should just be a Windows (and maybe Mac) only thing then. Distros that provide python3-requests (and maybe eventually python3-tkinter/idle) can choose to make that a default install if they like, so upstream doesn't need to cover it, but until a distro becomes more popular than the python.org release we won't see that on Windows. The tcl/tk dependency is also most blatant on Windows as well, since we apparently depend on the system's version on every other platform. Cheers, Steve > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From tjreedy at udel.edu Fri Aug 14 03:58:55 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 13 Aug 2015 21:58:55 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <20150813113936.2a9b8595@anarchist.wooz.org> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <20150813113936.2a9b8595@anarchist.wooz.org> Message-ID: On 8/13/2015 11:39 AM, Barry Warsaw wrote: > On Aug 13, 2015, at 10:04 AM, Terry Reedy wrote: > >> For internationalization, the gettext.gettext translation call could be added >> in one place, where the string is passed to tk, rather than 80 places in the >> structure definition. An altered version of the menudefs walker could be >> used to collect the menu strings for translation. > > That would require being able to translate non-literals. I don't understand, Idle's menus are built from string literals -- no variables, not interpolation -- like 'File', 'Open', 'Open Module', etc. I think this is fairly typical. > I'd need the same, > and it would be okay if the translation call were spelled less conveniently, > as long as it's possible to both extract and translate the source strings. With table-driven ui creation, extraction for human translators and replacement of the original by the translation can be done with a pair of related functions. With code-driven ui creation (as currently with Idle dialogs), an extraction function may be possible (if the string literals are tagged with keywords such as 'title=' or 'text=') but translation still requires addition of a _() call for each arguments that needs translation. -- Terry Jan Reedy From tjreedy at udel.edu Fri Aug 14 06:34:07 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 14 Aug 2015 00:34:07 -0400 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CCCEA7.8000406@python.org> References: <55CCCEA7.8000406@python.org> Message-ID: On 8/13/2015 1:06 PM, Steve Dower wrote: > I'd like to propose expanding the list of 3rd-party packages we bundle > and install by default. (Obviously this does not apply to platforms that > repackage Python and can do whatever they want, but on Windows and Mac > we are fully responsible for these.) > > Currently, we bundle pip (and some of its dependencies - let's avoid > that particular discussion right now please, it's on python-dev) and > install it by default in a way that lets users easily update to the > latest version. Including pip in the standard library would lock users > into a specific version for the lifetime of that Python version, which > would be a bad thing. > > From my point-of-view, this has been very successful in Python 2.7, 3.4 > and will also be successful in 3.5. For Python 3.6, I'd like to do a > similar thing with: > > * requests Not in stdlib, so easier availability is a plus > * tkinter (including tcl/tk, IDLE, and other dependencies) In stdlib, heavily used by beginners, who will not be helped by the change and who may possibly be harmed by reduced availability. Why did you pick the tkinter group instead some obsolete and little used modules, such as asyncore and asynchat? I think any discussion of breaking up the stdlib should wait until the new workflow is chosen and implemented. > tkinter is worth more discussion :) For the remainder of this email, > I'll use "tkinter" as shorthand to refer to Tcl, Tk, Tix, _tkinter, > tkinter, idlelib/IDLE, PyDoc, turtledemo and any other dependencies or > dependents that I missed. turtle itself. > In my experience, few Python scripts depend on or assume tkinter is > available. No person's experience is representative. On Stackoverflow, tkinter has a higher rate of tkinter questions than for, say, requests or pil. I suspect its usage is also higher than those modules among students learning Python. > tkinter is already an optional item in the Windows installer I am pretty sure that this is due to the large size of tcl/tk. This is, of course, less relevant now than 20 years ago. The offline docs and test suite (/lib/test/) are also optional. > (maybe Mac too? I don't know) and there are certainly installations of > Python out there that don't have it. From this side, nothing would > actually change by installing tkinter into site-packages rather than Lib. There are multiple questions here: where is x developed, where is it installed from, and where is it installed to, and what is it installed with. Educational machines tend to be as locked down as corporate machines. Many places allow 'python and its stdlib' as a package, but no other python packages. If such policies excludes installing into site-packages, then the proposed change is crippling. [Shifting focus from 'tkinter' to Idle...] > (One impact may be the start menu shortcuts for IDLE and PyDoc. For students who are not allowed to use a console, the icons are essential. > IDLE is already allowed to make enhancements in maintenance branches > (https://www.python.org/dev/peps/pep-0434/), PEP 434 formalized what had more or less been the practice for several years before. > and we have recently > received patches that are to be applied to *four* branches. So what? This is not limited to Idle. There are, of course, *four* branches because we have not dropped 2.7 and started 3.6 earlier than usual. > The freedom to enhance IDLE is greatly improved by making it > a PyPI installable package Are you volunteering to do this? Pardon my skepticism, but you said yourself that you work at Microsoft on "a direct competitor to Idle". > and disconnecting it from the stdlib's schedule. You already explained above that the stdlib schedule is pretty much not an issue for Idle. And as you already know, Idle is already on track to get a makeover in the next year *with things as they are*. Anyway, just as there are separate lists for other specialized topics, like packaging, time zones, and core workflow, there is a separate list, Idle-sig*, for discussing how to improve Idle. Anyone interested in this should join us there. *mirrored on news.gmane.org as gmane.comp.python.idle with no mail.python.org subscription required. -- Terry Jan Reedy From furrykef at gmail.com Fri Aug 14 11:32:53 2015 From: furrykef at gmail.com (Kef Schecter) Date: Fri, 14 Aug 2015 04:32:53 -0500 Subject: [Python-ideas] Draft PEP: Automatic Globbing of Filenames in argparse on Windows Message-ID: PEP: XXX Title: Automatic Globbing of Filenames in argparse on Windows Version: $Revision$ Last-Modified: $Date$ Author: Kef Schecter Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 14-Aug-2015 Python-Version: 3.6 Post-History: Abstract ======== This PEP proposes to add functionality to argparse to allow glob (wildcard) expressions to be handled automagically on Windows. Motivation ========== For many command-line tools, it is handy to be able to specify wildcards in order to operate on more than one file at a time. On Unix-like systems, this is handled automatically by the shell. On Windows, however, the default shell does not have this behavior, nor does Microsoft's PowerShell. Yet Windows users generally expect wildcards to work. For example, most built-in commands such as ``dir`` and ``type`` accept wildcard arguments, and have since the early days of MS-DOS. It is already possible for programmers to work around this issue, but it is a bit cumbersome and it is easy to make the behavior almost, but not quite, correct. Moreover, since Python has a "batteries included" philosophy, and this is a very common feature, it is the author's opinion that the correct functionality should be available out of the box. How It Must Be Done Currently ============================= :: if platform.system() == 'Windows': filenames = [] for filename in args.files: if '*' in filename or '?' in filename or '[' in filename: filenames += glob.glob(filename) else: filenames.append(filename) args.files = filenames Why This Is a Problem ===================== - Authors, especially those who use Unix-like systems, will usually not bother to add this code unless users specifically request it, and perhaps not even then. How often have you seen this code in a program? - It is easy to forget the platform check or not understand why it is necessary. Automatically globbing filenames on a Unix-like system is wrong because the shell is supposed to handle it already; on such a system, if the program sees a name like ``*.txt``, then it means the user explicitly specified the name of a file that, improbable as it may seem, has an asterisk in its filename. On Windows, filenames with the characters ``*`` and ``?`` in their name are not possible, so this is partially irrelevant on Windows even when using a Unix-like shell such as bash (but see `Square Brackets`_ below). - It is easy to forget to check the string for wildcard characters before passing it to glob.glob. If the user specifies a filename with no wildcards such as ``foo.txt``, and foo.txt does not exist, then glob.glob will silently ignore the file, giving the program no opportunity to print a message such as "No file named foo.txt". - glob.glob may not be quite the right function to use. See `Square Brackets`_ below. - It is boilerplate code that is applicable to a large number of programs without change, which suggests it belongs in a library. Solution ======== Add a keyword argument to argparse.ArgumentParser.add_argument called ``glob``. If it is true, it will automatically glob filenames using code much like the boilerplate code given earlier in `How It Must Be Done Currently`_. This argument is only meaningful when nargs is set to an appropriate value such as '+' or '*'. The default value of this argument should be False. This ensures backward compatibility with existing programs that assume wildcards are not expanded, such as a program that accepts a regex as an argument. A possibly better behavior might be to make this argument default to True (enabling the functionality automagically without the programmer needing to be aware of it) and only expand wildcard arguments that are not provided in quotes, similar to how Unix-like shells behave. However, there appears to be no simple way to tell whether an argument was supplied in quotes or not; the strings in sys.argv already have had the quotes removed. Square Brackets =============== It has been noted above that the characters ``*`` and ``?`` will never appear in filenames on Windows. However, the characters ``[`` and ``]``, which glob.glob uses for wildcards, **can** be used in filenames, and may not be especially uncommon. There are three possible ways of handling this: 1. Use a version of glob.glob without the wildcard functionality that ``[`` and ``]`` provide. This type of wildcard has never been standard for wildcard arguments to MS-DOS or Windows command-line programs. 2. Specify some kind of escaping mechanism; for example, ``\[foo\].txt`` would refer to a file that has ``[`` and ``]`` in its filename. This may not be intuitive behavior for Windows users. 3. Keep glob.glob's standard functionality. Programs using this feature will not be able to operate on files that have square brackets in their names. Of these, the first should adhere best to the principle of least surprise. Windows users do not expect square brackets to form wildcard expressions. If they want such functionality, they will probably already be using a shell such as bash that handles it for them. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From mal at egenix.com Fri Aug 14 13:22:08 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 14 Aug 2015 13:22:08 +0200 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CCCEA7.8000406@python.org> References: <55CCCEA7.8000406@python.org> Message-ID: <55CDCF60.3080902@egenix.com> On 13.08.2015 19:06, Steve Dower wrote: > I'd like to propose expanding the list of 3rd-party packages we bundle and install by default. > (Obviously this does not apply to platforms that repackage Python and can do whatever they want, but > on Windows and Mac we are fully responsible for these.) > > Currently, we bundle pip (and some of its dependencies - let's avoid that particular discussion > right now please, it's on python-dev) and install it by default in a way that lets users easily > update to the latest version. Including pip in the standard library would lock users into a specific > version for the lifetime of that Python version, which would be a bad thing. > > From my point-of-view, this has been very successful in Python 2.7, 3.4 and will also be successful > in 3.5. For Python 3.6, I'd like to do a similar thing with: > > * requests > * tkinter (including tcl/tk, IDLE, and other dependencies) requests is already installed as part of pip, along with a whole set of other packages (but not exposed at the top-level), so moving it to its own ensure package wouldn't really change much in terms of approach. The problem I see with requests is that they sometimes have glitches in their releases causing them not to be usable, so the version that gets "ensured" would need some extra testing by whoever manages the list of packages. Also notes that the pre-packaged version in pip is not managed by the package manager (because it doesn't see it), so you will sooner or later end up with multiple requests package copies in your site-packages. Not sure about tkinter. Requiring newbies to run an ensure script to be able to run IDLE doesn't sound like a good idea. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 14 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2015-08-12: Released mxODBC 3.3.4 ... http://egenix.com/go80 2015-08-22: FrOSCon 2015 ... 8 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From rosuav at gmail.com Fri Aug 14 13:47:42 2015 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 14 Aug 2015 21:47:42 +1000 Subject: [Python-ideas] Draft PEP: Automatic Globbing of Filenames in argparse on Windows In-Reply-To: References: Message-ID: On Fri, Aug 14, 2015 at 7:32 PM, Kef Schecter wrote: > PEP: XXX > Title: Automatic Globbing of Filenames in argparse on Windows > > For many command-line tools, it is handy to be able to specify > wildcards in order to operate on more than one file at a time. On > Unix-like systems, this is handled automatically by the shell. On > Windows, however, the default shell does not have this behavior, nor > does Microsoft's PowerShell. > > Yet Windows users generally expect wildcards to work. For example, > most built-in commands such as ``dir`` and ``type`` accept wildcard > arguments, and have since the early days of MS-DOS. How does this interact with the 'fileinput' module? Can you tie in with that? > How It Must Be Done Currently > ============================= > > if platform.system() == 'Windows': > filenames = [] > for filename in args.files: > if '*' in filename or '?' in filename or '[' in filename: > filenames += glob.glob(filename) > else: > filenames.append(filename) > args.files = filenames Disgusting. :) Definitely needs to be buried away. > Add a keyword argument to argparse.ArgumentParser.add_argument called > ``glob``. If it is true, it will automatically glob filenames using > code much like the boilerplate code given earlier in `How It Must Be > Done Currently`_. This argument is only meaningful when nargs is set > to an appropriate value such as '+' or '*'. +1 > A possibly better behavior might be to make this argument default to > True (enabling the functionality automagically without the programmer > needing to be aware of it) and only expand wildcard arguments that are > not provided in quotes, similar to how Unix-like shells behave. > However, there appears to be no simple way to tell whether an argument > was supplied in quotes or not; the strings in sys.argv already have > had the quotes removed. -1. Since Windows users aren't generally used to escaping arguments (eg compare Windows's "dir /s *.py" to Unix's "find . -name \*.py", where the latter will be backslash-protected), I would advise against glob expansion any time the program doesn't explicitly ask for it. > 1. Use a version of glob.glob without the wildcard functionality that > ``[`` and ``]`` provide. This type of wildcard has never been > standard for wildcard arguments to MS-DOS or Windows command-line > programs. > > Of these, the first should adhere best to the principle of least > surprise. Windows users do not expect square brackets to form > wildcard expressions. If they want such functionality, they will > probably already be using a shell such as bash that handles it for > them. +1. It also shouldn't do any other form of expansion (eg braces) that people wouldn't expect of a standard Windows command like dir or copy. One other difference from glob.glob() that I'd recommend: If the spec doesn't match any files, return it unchanged, as bash does. (glob.glob will return an empty list.) Otherwise, sounds good to me - make it easy to DTRT on all platforms. ChrisA From eric at trueblade.com Fri Aug 14 15:05:46 2015 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 14 Aug 2015 09:05:46 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <20150813100042.3f026ce5@anarchist.wooz.org> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> <20150813100042.3f026ce5@anarchist.wooz.org> Message-ID: <55CDE7AA.9010006@trueblade.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/13/2015 10:00 AM, Barry Warsaw wrote: > On Aug 13, 2015, at 07:58 AM, Eric V. Smith wrote: > >> The one downside of this is that the strings that the translator >> is translating from do not appear in the source code. The >> translator would have to know that the string being translated >> is: "My name is {0}, my dog's name is {1}" > > I think unfortunately, this is a non-starter for the i18n use case. > The message catalog must include the source string as it appears in > the code because otherwise, translators will not be able to > reliably map the intended meaning to their native language. > They'll have to keep a mental map between source string > placeholders and numeric placeholders, and I am fairly confident > that this will be a source of broken translations. > > Is there a problem with keeping the named placeholders throughout > the entire stack? I guess not. It complicates things in the non-translated case, but I think it's probably all workable. I'll give it some thought. But is that enough? I'm not exactly sure what goal we're trying to achieve here. If it's to entirely replace the gettext module, including things like ngettext, then I think it's not an achievable goal, and we should just give up. If it's only to replace gettext.gettext (commonly used as "_"), then I think there's hope. Eric. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAEBAgAGBQJVzeeqAAoJENxauZFcKtNxYlQH/REa+PV0Rhsr3NMrNdzfsuw/ 6kOL9CItiSqjTOit/nVPR56ZpkHuGTnVO0QMCUgUmPpzU4arP945OwZj/ObSh8Jm QTAyho1El4riDAgA7qywxzij2Z3imtuBDEAwkp022WdjKbbQ3/I2mTG9d4mPHBQc Sl9qMqPoPjKwoJzTahqWJ0vgxqQ+ZfjaXKzgv581GPJknp4KG5i5Zw/U5oDFj+Oh tsvedi25qWN7iSR60cfAZ/2/WfidgwlGH8Bb1V3JYj7B59Zsvkcg7VVYhQSkQrc7 XXJXdFdgxUDH/OXiwQLxTsBJ0AjJah7ZTiq8LOeql9BgcLQMXV306JOIUWcqKa8= =Jcwh -----END PGP SIGNATURE----- From eric at trueblade.com Fri Aug 14 15:04:55 2015 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 14 Aug 2015 09:04:55 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <20150813113936.2a9b8595@anarchist.wooz.org> Message-ID: <55CDE777.70803@trueblade.com> On 08/13/2015 09:58 PM, Terry Reedy wrote: > On 8/13/2015 11:39 AM, Barry Warsaw wrote: >> On Aug 13, 2015, at 10:04 AM, Terry Reedy wrote: >> >>> For internationalization, the gettext.gettext translation call could >>> be added >>> in one place, where the string is passed to tk, rather than 80 places >>> in the >>> structure definition. An altered version of the menudefs walker >>> could be >>> used to collect the menu strings for translation. >> >> That would require being able to translate non-literals. > > I don't understand, Idle's menus are built from string literals -- no > variables, not interpolation -- like 'File', 'Open', 'Open Module', etc. > I think this is fairly typical. It's the "could be added in one place" part that would require working on non-literals. In that one place, you'd be operating on a variable, not a literal. Eric. From random832 at fastmail.us Fri Aug 14 15:42:34 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Fri, 14 Aug 2015 09:42:34 -0400 Subject: [Python-ideas] Draft PEP: Automatic Globbing of Filenames in argparse on Windows In-Reply-To: References: Message-ID: <1439559754.3115277.356234777.573D3B23@webmail.messagingengine.com> On Fri, Aug 14, 2015, at 05:32, Kef Schecter wrote: > A possibly better behavior might be to make this argument default to > True (enabling the functionality automagically without the programmer > needing to be aware of it) and only expand wildcard arguments that are > not provided in quotes, similar to how Unix-like shells behave. > However, there appears to be no simple way to tell whether an argument > was supplied in quotes or not; the strings in sys.argv already have > had the quotes removed. -1, "C:\Some Directory With Spaces\*.txt" is a valid wildcard for most Windows programs. > 2. Specify some kind of escaping mechanism; for example, > ``\[foo\].txt`` would refer to a file that has ``[`` and ``]`` in > its filename. This may not be intuitive behavior for Windows > users. Especially if you're using backslash for quoting - how do you differentiate it from backslash as a directory separator? Vim manages, but it's a bit headache-inducing to look at. > 3. Keep glob.glob's standard functionality. Programs using this > feature will not be able to operate on files that have square > brackets in their names. I actually think #1 is the correct way, but it's worth noting that using [[] for left bracket already works. This solution is also used in Vim (even on Windows) and MS-SQL. Right bracket needs no escaping since it's outside brackets to begin with (but you can if you want). You can even quote * and ? that way (filenames won't have them on Windows, but can on Unix). > Of these, the first should adhere best to the principle of least > surprise. Windows users do not expect square brackets to form > wildcard expressions. If they want such functionality, they will > probably already be using a shell such as bash that handles it for > them. Speaking of least surprise... while _most_ of the quirks of Windows wildcards are extremely obscure and can probably be safely ignored for the purpose of this feature (my objection in the scandir discussion was to relying on the _real_ Windows wildcard implementation in a cross-platform function without explicitly documenting it, rather than saying it needs to be emulated), the fact that you can use *.* to match all files (including those that don't include a dot) and *. to match only files that do not include a dot is well-known. Filenames on Windows cannot actually end with a dot. From donald at stufft.io Fri Aug 14 15:44:46 2015 From: donald at stufft.io (Donald Stufft) Date: Fri, 14 Aug 2015 09:44:46 -0400 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CDCF60.3080902@egenix.com> References: <55CCCEA7.8000406@python.org> <55CDCF60.3080902@egenix.com> Message-ID: On August 14, 2015 at 7:22:56 AM, M.-A. Lemburg (mal at egenix.com) wrote: > > requests is already installed as part of pip, along with a whole > set of other packages (but not exposed at the top-level), so > moving it to its own ensure package wouldn't really change much > in terms of approach. This isn?t really true except in a very abstract sense. Yes, requests exists because of pip, but it exists inside of pip (``import pip._vendor.requests``) not anywhere where normal people should be importing it. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA From furrykef at gmail.com Fri Aug 14 15:47:28 2015 From: furrykef at gmail.com (Kef Schecter) Date: Fri, 14 Aug 2015 08:47:28 -0500 Subject: [Python-ideas] Draft PEP: Automatic Globbing of Filenames in argparse on Windows In-Reply-To: References: Message-ID: On Fri, Aug 14, 2015 at 6:47 AM, Chris Angelico wrote: > On Fri, Aug 14, 2015 at 7:32 PM, Kef Schecter wrote: >> PEP: XXX >> Title: Automatic Globbing of Filenames in argparse on Windows >> >> For many command-line tools, it is handy to be able to specify >> wildcards in order to operate on more than one file at a time. On >> Unix-like systems, this is handled automatically by the shell. On >> Windows, however, the default shell does not have this behavior, nor >> does Microsoft's PowerShell. >> >> Yet Windows users generally expect wildcards to work. For example, >> most built-in commands such as ``dir`` and ``type`` accept wildcard >> arguments, and have since the early days of MS-DOS. > > How does this interact with the 'fileinput' module? Can you tie in with that? The fileinput module likewise does not expand wildcards and will choke on an argument such as "*.txt", but I think for that module it would be safe to make wildcard expansion automatic on Windows, since it's known that all the arguments are filenames. From mal at egenix.com Fri Aug 14 16:17:44 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 14 Aug 2015 16:17:44 +0200 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> <55CDCF60.3080902@egenix.com> Message-ID: <55CDF888.7000003@egenix.com> On 14.08.2015 15:44, Donald Stufft wrote: > > > On August 14, 2015 at 7:22:56 AM, M.-A. Lemburg (mal at egenix.com) wrote: >> >> requests is already installed as part of pip, along with a whole >> set of other packages (but not exposed at the top-level), so >> moving it to its own ensure package wouldn't really change much >> in terms of approach. > > > This isn?t really true except in a very abstract sense. Yes, requests exists because of pip, but it exists inside of pip (``import pip._vendor.requests``) not anywhere where normal people should be importing it. Right. What I'm saying is that requests is already installed, so moving it from inside pip to top-level isn't much of a change in terms of "do we want requests to be installed via ensure or not". -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 14 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2015-08-12: Released mxODBC 3.3.4 ... http://egenix.com/go80 2015-08-22: FrOSCon 2015 ... 8 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From barry at python.org Fri Aug 14 17:00:54 2015 From: barry at python.org (Barry Warsaw) Date: Fri, 14 Aug 2015 11:00:54 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> <20150813100042.3f026ce5@anarchist.wooz.org> <55CDE7AA.9010006@trueblade.com> Message-ID: <20150814110054.75be1c4c@anarchist.wooz.org> On Aug 14, 2015, at 09:05 AM, Eric V. Smith wrote: >On 08/13/2015 10:00 AM, Barry Warsaw wrote: >> Is there a problem with keeping the named placeholders throughout >> the entire stack? > >I guess not. It complicates things in the non-translated case, but I >think it's probably all workable. I'll give it some thought. Thanks. >But is that enough? I'm not exactly sure what goal we're trying to >achieve here. If it's to entirely replace the gettext module, >including things like ngettext, then I think it's not an achievable >goal, and we should just give up. If it's only to replace >gettext.gettext (commonly used as "_"), then I think there's hope. From my perspective, exactly this. I don't expect or want to replace gettext. Part of the focus of PEP 501 is to enable gettext as a use case, binding it opportunistically to the __interpolate__ built-in. Without that binding, you get "normal" interpolation. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From barry at python.org Fri Aug 14 17:07:09 2015 From: barry at python.org (Barry Warsaw) Date: Fri, 14 Aug 2015 11:07:09 -0400 Subject: [Python-ideas] More "ensure*" packages References: <55CCCEA7.8000406@python.org> Message-ID: <20150814110709.036ce425@anarchist.wooz.org> On Aug 13, 2015, at 10:06 AM, Steve Dower wrote: >I'd like to propose expanding the list of 3rd-party packages we bundle and >install by default. (Obviously this does not apply to platforms that >repackage Python and can do whatever they want, but on Windows and Mac we are >fully responsible for these.) Indeed. Can we *please* keep the source tarball pure? Even ensurepip gives (some) downstreams plenty of headaches. I'm totally fine with the OS X and Windows installers bundling whatever makes users on those platforms have a better experience. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tjreedy at udel.edu Fri Aug 14 17:24:21 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 14 Aug 2015 11:24:21 -0400 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CCCEA7.8000406@python.org> References: <55CCCEA7.8000406@python.org> Message-ID: On 8/13/2015 1:06 PM, Steve Dower wrote: > From my point-of-view, this has been very successful in Python 2.7, 3.4 > and will also be successful in 3.5. For Python 3.6, I'd like to do a > similar thing with: > > * requests The idea, I take it, it to make it easier for people, especially beginners to have a more fully-loaded installation when the installer quits, without having to find the console and run pip. The requests example points to presenting users with a curated list of packages they might want to install. A non-exclusive alternative is to present a list of packages users have already installed with other versions. "You have the following 3rd party modules installed for 3.4. Which of these would you like installed for 3.5? [x] numpy [x] pillow [x] pygame ... " This would save people from having to learn about and use a requirements list. For 2 to 3 upgrades, additional work would be required to check whether packages have py 3 versions, and if not, whether there are py 3 replacements. Pillow can be suggested and installed as a plug-compatible replacement for PIL. Most other cases are harder. But any work the installer can do to make upgrades easier would be very helpful. -- Terry Jan Reedy From steve.dower at python.org Fri Aug 14 17:52:49 2015 From: steve.dower at python.org (Steve Dower) Date: Fri, 14 Aug 2015 08:52:49 -0700 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> Message-ID: <55CE0ED1.9010905@python.org> On 13Aug2015 2134, Terry Reedy wrote: > On 8/13/2015 1:06 PM, Steve Dower wrote: >> I'd like to propose expanding the list of 3rd-party packages we bundle >> and install by default. (Obviously this does not apply to platforms that >> repackage Python and can do whatever they want, but on Windows and Mac >> we are fully responsible for these.) >> >> Currently, we bundle pip (and some of its dependencies - let's avoid >> that particular discussion right now please, it's on python-dev) and >> install it by default in a way that lets users easily update to the >> latest version. Including pip in the standard library would lock users >> into a specific version for the lifetime of that Python version, which >> would be a bad thing. >> >> From my point-of-view, this has been very successful in Python 2.7, 3.4 >> and will also be successful in 3.5. For Python 3.6, I'd like to do a >> similar thing with: >> >> * requests > > Not in stdlib, so easier availability is a plus > >> * tkinter (including tcl/tk, IDLE, and other dependencies) > > In stdlib, heavily used by beginners, who will not be helped by the > change and who may possibly be harmed by reduced availability. It's the biggest (only?) application in the standard library. idlelib is not documented, and I describe below how this wouldn't necessarily reduce availability or ease of entry for anyone. > Why did you pick the tkinter group instead some obsolete and little used > modules, such as asyncore and asynchat? I don't want to promote those to PyPI packages, I'd rather just remove them completely. > I think any discussion of breaking up the stdlib should wait until the > new workflow is chosen and implemented. Fair, but this is also a good way to test alternate workflows as well as simplify things for the people working on IDLE (and I acknowledge I haven't raised it on the IDLE list yet, but if the response is "never take it out of the standard library" then there's no point asking people "would you prefer to contribute to IDLE outside of the Python stdlib"). >> tkinter is worth more discussion :) For the remainder of this email, >> I'll use "tkinter" as shorthand to refer to Tcl, Tk, Tix, _tkinter, >> tkinter, idlelib/IDLE, PyDoc, turtledemo and any other dependencies or >> dependents that I missed. > > turtle itself. Thank you. >> In my experience, few Python scripts depend on or assume tkinter is >> available. > > No person's experience is representative. On Stackoverflow, tkinter has > a higher rate of tkinter questions than for, say, requests or pil. I > suspect its usage is also higher than those modules among students > learning Python. I really don't like it when StackOverflow is used for this sort of metric (it's actually fairly popular at work right now, and also for comparing R and Python). High question rates on StackOverflow indicate that people had problems they weren't able to solve using the documentation. A more interesting number would be the number of users who have received badges for high quality and popular answers, as that would indicate a thriving community. (And I haven't checked, but I'd expect Django and matplotlib to far exceed tkinter on either of these metrics.) > > tkinter is already an optional item in the Windows installer > > I am pretty sure that this is due to the large size of tcl/tk. This is, > of course, less relevant now than 20 years ago. The offline docs and > test suite (/lib/test/) are also optional. As far as I'm concerned, it is precisely because of the large size. Python is being installed on so many more servers, VMs and containers now that size reduction is a perfectly legitimate goal IMHO. >> (maybe Mac too? I don't know) and there are certainly installations of >> Python out there that don't have it. From this side, nothing would >> actually change by installing tkinter into site-packages rather than Lib. > > There are multiple questions here: where is x developed, where is it > installed from, and where is it installed to, and what is it installed > with. Educational machines tend to be as locked down as corporate > machines. Many places allow 'python and its stdlib' as a package, but > no other python packages. If such policies excludes installing into > site-packages, then the proposed change is crippling. * where x is developed is up to whoever is doing the development. Provided a package appears on PyPI, it will work for my proposal. * it would be installed from a wheel bundled into the main installer, exactly as for pip and setuptools currently. * it would be installed with pip (so to "ensuretkinter" you'd have to "ensurepip" - I don't see this being a major issue. If someone wants to lock down package installation, they need to lock down users and not the Python installation) I've heard of plenty of places that disallow packages (I get to deal with a lot of enterprises through my work), but anything in the core installer is okay. tkinter would be in the core installer still, just like today. > [Shifting focus from 'tkinter' to Idle...] > >> (One impact may be the start menu shortcuts for IDLE and PyDoc. > > For students who are not allowed to use a console, the icons are essential. Please don't quote part of my sentence and add a full stop so it looks like I stopped there. Here is what I originally wrote: (One impact may be the start menu shortcuts for IDLE and PyDoc, but provided the entry points into those tools are kept stable we can continue adding shortcuts from the installer. People who omit tkinter and then install it later would not get shortcuts. But since they omitted it from the installer, they probably don't want them - they likely just got a package that has tkinter as a dependency.) As you can see, I have acknowledged this is a problem, proposed a solution, identified a subsequent problem, and stated why I don't think it's blocking. Pretending that I'm simply trying to remove the entry point is unfair and dishonest. >> IDLE is already allowed to make enhancements in maintenance branches >> (https://www.python.org/dev/peps/pep-0434/), > > PEP 434 formalized what had more or less been the practice for several > years before. > >> and we have recently >> received patches that are to be applied to *four* branches. > > So what? This is not limited to Idle. There are, of course, *four* > branches because we have not dropped 2.7 and started 3.6 earlier than > usual. As I said earlier, the difference between Idle and the (majority of the) rest of the stdlib is that Idle is an application. > > The freedom to enhance IDLE is greatly improved by making it > > a PyPI installable package > > Are you volunteering to do this? Pardon my skepticism, but you said > yourself that you work at Microsoft on "a direct competitor to Idle". I'm certainly willing to help get it started. Right now one of my jobs (along with the other Windows core devs) is maintaining the tcl/tk builds and the sub-installer for tkinter. This change would simplify that part of my volunteer efforts, and so I am willing to invest in it. That's not saying I'm going to become an Idlecontributor. The "direct competitor" comment was intended as tongue in cheek (it's from a more private email thread, for anyone wondering where it showed up), but I do acknowledge that one of our (free - not a core business) products provides similar functionality to Idle. It is difficult to make any sort of design decision transparently enough that suggestions of sabotage can be silenced (I hope that's what I'm achieving here - it's certainly the point). >> and disconnecting it from the stdlib's schedule. > > You already explained above that the stdlib schedule is pretty much not > an issue for Idle. And as you already know, Idle is already on track to > get a makeover in the next year *with things as they are*. My experience with our own product is that being disconnected from Visual Studio's schedule made it much easier to do makeovers whenever we want. I believe Idle (as an application) and tkinter (as a wrapper around a 3rd-party library) would also benefit from this. (As an aside, so would the ssl and hashlib modules, except those both have other dependencies in the stdlib.) > Anyway, just as there are separate lists for other specialized topics, > like packaging, time zones, and core workflow, there is a separate list, > Idle-sig*, for discussing how to improve Idle. Anyone interested in this > should join us there. > > *mirrored on news.gmane.org as gmane.comp.python.idle with no > mail.python.org subscription required. > Thanks for the invite. As I mentioned earlier, if there's opposition here to bundling tkinter rather than having it as a core part of the stdlib, I don't see any reason discussing things like PyPI packages and improved workflows on Idle-sig right now. Cheers, Steve From steve.dower at python.org Fri Aug 14 17:57:53 2015 From: steve.dower at python.org (Steve Dower) Date: Fri, 14 Aug 2015 08:57:53 -0700 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CDCF60.3080902@egenix.com> References: <55CCCEA7.8000406@python.org> <55CDCF60.3080902@egenix.com> Message-ID: <55CE1001.4020003@python.org> On 14Aug2015 0422, M.-A. Lemburg wrote: > On 13.08.2015 19:06, Steve Dower wrote: >> I'd like to propose expanding the list of 3rd-party packages we bundle and install by default. >> (Obviously this does not apply to platforms that repackage Python and can do whatever they want, but >> on Windows and Mac we are fully responsible for these.) >> >> Currently, we bundle pip (and some of its dependencies - let's avoid that particular discussion >> right now please, it's on python-dev) and install it by default in a way that lets users easily >> update to the latest version. Including pip in the standard library would lock users into a specific >> version for the lifetime of that Python version, which would be a bad thing. >> >> From my point-of-view, this has been very successful in Python 2.7, 3.4 and will also be successful >> in 3.5. For Python 3.6, I'd like to do a similar thing with: >> >> * requests >> * tkinter (including tcl/tk, IDLE, and other dependencies) > > requests is already installed as part of pip, along with a whole > set of other packages (but not exposed at the top-level), so > moving it to its own ensure package wouldn't really change much > in terms of approach. > > The problem I see with requests is that they sometimes > have glitches in their releases causing them not to be usable, > so the version that gets "ensured" would need some extra testing > by whoever manages the list of packages. I'm interested in this. What sort of glitches are we talking about here? Are they not caught by the requests team's tests? Why would someone else be able to test it better than them? I'd certainly be okay with locking in the version at rc1 time to give people a chance for wider testing. I'd be very nervous about updating any bundled package on the day that the final release is built. > Also notes that the pre-packaged > version in pip is not managed by the package manager (because > it doesn't see it), so you will sooner or later end up with multiple > requests package copies in your site-packages. pip has decided to vendor requests to avoid issues like this. It's unfortunate, but it is the best way to ensure that you can update requests securely even if you get a broken version. > Not sure about tkinter. Requiring newbies to run an ensure script > to be able to run IDLE doesn't sound like a good idea. > Maybe I misunderstand how the ensure scripts work on other platforms? On Windows (and in the makefile), the installation runs it for them. Only people who edit the makefile and build from source would have to run it manually, and I'm fairly sure you don't get to claim to be a newbie at that point :) Of course, if distros disable the ensure scripts, it's on them to make sure their users have access to the packages they need. Distros can already remove Idle/tkinter if they want to. Cheers, Steve From steve.dower at python.org Fri Aug 14 18:00:22 2015 From: steve.dower at python.org (Steve Dower) Date: Fri, 14 Aug 2015 09:00:22 -0700 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <20150814110709.036ce425@anarchist.wooz.org> References: <55CCCEA7.8000406@python.org> <20150814110709.036ce425@anarchist.wooz.org> Message-ID: <55CE1096.5020603@python.org> On 14Aug2015 0807, Barry Warsaw wrote: > On Aug 13, 2015, at 10:06 AM, Steve Dower wrote: > >> I'd like to propose expanding the list of 3rd-party packages we bundle and >> install by default. (Obviously this does not apply to platforms that >> repackage Python and can do whatever they want, but on Windows and Mac we are >> fully responsible for these.) > > Indeed. Can we *please* keep the source tarball pure? Even ensurepip gives > (some) downstreams plenty of headaches. > > I'm totally fine with the OS X and Windows installers bundling whatever makes > users on those platforms have a better experience. I'm fine with restricting it to the installers we build ourselves too. I guess the core of the discussion (which I got wrong in my quote above) is about turning some 1st-party packages into 3rd-party packages as far as the source tarball is concerned. How would downstreams react to having to get tkinter/Idle/etc. from a separate repo? Cheers, Steve > Cheers, > -Barry > From encukou at gmail.com Fri Aug 14 18:27:11 2015 From: encukou at gmail.com (Petr Viktorin) Date: Fri, 14 Aug 2015 18:27:11 +0200 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CE1096.5020603@python.org> References: <55CCCEA7.8000406@python.org> <20150814110709.036ce425@anarchist.wooz.org> <55CE1096.5020603@python.org> Message-ID: On Fri, Aug 14, 2015 at 6:00 PM, Steve Dower wrote: > How would downstreams react to having to get tkinter/Idle/etc. from a > separate repo? It shouldn't be much of a problem. In Fedora, we already provide tkinter in a separate package from the rest of Python. From mal at egenix.com Fri Aug 14 18:41:49 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 14 Aug 2015 18:41:49 +0200 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CE1001.4020003@python.org> References: <55CCCEA7.8000406@python.org> <55CDCF60.3080902@egenix.com> <55CE1001.4020003@python.org> Message-ID: <55CE1A4D.8050809@egenix.com> On 14.08.2015 17:57, Steve Dower wrote: > On 14Aug2015 0422, M.-A. Lemburg wrote: >> On 13.08.2015 19:06, Steve Dower wrote: >>> I'd like to propose expanding the list of 3rd-party packages we bundle and install by default. >>> (Obviously this does not apply to platforms that repackage Python and can do whatever they want, but >>> on Windows and Mac we are fully responsible for these.) >>> >>> Currently, we bundle pip (and some of its dependencies - let's avoid that particular discussion >>> right now please, it's on python-dev) and install it by default in a way that lets users easily >>> update to the latest version. Including pip in the standard library would lock users into a specific >>> version for the lifetime of that Python version, which would be a bad thing. >>> >>> From my point-of-view, this has been very successful in Python 2.7, 3.4 and will also be successful >>> in 3.5. For Python 3.6, I'd like to do a similar thing with: >>> >>> * requests >>> * tkinter (including tcl/tk, IDLE, and other dependencies) >> >> requests is already installed as part of pip, along with a whole >> set of other packages (but not exposed at the top-level), so >> moving it to its own ensure package wouldn't really change much >> in terms of approach. >> >> The problem I see with requests is that they sometimes >> have glitches in their releases causing them not to be usable, >> so the version that gets "ensured" would need some extra testing >> by whoever manages the list of packages. > > I'm interested in this. What sort of glitches are we talking about here? E.g. 2.5.2 -> 2.5.3 > Are they not caught by the requests team's tests? Why would someone else > be able to test it better than them? No, but someone will have to decide which version is stable enough to put into the ensure package. > I'd certainly be okay with locking in the version at rc1 time to give people a chance for wider > testing. I'd be very nervous about updating any bundled package on the day that the final release is > built. > >> Also notes that the pre-packaged >> version in pip is not managed by the package manager (because >> it doesn't see it), so you will sooner or later end up with multiple >> requests package copies in your site-packages. > > pip has decided to vendor requests to avoid issues like this. It's unfortunate, but it is the best > way to ensure that you can update requests securely even if you get a broken version. Right, and so the question is not so much: "do we want ensure to install requests (and all the other pip and requests dependencies) ?" but rather: "why not expose those bundled version as top-level installs ?" >> Not sure about tkinter. Requiring newbies to run an ensure script >> to be able to run IDLE doesn't sound like a good idea. >> > > Maybe I misunderstand how the ensure scripts work on other platforms? On Windows (and in the > makefile), the installation runs it for them. Depends on which Python version you are talking about. For Python 2.7, ensurepip is not run during installation. For Python 3.4, it is enabled per default. On Unix platforms, you have to run "python2 -m ensurepip" to have it installed, or configure Python 2.7 with --with-ensurepip to have it run during installation (default is not to run ensurepip during install). > Only people who edit the makefile and build from > source would have to run it manually, and I'm fairly sure you don't get to > claim to be a newbie at that point :) > > Of course, if distros disable the ensure scripts, it's on them to make sure their users have access > to the packages they need. Distros can already remove Idle/tkinter if they want to. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 14 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2015-08-12: Released mxODBC 3.3.4 ... http://egenix.com/go80 2015-08-22: FrOSCon 2015 ... 8 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From cory at lukasa.co.uk Sat Aug 15 04:06:44 2015 From: cory at lukasa.co.uk (Cory Benfield) Date: Fri, 14 Aug 2015 22:06:44 -0400 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CE1A4D.8050809@egenix.com> References: <55CCCEA7.8000406@python.org> <55CDCF60.3080902@egenix.com> <55CE1001.4020003@python.org> <55CE1A4D.8050809@egenix.com> Message-ID: On 14 August 2015 at 12:41, M.-A. Lemburg wrote: >>> The problem I see with requests is that they sometimes >>> have glitches in their releases causing them not to be usable, >>> so the version that gets "ensured" would need some extra testing >>> by whoever manages the list of packages. >> >> I'm interested in this. What sort of glitches are we talking about here? > > E.g. 2.5.2 -> 2.5.3 For those who don't want to look this up, the error was that we updated our bundled certificates, which caused cert validation failures on websites offering certain trust chains. This would be difficult/impossible to find with pre-release testing, except by sheer good luck, because it only affected a small number of websites that have no common thread between them. This is inevitable with any form of network protocol implementation, sadly: we tend to hit unexpected edge cases in our dependencies (in this case, OpenSSL's trust chain logic). >> Are they not caught by the requests team's tests? Why would someone else >> be able to test it better than them? > > No, but someone will have to decide which version is stable enough to > put into the ensure package. I cannot speak for the project yet (all three maintainers are currently on holiday, so team communication is not particularly high bandwidth at the moment!), but I suspect we'd be really worried about any system that does not obtain the most recent release of requests, or that cannot respond quickly to security releases in requests. Cory From eric at trueblade.com Sat Aug 15 04:32:37 2015 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 14 Aug 2015 22:32:37 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <20150814110054.75be1c4c@anarchist.wooz.org> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> <20150813100042.3f026ce5@anarchist.wooz.org> <55CDE7AA.9010006@trueblade.com> <20150814110054.75be1c4c@anarchist.wooz.org> Message-ID: <1E16A760-61CD-4B2C-941C-C40BE0208870@trueblade.com> On Aug 14, 2015, at 11:00 AM, Barry Warsaw wrote: > >> On Aug 14, 2015, at 09:05 AM, Eric V. Smith wrote: >> >>> On 08/13/2015 10:00 AM, Barry Warsaw wrote: >>> Is there a problem with keeping the named placeholders throughout >>> the entire stack? >> >> I guess not. It complicates things in the non-translated case, but I >> think it's probably all workable. I'll give it some thought. > > Thanks. > >> But is that enough? I'm not exactly sure what goal we're trying to >> achieve here. If it's to entirely replace the gettext module, >> including things like ngettext, then I think it's not an achievable >> goal, and we should just give up. If it's only to replace >> gettext.gettext (commonly used as "_"), then I think there's hope. > > From my perspective, exactly this. I don't expect or want to replace > gettext. Part of the focus of PEP 501 is to enable gettext as a use case, > binding it opportunistically to the __interpolate__ built-in. Without that > binding, you get "normal" interpolation. One thing that concerns me about gettext integration is the tooling support. For example, could pygettext be taught about f-strings, and could it be made to handle cases such as the 3rd example in: https://docs.python.org/3/library/gettext.html#deferred-translations ? That is: some f-strings in a module that are i18n-aware, and some that aren't. If the "built in" nature of f-strings mean that the tooling can't detect all of the desired use cases, should we move forward with an i18n-friendly version of f-strings? I'm concerned about designing a lot of plumbing for i18n, but no one will end up using because it can't do quite enough. Eric. From rustompmody at gmail.com Sat Aug 15 09:54:41 2015 From: rustompmody at gmail.com (Rustom Mody) Date: Sat, 15 Aug 2015 00:54:41 -0700 (PDT) Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CCE6A6.7000505@nextday.fi> References: <55CCCEA7.8000406@python.org> <55CCE6A6.7000505@nextday.fi> Message-ID: On Friday, August 14, 2015 at 12:19:44 AM UTC+5:30, Alex Gr?nholm wrote: > > 13.08.2015, 20:35, Donald Stufft kirjoitti: > > > > One possible thing to look at for prior art, is what Haskell does. They > don?t have a bunch of ensure* modules or anything like it, instead they > have their compiler (which is like ?Haskell Core? and then on top of that > they layer a bunch of libraries (Called ?Haskell Platform?). This platform > releases every ~6 months and just includes something like 40 different > libraries with it that represent common development tools and widely used > libraries [1]. > > > > So I guess my question is, instead of continuing down a path where we > add more ensure* style modules to the standard library, why not do > something similar and have ?Python the Language? and ?The Python Platform?, > and the platform would be the Python language + N ?important? or ?popular? > packages. This could release on a quicker release schedule than Python > itself (since it would really be more like a meta package than anything > that itself got developed) and would give the ability to ship things like > this without the problems that we?ve had with ensurepip. From a downstream > perspective they would just package all of this stuff as normal and it > would just be available as normal. We could even publish a metapackage on > PyPI that had no code of it?s own, but existed simply to list all of the > platform packages as dependencies (with ==) and then people could easily > depend on the Python ?platform? in their own code. > > > > This would essentially involve someone(s) needing to be the gatekeeper > of which libraries become part of the Python platform, some small packaging > shims to handle the metapackage on PyPI, and then the installer stuff for > OSX and Windows (probably nothing for other OSs? Or maybe a tarball? I > don?t know). > Amen to this. This is EXACTLY where I've hoped Python would go :) > > As an idea that's fine As an actual example, Haskell's package system is more broken than most -- See cabal hell In fact Haskell's package system is ironically un-functional Last I knew you can 'cabal install foo' You cant 'cabal uninstall foo' thereafter; only 'cabal unregister foo' Or delete all haskell packages and start over! That's like saying that after apt-get install foo the only way of undoing it is to reinstall linux More generally large non-trivial systems -- not just languages like python, haskell, ruby etc, but even emacs, tex, eclipse, firefox etc -- have their own packaging systems. These are all upstream of distro systems like apt/portage etc which are usually more stable, thoroughly thought out but by definition of downstream, somewhat stale. It would be good if choices like increasing 'ensures' keep in mind the required symbiosis between these two packaging worlds. Ideally -- somewhat utopian -- we can imagine a world of federated package management, where one could say # apt-get subsystem python install django and apt gets django from Pypi rather than from debian/ubuntu repos -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Sat Aug 15 14:27:54 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 15 Aug 2015 08:27:54 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <55CC8659.2080308@trueblade.com> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> Message-ID: <55CF304A.2080100@trueblade.com> On 08/13/2015 07:58 AM, Eric V. Smith wrote: > On 08/13/2015 12:37 AM, Guido van Rossum wrote: >> On Wed, Aug 12, 2015 at 6:06 PM, Barry Warsaw > > wrote: > >> >> placeholders = source_string.extract_placeholders() >> substitutions = scope(*placeholders) >> translated_string = i18n.lookup(source_string) >> return translated_string.safe_substitute(substitutions) >> >> That would actually be quite useful. >> >> >> Agreed. But whereas you are quite happy having only simple variable >> names in i18n templates, the feature required for the non-i18n use case >> really needs arbitrary expressions. If we marry the two, your i18n code >> will just have to yell at the programmer if they use something too >> complex for the translators as a substitution. So possibly PEP 501 can >> be rescued. But I think we need separate prefixes for the PEP 498 and >> PEP 501 use cases; perhaps f'{...}' and _'{...}'. (But it would not be >> up to the compiler to limit the substitution syntax in _'{...}') > > For the sake of the following argument, let's agree to disagree on: > - arbitrary expressions: we'll say yes > - string prefix character: we'll say 'f' > - how to identify expressions in a string: we'll say {...} > > I promise we can bikeshed about these later. I'm just using the PEP 498 > version because I'm more familiar with it. > > And let's say that PEP 498 will take this: > > name = 'Eric' > dog_name = 'Fluffy' > f"My name is {name}, my dog's name is {dog_name}" > > And convert it to this (inspired by Victor): > > "My name is {0}, my dog's name is {1}".format('Eric', 'Fluffy') > Resulting in: > "My name is Eric, my dog's name is Fluffy" > > It seems to me that all you need for i18n is to instead make it produce: > > __i18n__("My name is {0}, my dog's name is {1}").format('Eric', 'Fluffy') > > The __i18n__ function would do whatever lookup is needed to produce the > translated string. So, in some English dialect where pet names had to > come first, it could return: > 'The owner of the dog {1} is named {0}' > > So the result would be: > 'The owner of the dog Fluffy is named Eric' > > I promise we can bikeshed about the name __i18n__. > > So the translator has no say in how the expressions are evaluated. This > removes any concern about information leakage. If the source code said: > f"My name is {name}, my dog's name is {dog_name.upper()}" > > then the string being passed to __i18n__ would remain unchanged. If by > convention you wanted to not use arbitrary expressions and just use > identifiers, then just make it a coding standard thing. It doesn't > affect the implementation one way or the other. > > The default implementation for my proposed __i18n__ function (probably a > builtin) would be just to return its string argument. Then you get the > PEP 498 behavior. But in your module, you could say: > __i18n__ = gettext.gettext > and now you'd be using that machinery. > > The one downside of this is that the strings that the translator is > translating from do not appear in the source code. The translator would > have to know that the string being translated is: > "My name is {0}, my dog's name is {1}" Okay, here's a new proposal that handles Barry's concern about the format strings passed to __i18n__ not having the same contents as the source code. Instead of translating: name = 'Eric' dog_name = 'Fluffy' f"My name is {name}, my dog's name is {dog_name}" to: __i18n__("My name is {0}, my dog's name is {1}").format('Eric', 'Fluffy') We instead translate it to: __i18n__("My name is {name}, my dog's name is {dog_name}").format_map({'name':'Eric', 'dog_name':'Fluffy') The string would be unchanged from value of the f-string. The keys in the dict would be exactly the expressions inside the braces in the f-string. The values in the dict would be the value of the expressions in the f-string. This solution works for cases where the expressions inside braces are either simple identifiers, or are more complicated expressions. For i18n work, I'd expect them to all be simple identifiers, but that need not be the case. I consider this a code review item. We could add something like's PEP 501's iu-strings, that would be interpolated but not translated, so we could mix translated and non-translated strings in the same module. Probably not spelled fu-strings, though! We'd probably want to add a str.safe_format_map to match the behavior of string.Template.safe_substitute, or add a parameter to str.format_map. I'm not sure how this parameter would get set from an f-string, or if it would always default to "safe" for the __i18n__ case. Maybe instead of __i18n__ just doing the string lookup, it would also be responsible for calling .format_map or .safe_format_map, so it could choose the behavior it wanted on a per-module basis. Eric. From rosuav at gmail.com Sat Aug 15 15:13:22 2015 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 15 Aug 2015 23:13:22 +1000 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <55CF304A.2080100@trueblade.com> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> <55CF304A.2080100@trueblade.com> Message-ID: On Sat, Aug 15, 2015 at 10:27 PM, Eric V. Smith wrote: > Instead of translating: > name = 'Eric' > dog_name = 'Fluffy' > f"My name is {name}, my dog's name is {dog_name}" > > to: > __i18n__("My name is {0}, my dog's name is {1}").format('Eric', 'Fluffy') > > We instead translate it to: > __i18n__("My name is {name}, my dog's name is > {dog_name}").format_map({'name':'Eric', 'dog_name':'Fluffy') > > The string would be unchanged from value of the f-string. The keys in > the dict would be exactly the expressions inside the braces in the > f-string. The values in the dict would be the value of the expressions > in the f-string. > > This solution works for cases where the expressions inside braces are > either simple identifiers, or are more complicated expressions. I know it's a ridiculous corner case, but what if an expression occurs more than once? Will it be evaluated more than once, or will the exact text of the expression be used as, in effect, a lookup key? With simple expressions it won't make any difference, but anywhere else in Python, if you use the same expression twice, it'll be evaluated twice. user = "rosuav" f"You can log in with user name {user} and your provided password, and your web site is now online at http://{user}.amazinghosting.example/ for all to see. Thank you for using Amazing Hosting!" This kind of example should definitely be supported, but what about a function call? f"... user name {user()} ... http://{user()}.amazinghosting.example/" Do that in any other form of expression, and people will expect two calls. With i18n it'd be impossible to distinguish the two, but I'd still normally expect user() to get called twice. ChrisA From eric at trueblade.com Sat Aug 15 15:19:10 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 15 Aug 2015 09:19:10 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> <55CF304A.2080100@trueblade.com> Message-ID: It would be evaluated twice, and in the dict once, with a value of the second time it was calculated. -- Eric. Top posted from my phone. > On Aug 15, 2015, at 9:13 AM, Chris Angelico wrote: > >> On Sat, Aug 15, 2015 at 10:27 PM, Eric V. Smith wrote: >> Instead of translating: >> name = 'Eric' >> dog_name = 'Fluffy' >> f"My name is {name}, my dog's name is {dog_name}" >> >> to: >> __i18n__("My name is {0}, my dog's name is {1}").format('Eric', 'Fluffy') >> >> We instead translate it to: >> __i18n__("My name is {name}, my dog's name is >> {dog_name}").format_map({'name':'Eric', 'dog_name':'Fluffy') >> >> The string would be unchanged from value of the f-string. The keys in >> the dict would be exactly the expressions inside the braces in the >> f-string. The values in the dict would be the value of the expressions >> in the f-string. >> >> This solution works for cases where the expressions inside braces are >> either simple identifiers, or are more complicated expressions. > > I know it's a ridiculous corner case, but what if an expression occurs > more than once? Will it be evaluated more than once, or will the exact > text of the expression be used as, in effect, a lookup key? With > simple expressions it won't make any difference, but anywhere else in > Python, if you use the same expression twice, it'll be evaluated > twice. > > user = "rosuav" > f"You can log in with user name {user} and your provided password, and > your web site is now online at http://{user}.amazinghosting.example/ > for all to see. Thank you for using Amazing Hosting!" > > This kind of example should definitely be supported, but what about a > function call? > > f"... user name {user()} ... http://{user()}.amazinghosting.example/" > > Do that in any other form of expression, and people will expect two > calls. With i18n it'd be impossible to distinguish the two, but I'd > still normally expect user() to get called twice. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From mark.tse at neverendingqs.com Sat Aug 15 21:18:07 2015 From: mark.tse at neverendingqs.com (Mark Tse) Date: Sat, 15 Aug 2015 15:18:07 -0400 Subject: [Python-ideas] Map to Many Function Message-ID: Currently, when the function for map() returns a list, the resulting object is an iterable of lists: >>> list(map(lambda x: [x, x], [1, 2, 3, 4])) [[1, 1], [2, 2], [3, 3], [4, 4]] However, a function to convert each element to multiple elements, similar to flatMap (Java) or SelectMany (C#) does not exist, for doing the following: >>> list(mapmany(lambda x: [x, x], [1, 2, 3, 4])) [1, 1, 2, 2, 3, 3, 4, 4] Proposal: new built-in method or standard library function to do mapmany. Sample use case: Library JSON data returns a list of authors, and each author has a list of books: { [ { 'author': 'name', 'books': ['book1', 'book2'] }, { 'author': 'name, 'books': ['book3', 'book4'] }, ... ] } allbooks = list(mapmany(lambda x: x['books'], json)) (First time posting - please redirect me to things I need to read if something's wrong with the format of this suggestion). Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Aug 15 21:33:48 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 15 Aug 2015 15:33:48 -0400 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: Message-ID: On Sat, Aug 15, 2015 at 3:18 PM, Mark Tse wrote: > However, a function to convert each element to multiple elements, similar to > flatMap (Java) or SelectMany (C#) does not exist, for doing the following: > >>>> list(mapmany(lambda x: [x, x], [1, 2, 3, 4])) > [1, 1, 2, 2, 3, 3, 4, 4] >>> from itertools import chain >>> list(chain.from_iterable((map(lambda x: [x, x], range(5))))) [0, 0, 1, 1, 2, 2, 3, 3, 4, 4] From encukou at gmail.com Sat Aug 15 21:34:58 2015 From: encukou at gmail.com (Petr Viktorin) Date: Sat, 15 Aug 2015 21:34:58 +0200 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: Message-ID: On Sat, Aug 15, 2015 at 9:18 PM, Mark Tse wrote: > Currently, when the function for map() returns a list, the resulting object > is an iterable of lists: > >>>> list(map(lambda x: [x, x], [1, 2, 3, 4])) > [[1, 1], [2, 2], [3, 3], [4, 4]] > > However, a function to convert each element to multiple elements, similar to > flatMap (Java) or SelectMany (C#) does not exist, for doing the following: > >>>> list(mapmany(lambda x: [x, x], [1, 2, 3, 4])) > [1, 1, 2, 2, 3, 3, 4, 4] There's no built-in to do it, but with a little help from itertools, you can get the effect: >>> import itertools >>> list(itertools.chain.from_iterable(map(lambda x: [x, x], [1, 2, 3, 4]))) [1, 1, 2, 2, 3, 3, 4, 4] Wrapping this up in a "mapmany" function should take about two lines of code. From emile at fenx.com Sun Aug 16 00:17:32 2015 From: emile at fenx.com (Emile van Sebille) Date: Sat, 15 Aug 2015 15:17:32 -0700 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: Message-ID: On 8/15/2015 12:18 PM, Mark Tse wrote: > Currently, when the function for map() returns a list, the resulting object > is an iterable of lists: > >>>> list(map(lambda x: [x, x], [1, 2, 3, 4])) > [[1, 1], [2, 2], [3, 3], [4, 4]] > > However, a function to convert each element to multiple elements, similar > to flatMap (Java) or SelectMany (C#) does not exist, for doing the > following: In addition to the itertools solutions already posted, there's also a flatten function that'll do it: Python 2.7.6 (default, Mar 22 2014, 22:59:56) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from compiler.ast import flatten >>> flatten ((map(lambda x: [x, x], [1, 2, 3, 4]))) [1, 1, 2, 2, 3, 3, 4, 4] >>> Emile From wes.turner at gmail.com Sun Aug 16 00:54:28 2015 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 15 Aug 2015 17:54:28 -0500 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: Message-ID: On Aug 15, 2015 5:18 PM, "Emile van Sebille" wrote: > > On 8/15/2015 12:18 PM, Mark Tse wrote: >> >> Currently, when the function for map() returns a list, the resulting object >> is an iterable of lists: >> >>>>> list(map(lambda x: [x, x], [1, 2, 3, 4])) >> >> [[1, 1], [2, 2], [3, 3], [4, 4]] >> >> However, a function to convert each element to multiple elements, similar >> to flatMap (Java) or SelectMany (C#) does not exist, for doing the >> following: > > > In addition to the itertools solutions already posted, there's also a flatten function that'll do it: > > Python 2.7.6 (default, Mar 22 2014, 22:59:56) > [GCC 4.8.2] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> from compiler.ast import flatten Thanks! Hadn't been aware that there is a flatten() func in stdlib. > >>> flatten ((map(lambda x: [x, x], [1, 2, 3, 4]))) > > [1, 1, 2, 2, 3, 3, 4, 4] > >>> > > Emile > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Sun Aug 16 01:08:28 2015 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Sat, 15 Aug 2015 18:08:28 -0500 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: Message-ID: Wow...I was completely unaware of this! It's worth noting that *this is Python 2-only*. It will give an ImportError with Python 3. :( I really wish the stdlib had something like this, considering that I use it constantly. On Sat, Aug 15, 2015 at 5:17 PM, Emile van Sebille wrote: > On 8/15/2015 12:18 PM, Mark Tse wrote: > >> Currently, when the function for map() returns a list, the resulting >> object >> is an iterable of lists: >> >> list(map(lambda x: [x, x], [1, 2, 3, 4])) >>>>> >>>> [[1, 1], [2, 2], [3, 3], [4, 4]] >> >> However, a function to convert each element to multiple elements, similar >> to flatMap (Java) or SelectMany (C#) does not exist, for doing the >> following: >> > > In addition to the itertools solutions already posted, there's also a > flatten function that'll do it: > > Python 2.7.6 (default, Mar 22 2014, 22:59:56) > [GCC 4.8.2] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> from compiler.ast import flatten > >>> flatten ((map(lambda x: [x, x], [1, 2, 3, 4]))) > [1, 1, 2, 2, 3, 3, 4, 4] > >>> > > Emile > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Aug 16 01:51:12 2015 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Aug 2015 02:51:12 +0300 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 1:17 AM, Emile van Sebille wrote: > On 8/15/2015 12:18 PM, Mark Tse wrote: > >> Currently, when the function for map() returns a list, the resulting >> object >> is an iterable of lists: >> >> list(map(lambda x: [x, x], [1, 2, 3, 4])) >>>>> >>>> [[1, 1], [2, 2], [3, 3], [4, 4]] >> >> However, a function to convert each element to multiple elements, similar >> to flatMap (Java) or SelectMany (C#) does not exist, for doing the >> following: >> > > In addition to the itertools solutions already posted, there's also a > flatten function that'll do it: > > Python 2.7.6 (default, Mar 22 2014, 22:59:56) > [GCC 4.8.2] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> from compiler.ast import flatten > >>> flatten ((map(lambda x: [x, x], [1, 2, 3, 4]))) > [1, 1, 2, 2, 3, 3, 4, 4] > >>> That function isn't meant for public consumption though. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sun Aug 16 02:47:51 2015 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 15 Aug 2015 19:47:51 -0500 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: Message-ID: On Aug 15, 2015 6:52 PM, "Guido van Rossum" wrote: > > On Sun, Aug 16, 2015 at 1:17 AM, Emile van Sebille wrote: >> >> On 8/15/2015 12:18 PM, Mark Tse wrote: >>> >>> Currently, when the function for map() returns a list, the resulting object >>> is an iterable of lists: >>> >>>>>> list(map(lambda x: [x, x], [1, 2, 3, 4])) >>> >>> [[1, 1], [2, 2], [3, 3], [4, 4]] >>> >>> However, a function to convert each element to multiple elements, similar >>> to flatMap (Java) or SelectMany (C#) does not exist, for doing the >>> following: >> >> >> In addition to the itertools solutions already posted, there's also a flatten function that'll do it: >> >> Python 2.7.6 (default, Mar 22 2014, 22:59:56) >> [GCC 4.8.2] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> >>> from compiler.ast import flatten >> >>> flatten ((map(lambda x: [x, x], [1, 2, 3, 4]))) >> [1, 1, 2, 2, 3, 3, 4, 4] >> >>> > > > That function isn't meant for public consumption though. * notes about flatten in toolz: https://github.com/pytoolz/toolz/issues/176 * flatten in fn.py: https://github.com/kachayev/fn.py#itertools-recipes > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sun Aug 16 03:29:02 2015 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 15 Aug 2015 20:29:02 -0500 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: Message-ID: On Aug 15, 2015 2:35 PM, "Petr Viktorin" wrote: > > On Sat, Aug 15, 2015 at 9:18 PM, Mark Tse wrote: > > Currently, when the function for map() returns a list, the resulting object > > is an iterable of lists: > > > >>>> list(map(lambda x: [x, x], [1, 2, 3, 4])) > > [[1, 1], [2, 2], [3, 3], [4, 4]] > > > > However, a function to convert each element to multiple elements, similar to > > flatMap (Java) or SelectMany (C#) does not exist, for doing the following: > > > >>>> list(mapmany(lambda x: [x, x], [1, 2, 3, 4])) > > [1, 1, 2, 2, 3, 3, 4, 4] > > There's no built-in to do it, but with a little help from itertools, > you can get the effect: > > >>> import itertools > >>> list(itertools.chain.from_iterable(map(lambda x: [x, x], [1, 2, 3, 4]))) > [1, 1, 2, 2, 3, 3, 4, 4] > > Wrapping this up in a "mapmany" function should take about two lines of code. I think this is something like toolz.itertoolz.mapcat: https://toolz.readthedocs.org/en/latest/api.html#toolz.itertoolz.mapcat > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.us Sun Aug 16 03:57:18 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Sat, 15 Aug 2015 21:57:18 -0400 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: Message-ID: <1439690238.1821156.357247953.40570399@webmail.messagingengine.com> On Sat, Aug 15, 2015, at 18:54, Wes Turner wrote: > Thanks! Hadn't been aware that there is a flatten() func in stdlib. You should be aware that this will flatten _any_ list or tuple elements inside the elements, and it is gone in python 3. Also, it constructs the result as a list rather than an iterator, if that matters to you. From wes.turner at gmail.com Sun Aug 16 04:02:44 2015 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 15 Aug 2015 21:02:44 -0500 Subject: [Python-ideas] Map to Many Function In-Reply-To: <1439690238.1821156.357247953.40570399@webmail.messagingengine.com> References: <1439690238.1821156.357247953.40570399@webmail.messagingengine.com> Message-ID: On Aug 15, 2015 8:57 PM, wrote: > > On Sat, Aug 15, 2015, at 18:54, Wes Turner wrote: > > Thanks! Hadn't been aware that there is a flatten() func in stdlib. > > You should be aware that this will flatten _any_ list or tuple elements > inside the elements, and it is gone in python 3. So it would then flatten e.g. strings without flinching > > Also, it constructs the result as a list rather than an iterator, if > that matters to you. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.us Sun Aug 16 04:06:13 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Sat, 15 Aug 2015 22:06:13 -0400 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: <1439690238.1821156.357247953.40570399@webmail.messagingengine.com> Message-ID: <1439690773.1822121.357251761.5157C1DA@webmail.messagingengine.com> On Sat, Aug 15, 2015, at 22:02, Wes Turner wrote: > On Aug 15, 2015 8:57 PM, wrote: > > > > On Sat, Aug 15, 2015, at 18:54, Wes Turner wrote: > > > Thanks! Hadn't been aware that there is a flatten() func in stdlib. > > > > You should be aware that this will flatten _any_ list or tuple elements > > inside the elements, and it is gone in python 3. > > So it would then flatten e.g. strings without flinching No, a string isn't a tuple or a list. The point is it will turn (1, 2, [3, (4, 5), 6, [7, 8, [9, 10]]]) into [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] So if you have [[(1, 2), (1, 2)], [(3, 4), (3, 4)]] it will become [1, 2, 1, 2, 3, 4, 3, 4] while the mapmany idea you originally discussed, and the solutions other people have given with itertools chain, would give [(1, 2), (1, 2), (3, 4), (3, 4)] Look for yourself, the source code is pretty understandable: def flatten(seq): l = [] for elt in seq: t = type(elt) if t is tuple or t is list: for elt2 in flatten(elt): l.append(elt2) else: l.append(elt) return l You can see it recursively calls flatten on every tuple or list element. From wes.turner at gmail.com Sun Aug 16 04:15:29 2015 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 15 Aug 2015 21:15:29 -0500 Subject: [Python-ideas] Map to Many Function In-Reply-To: <1439690773.1822121.357251761.5157C1DA@webmail.messagingengine.com> References: <1439690238.1821156.357247953.40570399@webmail.messagingengine.com> <1439690773.1822121.357251761.5157C1DA@webmail.messagingengine.com> Message-ID: On Aug 15, 2015 9:06 PM, wrote: > > > > On Sat, Aug 15, 2015, at 22:02, Wes Turner wrote: > > On Aug 15, 2015 8:57 PM, wrote: > > > > > > On Sat, Aug 15, 2015, at 18:54, Wes Turner wrote: > > > > Thanks! Hadn't been aware that there is a flatten() func in stdlib. > > > > > > You should be aware that this will flatten _any_ list or tuple elements > > > inside the elements, and it is gone in python 3. > > > > So it would then flatten e.g. strings without flinching > > No, a string isn't a tuple or a list. > > The point is it will turn (1, 2, [3, (4, 5), 6, [7, 8, [9, 10]]]) into > [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] > > So if you have [[(1, 2), (1, 2)], [(3, 4), (3, 4)]] it will become [1, > 2, 1, 2, 3, 4, 3, 4] while the mapmany idea you originally discussed, > and the solutions other people have given with itertools chain, would > give [(1, 2), (1, 2), (3, 4), (3, 4)] > > Look for yourself, the source code is pretty understandable: > > def flatten(seq): > l = [] > for elt in seq: > t = type(elt) > if t is tuple or t is list: > for elt2 in flatten(elt): > l.append(elt2) > else: > l.append(elt) > return l > > You can see it recursively calls flatten on every tuple or list element. Got it. So there is no forwardports.flatten in py3k? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sun Aug 16 05:07:10 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 15 Aug 2015 20:07:10 -0700 Subject: [Python-ideas] Map to Many Function In-Reply-To: References: <1439690238.1821156.357247953.40570399@webmail.messagingengine.com> <1439690773.1822121.357251761.5157C1DA@webmail.messagingengine.com> Message-ID: On Aug 15, 2015, at 19:15, Wes Turner wrote: > > > On Aug 15, 2015 9:06 PM, wrote: > > > > > > > > On Sat, Aug 15, 2015, at 22:02, Wes Turner wrote: > > > On Aug 15, 2015 8:57 PM, wrote: > > > > > > > > On Sat, Aug 15, 2015, at 18:54, Wes Turner wrote: > > > > > Thanks! Hadn't been aware that there is a flatten() func in stdlib. > > > > > > > > You should be aware that this will flatten _any_ list or tuple elements > > > > inside the elements, and it is gone in python 3. > > > > > > So it would then flatten e.g. strings without flinching > > > > No, a string isn't a tuple or a list. > > > > The point is it will turn (1, 2, [3, (4, 5), 6, [7, 8, [9, 10]]]) into > > [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] > > > > So if you have [[(1, 2), (1, 2)], [(3, 4), (3, 4)]] it will become [1, > > 2, 1, 2, 3, 4, 3, 4] while the mapmany idea you originally discussed, > > and the solutions other people have given with itertools chain, would > > give [(1, 2), (1, 2), (3, 4), (3, 4)] > > > > Look for yourself, the source code is pretty understandable: > > > > def flatten(seq): > > l = [] > > for elt in seq: > > t = type(elt) > > if t is tuple or t is list: > > for elt2 in flatten(elt): > > l.append(elt2) > > else: > > l.append(elt) > > return l > > > > You can see it recursively calls flatten on every tuple or list element. > > Got it. So there is no forwardports.flatten in py3k? > Why would you expect a forward port of an undocumented function, especially one that trivial? Also, given that flatten doesn't do what you want here, and there's also a stdlib function (chain) that does what you want? -------------- next part -------------- An HTML attachment was scrubbed... URL: From 4kir4.1i at gmail.com Sun Aug 16 16:16:29 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Sun, 16 Aug 2015 17:16:29 +0300 Subject: [Python-ideas] Map to Many Function References: Message-ID: <87lhdbjp36.fsf@gmail.com> Mark Tse writes: > Currently, when the function for map() returns a list, the resulting object > is an iterable of lists: > >>>> list(map(lambda x: [x, x], [1, 2, 3, 4])) > [[1, 1], [2, 2], [3, 3], [4, 4]] > > However, a function to convert each element to multiple elements, similar > to flatMap (Java) or SelectMany (C#) does not exist, for doing the > following: > >>>> list(mapmany(lambda x: [x, x], [1, 2, 3, 4])) > [1, 1, 2, 2, 3, 3, 4, 4] > > Proposal: new built-in method or standard library function to do mapmany. > There is itertools.chain: >>> from itertools import chain >>> list(chain.from_iterable(map(lambda x: [x, x], [1, 2, 3, 4]))) [1, 1, 2, 2, 3, 3, 4, 4] >>> [item for x in [1, 2, 3, 4] for item in [x, x]] [1, 1, 2, 2, 3, 3, 4, 4] > Sample use case: > Library JSON data returns a list of authors, and each author has a list of > books: > > { [ { 'author': 'name', 'books': ['book1', 'book2'] }, { 'author': 'name, > 'books': ['book3', 'book4'] }, ... ] } > > allbooks = list(mapmany(lambda x: x['books'], json)) > allbooks = list(chain.from_iterable(map(itemgetter('books'), json_data))) Or allbooks = [book for x in json_data for book in x['book']] From mal at egenix.com Sun Aug 16 17:03:54 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 16 Aug 2015 17:03:54 +0200 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> <55CDCF60.3080902@egenix.com> <55CE1001.4020003@python.org> <55CE1A4D.8050809@egenix.com> Message-ID: <55D0A65A.1090706@egenix.com> On 15.08.2015 04:06, Cory Benfield wrote: > On 14 August 2015 at 12:41, M.-A. Lemburg wrote: >>>> The problem I see with requests is that they sometimes >>>> have glitches in their releases causing them not to be usable, >>>> so the version that gets "ensured" would need some extra testing >>>> by whoever manages the list of packages. >>> >>> I'm interested in this. What sort of glitches are we talking about here? >> >> E.g. 2.5.2 -> 2.5.3 > > For those who don't want to look this up, the error was that we > updated our bundled certificates, which caused cert validation > failures on websites offering certain trust chains. This would be > difficult/impossible to find with pre-release testing, except by sheer > good luck, because it only affected a small number of websites that > have no common thread between them. This is inevitable with any form > of network protocol implementation, sadly: we tend to hit unexpected > edge cases in our dependencies (in this case, OpenSSL's trust chain > logic). Sorry, should have added some more context. Thanks for adding it. >>> Are they not caught by the requests team's tests? Why would someone else >>> be able to test it better than them? >> >> No, but someone will have to decide which version is stable enough to >> put into the ensure package. > > I cannot speak for the project yet (all three maintainers are > currently on holiday, so team communication is not particularly high > bandwidth at the moment!), but I suspect we'd be really worried about > any system that does not obtain the most recent release of requests, > or that cannot respond quickly to security releases in requests. The problem here is that the ensure package would include one particular package version and install this per default (with an option to update it to the most recent release, if possible during install, as is done for ensurepip in Python 3.4). I doubt that people will regularly run a package update on all their virtualenvs and Python installations to get the most recent requests version, so this needs to be taken into account somehow. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 16 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2015-08-12: Released mxODBC 3.3.4 ... http://egenix.com/go80 2015-08-22: FrOSCon 2015 ... 6 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From p.f.moore at gmail.com Mon Aug 17 01:08:26 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 17 Aug 2015 00:08:26 +0100 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <55CE1A4D.8050809@egenix.com> References: <55CCCEA7.8000406@python.org> <55CDCF60.3080902@egenix.com> <55CE1001.4020003@python.org> <55CE1A4D.8050809@egenix.com> Message-ID: On 14 August 2015 at 17:41, M.-A. Lemburg wrote: >> pip has decided to vendor requests to avoid issues like this. It's unfortunate, but it is the best >> way to ensure that you can update requests securely even if you get a broken version. > > Right, and so the question is not so much: "do we want ensure to > install requests (and all the other pip and requests dependencies) ?" > but rather: "why not expose those bundled version as top-level installs ?" It's important to note that pip vendors the modules it does for a number of reasons - one, to ensure that they are always available, two so that there's no circular dependency (we need requests available to upgrade requests), and three, to ensure we have a stable, tested version (so that we don't have to field "pip is broken" bugs where someone has an in-development version of requests installed, for example). The first of the above two reasons is one that could be handled by ensure making these dependencies available at the top level, but the second and third advantages would be lost if that happened. We have had enough issues raised with pip because users have problems with the system OpenSSL installation (which we have to depend on, and which we can't control) that I, for one, would be reluctant for us to expose ourselves to yet more potential issues in the same vein. (Albeit much less likely to occur, as the circumstances which would cause a problem are pretty unusual). And someone would need to audit all of the uses of (currently) vendored modules in pip, to verify that if they were made top-level dependencies, they could upgrade themselves in place without (say) importing a previously unused sub-module after the uninstall had occurred but before the subsequent install. And keep that audit up to date as changes occur with pip to ensure no problems are introduced via new PRs. Paul From stephen at xemacs.org Mon Aug 17 02:27:14 2015 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 17 Aug 2015 09:27:14 +0900 Subject: [Python-ideas] More "ensure*" packages In-Reply-To: References: <55CCCEA7.8000406@python.org> Message-ID: <87twrywyhp.fsf@uwakimon.sk.tsukuba.ac.jp> Donald Stufft writes: > One possible thing to look at for prior art, is what Haskell > does. I haven't done Darcs in anger for several years but I still follow their mailing lists, and it seems like the GHC people are happy breaking the world for Darcs every 6 months. Darcs is a relatively old member of the GHC world having many internal components that since have be superseded by "platform" modules, and apparently the style of much of the core code is considered idiosyncratic, so it may be an unusual case. Still, it doesn't sound to me like Haskell provides anything like the stability promises that Python makes for the language and the stdlib. From rustompmody at gmail.com Mon Aug 17 02:56:38 2015 From: rustompmody at gmail.com (Rustom Mody) Date: Sun, 16 Aug 2015 17:56:38 -0700 (PDT) Subject: [Python-ideas] More "ensure*" packages In-Reply-To: <87twrywyhp.fsf@uwakimon.sk.tsukuba.ac.jp> References: <55CCCEA7.8000406@python.org> <87twrywyhp.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Monday, August 17, 2015 at 5:57:48 AM UTC+5:30, Stephen J. Turnbull wrote: > > Donald Stufft writes: > > > One possible thing to look at for prior art, is what Haskell > > does. > > I haven't done Darcs in anger for several years but I still follow > their mailing lists, and it seems like the GHC people are happy > breaking the world for Darcs every 6 months. Darcs is a relatively > old member of the GHC world having many internal components that since > have be superseded by "platform" modules, and apparently the style of > much of the core code is considered idiosyncratic, so it may be an > unusual case. Still, it doesn't sound to me like Haskell provides > anything like the stability promises that Python makes for the > language and the stdlib. > > Thats an interesting point to bring up. Yes darcs is an old member of ghc world And yes Haskell provides poor stability guarantees compared to Python And since these two ? stable language vs research language ? pull in opposite ways the inevitable happened: ghc junked darcs for git for its own development :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Aug 17 17:41:10 2015 From: barry at python.org (Barry Warsaw) Date: Mon, 17 Aug 2015 11:41:10 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) In-Reply-To: <1E16A760-61CD-4B2C-941C-C40BE0208870@trueblade.com> References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> <20150813100042.3f026ce5@anarchist.wooz.org> <55CDE7AA.9010006@trueblade.com> <20150814110054.75be1c4c@anarchist.wooz.org> <1E16A760-61CD-4B2C-941C-C40BE0208870@trueblade.com> Message-ID: <20150817114110.20f78020@anarchist.wooz.org> On Aug 14, 2015, at 10:32 PM, Eric V. Smith wrote: >One thing that concerns me about gettext integration is the tooling >support. That worries me about PEP 498 too, but for different reasons. See my other follow up (in python-dev I think). >For example, could pygettext be taught about f-strings, and could it be made >to handle cases such as the 3rd example in: >https://docs.python.org/3/library/gettext.html#deferred-translations ? > >That is: some f-strings in a module that are i18n-aware, and some that >aren't. If the "built in" nature of f-strings mean that the tooling can't >detect all of the desired use cases, should we move forward with an >i18n-friendly version of f-strings? I'm concerned about designing a lot of >plumbing for i18n, but no one will end up using because it can't do quite >enough. That's a great question. It could be solved by having a prefix explicitly for i18n extraction, e.g. PEP 501's i-strings. I agree that mixing translatable strings with strings-not-to-be-translated is an issue worth figuring out because you don't want to overload translators with a bunch of string they don't have to translate. As for deferred translations, they are rare enough that some alternative spelling is IMHO acceptable. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From barry at python.org Mon Aug 17 17:51:17 2015 From: barry at python.org (Barry Warsaw) Date: Mon, 17 Aug 2015 11:51:17 -0400 Subject: [Python-ideas] Fix the DRY problem (was Re: PEP 501 - i18n with marked strings) References: <55C90A3E.1080906@mgmiller.net> <55C9D1D2.1070703@nedbatchelder.com> <20150811112043.676c6038@anarchist.wooz.org> <8737zp8ccx.fsf@vostro.rath.org> <2096801B-F7EB-4DCD-A257-77F2CBCE241B@yahoo.com> <20150812120627.0efc2b79@anarchist.wooz.org> <55CC8659.2080308@trueblade.com> <55CF304A.2080100@trueblade.com> Message-ID: <20150817115117.39d44f9f@anarchist.wooz.org> On Aug 15, 2015, at 08:27 AM, Eric V. Smith wrote: >We instead translate it to: >__i18n__("My name is {name}, my dog's name is >{dog_name}").format_map({'name':'Eric', 'dog_name':'Fluffy') > >The string would be unchanged from value of the f-string. The keys in >the dict would be exactly the expressions inside the braces in the >f-string. The values in the dict would be the value of the expressions >in the f-string. +1 >We could add something like's PEP 501's iu-strings, that would be >interpolated but not translated, so we could mix translated and >non-translated strings in the same module. Probably not spelled >fu-strings, though! One of the things I've mentioned to Nick about PEP 501 is the difference between i"foo" and iu"foo". The former gets mapped to __interpolate__() while the latter gets mapped to __interpolateu__(). Nick makes the case for this distinction based on the ability to override __interpolate__() in the local namespace to implement i18n, whereas __interpolateu__() - while technically still able to override - would generally just be left to the "normal" non-i18n interpolation. I countered with a proposal that a context manager could be used, but Nick points out that you can't really *unbind* __interpolate__() when the context manager exits. This still seems weird to me. There's no distinction in Python 3 between "foo" and u"foo" with the latter having been re-added to aid in migrations between Python 2 and 3. But with PEP 501, this introduces a functional distinction between i"foo" and iu"foo" (and ui"foo"?). It's handy, but seems to be a fairly significant difference from the current use if u-prefixes. I'm sympathetic but still skeptical. ;) >We'd probably want to add a str.safe_format_map to match the behavior of >string.Template.safe_substitute, or add a parameter to str.format_map. I'm >not sure how this parameter would get set from an f-string, or if it would >always default to "safe" for the __i18n__ case. > >Maybe instead of __i18n__ just doing the string lookup, it would also be >responsible for calling .format_map or .safe_format_map, so it could choose >the behavior it wanted on a per-module basis. You always want safe-substitution for i18n because you can't let broken translations break your application (i.e. by causing exceptions to be thrown). It's the lesser of two evils to just include the original, un-interpolated placeholder in the final string. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From kesavarapu.siva at gmail.com Mon Aug 17 19:19:54 2015 From: kesavarapu.siva at gmail.com (shiva prasanth) Date: Mon, 17 Aug 2015 22:49:54 +0530 Subject: [Python-ideas] pip moduletree new argument Message-ID: pip list only shows the name of package which auther wants which cant be imported for import we have to search for internet in which package format auther has written what i want to introduce is a new argument for pip this is new output which will show the list of all available packages. and if there is a subpackage it should be represented as branch tree and if all the leafs should be modules it is justlike pstree command in linux. $pip moduletree name name==version packagename-----+----{subpackage1}-----+----{module1} | +----{module2} | +----{module3} +----{subpackage2}------+---{module1} +----{subpackage3}------+-----{module} +-----mainmodule1 +-----mainmodule2 $pip moduletree # will give the above output for the for each package in pip list $pip moduletree packagename --classes will output the availble class names in the leaf node -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Mon Aug 17 22:13:14 2015 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 17 Aug 2015 16:13:14 -0400 Subject: [Python-ideas] template strings Message-ID: <55D2405A.6020404@gmail.com> In ECMAScript 6 there is a concept of Template Strings [1]. What if we add something similar in Python? Some key ideas -------------- 1. Template Strings (TS) will be built on top of PEP 498 machinery (if accepted). 2. The syntax will match the following: {python identifier}{optional whitespace}{string literal} where "python identifier" can be any valid python name *except* r, u, b, or f. Some examples of valid TS: ## _'foo {bar}' ## sql = db.escape sql """ SELECT ... FROM ... """ ## from framework import html html"""
{caption}
""" 3. A special magic method will be added: __format_str__(string, values_map). For instance, b = 10 print(_'something { b+1 }') will be equivalent to b = 10 print(_.__format_str__('something { b+1 }', {' b+1 ': b + 1})) (or however PEP 498 will be implemented). Some use cases -------------- 1. i18n and PEP 501 Pros: - No global __interpolate__ builtin (hard to have more than one i18n lib in one project) - Easy to restrict the exact interpolation syntax: class T: def __format_str__(self, string, values_map): for name in values_map: if not name.isidentifier(): raise ValueError('i18n string only support ...') ... _ = T() _'spam: {spam and ham}' # will raise a ValueError -'spam: {ham}' # will be interpolated - Can have more than one i18n lib: a.py: from lib1 import _ print(_'...') b.py: from gettext import gettext as _ print(_'...') 2. SQL queries Being able to write db.query(db'SELECT * FROM users WHERE name = {name} AND ...') instead of db.query('SELECT * FROM users WHERE name = {} AND ...', name) 3. Automatic HTML escaping (see [2] for __markup__ protocol, for instance): name = '. My (limited) experience with pyxl at Dropbox also suggests that html often is constructed programmatically in multiple stages, so it's important to be able to include already-interpolated html fragments into another html block. - In SQL the evaluation of $N is often built into the SQL parser. - Honestly, subprocess.call(i'echo $filename') looks like it's referencing an environment variable, not a variable in the Python code. [1] I am not endorsing pyxl -- its use is currently controversial at Dropbox. But its "coding: pyxl" hack is easily adapted for other syntax experiments (e.g. https://github.com/JukkaL/mypy/tree/master/mypy/codec). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Mon Aug 24 03:23:56 2015 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 23 Aug 2015 21:23:56 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DA66C5.7000105@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> Message-ID: On 08/23/2015 08:35 PM, Eric V. Smith wrote: > Thanks for PEP 501. Maybe I'll add delayed interpolation to PEP 498! > > On a more serious note, I'm thinking of adding i-strings to my f-string > implementation. I have some ideas that the format_spec (the :.3f stuff) > could be used by the code that eventually does the string interpolation. > For example, sql(i-string) might want to interpret this expression using > __sql__, instead of how str(i-string) would use __format__. Then the > sql() machinery could look at the format_spec and pass it to the value's > __sql__ method. > > For example: > sql(i'select {date:as_date} from {tablename}' > > might call date.__sql__('as_date'), which would know how to cast to the > write datatype (this happens to me all the time). > > This is one reason I'm thinking of ditching !s, !r, and !a, at least for > the first implementation of PEP 498: they're not needed, and are not > generally applicable if we add the hooks I'm considering into i-strings. In the .format() mini language is there a way to format an in place literal value? (ok... need an example for this one.) "{Name: {'John Doe':?<30} {'123-123-1234':?>13}\n".format() What would '?' be? Here this case the values are give, but not formatted yet. I was thinking this would allow interpolating the values, then translating, and finally formatting the translated string. It seems part of the problem is the insertion of the values and formatting may be tied to closely each other. Field formatting and value formatting are to separate things. By separating them into two well defined steps, we may be able to do... "{Name: {name:<30} {number:>13}\n".interpolate().translate().format() And possibly a literal syntax for that could just be expanded to the chained method calls. Probably 'i' and/or 'f' would do, but 't' for translate seems like it may be nice. And if someone wanted to they can still do each step separately by using the methods explicitly. Cheers, Ron From steve at pearwood.info Mon Aug 24 03:24:59 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 24 Aug 2015 11:24:59 +1000 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DA66C5.7000105@trueblade.com> References: <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> Message-ID: <20150824012457.GD3881@ando.pearwood.info> On Sun, Aug 23, 2015 at 08:35:17PM -0400, Eric V. Smith wrote: > I think the string interpolation object is interesting. It's basically > what Petr Viktorin and Chris Angelico discussed and suggested here: > https://mail.python.org/pipermail/python-ideas/2015-August/035303.html. Are you sure that's the right URL? It seems only barely relevant to me. It has Chris replying to Petr, but it's a vague suggestion of a "quantum string interpolation" (Chris' words) with no details. He asks: "How hard would this be to implement? Something that isn't a string, retains all the necessary information, and then collapses to a string when someone looks at it?" I looked ahead a dozen or two posts, and can't see any further discussion. Have I missed something? -- Steve From eric at trueblade.com Mon Aug 24 03:31:06 2015 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 23 Aug 2015 21:31:06 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <20150824012457.GD3881@ando.pearwood.info> References: <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <20150824012457.GD3881@ando.pearwood.info> Message-ID: <76F0A8EA-BA03-40F1-913E-760A201978CC@trueblade.com> > On Aug 23, 2015, at 9:24 PM, Steven D'Aprano wrote: > >> On Sun, Aug 23, 2015 at 08:35:17PM -0400, Eric V. Smith wrote: >> >> I think the string interpolation object is interesting. It's basically >> what Petr Viktorin and Chris Angelico discussed and suggested here: >> https://mail.python.org/pipermail/python-ideas/2015-August/035303.html. > > Are you sure that's the right URL? It seems only barely relevant to me. > It has Chris replying to Petr, but it's a vague suggestion of a "quantum > string interpolation" (Chris' words) with no details. He asks: > > "How hard would this be to implement? Something that isn't a string, > retains all the necessary information, and then collapses to a string > when someone looks at it?" > > I looked ahead a dozen or two posts, and can't see any further > discussion. Have I missed something? That's the right url. I thought they were talking about the same thing. I even had a response written about it, saying it would always require str() for the simple use case. Then I accidentally deleted it before I sent it :( Maybe I read too much in to it. Eric. From ncoghlan at gmail.com Mon Aug 24 03:41:42 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Aug 2015 11:41:42 +1000 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DA66C5.7000105@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> Message-ID: On 24 August 2015 at 10:35, Eric V. Smith wrote: > On 08/22/2015 09:37 PM, Nick Coghlan wrote: >> The trick would be to make interpolation lazy *by default* (preserving >> the triple of the raw template string, the parsed fields, and the >> expression values), and put the default rendering in the resulting >> object's *__str__* method. > > At this point, I think PEPs 498 and 501 have converged, except for the > delayed string interpolation object (which I realize is important) and > how expressions are identified in the strings (which I consider less > important). > > I think the string interpolation object is interesting. It's basically > what Petr Viktorin and Chris Angelico discussed and suggested here: > https://mail.python.org/pipermail/python-ideas/2015-August/035303.html. Aha, I though I'd seen that idea go by in one of the threads, but I didn't remember where :) I'll add Petr and Chris to the acknowledgements section in 501. > My suggestion would be to add both f-strings (PEP 498) and i-strings (as > they're currently called in PEP 501), but with the exact same syntax to > identify and evaluate expressions. I don't particularly care what the > prefixes are. I'd add the plain f-strings first, then i-strings maybe > later. There are definitely some issues with delayed interpolation we > need to think about. An f-string would be shorthand for str(i-string). +1, as this is the point of view I've come to as well. > I think it's hyperbolic to refers f-strings as a new string formatting > language. With one small difference (detailed in PEP 498, and with zero > usage I could find in the stdlib outside of tests), f-strings are a > strict superset of str.format() strings (but not the arguments to > .format of course). I think f-strings are no more different from > str.format strings than PEP 501 i-strings are to string.Template strings. Yeah, that's a fair criticism of my rhetoric, so I'll stop saying that. > From what I can tell in the stdlib and in the wild, str.format() has > hundreds or thousands of times more usage that string.Template. I > realize that the reasons are not necessarily related to the syntax of > the replacement strings, but you can't say most people aren't familiar > with str.format(). Right, and I think we can actually make an example driven decision on that front by looking at potential *target* formats for template rendering. After all, one of the interesting discoveries we made in having both str.__mod__ and str.format available is that %-formatting is a great way to template str.format strings, and vice-versa, since the meta-characters don't conflict, so you can minimise the escaping needed. For use cases like writing object __repr__ methods, I don't think the choice of $-substitution or {}-substitution matters - neither $ nor {} are likely to appear in the desired output (except as part of interpolated values), so escaping shouldn't be common regardless of which we choose. (Side note: __repr__ and _str__ implementations are likely worth highlighting as a good use case for the new syntax!) I think things get more interesting once we start talking about interpolation targets other than "human readable text". For example, one of the neat (/scary, depending on how you feel about this kind of feature) things I realised in working on the latest draft of PEP 501 is that you could use it to template *Python code*, including eagerly bound references to objects in the current scope. That is: a = b + c could instead be written as: a = eval(str(i"$b + $c")) That's not very interesting if all you do is immediately call eval() on it, but it's a lot more interesting if you instead want to do things like extract the AST, dispatch the operation for execution in another process, etc. For example, you could use this capability to build eagerly bound closures, which wouldn't see changes in name bindings, but *would* see state changes in mutable objects. With $-substitution, that "just works", as $ generally isn't syntactically significant in Python code - it can only appear inside strings (and potentially interpolation templates). With {}-substitution, you'd have to double all the braces for dictionary displays, dictionary comprehensions and set comprehensions. In example form: data = {k:v for k, v in source} becomes: data = eval(str(i"{k:v for k, v in $source}")) rather than: data = eval(f"{{k:v for k, v in {{source}}}}")) You hit a similar problem if you're targeting Django or Jinja2 templates, or any content that involves l20n style JavaScript translation strings: the use of braces for substitution expressions in the interpolation template conflicts with their use in the target format. So far, the only target rendering environments I've come up with where $-substitution would create a conflict are shell commands and JavaScript localisation using Mozilla's l20n syntax, and in both of those, I'd actually *want* the Python lookup to take precedence over the target environment lookup (and doubling the prefix to "$$" for target environment lookup seems quite reasonable when you actually do want to do the name lookup in the target environment). >> That description is probably as clear as mud, though, so back to the >> PEP I go! :) > > Thanks for PEP 501. Maybe I'll add delayed interpolation to PEP 498! > > On a more serious note, I'm thinking of adding i-strings to my f-string > implementation. I have some ideas that the format_spec (the :.3f stuff) > could be used by the code that eventually does the string interpolation. > For example, sql(i-string) might want to interpret this expression using > __sql__, instead of how str(i-string) would use __format__. Then the > sql() machinery could look at the format_spec and pass it to the value's > __sql__ method. Yeah, that's the key reason PEP 501 is careful to treat them as opaque strings that it merely transports through to the renderer. The *default* renderer would expect them to be str.format format specifiers, but other renderers may either disallow them entirely, or expect them to do something different. > For example: > sql(i'select {date:as_date} from {tablename}' > > might call date.__sql__('as_date'), which would know how to cast to the > write datatype (this happens to me all the time). > > This is one reason I'm thinking of ditching !s, !r, and !a, at least for > the first implementation of PEP 498: they're not needed, and are not > generally applicable if we add the hooks I'm considering into i-strings. +1 from me. Given arbitrary expression support, it's both entirely possible and more explicit to write the builtin calls directly (obj!a, obj!r, obj!s -> ascii(obj), repr(obj), str(obj)) Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Aug 24 03:49:25 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Aug 2015 11:49:25 +1000 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <20150824012457.GD3881@ando.pearwood.info> References: <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <20150824012457.GD3881@ando.pearwood.info> Message-ID: On 24 August 2015 at 11:24, Steven D'Aprano wrote: > On Sun, Aug 23, 2015 at 08:35:17PM -0400, Eric V. Smith wrote: > >> I think the string interpolation object is interesting. It's basically >> what Petr Viktorin and Chris Angelico discussed and suggested here: >> https://mail.python.org/pipermail/python-ideas/2015-August/035303.html. > > Are you sure that's the right URL? It seems only barely relevant to me. > It has Chris replying to Petr, but it's a vague suggestion of a "quantum > string interpolation" (Chris' words) with no details. He asks: > > "How hard would this be to implement? Something that isn't a string, > retains all the necessary information, and then collapses to a string > when someone looks at it?" > > I looked ahead a dozen or two posts, and can't see any further > discussion. Have I missed something? That's the level of detail I remembered seeing, and it fairly concisely describes PEP 501's types.InterpolationTemplate - it's an object that isn't a string (it's an unrendered template that carries with it all the information needed to render itself on demand) that renders itself to a plain string when you look at it with str(). So the answer to Chris's initial "How hard would this be to implement?" question turned out to be "Not very, once we thought through the details" :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From wes.turner at gmail.com Mon Aug 24 04:31:58 2015 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 23 Aug 2015 21:31:58 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> Message-ID: On Sun, Aug 23, 2015 at 8:41 PM, Nick Coghlan wrote: > On 24 August 2015 at 10:35, Eric V. Smith wrote: > > On 08/22/2015 09:37 PM, Nick Coghlan wrote: > >> The trick would be to make interpolation lazy *by default* (preserving > >> the triple of the raw template string, the parsed fields, and the > >> expression values), and put the default rendering in the resulting > >> object's *__str__* method. > > > > At this point, I think PEPs 498 and 501 have converged, except for the > > delayed string interpolation object (which I realize is important) and > > how expressions are identified in the strings (which I consider less > > important). > > > > I think the string interpolation object is interesting. It's basically > > what Petr Viktorin and Chris Angelico discussed and suggested here: > > https://mail.python.org/pipermail/python-ideas/2015-August/035303.html. > > Aha, I though I'd seen that idea go by in one of the threads, but I > didn't remember where :) > > I'll add Petr and Chris to the acknowledgements section in 501. > > > My suggestion would be to add both f-strings (PEP 498) and i-strings (as > > they're currently called in PEP 501), but with the exact same syntax to > > identify and evaluate expressions. I don't particularly care what the > > prefixes are. I'd add the plain f-strings first, then i-strings maybe > > later. There are definitely some issues with delayed interpolation we > > need to think about. An f-string would be shorthand for str(i-string). > > +1, as this is the point of view I've come to as well. > > > I think it's hyperbolic to refers f-strings as a new string formatting > > language. With one small difference (detailed in PEP 498, and with zero > > usage I could find in the stdlib outside of tests), f-strings are a > > strict superset of str.format() strings (but not the arguments to > > .format of course). I think f-strings are no more different from > > str.format strings than PEP 501 i-strings are to string.Template strings. > > Yeah, that's a fair criticism of my rhetoric, so I'll stop saying that. > > > From what I can tell in the stdlib and in the wild, str.format() has > > hundreds or thousands of times more usage that string.Template. I > > realize that the reasons are not necessarily related to the syntax of > > the replacement strings, but you can't say most people aren't familiar > > with str.format(). > > Right, and I think we can actually make an example driven decision on > that front by looking at potential *target* formats for template > rendering. After all, one of the interesting discoveries we made in > having both str.__mod__ and str.format available is that %-formatting > is a great way to template str.format strings, and vice-versa, since > the meta-characters don't conflict, so you can minimise the escaping > needed. > > For use cases like writing object __repr__ methods, I don't think the > choice of $-substitution or {}-substitution matters - neither $ nor {} > are likely to appear in the desired output (except as part of > interpolated values), so escaping shouldn't be common regardless of > which we choose. (Side note: __repr__ and _str__ implementations are > likely worth highlighting as a good use case for the new syntax!) > > I think things get more interesting once we start talking about > interpolation targets other than "human readable text". > > For example, one of the neat (/scary, depending on how you feel about > this kind of feature) things I realised in working on the latest draft > of PEP 501 is that you could use it to template *Python code*, > including eagerly bound references to objects in the current scope. > That is: > > a = b + c > > could instead be written as: > > a = eval(str(i"$b + $c")) > > That's not very interesting if all you do is immediately call eval() > on it, but it's a lot more interesting if you instead want to do > things like extract the AST, dispatch the operation for execution in > another process, etc. For example, you could use this capability to > build eagerly bound closures, which wouldn't see changes in name > bindings, but *would* see state changes in mutable objects. > > With $-substitution, that "just works", as $ generally isn't > syntactically significant in Python code - it can only appear inside > strings (and potentially interpolation templates). With > {}-substitution, you'd have to double all the braces for dictionary > displays, dictionary comprehensions and set comprehensions. In example > form: > > data = {k:v for k, v in source} > > becomes: > > data = eval(str(i"{k:v for k, v in $source}")) > > rather than: > > data = eval(f"{{k:v for k, v in {{source}}}}")) > > You hit a similar problem if you're targeting Django or Jinja2 > templates, or any content that involves l20n style JavaScript > translation strings: the use of braces for substitution expressions in > the interpolation template conflicts with their use in the target > format. > > So far, the only target rendering environments I've come up with where > $-substitution would create a conflict are shell commands and > JavaScript localisation using Mozilla's l20n syntax, and in both of > those, I'd actually *want* the Python lookup to take precedence over > the target environment lookup (and doubling the prefix to "$$" for > target environment lookup seems quite reasonable when you actually do > want to do the name lookup in the target environment). > > >> That description is probably as clear as mud, though, so back to the > >> PEP I go! :) > > > > Thanks for PEP 501. Maybe I'll add delayed interpolation to PEP 498! > > > > On a more serious note, I'm thinking of adding i-strings to my f-string > > implementation. I have some ideas that the format_spec (the :.3f stuff) > > could be used by the code that eventually does the string interpolation. > > For example, sql(i-string) might want to interpret this expression using > > __sql__, instead of how str(i-string) would use __format__. Then the > > sql() machinery could look at the format_spec and pass it to the value's > > __sql__ method. > > Yeah, that's the key reason PEP 501 is careful to treat them as opaque > strings that it merely transports through to the renderer. The > *default* renderer would expect them to be str.format format > specifiers, but other renderers may either disallow them entirely, or > expect them to do something different. > > > For example: > > sql(i'select {date:as_date} from {tablename}' > > > > might call date.__sql__('as_date'), which would know how to cast to the > > write datatype (this happens to me all the time). > > > > This is one reason I'm thinking of ditching !s, !r, and !a, at least for > > the first implementation of PEP 498: they're not needed, and are not > > generally applicable if we add the hooks I'm considering into i-strings. > > +1 from me. Given arbitrary expression support, it's both entirely > possible and more explicit to write the builtin calls directly (obj!a, > obj!r, obj!s -> ascii(obj), repr(obj), str(obj)) > IIUC, to do this with SQL, > sql(i'select {date:as_date} from {tablename}' needs to be ['select ', unescaped(date, 'as_date'), 'from ', unescaped(tablename)] so that e.g. sql_92(), sql_2011() would know that 'select ' is presumably implicitly escaped * https://en.wikipedia.org/wiki/SQL#Interoperability_and_standardization * http://docs.sqlalchemy.org/en/rel_1_0/dialects/ * https://docs.djangoproject.com/en/1.7/ref/models/queries/#f-expressions "Django F-Expressions" > Regards, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From humbert at uni-wuppertal.de Mon Aug 24 06:19:47 2015 From: humbert at uni-wuppertal.de (Prof. Dr. L. Humbert) Date: Mon, 24 Aug 2015 06:19:47 +0200 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal Message-ID: <55DA9B63.3010208@uni-wuppertal.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, when looking at pep-0484 I find the *non-orthogonal* construction, which may lead to a misconcetion: What students should be able to code: 1. varinat #-------------wishful----------------------------------\ class Tree: def __init__(self, left: Tree, right: Tree): self.left = left self.right = right what students have to write instead: #-------------bad workaround----------------------------\ class Tree: def __init__(self, left: 'Tree', right: 'Tree'): self.left = left self.right = right / Please enable: from __future__ import annotations so the *first* variant should be possible \ At this very moment (python 3.5rc1), it is not possible, but we need it, so the construction will be orthogonal from the point of view for students(!) - _one_ concept should work in different circumstances. TNX Ludger Humbert - -- https://twitter.com/n770 http://ddi.uni-wuppertal.de/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlXam2MACgkQJQsN9FQ+jJ+RHgCfdcTgjVmZ3ULLwjerpJ3NdN7d NH8AoIvdTqWbkcfi7o8e7JuAYXbgZk0V =OMhU -----END PGP SIGNATURE----- From joejev at gmail.com Mon Aug 24 06:39:34 2015 From: joejev at gmail.com (Joseph Jevnik) Date: Mon, 24 Aug 2015 00:39:34 -0400 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DA9B63.3010208@uni-wuppertal.de> References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: What is the intended behavior if `Tree` is already a name in scope? On Mon, Aug 24, 2015 at 12:19 AM, Prof. Dr. L. Humbert < humbert at uni-wuppertal.de> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > when looking at pep-0484 I find the *non-orthogonal* construction, which > may lead to a misconcetion: > > What students should be able to code: > > 1. varinat > #-------------wishful----------------------------------\ > class Tree: > def __init__(self, left: Tree, right: Tree): > self.left = left > self.right = right > > > what students have to write instead: > > #-------------bad workaround----------------------------\ > class Tree: > def __init__(self, left: 'Tree', right: 'Tree'): > self.left = left > self.right = right > > / > Please enable: > from __future__ import annotations > > so the *first* variant should be possible > \ > At this very moment (python 3.5rc1), it is not possible, but we need it, > so the construction will be orthogonal from the point of view for > students(!) - _one_ concept should work in different circumstances. > > TNX > Ludger Humbert > - -- > https://twitter.com/n770 > http://ddi.uni-wuppertal.de/ > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2 > > iEYEARECAAYFAlXam2MACgkQJQsN9FQ+jJ+RHgCfdcTgjVmZ3ULLwjerpJ3NdN7d > NH8AoIvdTqWbkcfi7o8e7JuAYXbgZk0V > =OMhU > -----END PGP SIGNATURE----- > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Mon Aug 24 07:00:45 2015 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 24 Aug 2015 00:00:45 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> Message-ID: On Sun, Aug 23, 2015 at 9:31 PM, Wes Turner wrote: > > > On Sun, Aug 23, 2015 at 8:41 PM, Nick Coghlan wrote: > >> On 24 August 2015 at 10:35, Eric V. Smith wrote: >> > On 08/22/2015 09:37 PM, Nick Coghlan wrote: >> >> The trick would be to make interpolation lazy *by default* (preserving >> >> the triple of the raw template string, the parsed fields, and the >> >> expression values), and put the default rendering in the resulting >> >> object's *__str__* method. >> > >> > At this point, I think PEPs 498 and 501 have converged, except for the >> > delayed string interpolation object (which I realize is important) and >> > how expressions are identified in the strings (which I consider less >> > important). >> > >> > I think the string interpolation object is interesting. It's basically >> > what Petr Viktorin and Chris Angelico discussed and suggested here: >> > https://mail.python.org/pipermail/python-ideas/2015-August/035303.html. >> >> Aha, I though I'd seen that idea go by in one of the threads, but I >> didn't remember where :) >> >> I'll add Petr and Chris to the acknowledgements section in 501. >> >> > My suggestion would be to add both f-strings (PEP 498) and i-strings (as >> > they're currently called in PEP 501), but with the exact same syntax to >> > identify and evaluate expressions. I don't particularly care what the >> > prefixes are. I'd add the plain f-strings first, then i-strings maybe >> > later. There are definitely some issues with delayed interpolation we >> > need to think about. An f-string would be shorthand for str(i-string). >> >> +1, as this is the point of view I've come to as well. >> >> > I think it's hyperbolic to refers f-strings as a new string formatting >> > language. With one small difference (detailed in PEP 498, and with zero >> > usage I could find in the stdlib outside of tests), f-strings are a >> > strict superset of str.format() strings (but not the arguments to >> > .format of course). I think f-strings are no more different from >> > str.format strings than PEP 501 i-strings are to string.Template >> strings. >> >> Yeah, that's a fair criticism of my rhetoric, so I'll stop saying that. >> >> > From what I can tell in the stdlib and in the wild, str.format() has >> > hundreds or thousands of times more usage that string.Template. I >> > realize that the reasons are not necessarily related to the syntax of >> > the replacement strings, but you can't say most people aren't familiar >> > with str.format(). >> >> Right, and I think we can actually make an example driven decision on >> that front by looking at potential *target* formats for template >> rendering. After all, one of the interesting discoveries we made in >> having both str.__mod__ and str.format available is that %-formatting >> is a great way to template str.format strings, and vice-versa, since >> the meta-characters don't conflict, so you can minimise the escaping >> needed. >> >> For use cases like writing object __repr__ methods, I don't think the >> choice of $-substitution or {}-substitution matters - neither $ nor {} >> are likely to appear in the desired output (except as part of >> interpolated values), so escaping shouldn't be common regardless of >> which we choose. (Side note: __repr__ and _str__ implementations are >> likely worth highlighting as a good use case for the new syntax!) >> >> I think things get more interesting once we start talking about >> interpolation targets other than "human readable text". >> >> For example, one of the neat (/scary, depending on how you feel about >> this kind of feature) things I realised in working on the latest draft >> of PEP 501 is that you could use it to template *Python code*, >> including eagerly bound references to objects in the current scope. >> That is: >> >> a = b + c >> >> could instead be written as: >> >> a = eval(str(i"$b + $c")) >> >> That's not very interesting if all you do is immediately call eval() >> on it, but it's a lot more interesting if you instead want to do >> things like extract the AST, dispatch the operation for execution in >> another process, etc. For example, you could use this capability to >> build eagerly bound closures, which wouldn't see changes in name >> bindings, but *would* see state changes in mutable objects. >> >> With $-substitution, that "just works", as $ generally isn't >> syntactically significant in Python code - it can only appear inside >> strings (and potentially interpolation templates). With >> {}-substitution, you'd have to double all the braces for dictionary >> displays, dictionary comprehensions and set comprehensions. In example >> form: >> >> data = {k:v for k, v in source} >> >> becomes: >> >> data = eval(str(i"{k:v for k, v in $source}")) >> >> rather than: >> >> data = eval(f"{{k:v for k, v in {{source}}}}")) >> >> You hit a similar problem if you're targeting Django or Jinja2 >> templates, or any content that involves l20n style JavaScript >> translation strings: the use of braces for substitution expressions in >> the interpolation template conflicts with their use in the target >> format. >> >> So far, the only target rendering environments I've come up with where >> $-substitution would create a conflict are shell commands and >> JavaScript localisation using Mozilla's l20n syntax, and in both of >> those, I'd actually *want* the Python lookup to take precedence over >> the target environment lookup (and doubling the prefix to "$$" for >> target environment lookup seems quite reasonable when you actually do >> want to do the name lookup in the target environment). >> >> >> That description is probably as clear as mud, though, so back to the >> >> PEP I go! :) >> > >> > Thanks for PEP 501. Maybe I'll add delayed interpolation to PEP 498! >> > >> > On a more serious note, I'm thinking of adding i-strings to my f-string >> > implementation. I have some ideas that the format_spec (the :.3f stuff) >> > could be used by the code that eventually does the string interpolation. >> > For example, sql(i-string) might want to interpret this expression using >> > __sql__, instead of how str(i-string) would use __format__. Then the >> > sql() machinery could look at the format_spec and pass it to the value's >> > __sql__ method. >> >> Yeah, that's the key reason PEP 501 is careful to treat them as opaque >> strings that it merely transports through to the renderer. The >> *default* renderer would expect them to be str.format format >> specifiers, but other renderers may either disallow them entirely, or >> expect them to do something different. >> >> > For example: >> > sql(i'select {date:as_date} from {tablename}' >> > >> > might call date.__sql__('as_date'), which would know how to cast to the >> > write datatype (this happens to me all the time). >> > >> > This is one reason I'm thinking of ditching !s, !r, and !a, at least for >> > the first implementation of PEP 498: they're not needed, and are not >> > generally applicable if we add the hooks I'm considering into i-strings. >> >> +1 from me. Given arbitrary expression support, it's both entirely >> possible and more explicit to write the builtin calls directly (obj!a, >> obj!r, obj!s -> ascii(obj), repr(obj), str(obj)) >> > > IIUC, to do this with SQL, > > > sql(i'select {date:as_date} from {tablename}' > > needs to be > > ['select ', unescaped(date, 'as_date'), 'from ', unescaped(tablename)] > > so that e.g. sql_92(), sql_2011() > would know that 'select ' is presumably implicitly escaped > > * https://en.wikipedia.org/wiki/SQL#Interoperability_and_standardization > * http://docs.sqlalchemy.org/en/rel_1_0/dialects/ > * https://docs.djangoproject.com/en/1.7/ref/models/queries/#f-expressions > "Django F-Expressions" > > For reference, the SQLAlchemy Expression API solves for (safer) method-chaining, nesting *Python* expression API; or you can reuse a raw SQL connection from a ConnectionPool. Django F-Objects are relevant because they are deferred (and compiled in context to the query context); similar to the objectives of a given SQL syntax templating, parameterization, and serialization library. Django Q-Objects are similar, in that an f-string is basically an iterator of AND-ed expressions where AND means string concatenation. Personally, I'd pretty much always just reflect the tables or map them out and write SQLAlchemy Python expressions which are then compiled to a particular dialect (and quoted appropriately, **avoiding CWE-89** surviving across table renames, managing migrations). Is it sometimes faster to write SQL by hand? * I'd write the [SQLAlchemy], serialize to SQL, [and modify] (because I should have namespaced Python table attrs for those attrs anyway, even if it requires table introspection and reflection at (every/pool) instantiation) * you can always execute query with a raw connection with an ORM (and then **refactor (REF) string-ified table and column names**) Each ORM (and DBAPI) have parametrization settings (e.g. '%' or '?' or configuration_setting) which should not collide with the f-string syntax. * DBAPI v2.0 https://www.python.org/dev/peps/pep-0249/ * SQLite DBAPI https://docs.python.org/2/library/sqlite3.html https://docs.python.org/3/library/sqlite3.html http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html#conjunctions >>> s = select([(users.c.fullname +... ", " + addresses.c.email_address).... label('title')]).\... where(users.c.id == addresses.c.user_id).\... where(users.c.name.between('m', 'z')).\... where(... or_(... addresses.c.email_address.like('%@aol.com'),... addresses.c.email_address.like('%@msn.com')... )... )>>> conn.execute(s).fetchall() SELECT users.fullname || ? || addresses.email_address AS titleFROM users, addressesWHERE users.id = addresses.user_id AND users.name BETWEEN ? AND ? AND(addresses.email_address LIKE ? OR addresses.email_address LIKE ?)(', ', 'm', 'z', '%@aol.com', '%@msn.com')[(u'Wendy Williams, wendy at aol.com',)] http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html#using-textual-sql >>> from sqlalchemy.sql import text>>> s = text(... "SELECT users.fullname || ', ' || addresses.email_address AS title "... "FROM users, addresses "... "WHERE users.id = addresses.user_id "... "AND users.name BETWEEN :x AND :y "... "AND (addresses.email_address LIKE :e1 "... "OR addresses.email_address LIKE :e2)")SQL >>> conn.execute(s, x='m', y='z', e1='%@aol.com', e2='%@msn.com').fetchall() [(u'Wendy Williams, wendy at aol.com',)] SQLAlchemy is not async-compatible (besides, most drivers block); it's debatable whether async would be faster, anyway: https://bitbucket.org/zzzeek/sqlalchemy/issues/3414/asyncio-and-sqlalchemy >> Regards, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Mon Aug 24 09:07:30 2015 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 24 Aug 2015 16:07:30 +0900 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: <87h9np6u6l.fsf@uwakimon.sk.tsukuba.ac.jp> Joseph Jevnik writes: > What is the intended behavior [of annotating the argument to a > method in class Tree with `arg: Tree`] if `Tree` is already a name > in scope? I would expect "class" to somehow redeclare the class name as it does when executed: >>> class List: ... pass ... >>> l = List(2,3) Traceback (most recent call last): File "", line 1, in TypeError: object() takes no parameters >>> class List: ... def __init__(self, *elts): ... self.car, *rest = elts ... if rest: ... self.cdr = List(*rest) ... else: ... self.cdr = None ... >>> l = List(1,2,3) >>> print(l.cdr.cdr.car) 3 >>> print(l.cdr.cdr.cdr) None >>> Regardless of the plausibility of the proposed behavior, the reason for the behavior defined by PEP 484 is explained there. Annotations (including type hints), like default values, are evaluated at the time of method *definition* per PEP 3107, and that value is bound to the default slot. OTOH, the class *declaration* is implemented by binding the name to the class, which occurs once the class object is available, ie, *after* the methods are defined and added to the object. Guido and Mark (the BDFL-Delegate) clearly considered the current situation to be acceptable if not elegant, and left elegance for a future PEP (which somebody else will have to write, I guess). There's no guarantee of acceptance, since it seems to involve a backward- incompatible change. N.B. I don't think other users of annotations would be happy with "from __future__ import annotations", which could easily be taken as a deprecation of their use cases. The name of the __future__ import will probably have to be bikeshedded a bit. From tritium-list at sdamon.com Mon Aug 24 09:07:49 2015 From: tritium-list at sdamon.com (Alexander Walters) Date: Mon, 24 Aug 2015 03:07:49 -0400 Subject: [Python-ideas] Why does sys.flags have both .interactive and .inspect? In-Reply-To: <55DA4B09.3040400@mrabarnett.plus.com> References: <55DA1133.3000904@sdamon.com> <55DA1D05.7010300@mrabarnett.plus.com> <55DA1E76.4060700@sdamon.com> <55DA4B09.3040400@mrabarnett.plus.com> Message-ID: <55DAC2C5.8090504@sdamon.com> On 8/23/2015 18:36, MRAB wrote: > I tested it, and they _can_ be different. > > sys.flags.interactive will be true if you include the -i switch. > > sys.flags.inspect will be true if you include the -i switch or set the > PYTHONINSPECT environment > variable. > > Thus, if you don't include the -i switch but _do_ set PYTHONINSPECT, > sys.flags.interactive will be > false and sys.flags.inspect will be true. That seams significant enough to be documented. I'll work on a patch in the morning. From abarnert at yahoo.com Mon Aug 24 09:25:02 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 24 Aug 2015 00:25:02 -0700 Subject: [Python-ideas] Why does sys.flags have both .interactive and .inspect? In-Reply-To: <55DAC2C5.8090504@sdamon.com> References: <55DA1133.3000904@sdamon.com> <55DA1D05.7010300@mrabarnett.plus.com> <55DA1E76.4060700@sdamon.com> <55DA4B09.3040400@mrabarnett.plus.com> <55DAC2C5.8090504@sdamon.com> Message-ID: <73C4B992-E2AA-4C89-888D-14981AB9EAF9@yahoo.com> On Aug 24, 2015, at 00:07, Alexander Walters wrote: > >> On 8/23/2015 18:36, MRAB wrote: >> I tested it, and they _can_ be different. >> >> sys.flags.interactive will be true if you include the -i switch. >> >> sys.flags.inspect will be true if you include the -i switch or set the PYTHONINSPECT environment >> variable. >> >> Thus, if you don't include the -i switch but _do_ set PYTHONINSPECT, sys.flags.interactive will be >> false and sys.flags.inspect will be true. > > That seams significant enough to be documented. I'll work on a patch in the morning. Wouldn't it be better to write a more complete explanation of what "inspect" and "interactive" mean, so it's obvious that the one must imply the other but not vice-versa, instead of just saying that one implies the other and still leaving what they actually mean a mystery to anyone who doesn't read the C source? From encukou at gmail.com Mon Aug 24 09:28:00 2015 From: encukou at gmail.com (Petr Viktorin) Date: Mon, 24 Aug 2015 09:28:00 +0200 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <20150824012457.GD3881@ando.pearwood.info> References: <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <20150824012457.GD3881@ando.pearwood.info> Message-ID: On Mon, Aug 24, 2015 at 3:24 AM, Steven D'Aprano wrote: > On Sun, Aug 23, 2015 at 08:35:17PM -0400, Eric V. Smith wrote: > >> I think the string interpolation object is interesting. It's basically >> what Petr Viktorin and Chris Angelico discussed and suggested here: >> https://mail.python.org/pipermail/python-ideas/2015-August/035303.html. > > Are you sure that's the right URL? It seems only barely relevant to me. > It has Chris replying to Petr, but it's a vague suggestion of a "quantum > string interpolation" (Chris' words) with no details. He asks: > > "How hard would this be to implement? Something that isn't a string, > retains all the necessary information, and then collapses to a string > when someone looks at it?" > > I looked ahead a dozen or two posts, and can't see any further > discussion. Have I missed something? Actually, it's I who missed something ? replied from a phone, and sent the reply to Chris only instead of to the list. And that killed further discussion, it seems. My answer was: > Not too hard, but getting the exact semantics right could be tricky. > It's probably something the language/stdlib should enable, rather than > having it in the stdlib itself. This seems roughly in line with what Guido was saying earlier. (Am I misrepresenting your words, Guido?) I thought a bit about what's bothering me with this idea, and I realized I just don't like that "quantum effect" ? collapsing when something looks at a value. All the parts up to that point sound OK, it's the str() that seems too magical to me. We could require a more explicit function, not just str(), to format the string: >>> t0=1; t1=2; n=3 >>> template = i"Peeled {n} onions in {t1-t0:.2f}s" >>> str(template) types.InterpolationTemplate(template="Peeled {n} onions in {t1-t0:.2f}s", fields=(('Peeled', 0, 'n', '', ''), ...), values=(3, 1)) >>> format_template(template) # (or make it a method?) 'Peeled 3 onions in 1s' This no longer feels "too magic" to me, and it would allow some experimentation before (if ever) InterpolationTemplate grows a more convenient str(). Compared to f-strings, all this is doing is exposing the intermediate structure. (What the "i" really stands for is "internal".) Now f-strings would be just i-strings with a default formatter applied. And, InterpolationTemplate should only allow attribute access (i.e. it shouldn't be structseq). That way the internal structure can be changed later, and the "old" attributes can be synthetized on access. From python-ideas at mgmiller.net Mon Aug 24 10:51:58 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 24 Aug 2015 01:51:58 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> Message-ID: <55DADB2E.2020300@mgmiller.net> On 08/23/2015 06:41 PM, Nick Coghlan wrote: > You hit a similar problem if you're targeting Django or Jinja2 > templates, or any content that involves l20n style JavaScript > translation strings: the use of braces for substitution expressions in Hi, this part I don't get, maybe because it's so late here. Why create Django/Jinja2/i20n templates inside Python code using another templating language (whether Template or .format)? Those kind of templates should be in dedicated text files, no? -Mike From ncoghlan at gmail.com Mon Aug 24 11:48:05 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Aug 2015 19:48:05 +1000 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DADB2E.2020300@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <55DADB2E.2020300@mgmiller.net> Message-ID: On 24 August 2015 at 18:51, Mike Miller wrote: > > On 08/23/2015 06:41 PM, Nick Coghlan wrote: >> >> You hit a similar problem if you're targeting Django or Jinja2 >> templates, or any content that involves l20n style JavaScript >> translation strings: the use of braces for substitution expressions in > > Hi, this part I don't get, maybe because it's so late here. Why create > Django/Jinja2/i20n templates inside Python code using another templating > language (whether Template or .format)? > > Those kind of templates should be in dedicated text files, no? Think of meta-templating tools like cookie-cutter or DevAssistant (or the project wizards in an IDE) - for those kinds of tools, "source file formats" are actually output formats. Once you look at enough different parts of the software development pipeline you find that pretty much *every* input format is an output format for some other tool :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Aug 24 11:53:59 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Aug 2015 19:53:59 +1000 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <20150824012457.GD3881@ando.pearwood.info> Message-ID: On 24 August 2015 at 17:28, Petr Viktorin wrote: > I thought a bit about what's bothering me with this idea, and I > realized I just don't like that "quantum effect" ? collapsing when > something looks at a value. > All the parts up to that point sound OK, it's the str() that seems too > magical to me. > > We could require a more explicit function, not just str(), to format the string: > >>>> t0=1; t1=2; n=3 >>>> template = i"Peeled {n} onions in {t1-t0:.2f}s" >>>> str(template) > types.InterpolationTemplate(template="Peeled {n} onions in > {t1-t0:.2f}s", fields=(('Peeled', 0, 'n', '', ''), ...), values=(3, > 1)) >>>> format_template(template) # (or make it a method?) > 'Peeled 3 onions in 1s' > > This no longer feels "too magic" to me, and it would allow some > experimentation before (if ever) InterpolationTemplate grows a more > convenient str(). Another option would be to put the default rendering in __format__, and let __str__ fall through to __repr__. That way str(template) wouldn't render the template, but format(template) would. > Compared to f-strings, all this is doing is exposing the intermediate > structure. (What the "i" really stands for is "internal".) > Now f-strings would be just i-strings with a default formatter applied. > > And, InterpolationTemplate should only allow attribute access (i.e. it > shouldn't be structseq). That way the internal structure can be > changed later, and the "old" attributes can be synthetized on access. Yeah, that's fair. I added the __iter__ to make some of the examples prettier, but it probably isn't worth the loss of future flexibility. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guettliml at thomas-guettler.de Mon Aug 24 12:07:06 2015 From: guettliml at thomas-guettler.de (=?UTF-8?B?VGhvbWFzIEfDvHR0bGVy?=) Date: Mon, 24 Aug 2015 12:07:06 +0200 Subject: [Python-ideas] Properties for classes possible? In-Reply-To: References: <55D57989.1020704@thomas-guettler.de> Message-ID: <55DAECCA.70200@thomas-guettler.de> Am 20.08.2015 um 17:29 schrieb Guido van Rossum: > I think it's reasonable to propose @classproperty as a patch to CPython. It needs to be C code. Not sure about the > writable version. The lazy=True part is not appropriate for th he stdlib (it's just a memoize pattern). What's the next step? My knowledge of the programming language C is very limited. I am not able to write a patch for CPython. I could write a patch which looks like this: {{{ # From http://stackoverflow.com/a/5192374/633961 class classproperty(object): def __init__(self, f): self.f = f def __get__(self, obj, owner): return self.f(owner) }}} -- Thomas Guettler http://www.thomas-guettler.de/ From ncoghlan at gmail.com Mon Aug 24 12:14:21 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Aug 2015 20:14:21 +1000 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> Message-ID: On 24 August 2015 at 11:41, Nick Coghlan wrote: > That's not very interesting if all you do is immediately call eval() > on it, but it's a lot more interesting if you instead want to do > things like extract the AST, dispatch the operation for execution in > another process, etc. For example, you could use this capability to > build eagerly bound closures, which wouldn't see changes in name > bindings, but *would* see state changes in mutable objects. Offering a nice early binding syntax is a question I've been pondering for years (cf. PEPs 403 and 3150), so I'm intrigued by this question of whether or not f-strings and i-strings might be able to deliver those in a way that's more attractive than the current options. This idea doesn't necessarily need deferred interpolation, so I'll use the current PEP 498 f-string prefix and substitution expression syntax. Consider the following function definition: def defer(expr): return eval("lambda: (" + expr + ")") We can use this today as a strange way of writing a lambda expression: >>> f = defer("42") >>> f at 0x7f1c0314eae8> >>> f() 42 There's no reason to do that, of course - you'd just use an actual lambda expression instead. However, f-strings will make it possible for folks to write code like this: callables = [defer(f"{i}") for i in range(10)] "{i}" in that example isn't a one-element set, it's a substitution expression that interpolates "str(i)" into the formatted string, which is then evaluated by "defer" as if the template contained the literal value of "i" at the time of interpolation, rather than being a lazy reference to a closure variable. (If you were to get appropriately creative with exec, you could even use a trick like this to define multiline lambdas) Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Mon Aug 24 13:35:44 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 24 Aug 2015 12:35:44 +0100 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <20150824012457.GD3881@ando.pearwood.info> Message-ID: On 24 August 2015 at 10:53, Nick Coghlan wrote: >> We could require a more explicit function, not just str(), to format the string: >> >>>>> t0=1; t1=2; n=3 >>>>> template = i"Peeled {n} onions in {t1-t0:.2f}s" >>>>> str(template) >> types.InterpolationTemplate(template="Peeled {n} onions in >> {t1-t0:.2f}s", fields=(('Peeled', 0, 'n', '', ''), ...), values=(3, >> 1)) >>>>> format_template(template) # (or make it a method?) >> 'Peeled 3 onions in 1s' >> >> This no longer feels "too magic" to me, and it would allow some >> experimentation before (if ever) InterpolationTemplate grows a more >> convenient str(). > > Another option would be to put the default rendering in __format__, > and let __str__ fall through to __repr__. That way str(template) > wouldn't render the template, but format(template) would. I'm once again losing the thread of all the variations being proposed. As a reality check, is the expectation that something like the following will still be possible: print(f"Iteration {n}: Duration {end-start} seconds") This is as an improvement over the two current approaches: print("Iteration {}: Duration {} seconds".format(n, end-start)) print("Iteration %s: Duration %s seconds" % (n, end-start)) because it's less verbose than the former, and less punctuation-heavy (and old-fashioned ;-)) than the latter. Explicit str() calls or temporary variables or anything like that are no improvement over the current options. Of course they may offer more advanced features, but let's not lose the 80% case for the sake of the 20% (that's actually more like 95-5, to be honest). Paul From steve at pearwood.info Mon Aug 24 14:00:34 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 24 Aug 2015 22:00:34 +1000 Subject: [Python-ideas] Deferred evaluation [was Re: Draft PEP on string interpolation] In-Reply-To: References: <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> Message-ID: <20150824120033.GJ3881@ando.pearwood.info> On Mon, Aug 24, 2015 at 08:14:21PM +1000, Nick Coghlan wrote: > This idea doesn't necessarily need deferred interpolation, so I'll use > the current PEP 498 f-string prefix and substitution expression > syntax. Consider the following function definition: > > def defer(expr): > return eval("lambda: (" + expr + ")") > > We can use this today as a strange way of writing a lambda expression: > > >>> f = defer("42") > >>> f > at 0x7f1c0314eae8> > >>> f() > 42 > > There's no reason to do that, of course - you'd just use an actual > lambda expression instead. There's a problem with the idea of using eval to defer objects -- it relies on your object having an eval'able representation. Try to defer() the following list L: L = [] L.append(L) But putting that aside... > However, f-strings will make it possible for folks to write code like this: > > callables = [defer(f"{i}") for i in range(10)] How is that different from this? callables = [defer(str(i)) for i in range(10)] If they are not the same, then what would this return? [func() for func in callables] I expect it to give [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]. Am I wrong? > "{i}" in that example isn't a one-element set, it's a substitution > expression that interpolates "str(i)" into the formatted string, I understand that f-strings are evaluated at the time of, um, their evaluation, so this would be equivalent to: callables = [defer("0"), defer("1"), defer("2", ... defer("9")] > which > is then evaluated by "defer" as if the template contained the literal > value of "i" at the time of interpolation, rather than being a lazy > reference to a closure variable. I'm completely lost. How would you get a closure variable here? I mean, I know how to get a closure in general terms, e.g.: [(lambda : i) for i in range(10)] but I'm not seeing where you would get a closure *specifically* in this situation with your defer function. -- Steve From eric at trueblade.com Mon Aug 24 14:41:45 2015 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 24 Aug 2015 08:41:45 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <20150824012457.GD3881@ando.pearwood.info> Message-ID: <55DB1109.7010208@trueblade.com> On 08/24/2015 07:35 AM, Paul Moore wrote: > I'm once again losing the thread of all the variations being proposed. > > As a reality check, is the expectation that something like the > following will still be possible: > > print(f"Iteration {n}: Duration {end-start} seconds") Yes, that's the PEP 498 proposal. I think (and this is just my opinion) that if we do something more complicated, like the delayed interpolation of i-strings, that we'd still keep f-strings. And further, while internally we may rewrite f-strings to use the i-string infrastructure, to the user they'd still look like the same f-strings. > Explicit str() calls or temporary variables or anything like that are > no improvement over the current options. Of course they may offer more > advanced features, but let's not lose the 80% case for the sake of the > 20% (that's actually more like 95-5, to be honest). Agreed. Eric. From encukou at gmail.com Mon Aug 24 14:46:03 2015 From: encukou at gmail.com (Petr Viktorin) Date: Mon, 24 Aug 2015 14:46:03 +0200 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB1109.7010208@trueblade.com> References: <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <20150824012457.GD3881@ando.pearwood.info> <55DB1109.7010208@trueblade.com> Message-ID: On Mon, Aug 24, 2015 at 2:41 PM, Eric V. Smith wrote: > On 08/24/2015 07:35 AM, Paul Moore wrote: >> I'm once again losing the thread of all the variations being proposed. >> >> As a reality check, is the expectation that something like the >> following will still be possible: >> >> print(f"Iteration {n}: Duration {end-start} seconds") > > Yes, that's the PEP 498 proposal. I think (and this is just my opinion) > that if we do something more complicated, like the delayed interpolation > of i-strings, that we'd still keep f-strings. > > And further, while internally we may rewrite f-strings to use the > i-string infrastructure, to the user they'd still look like the same > f-strings. > >> Explicit str() calls or temporary variables or anything like that are >> no improvement over the current options. Of course they may offer more >> advanced features, but let's not lose the 80% case for the sake of the >> 20% (that's actually more like 95-5, to be honest). > > Agreed. Indeed. On the other hand, let's make reasonably sure that next year we won't need yet another syntax for the 20%. From p.f.moore at gmail.com Mon Aug 24 17:03:49 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 24 Aug 2015 16:03:49 +0100 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB1109.7010208@trueblade.com> References: <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <20150824012457.GD3881@ando.pearwood.info> <55DB1109.7010208@trueblade.com> Message-ID: On 24 August 2015 at 13:41, Eric V. Smith wrote: > On 08/24/2015 07:35 AM, Paul Moore wrote: >> I'm once again losing the thread of all the variations being proposed. >> >> As a reality check, is the expectation that something like the >> following will still be possible: >> >> print(f"Iteration {n}: Duration {end-start} seconds") > > Yes, that's the PEP 498 proposal. I think (and this is just my opinion) > that if we do something more complicated, like the delayed interpolation > of i-strings, that we'd still keep f-strings. OK. That's my point, essentially - the discussion has drifted into much more complex areas, with comments about how the wider-ranging proposals cover the f-string case as a subset, and I just wanted to be sure that there wasn't an implied "so we don't need f-strings any more" in there. (Nick at one point spoke quite strongly against adding multiple ways of doing the same thing). Paul From eric at trueblade.com Mon Aug 24 17:14:53 2015 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 24 Aug 2015 11:14:53 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> Message-ID: <55DB34ED.30304@trueblade.com> On 08/23/2015 09:13 PM, Guido van Rossum wrote: > But for i-strings, I think it would be good if we could gather more > actual experience using them. Every potential use case brought up for > these so far (translation, html/shell/sql quoting) feels like there's a > lot of work needing to be done to see if the idea is actually viable > there. It would be a shame if we added all the (considerable!) machinery > for i-strings and all we got was yet another way to do it > (https://xkcd.com/927/), without killing at least one competing approach > (similar to the way .format() has failed to replace %). > > It's tough to envision how we could gather more experience with > i-strings *without* building them into the language, but I'm really > hesitant to add them without more experience. (This is the "new on the > job market" paradox. :-) Maybe they could be emulated using a function > call that uses sys._getframe() under the covers? Or maybe it's possible > to cook up an experiment using other syntax hooks? E.g. the coding hack > used in pyxl (https://github.com/dropbox/pyxl).[1] I hope you don't mind that I borrowed the keys to the time machine. I'm using the implementation of _string.formatter_parser() that I added for implementing string.Formatter: ---8<--------------------------------------------- import sys import _string class i: def __init__(self, s): self.s = s locals = sys._getframe(1).f_locals globals = sys._getframe(1).f_globals self.values = {} # evaluate the expressions for literal, expr, format_spec, conversion in \ _string.formatter_parser(self.s): if expr: value = eval(expr, locals, globals) self.values[expr] = value def __str__(self): result = [] for literal, expr, format_spec, conversion in \ _string.formatter_parser(self.s): result.append(literal) if expr: value = self.values[expr] result.append(value.__format__(format_spec)) return ''.join(result) ---8<--------------------------------------------- So now, instead of i"x={x}", we say i("x={x}"). Let's use it with str: >>> x = i('Version in caps {sys.version[0:7].upper()}') >>> x <__main__.i object at 0x7f1653311e90> >>> str(x) 'Version in caps 3.6.0A0' Cool. Now let's whip up a simple i18n example: >>> def gettext(s): ... # Our complicated string lookup ... if s == 'My name is {name}, my dog is {dog}': ... return 'Mi pero es {dog}, y mi nombre es {name}' ... return s ... >>> def _(istring): ... result = [] ... # do the gettext lookup ... s = gettext(istring.s) ... # use the values from our original istring, ... # but the literals and ordering from our ... # looked-up string ... for literal, expr, format_spec, conversion in \ ... _string.formatter_parser(s): ... result.append(literal) ... if expr is not None: ... result.append(istring.values[expr]) ... return ''.join(result) ... >>> name = 'Eric' >>> dog = 'Misty' >>> x = i('My name is {name}, my dog is {dog}') >>> str(x) 'My name is Eric, my dog is Misty' >>> _(x) 'Mi pero es Misty, y mi nombre es Eric' >>> That should be enough to play with i-strings in logging, sql, xml, etc. Several things should be addressed: hiding the call to _string.formatter_parse inside the 'i' class, for example. And of course don't use sys._getframe. But the ideas are all there. I can't swear that _string.formatter_parser will parse all known expressions, since that's not what it was designed to do. It will likely fail with expressions that contain strings and braces, for example. I haven't really checked. But hey, what do you want for free? With a slight tweak, this code even works with 2.7: replace "_string.formatter_parser" with "str._formatter_parser". Unfortunately, 2.7 will then only support very simple expressions. Oh, well. Enjoy! Eric. From eric at trueblade.com Mon Aug 24 17:55:35 2015 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 24 Aug 2015 11:55:35 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB34ED.30304@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <55DB34ED.30304@trueblade.com> Message-ID: <55DB3E77.5070309@trueblade.com> I should have added: this is for i-strings that look like PEP 498's f-strings. I'm not trying to jump to conclusions about the syntax: I'm just trying to reuse some code, and making i-strings and f-strings look like str.format strings allows me to reuse lots of infrastructure (as I hope can be seen from this example). For the final version, we can choose whatever syntax makes sense. I would argue for i"Value={value}" (same for f-strings), but if we decide to make it something else, I'll live with the decision. Eric. On 08/24/2015 11:14 AM, Eric V. Smith wrote: > On 08/23/2015 09:13 PM, Guido van Rossum wrote: >> But for i-strings, I think it would be good if we could gather more >> actual experience using them. Every potential use case brought up for >> these so far (translation, html/shell/sql quoting) feels like there's a >> lot of work needing to be done to see if the idea is actually viable >> there. It would be a shame if we added all the (considerable!) machinery >> for i-strings and all we got was yet another way to do it >> (https://xkcd.com/927/), without killing at least one competing approach >> (similar to the way .format() has failed to replace %). >> >> It's tough to envision how we could gather more experience with >> i-strings *without* building them into the language, but I'm really >> hesitant to add them without more experience. (This is the "new on the >> job market" paradox. :-) Maybe they could be emulated using a function >> call that uses sys._getframe() under the covers? Or maybe it's possible >> to cook up an experiment using other syntax hooks? E.g. the coding hack >> used in pyxl (https://github.com/dropbox/pyxl).[1] > > > I hope you don't mind that I borrowed the keys to the time machine. I'm > using the implementation of _string.formatter_parser() that I added for > implementing string.Formatter: > > ---8<--------------------------------------------- > import sys > import _string > > class i: > def __init__(self, s): > self.s = s > locals = sys._getframe(1).f_locals > globals = sys._getframe(1).f_globals > self.values = {} > # evaluate the expressions > for literal, expr, format_spec, conversion in \ > _string.formatter_parser(self.s): > if expr: > value = eval(expr, locals, globals) > self.values[expr] = value > > def __str__(self): > result = [] > for literal, expr, format_spec, conversion in \ > _string.formatter_parser(self.s): > result.append(literal) > if expr: > value = self.values[expr] > result.append(value.__format__(format_spec)) > return ''.join(result) > ---8<--------------------------------------------- > > So now, instead of i"x={x}", we say i("x={x}"). > > Let's use it with str: > >>>> x = i('Version in caps {sys.version[0:7].upper()}') >>>> x > <__main__.i object at 0x7f1653311e90> >>>> str(x) > 'Version in caps 3.6.0A0' > > > Cool. Now let's whip up a simple i18n example: > >>>> def gettext(s): > ... # Our complicated string lookup > ... if s == 'My name is {name}, my dog is {dog}': > ... return 'Mi pero es {dog}, y mi nombre es {name}' > ... return s > ... >>>> def _(istring): > ... result = [] > ... # do the gettext lookup > ... s = gettext(istring.s) > ... # use the values from our original istring, > ... # but the literals and ordering from our > ... # looked-up string > ... for literal, expr, format_spec, conversion in \ > ... _string.formatter_parser(s): > ... result.append(literal) > ... if expr is not None: > ... result.append(istring.values[expr]) > ... return ''.join(result) > ... >>>> name = 'Eric' >>>> dog = 'Misty' >>>> x = i('My name is {name}, my dog is {dog}') >>>> str(x) > 'My name is Eric, my dog is Misty' >>>> _(x) > 'Mi pero es Misty, y mi nombre es Eric' >>>> > > That should be enough to play with i-strings in logging, sql, xml, etc. > > Several things should be addressed: hiding the call to > _string.formatter_parse inside the 'i' class, for example. And of course > don't use sys._getframe. But the ideas are all there. > > I can't swear that _string.formatter_parser will parse all known > expressions, since that's not what it was designed to do. It will likely > fail with expressions that contain strings and braces, for example. I > haven't really checked. But hey, what do you want for free? > > With a slight tweak, this code even works with 2.7: replace > "_string.formatter_parser" with "str._formatter_parser". Unfortunately, > 2.7 will then only support very simple expressions. Oh, well. > > Enjoy! > > Eric. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From Nikolaus at rath.org Mon Aug 24 18:30:12 2015 From: Nikolaus at rath.org (Nikolaus Rath) Date: Mon, 24 Aug 2015 09:30:12 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: (Nick Coghlan's message of "Sun, 23 Aug 2015 14:09:58 +1000") References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> Message-ID: <874mjotzsb.fsf@thinkpad.rath.org> On Aug 23 2015, Nick Coghlan wrote: > On 23 August 2015 at 11:37, Nick Coghlan wrote: >> However, I'm now coming full circle back to the idea of making this a >> string prefix, so that would instead look like: >> >> subprocess.call($"echo $filename") >> >> The trick would be to make interpolation lazy *by default* (preserving >> the triple of the raw template string, the parsed fields, and the >> expression values), and put the default rendering in the resulting >> object's *__str__* method. > > Indeed, after working through this latest change, I ended up back > where I started from a syntactic perspective, with a proposal for > i(nterpolated)-strings rather than f(ormatted)-strings: > https://www.python.org/dev/peps/pep-0501/ > > With appropriate modifications to subprocess.call, the proposal would > then enable us to write a *safe* shell command interpolation as: > > subprocess.call(i"echo $filename") I like the idea, but *please* stop using this example. It's just terrible. Firstly, subprocess.call defaults to shell=False, so this wouldn't even work. Secondly, subprocess.call('echo', filename') looks orders of magnitude cleaner. Thirdly, your i-string wouldn't even know how to quote because it doesn't know what shell you are using. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From eric at trueblade.com Mon Aug 24 19:10:39 2015 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 24 Aug 2015 13:10:39 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB3E77.5070309@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <55DB34ED.30304@trueblade.com> <55DB3E77.5070309@trueblade.com> Message-ID: <55DB500F.1090803@trueblade.com> And because I can't leave well enough alone, here's an improved version. It includes a little logging example, plus an implementation of f-strings. Again, using f("") instead of f"". It might only work with the hg tip (what will be 3.6). I don't have a 3.5 around to test it with. It won't work with 3.3 due to changes in _string.formatter_parse. It's possible simpler expressions might work, but I'm not well motivated to try it out. Eric. On 08/24/2015 11:55 AM, Eric V. Smith wrote: > I should have added: this is for i-strings that look like PEP 498's > f-strings. I'm not trying to jump to conclusions about the syntax: I'm > just trying to reuse some code, and making i-strings and f-strings look > like str.format strings allows me to reuse lots of infrastructure (as I > hope can be seen from this example). > > For the final version, we can choose whatever syntax makes sense. I > would argue for i"Value={value}" (same for f-strings), but if we decide > to make it something else, I'll live with the decision. > > Eric. > > On 08/24/2015 11:14 AM, Eric V. Smith wrote: >> On 08/23/2015 09:13 PM, Guido van Rossum wrote: >>> But for i-strings, I think it would be good if we could gather more >>> actual experience using them. Every potential use case brought up for >>> these so far (translation, html/shell/sql quoting) feels like there's a >>> lot of work needing to be done to see if the idea is actually viable >>> there. It would be a shame if we added all the (considerable!) machinery >>> for i-strings and all we got was yet another way to do it >>> (https://xkcd.com/927/), without killing at least one competing approach >>> (similar to the way .format() has failed to replace %). >>> >>> It's tough to envision how we could gather more experience with >>> i-strings *without* building them into the language, but I'm really >>> hesitant to add them without more experience. (This is the "new on the >>> job market" paradox. :-) Maybe they could be emulated using a function >>> call that uses sys._getframe() under the covers? Or maybe it's possible >>> to cook up an experiment using other syntax hooks? E.g. the coding hack >>> used in pyxl (https://github.com/dropbox/pyxl).[1] >> >> >> I hope you don't mind that I borrowed the keys to the time machine. I'm >> using the implementation of _string.formatter_parser() that I added for >> implementing string.Formatter: >> >> ---8<--------------------------------------------- >> import sys >> import _string >> >> class i: >> def __init__(self, s): >> self.s = s >> locals = sys._getframe(1).f_locals >> globals = sys._getframe(1).f_globals >> self.values = {} >> # evaluate the expressions >> for literal, expr, format_spec, conversion in \ >> _string.formatter_parser(self.s): >> if expr: >> value = eval(expr, locals, globals) >> self.values[expr] = value >> >> def __str__(self): >> result = [] >> for literal, expr, format_spec, conversion in \ >> _string.formatter_parser(self.s): >> result.append(literal) >> if expr: >> value = self.values[expr] >> result.append(value.__format__(format_spec)) >> return ''.join(result) >> ---8<--------------------------------------------- >> >> So now, instead of i"x={x}", we say i("x={x}"). >> >> Let's use it with str: >> >>>>> x = i('Version in caps {sys.version[0:7].upper()}') >>>>> x >> <__main__.i object at 0x7f1653311e90> >>>>> str(x) >> 'Version in caps 3.6.0A0' >> >> >> Cool. Now let's whip up a simple i18n example: >> >>>>> def gettext(s): >> ... # Our complicated string lookup >> ... if s == 'My name is {name}, my dog is {dog}': >> ... return 'Mi pero es {dog}, y mi nombre es {name}' >> ... return s >> ... >>>>> def _(istring): >> ... result = [] >> ... # do the gettext lookup >> ... s = gettext(istring.s) >> ... # use the values from our original istring, >> ... # but the literals and ordering from our >> ... # looked-up string >> ... for literal, expr, format_spec, conversion in \ >> ... _string.formatter_parser(s): >> ... result.append(literal) >> ... if expr is not None: >> ... result.append(istring.values[expr]) >> ... return ''.join(result) >> ... >>>>> name = 'Eric' >>>>> dog = 'Misty' >>>>> x = i('My name is {name}, my dog is {dog}') >>>>> str(x) >> 'My name is Eric, my dog is Misty' >>>>> _(x) >> 'Mi pero es Misty, y mi nombre es Eric' >>>>> >> >> That should be enough to play with i-strings in logging, sql, xml, etc. >> >> Several things should be addressed: hiding the call to >> _string.formatter_parse inside the 'i' class, for example. And of course >> don't use sys._getframe. But the ideas are all there. >> >> I can't swear that _string.formatter_parser will parse all known >> expressions, since that's not what it was designed to do. It will likely >> fail with expressions that contain strings and braces, for example. I >> haven't really checked. But hey, what do you want for free? >> >> With a slight tweak, this code even works with 2.7: replace >> "_string.formatter_parser" with "str._formatter_parser". Unfortunately, >> 2.7 will then only support very simple expressions. Oh, well. >> >> Enjoy! >> >> Eric. >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- A non-text attachment was scrubbed... Name: istring.py Type: text/x-python Size: 2982 bytes Desc: not available URL: From barry at python.org Mon Aug 24 19:12:11 2015 From: barry at python.org (Barry Warsaw) Date: Mon, 24 Aug 2015 13:12:11 -0400 Subject: [Python-ideas] Draft PEP on string interpolation References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> Message-ID: <20150824131211.0bb1a429@anarchist> On Aug 21, 2015, at 10:52 PM, Mike Miller wrote: >Which syntax would you rather have for translation? (Knowing that you might >give a different answer for standard interpolation.) For i18n, $-strings (aka PEP 292, string.Template) is by far the best choice. Translators are very familiar with the syntax, having used it now for many years (and not just in a Python context), and it's very difficult for non-technical folks to get wrong. I don't see any advantages to springing yet another i18n interpolation syntax on translators, and I definitely don't see the advantage of introducing a *second* i18n syntax to translators of Python programs. If that means PEP 498/501 isn't appropriate for Python i18n, so be it. What we have now works, even if its implementation requires the use of some frowned-upon APIs, and the use of function syntax for marking and invocation. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tritium-list at sdamon.com Mon Aug 24 19:12:11 2015 From: tritium-list at sdamon.com (Alexander Walters) Date: Mon, 24 Aug 2015 13:12:11 -0400 Subject: [Python-ideas] Why does sys.flags have both .interactive and .inspect? In-Reply-To: <73C4B992-E2AA-4C89-888D-14981AB9EAF9@yahoo.com> References: <55DA1133.3000904@sdamon.com> <55DA1D05.7010300@mrabarnett.plus.com> <55DA1E76.4060700@sdamon.com> <55DA4B09.3040400@mrabarnett.plus.com> <55DAC2C5.8090504@sdamon.com> <73C4B992-E2AA-4C89-888D-14981AB9EAF9@yahoo.com> Message-ID: <55DB506B.9030508@sdamon.com> On 8/24/2015 03:25, Andrew Barnert wrote: > Wouldn't it be better to write a more complete explanation of what > "inspect" and "interactive" mean, so it's obvious that the one must > imply the other but not vice-versa, instead of just saying that one > implies the other and still leaving what they actually mean a mystery > to anyone who doesn't read the C source? Of...course? From tjreedy at udel.edu Mon Aug 24 19:17:12 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 24 Aug 2015 13:17:12 -0400 Subject: [Python-ideas] Properties for classes possible? In-Reply-To: <55DAECCA.70200@thomas-guettler.de> References: <55D57989.1020704@thomas-guettler.de> <55DAECCA.70200@thomas-guettler.de> Message-ID: On 8/24/2015 6:07 AM, Thomas G?ttler wrote: > > > Am 20.08.2015 um 17:29 schrieb Guido van Rossum: >> I think it's reasonable to propose @classproperty as a patch to >> CPython. It needs to be C code. Not sure about the >> writable version. The lazy=True part is not appropriate for th he >> stdlib (it's just a memoize pattern). > > What's the next step? Open an issue on the tracker. Quote Guido's message above with list name, date, and thread name -- or pipermail archive url. Add python code below, or revision thereof, for someone to translate to C. > My knowledge of the programming language C is very limited. I am not > able to write a > patch for CPython. > > I could write a patch which looks like this: > > {{{ > # From http://stackoverflow.com/a/5192374/633961 > > class classproperty(object): > def __init__(self, f): > self.f = f > def __get__(self, obj, owner): > return self.f(owner) > > }}} > > > > > -- Terry Jan Reedy From guido at python.org Mon Aug 24 19:38:33 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Aug 2015 10:38:33 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <20150824131211.0bb1a429@anarchist> References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <20150824131211.0bb1a429@anarchist> Message-ID: On Mon, Aug 24, 2015 at 10:12 AM, Barry Warsaw wrote: > On Aug 21, 2015, at 10:52 PM, Mike Miller wrote: > > >Which syntax would you rather have for translation? (Knowing that you > might > >give a different answer for standard interpolation.) > > For i18n, $-strings (aka PEP 292, string.Template) is by far the best > choice. > Translators are very familiar with the syntax, having used it now for many > years (and not just in a Python context), and it's very difficult for > non-technical folks to get wrong. > > I don't see any advantages to springing yet another i18n interpolation > syntax > on translators, and I definitely don't see the advantage of introducing a > *second* i18n syntax to translators of Python programs. > > If that means PEP 498/501 isn't appropriate for Python i18n, so be it. > What > we have now works, even if its implementation requires the use of some > frowned-upon APIs, and the use of function syntax for marking and > invocation. > That's fair, and I'm glad we have this clear position on the table. I cannot accept $ interpolation in the language definition. I also don't want PEP 498 and 501 to use different interpolation syntaxes. So to me, this means that i18n is off the table as a motivation for PEP 501 (it never was on the table for 498), and Nick can focus on motivational examples from html/sql/shell code injection for PEP 501 (but only if he can live with the PEP 498 surface syntax for interpolation). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Mon Aug 24 20:14:35 2015 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 24 Aug 2015 13:14:35 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <20150824131211.0bb1a429@anarchist> Message-ID: On Aug 24, 2015 12:39 PM, "Guido van Rossum" wrote: > (...), and Nick can focus on motivational examples from html/sql/shell code injection for PEP 501 (but only if he can live with the PEP 498 surface syntax for interpolation). f('select {date} from {tablename}') ~= ['select ', UnescapedStr(date), 'from ', UnescapedStr(tablename)] * UnescapedUntranslatedSoencodedStr * _repr_shell * quote or not? * _repr_html * charset, encoding * _repr_sql * WHERE x LIKE '%\%%' > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From encukou at gmail.com Mon Aug 24 20:15:45 2015 From: encukou at gmail.com (Petr Viktorin) Date: Mon, 24 Aug 2015 20:15:45 +0200 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <20150824131211.0bb1a429@anarchist> Message-ID: On Mon, Aug 24, 2015 at 7:38 PM, Guido van Rossum wrote: > On Mon, Aug 24, 2015 at 10:12 AM, Barry Warsaw wrote: >> >> On Aug 21, 2015, at 10:52 PM, Mike Miller wrote: >> >> >Which syntax would you rather have for translation? (Knowing that you >> > might >> >give a different answer for standard interpolation.) >> >> For i18n, $-strings (aka PEP 292, string.Template) is by far the best >> choice. >> Translators are very familiar with the syntax, having used it now for many >> years (and not just in a Python context), and it's very difficult for >> non-technical folks to get wrong. >> >> I don't see any advantages to springing yet another i18n interpolation >> syntax >> on translators, and I definitely don't see the advantage of introducing a >> *second* i18n syntax to translators of Python programs. >> >> If that means PEP 498/501 isn't appropriate for Python i18n, so be it. >> What >> we have now works, even if its implementation requires the use of some >> frowned-upon APIs, and the use of function syntax for marking and >> invocation. > > > That's fair, and I'm glad we have this clear position on the table. > > I cannot accept $ interpolation in the language definition. I also don't > want PEP 498 and 501 to use different interpolation syntaxes. So to me, this > means that i18n is off the table as a motivation for PEP 501 (it never was > on the table for 498), and Nick can focus on motivational examples from > html/sql/shell code injection for PEP 501 (but only if he can live with the > PEP 498 surface syntax for interpolation). The $ syntax might be a requirement for Barry, but it's definitely not required for translations at large. I agree that it *is* hard to introduce a new marker syntax in a project, since any change in a string will generally require re-translation in all languages. For flufl.i18n, $ is definitely best. But it might not be best new projects/libraries. Translators can get familiar with lots of things; the projects I helped translate used %1 (Qt/KDE) or %s (C/printf). Many Python projects (e.g. Django [0]) use "%(name)s" markers, where translators often leave off the "s". The brace syntax would be a big improvement. [0] https://github.com/django/django/blob/master/django/conf/locale/en/LC_MESSAGES/django.po From python-ideas at mgmiller.net Mon Aug 24 20:21:14 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 24 Aug 2015 11:21:14 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <55DADB2E.2020300@mgmiller.net> Message-ID: <55DB609A.7000301@mgmiller.net> Ok thanks, I know someone out there is probably using templating to make templating templates. But, we're getting out into the wilderness here. The original use cases were shell scripts and "whipping up a quick string", which I'd argue are more important. Cheers, -Mike On 08/24/2015 02:48 AM, Nick Coghlan wrote: > On 24 August 2015 at 18:51, Mike Miller wrote >> Hi, this part I don't get, maybe because it's so late here. Why create >> Django/Jinja2/i20n templates inside Python code using another templating >> language (whether Template or .format)? >> >> Those kind of templates should be in dedicated text files, no? > > Think of meta-templating tools like cookie-cutter or DevAssistant (or > the project wizards in an IDE) - for those kinds of tools, "source > file formats" are actually output formats. Once you look at enough > different parts of the software development pipeline you find that > pretty much *every* input format is an output format for some other > tool :) > > Cheers, > Nick. > From wes.turner at gmail.com Mon Aug 24 20:31:27 2015 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 24 Aug 2015 13:31:27 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB609A.7000301@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <55DADB2E.2020300@mgmiller.net> <55DB609A.7000301@mgmiller.net> Message-ID: On Aug 24, 2015 1:21 PM, "Mike Miller" wrote: > > Ok thanks, I know someone out there is probably using templating to make templating templates. But, we're getting out into the wilderness here. The original use cases were shell scripts Printf/str.format/str.__mod__/string concatenation are often *dangerou;\n\s** in context to shell scripts (unless you're building a "para"+"meter" that will itself be quoted/escaped; or passing tuple cmds to eg subprocess.Popen); which is why I would use pypi:sarge for Python 2.x+,3.x+ here. Or yield a sequence of typed strings which can be contextually ANDed. > and "whipping up a quick string", which I'd argue are more important. > > Cheers, > -Mike > > > > On 08/24/2015 02:48 AM, Nick Coghlan wrote: >> >> On 24 August 2015 at 18:51, Mike Miller wrote >>> >>> Hi, this part I don't get, maybe because it's so late here. Why create >>> >>> Django/Jinja2/i20n templates inside Python code using another templating >>> language (whether Template or .format)? >>> >>> Those kind of templates should be in dedicated text files, no? >> >> >> Think of meta-templating tools like cookie-cutter or DevAssistant (or >> the project wizards in an IDE) - for those kinds of tools, "source >> file formats" are actually output formats. Once you look at enough >> different parts of the software development pipeline you find that >> pretty much *every* input format is an output format for some other >> tool :) >> >> Cheers, >> Nick. >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Aug 24 20:55:42 2015 From: barry at python.org (Barry Warsaw) Date: Mon, 24 Aug 2015 14:55:42 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <20150824131211.0bb1a429@anarchist> Message-ID: <20150824145542.778b432a@limelight.wooz.org> On Aug 24, 2015, at 10:38 AM, Guido van Rossum wrote: >I cannot accept $ interpolation in the language definition. I also don't >want PEP 498 and 501 to use different interpolation syntaxes. So to me, >this means that i18n is off the table as a motivation for PEP 501 (it never >was on the table for 498), and Nick can focus on motivational examples from >html/sql/shell code injection for PEP 501 (but only if he can live with the >PEP 498 surface syntax for interpolation). I agree with this. Ignoring i18n, str.format() syntax is greatly preferred over old-school %-syntax IMO, so focusing 498/501 on being compatible with the former makes a lot of sense. Hopefully we can continue to make %-syntax obsolete, deprecated, or at least disfavored. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From eric at trueblade.com Mon Aug 24 22:10:55 2015 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 24 Aug 2015 16:10:55 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB500F.1090803@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <55DB34ED.30304@trueblade.com> <55DB3E77.5070309@trueblade.com> <55DB500F.1090803@trueblade.com> Message-ID: <55DB7A4F.2050404@trueblade.com> And here's an example with regex's, and a format_spec to say whether to escape the text or not: import re def to_re(istring): # escape the value of the embedded expressions result = [] for part in istring.parts(): result.append(part.literal) if part.expr is not None: if part.format_spec == 'raw': result.append(part.value) else: result.append(re.escape(part.value)) return re.compile(''.join(result)) delimiter = '+' trailing_re = r'\S+' regex = i(r'{delimiter}\d+{delimiter}{trailing_re:raw}') print(to_re(regex)) If we did i-strings for real, that line would be: regex = ri'{delimiter}\d+{delimiter}{trailing_re:raw}' I'm not really sold on i-strings yet. But there's enough here for people to play with. Eric. On 08/24/2015 01:10 PM, Eric V. Smith wrote: > And because I can't leave well enough alone, here's an improved version. > It includes a little logging example, plus an implementation of > f-strings. Again, using f("") instead of f"". > > It might only work with the hg tip (what will be 3.6). I don't have a > 3.5 around to test it with. It won't work with 3.3 due to changes in > _string.formatter_parse. It's possible simpler expressions might work, > but I'm not well motivated to try it out. > > Eric. > > On 08/24/2015 11:55 AM, Eric V. Smith wrote: >> I should have added: this is for i-strings that look like PEP 498's >> f-strings. I'm not trying to jump to conclusions about the syntax: I'm >> just trying to reuse some code, and making i-strings and f-strings look >> like str.format strings allows me to reuse lots of infrastructure (as I >> hope can be seen from this example). >> >> For the final version, we can choose whatever syntax makes sense. I >> would argue for i"Value={value}" (same for f-strings), but if we decide >> to make it something else, I'll live with the decision. >> >> Eric. >> >> On 08/24/2015 11:14 AM, Eric V. Smith wrote: >>> On 08/23/2015 09:13 PM, Guido van Rossum wrote: >>>> But for i-strings, I think it would be good if we could gather more >>>> actual experience using them. Every potential use case brought up for >>>> these so far (translation, html/shell/sql quoting) feels like there's a >>>> lot of work needing to be done to see if the idea is actually viable >>>> there. It would be a shame if we added all the (considerable!) machinery >>>> for i-strings and all we got was yet another way to do it >>>> (https://xkcd.com/927/), without killing at least one competing approach >>>> (similar to the way .format() has failed to replace %). >>>> >>>> It's tough to envision how we could gather more experience with >>>> i-strings *without* building them into the language, but I'm really >>>> hesitant to add them without more experience. (This is the "new on the >>>> job market" paradox. :-) Maybe they could be emulated using a function >>>> call that uses sys._getframe() under the covers? Or maybe it's possible >>>> to cook up an experiment using other syntax hooks? E.g. the coding hack >>>> used in pyxl (https://github.com/dropbox/pyxl).[1] >>> >>> >>> I hope you don't mind that I borrowed the keys to the time machine. I'm >>> using the implementation of _string.formatter_parser() that I added for >>> implementing string.Formatter: >>> >>> ---8<--------------------------------------------- >>> import sys >>> import _string >>> >>> class i: >>> def __init__(self, s): >>> self.s = s >>> locals = sys._getframe(1).f_locals >>> globals = sys._getframe(1).f_globals >>> self.values = {} >>> # evaluate the expressions >>> for literal, expr, format_spec, conversion in \ >>> _string.formatter_parser(self.s): >>> if expr: >>> value = eval(expr, locals, globals) >>> self.values[expr] = value >>> >>> def __str__(self): >>> result = [] >>> for literal, expr, format_spec, conversion in \ >>> _string.formatter_parser(self.s): >>> result.append(literal) >>> if expr: >>> value = self.values[expr] >>> result.append(value.__format__(format_spec)) >>> return ''.join(result) >>> ---8<--------------------------------------------- >>> >>> So now, instead of i"x={x}", we say i("x={x}"). >>> >>> Let's use it with str: >>> >>>>>> x = i('Version in caps {sys.version[0:7].upper()}') >>>>>> x >>> <__main__.i object at 0x7f1653311e90> >>>>>> str(x) >>> 'Version in caps 3.6.0A0' >>> >>> >>> Cool. Now let's whip up a simple i18n example: >>> >>>>>> def gettext(s): >>> ... # Our complicated string lookup >>> ... if s == 'My name is {name}, my dog is {dog}': >>> ... return 'Mi pero es {dog}, y mi nombre es {name}' >>> ... return s >>> ... >>>>>> def _(istring): >>> ... result = [] >>> ... # do the gettext lookup >>> ... s = gettext(istring.s) >>> ... # use the values from our original istring, >>> ... # but the literals and ordering from our >>> ... # looked-up string >>> ... for literal, expr, format_spec, conversion in \ >>> ... _string.formatter_parser(s): >>> ... result.append(literal) >>> ... if expr is not None: >>> ... result.append(istring.values[expr]) >>> ... return ''.join(result) >>> ... >>>>>> name = 'Eric' >>>>>> dog = 'Misty' >>>>>> x = i('My name is {name}, my dog is {dog}') >>>>>> str(x) >>> 'My name is Eric, my dog is Misty' >>>>>> _(x) >>> 'Mi pero es Misty, y mi nombre es Eric' >>>>>> >>> >>> That should be enough to play with i-strings in logging, sql, xml, etc. >>> >>> Several things should be addressed: hiding the call to >>> _string.formatter_parse inside the 'i' class, for example. And of course >>> don't use sys._getframe. But the ideas are all there. >>> >>> I can't swear that _string.formatter_parser will parse all known >>> expressions, since that's not what it was designed to do. It will likely >>> fail with expressions that contain strings and braces, for example. I >>> haven't really checked. But hey, what do you want for free? >>> >>> With a slight tweak, this code even works with 2.7: replace >>> "_string.formatter_parser" with "str._formatter_parser". Unfortunately, >>> 2.7 will then only support very simple expressions. Oh, well. >>> >>> Enjoy! >>> >>> Eric. >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From njs at pobox.com Mon Aug 24 22:44:02 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 24 Aug 2015 13:44:02 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <20150824131211.0bb1a429@anarchist> Message-ID: On Mon, Aug 24, 2015 at 10:38 AM, Guido van Rossum wrote: > I cannot accept $ interpolation in the language definition. I also don't > want PEP 498 and 501 to use different interpolation syntaxes. So to me, this > means that i18n is off the table as a motivation for PEP 501 (it never was > on the table for 498), and Nick can focus on motivational examples from > html/sql/shell code injection for PEP 501 (but only if he can live with the > PEP 498 surface syntax for interpolation). >From the early part of this discussion [1], I had the impression that the goal was that eventually string interpolation would be on by default for all strings, with PEP 498 intended as an intermediate step towards that goal. Is that still true, or is the plan now that interpolated strings will always require an explicit marker (like 'f')? I ask because if they *do* require an explicit marker, then obviously the best thing is for the syntax to match that of .format. But, if this will be enabled for all strings in Python 3.something, then it seems like we should be careful now to make sure that the syntax is clearly distinct from that used for .format ("${...}" or "\{...}" or ...), because anything else creates nasty compatibility problems for people trying to write format template strings that work on both old and new Pythons. (This is also assuming that f-string interpolation and the eventual plain-old-string interpolation will use the same syntax, but that seems like a highly desirable property to me..) -n [1] http://thread.gmane.org/gmane.comp.python.ideas/34980 -- Nathaniel J. Smith -- http://vorpus.org From python-ideas at mgmiller.net Mon Aug 24 22:57:45 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 24 Aug 2015 13:57:45 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55D65E4F.1040608@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> Message-ID: <55DB8549.3070908@mgmiller.net> Hi, here's my latest idea, riffing on other's latest this weekend. Let's call these e-strings (for expression), as it's easier to refer to the letter of the proposals than three digit numbers. So, an e-string looks like an f-string, though at compile-time, it is converted to an object instead (like i-string): print(e'Hello {friend}, filename: {filename}.') # converts to ==> print(estr('Hello {friend}, filename: {filename}.', friend=friend, filename=filename)) An estr is a subclass of str, therefore able to do the nice things a string can do. Rendering is deferred, and it also has a raw member, escape(), and translate() methods: class estr(str): # init: saves self.raw, args, kwargs for later # methods, ops render it # def escape(self, escape_func): # handles escaping # def translate(self, template, safe=True): # optional i18n support To make it as simple as possible to use by end-developers, it 1) doesn't require str() to be run explicitly, it renders itself when needed via its various methods and operators. Look for .raw, if you need the original. Also, 2) a bit of responsibility is pushed to stdlib/pypi. In a handful of sensitive places, the object is checked beforehand and escaped when needed: def os_system(command): # imagine os.system, subprocess, dbapi, etc. if isinstance(command, estr): command = command.escape(shlex.quote) # each chooses its own rules do_something(command) This means a billion lines of code using e-strings won't have to care about them, only a handful of places. What is easiest to type is now safe as well: os.system(e'cat {filename}') # sleep easy A translate method might available also (though we may have given up on i18n already), to provide a new raw string from a message catalog: rendered = message.translate(translated_message) # fmt syntax TBD This should enable the safety and features we'd like, without burdening the everyday user. I've created a sample script, here is the output: # consider: estr('Hello {friend}, filename: {filename}.') friend: 'John' filename: "somefile; rm -rf ~ 'foo' " original: Hello {friend}, filename: {filename}. print(): Hello John, filename: somefile; rm -rf ~ 'foo' . shell escape: Hello John, filename: 'somefile; rm -rf ~ '"'"'foo'"'"' '. html escape: Hello John, filename: somefile; rm -rf ~ 'foo' <html>. sql escape: Hello "John", filename: "somefile; rm -rf ~ 'foo' ". logger DEBUG Hello John, filename: somefile; rm -rf ~ 'foo' . upper+utf8: b"HELLO JOHN, FILENAME: SOMEFILE; RM -RF ~ 'FOO' ." translated: Hola John, archivo: somefile; rm -rf ~ 'foo' . Anything I've missed? -Mike On 08/20/2015 04:10 PM, Mike Miller wrote: > The ground seems to be settling on the issue, so I have tried my hand at a grand > unified pep for string interpolation. > From Nikolaus at rath.org Mon Aug 24 23:28:47 2015 From: Nikolaus at rath.org (Nikolaus Rath) Date: Mon, 24 Aug 2015 14:28:47 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB8549.3070908@mgmiller.net> (Mike Miller's message of "Mon, 24 Aug 2015 13:57:45 -0700") References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> Message-ID: <87pp2cs7e8.fsf@thinkpad.rath.org> On Aug 24 2015, Mike Miller wrote: > Also, 2) a bit of responsibility is pushed to stdlib/pypi. In a > handful of sensitive places, the object is checked beforehand and > escaped when needed: > > def os_system(command): # imagine os.system, subprocess, dbapi, etc. > if isinstance(command, estr): > command = command.escape(shlex.quote) # each chooses its own rules > do_something(command) > > This means a billion lines of code using e-strings won't have to care > about them, only a handful of places. What is easiest to type is now > safe as well: > > os.system(e'cat {filename}') # sleep easy *shudder*. After years of efforts to get people not to do this, you want to change course by 180 degrees and start telling people this is ok if they add an additional single character in front of the string? This sounds like very bad idea to me for many reasons: - People will forget to type the 'e', and things will appear to work but buggy. - People will forget that they need the 'e' (and the same thing will happen, further reinforcing the thought that the e is not required) - People will be confused because other languages don't have the 'e' (hmm. how do I do this in Perl? I guess I'll just drop the 'e'... *check*, works, great!) - People will assume that their my_custom_system() call also special-cases e strings and escape them (which it won't). Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From wes.turner at gmail.com Mon Aug 24 23:29:07 2015 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 24 Aug 2015 16:29:07 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB8549.3070908@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> Message-ID: On Mon, Aug 24, 2015 at 3:57 PM, Mike Miller wrote: > Hi, here's my latest idea, riffing on other's latest this weekend. > > Let's call these e-strings (for expression), as it's easier to refer to > the letter of the proposals than three digit numbers. > > So, an e-string looks like an f-string, though at compile-time, it is > converted to an object instead (like i-string): > > print(e'Hello {friend}, filename: {filename}.') # converts to ==> > > print(estr('Hello {friend}, filename: {filename}.', friend=friend, > filename=filename)) > > An estr is a subclass of str, therefore able to do the nice things a > string can do. Rendering is deferred, and it also has a raw member, > escape(), and translate() methods: > > class estr(str): > # init: saves self.raw, args, kwargs for later > # methods, ops render it > # def escape(self, escape_func): # handles escaping > # def translate(self, template, safe=True): # optional i18n support > * How do I overload/subclass [class estr()]? * Does it always just read LC_ALL='utf8' (or where do I specify that global/thread/frame-local?) * How do I escape_func? Jinja2 uses MarkupSafe, with a class named Markup: class Markup(): def __html__() def __html_format__() IPython can display objects with _repr_fmt_() callables, which TBH I prefer because it's not name mangled and so more easily testable. [3,4] Existing IPython rich display methods [5,6,7,8] _mime_map = dict( _repr_png_="image/png", _repr_jpeg_="image/jpeg", _repr_svg_="image/svg+xml", _repr_html_="text/html", _repr_json_="application/json", _repr_javascript_="application/javascript", ) # _repr_latex_ = "text/latex" # _repr_retina_ = "image/png" Suggestd IPython methods - [ ] _repr_shell_ - [ ] single_quote_shell_escape - [ ] double_quote_shell_escape - [ ] _repr_sql_ (*NOTE: SQL variants, otherworldly-escaping dependency / newb errors) [1] https://pypi.python.org/pypi/MarkupSafe [2] https://github.com/mitsuhiko/markupsafe [3] https://ipython.org/ipython-doc/dev/config/integrating.html [4] https://ipython.org/ipython-doc/dev/config/integrating.html#rich-display [5] https://github.com/ipython/ipython/blob/master/IPython/utils/capture.py [6] https://github.com/ipython/ipython/blob/master/IPython/utils/tests/test_capture.py [7] https://github.com/ipython/ipython/blob/master/IPython/core/display.py [8] https://github.com/ipython/ipython/blob/master/IPython/core/tests/test_display.py * IPython: _repr_fmt_() * MarkupSafe: __html__() > To make it as simple as possible to use by end-developers, it 1) doesn't > require str() to be run explicitly, it renders itself when needed via its > various methods and operators. Look for .raw, if you need the original. > > Also, 2) a bit of responsibility is pushed to stdlib/pypi. In a handful > of sensitive places, the object is checked beforehand and escaped when > needed: > > def os_system(command): # imagine os.system, subprocess, dbapi, etc. > if isinstance(command, estr): > command = command.escape(shlex.quote) # each chooses its own > rules > do_something(command) > > This means a billion lines of code using e-strings won't have to care > about them, only a handful of places. What is easiest to type is now safe > as well: > > os.system(e'cat {filename}') # sleep easy > > A translate method might available also (though we may have given up on > i18n already), to provide a new raw string from a message catalog: > > rendered = message.translate(translated_message) # fmt syntax TBD > > This should enable the safety and features we'd like, without burdening > the everyday user. I've created a sample script, here is the output: > > # consider: estr('Hello {friend}, filename: {filename}.') > friend: 'John' > filename: "somefile; rm -rf ~ 'foo' " > > original: Hello {friend}, filename: {filename}. > print(): Hello John, filename: somefile; rm -rf ~ 'foo' . > > shell escape: > Hello John, filename: 'somefile; rm -rf ~ '"'"'foo'"'"' '. > html escape: > Hello John, filename: somefile; rm -rf ~ 'foo' > <html>. > sql escape: Hello "John", filename: "somefile; rm -rf ~ 'foo' > ". > logger DEBUG Hello John, filename: somefile; rm -rf ~ 'foo' . > > upper+utf8: b"HELLO JOHN, FILENAME: SOMEFILE; RM -RF ~ 'FOO' ." > translated: Hola John, archivo: somefile; rm -rf ~ 'foo' . > > > Anything I've missed? > > -Mike > > > On 08/20/2015 04:10 PM, Mike Miller wrote: > >> The ground seems to be settling on the issue, so I have tried my hand at >> a grand >> unified pep for string interpolation. >> >> _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Aug 24 23:49:51 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Aug 2015 14:49:51 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <20150824131211.0bb1a429@anarchist> Message-ID: On Mon, Aug 24, 2015 at 1:44 PM, Nathaniel Smith wrote: > From the early part of this discussion [1], I had the impression that > the goal was that eventually string interpolation would be on by > default for all strings, with PEP 498 intended as an intermediate step > towards that goal. Is that still true, or is the plan now that > interpolated strings will always require an explicit marker (like > 'f')? > That was not received well, so I think it's dead. > I ask because if they *do* require an explicit marker, then obviously > the best thing is for the syntax to match that of .format. But, if > this will be enabled for all strings in Python 3.something, then it > seems like we should be careful now to make sure that the syntax is > clearly distinct from that used for .format ("${...}" or "\{...}" or > ...), because anything else creates nasty compatibility problems for > people trying to write format template strings that work on both old > and new Pythons. > Good point. > (This is also assuming that f-string interpolation and the eventual > plain-old-string interpolation will use the same syntax, but that > seems like a highly desirable property to me..) > > -n > > [1] http://thread.gmane.org/gmane.comp.python.ideas/34980 > > -- > Nathaniel J. Smith -- http://vorpus.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Mon Aug 24 23:54:40 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 24 Aug 2015 14:54:40 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <87pp2cs7e8.fsf@thinkpad.rath.org> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> Message-ID: <55DB92A0.3010909@mgmiller.net> On 08/24/2015 02:28 PM, Nikolaus Rath wrote: > *shudder*. After years of efforts to get people not to do this, you want > to change course by 180 degrees and start telling people this is ok if > they add an additional single character in front of the string? > > This sounds like very bad idea to me for many reasons: > > - People will forget to type the 'e', and things will appear to work > but buggy. > - People will forget that they need the 'e' (and the same thing will > happen, further reinforcing the thought that the e is not required) > - People will be confused because other languages don't have the 'e' > (hmm. how do I do this in Perl? I guess I'll just drop the > 'e'... *check*, works, great!) > - People will assume that their my_custom_system() call also > special-cases e strings and escape them (which it won't). > No, since the variables will not be replaced, therefore the command-line won't work. The previous proposals ignored this altogether. A partial solution is better than none, I think. I don't propose we document this as the recommended way, anyway. subprocess.call('foo', shell=False) is that. This is just a way to do the right thing in a number of common situations where we can do it. -Mike From p.f.moore at gmail.com Mon Aug 24 23:54:54 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 24 Aug 2015 22:54:54 +0100 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <87pp2cs7e8.fsf@thinkpad.rath.org> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> Message-ID: On 24 August 2015 at 22:28, Nikolaus Rath wrote: >> os.system(e'cat {filename}') # sleep easy > > *shudder*. After years of efforts to get people not to do this, you want > to change course by 180 degrees and start telling people this is ok if > they add an additional single character in front of the string? > > This sounds like very bad idea to me for many reasons: > > - People will forget to type the 'e', and things will appear to work > but buggy. > - People will forget that they need the 'e' (and the same thing will > happen, further reinforcing the thought that the e is not required) > - People will be confused because other languages don't have the 'e' > (hmm. how do I do this in Perl? I guess I'll just drop the > 'e'... *check*, works, great!) > - People will assume that their my_custom_system() call also > special-cases e strings and escape them (which it won't). Agreed. In a convenience library where it's absolutely clear that a shell is involved (something like sarge or invoke) this is OK, but not in the stdlib as the "official" way to call external programs. Also: - People will fail to understand the difference between e'...' and f'...' and will use the wrong one when using os.system, and things will work correctly but with security vulnerabilities. - Teaching Python will be complicated by needing to explain why both f'...' and e'...' exist, and what the difference is. Trying to do that for beginners without baffling them with discussions of security vulnerabilities will be challenging... Paul From python-ideas at mgmiller.net Mon Aug 24 23:59:06 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 24 Aug 2015 14:59:06 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> Message-ID: <55DB93AA.1020001@mgmiller.net> On 08/24/2015 02:29 PM, Wes Turner wrote: > > * How do I overload/subclass [class estr()]? class wes_estr(estr): pass > * Does it always just read LC_ALL='utf8' (or where do I specify that > global/thread/frame-local?) No, I just chose that in my script to show it suppoorted str functionality for example, .encode('utf-8'), it is not otherwise related to estr. I should post the script. > * How do I escape_func? You pass in a function that does the escaping. > Jinja2 uses MarkupSafe, with a class named Markup: > > class Markup(): > def __html__() > def __html_format__() By letting the caller set the escaping rules via passed function, estr does not have to know anything about escaping, and is much simpler. Also the caller could its own escaping rules. -Mike From python-ideas at mgmiller.net Tue Aug 25 00:06:54 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 24 Aug 2015 15:06:54 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> Message-ID: <55DB957E.4050005@mgmiller.net> On 08/24/2015 02:54 PM, Paul Moore wrote: > Agreed. In a convenience library where it's absolutely clear that a > shell is involved (something like sarge or invoke) this is OK, but not > in the stdlib as the "official" way to call external programs. Don't focus on os.system(), it could be any function, and not particularly relevant, nor do I recommend this line as the official way. Remember Nick Coghlan's statement that the "easy way should be the right way"? That's what this is trying to accomplish. > - People will fail to understand the difference between e'...' and > f'...' and will use the wrong one when using os.system, and things > will work correctly but with security vulnerabilities. I don't recommend e'' and f'', only e'' at this moment. -Mike From wes.turner at gmail.com Tue Aug 25 00:21:30 2015 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 24 Aug 2015 17:21:30 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB957E.4050005@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB957E.4050005@mgmiller.net> Message-ID: On Mon, Aug 24, 2015 at 5:06 PM, Mike Miller wrote: > > On 08/24/2015 02:54 PM, Paul Moore wrote: > > Agreed. In a convenience library where it's absolutely clear that a > > shell is involved (something like sarge or invoke) this is OK, but not > > in the stdlib as the "official" way to call external programs. > > Don't focus on os.system(), it could be any function, and not particularly > relevant, nor do I recommend this line as the official way. > > Remember Nick Coghlan's statement that the "easy way should be the right > way"? > That's what this is trying to accomplish. > > > - People will fail to understand the difference between e'...' and > > f'...' and will use the wrong one when using os.system, and things > > will work correctly but with security vulnerabilities. > > I don't recommend e'' and f'', only e'' at this moment. How would e strings prevent this: In [1]: import subprocess In [2]: subprocess.call('echo 1\necho 2', shell=True) 1 2 Out[2]: 0 In [3]: import sarge In [4]: sarge.run('echo 1\necho 2') 1 echo 2 Out[4]: In [5]: sarge.shell_quote?? Signature: sarge.shell_quote(s) Source: def shell_quote(s): """ Quote text so that it is safe for Posix command shells. For example, "*.py" would be converted to "'*.py'". If the text is considered safe it is returned unquoted. :param s: The value to quote :type s: str (or unicode on 2.x) :return: A safe version of the input, from the point of view of Posix command shells :rtype: The passed-in type """ assert isinstance(s, string_types) if not s: result = "''" elif not UNSAFE.search(s): result = s else: result = "'%s'" % s.replace("'", r"'\''") return result File: ~/.local/lib/python2.7/site-packages/sarge/__init__.py Type: function >From a code review standpoint, my eyes are tired and I'd rather have more than 1 character to mistype (because of the hamming distance between really all of the proposed single-letter string prefixes, and u'' and r'', and e") > > > -Mike > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Tue Aug 25 00:26:13 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 24 Aug 2015 15:26:13 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB957E.4050005@mgmiller.net> Message-ID: <55DB9A05.5060206@mgmiller.net> In the given example it uses shlex.quote on each variable: https://docs.python.org/dev/library/shlex.html#shlex.quote Btw, no one has to use this form, it simply helps when someone does. -Mike From python-ideas at mgmiller.net Tue Aug 25 00:27:39 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 24 Aug 2015 15:27:39 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB8549.3070908@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> Message-ID: <55DB9A5B.4000902@mgmiller.net> Here's the example script to demonstrate: https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_example.py -Mike From njs at pobox.com Tue Aug 25 00:32:25 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 24 Aug 2015 15:32:25 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <87pp2cs7e8.fsf@thinkpad.rath.org> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> Message-ID: On Mon, Aug 24, 2015 at 2:28 PM, Nikolaus Rath wrote: > On Aug 24 2015, Mike Miller wrote: >> Also, 2) a bit of responsibility is pushed to stdlib/pypi. In a >> handful of sensitive places, the object is checked beforehand and >> escaped when needed: >> >> def os_system(command): # imagine os.system, subprocess, dbapi, etc. >> if isinstance(command, estr): >> command = command.escape(shlex.quote) # each chooses its own rules >> do_something(command) >> >> This means a billion lines of code using e-strings won't have to care >> about them, only a handful of places. What is easiest to type is now >> safe as well: >> >> os.system(e'cat {filename}') # sleep easy > > *shudder*. After years of efforts to get people not to do this, you want > to change course by 180 degrees and start telling people this is ok if > they add an additional single character in front of the string? The problem is that despite years of effort trying to get people not to do things like this, it's still the case that if you look at, say, MITRE's ranked list of the "top 25 most dangerous software errors": https://cwe.mitre.org/top25/index.html then numbers #1, #2, and #4 are improper quoting. (#3 is buffer overflows.) Or if you look at the OWASP consensus list on the most critical web application security risks ("based on 8 datasets from 7 firms that specialize in application security, including 4 consulting companies and 3 tool/SaaS vendors (1 static, 1 dynamic, and 1 with both). This data spans over 500,000 vulnerabilities..."), then numbers #1 and #3 are improper quoting: https://www.owasp.org/index.php/Top_10_2013-Top_10 I mean, it's great that the rise of languages like Python that have easy range-checked string manipulation has knocked buffer overflows out of the #1 spot, but... :-) Guido is right that the nice thing about classic string interpolation is that its use in many languages gives us tons of data about how it works in practice. But one of the things that data tells us is that it actually causes a lot of problems! Do we actually want to continue the status quo, where one set of people keep designing languages features to make it easier and easier to slap strings together, and then another set of people spend increasing amounts of energy trying to educate all the users about why they shouldn't actually use those features? It wouldn't be the end of the world (that's why we call it "the status quo" ;-)), and trying to design something new and better is always difficult and risky, but this seems like a good moment to think very hard about whether there's a better way. (And possibly about whether that better way is something we could put up on PyPI now while the 3.6 freeze is still a year out...) -n -- Nathaniel J. Smith -- http://vorpus.org From njs at pobox.com Tue Aug 25 00:37:08 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 24 Aug 2015 15:37:08 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB8549.3070908@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> Message-ID: On Mon, Aug 24, 2015 at 1:57 PM, Mike Miller wrote: > Hi, here's my latest idea, riffing on other's latest this weekend. > > Let's call these e-strings (for expression), as it's easier to refer to the > letter of the proposals than three digit numbers. > > So, an e-string looks like an f-string, though at compile-time, it is > converted to an object instead (like i-string): > > print(e'Hello {friend}, filename: {filename}.') # converts to ==> > > print(estr('Hello {friend}, filename: {filename}.', friend=friend, > filename=filename)) > > An estr is a subclass of str, therefore able to do the nice things a string > can do. Rendering is deferred, and it also has a raw member, escape(), and > translate() methods: > > class estr(str): > # init: saves self.raw, args, kwargs for later > # methods, ops render it > # def escape(self, escape_func): # handles escaping > # def translate(self, template, safe=True): # optional i18n support > > To make it as simple as possible to use by end-developers, it 1) doesn't > require str() to be run explicitly, it renders itself when needed via its > various methods and operators. Look for .raw, if you need the original. This is a really interesting idea. You could potentially re-use PyUnicode_READY to do the default rendering. Some things to think about: - If I concatenate two e-string objects, or an e-string and a regular string, or interpolate an e-string into an e-string, then what happens? - How problematic will it be that an e-string pins all the interpolated objects in memory for its lifetime? -n -- Nathaniel J. Smith -- http://vorpus.org From python-ideas at mgmiller.net Tue Aug 25 00:45:21 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 24 Aug 2015 15:45:21 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> Message-ID: <55DB9E81.4060500@mgmiller.net> On 08/24/2015 03:37 PM, Nathaniel Smith wrote: > - If I concatenate two e-string objects, or an e-string and a regular > string, or interpolate an e-string into an e-string, then what > happens? In the example url I just posted, concatenation renders each string before concatenation, the returns a regular string with both concatenated. If interp into interp ((boggle)), when the passed one gets formated, the formatting operation will render it. Good test case. > - How problematic will it be that an e-string pins all the > interpolated objects in memory for its lifetime? It will be an object holding a raw template string, and a number of variables. In normal usage I don't suspect it to be a problem. -Mike From guido at python.org Tue Aug 25 00:45:34 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Aug 2015 15:45:34 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> Message-ID: On Mon, Aug 24, 2015 at 3:32 PM, Nathaniel Smith wrote: > [...] > I mean, it's great that the rise of languages like Python that have > easy range-checked string manipulation has knocked buffer overflows > out of the #1 spot, but... :-) > > Guido is right that the nice thing about classic string interpolation > is that its use in many languages gives us tons of data about how it > works in practice. But one of the things that data tells us is that it > actually causes a lot of problems! Do we actually want to continue the > status quo, where one set of people keep designing languages features > to make it easier and easier to slap strings together, and then > another set of people spend increasing amounts of energy trying to > educate all the users about why they shouldn't actually use those > features? It wouldn't be the end of the world (that's why we call it > "the status quo" ;-)), and trying to design something new and better > is always difficult and risky, but this seems like a good moment to > think very hard about whether there's a better way. > Or maybe from the persistence of quoting bugs we could conclude that the ways people slap strings together have very little effect on this category of bugs? > (And possibly about whether that better way is something we could put > up on PyPI now while the 3.6 freeze is still a year out...) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Tue Aug 25 00:55:59 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 25 Aug 2015 00:55:59 +0200 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DA9B63.3010208@uni-wuppertal.de> References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: <55DBA0FF.2000900@mail.de> On 24.08.2015 06:19, Prof. Dr. L. Humbert wrote: > At this very moment (python 3.5rc1), it is not possible, but we need it, > so the construction will be orthogonal from the point of view for > students(!) - _one_ concept should work in different circumstances. How about not using type hints when teaching the basics of trees? Type hints do not replace good variables names. What about using left_tree instead of left and right_tree instead of right? That should simplify the example for the students, remove advanced concepts and teach them something about right naming (conventions) which is desirable these days when looking at production code (readability, maintainability and so forth). Regards, Sven R. Kunze -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Aug 25 02:20:08 2015 From: barry at python.org (Barry Warsaw) Date: Mon, 24 Aug 2015 20:20:08 -0400 Subject: [Python-ideas] Draft PEP on string interpolation References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <55DB34ED.30304@trueblade.com> <55DB3E77.5070309@trueblade.com> Message-ID: <20150824202008.43410b34@anarchist.wooz.org> On Aug 24, 2015, at 11:55 AM, Eric V. Smith wrote: >I should have added: this is for i-strings that look like PEP 498's >f-strings. I'm not trying to jump to conclusions about the syntax: I remember something else about $-strings, based on Mailman's experience. Originally we also used %(foo)s strings, but when that reached the breaking point (and PEP 292 was implemented), we changed to $-strings. At that point we had to provide an upgrade path for settings with the original %-strings. It turns out to not be too difficult to translate between them. It would probably not be difficult to translate from $foo to {foo} either, so with a properly defined hook, the porcelain could use $-strings while all the underlying machinery could still use {}-strings. It would probably have to be roughly limited to simple name lookups with dot-chasing, and maybe it's not worth it. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From eric at trueblade.com Tue Aug 25 03:51:20 2015 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 24 Aug 2015 21:51:20 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <55DB34ED.30304@trueblade.com> Message-ID: <55DBCA18.4020407@trueblade.com> On 08/24/2015 07:55 PM, Andrew Barnert wrote: > On Aug 24, 2015, at 08:14, Eric V. Smith wrote: >> >>> On 08/23/2015 09:13 PM, Guido van Rossum wrote: >>> But for i-strings, I think it would be good if we could gather more >>> actual experience using them. Every potential use case brought up for >>> these so far (translation, html/shell/sql quoting) feels like there's a >>> lot of work needing to be done to see if the idea is actually viable >>> there. It would be a shame if we added all the (considerable!) machinery >>> for i-strings and all we got was yet another way to do it >>> (https://xkcd.com/927/), without killing at least one competing approach >>> (similar to the way .format() has failed to replace %). >>> >>> It's tough to envision how we could gather more experience with >>> i-strings *without* building them into the language, but I'm really >>> hesitant to add them without more experience. (This is the "new on the >>> job market" paradox. :-) Maybe they could be emulated using a function >>> call that uses sys._getframe() under the covers? Or maybe it's possible >>> to cook up an experiment using other syntax hooks? E.g. the coding hack >>> used in pyxl (https://github.com/dropbox/pyxl).[1] >> >> >> I hope you don't mind that I borrowed the keys to the time machine. I'm >> using the implementation of _string.formatter_parser() that I added for >> implementing string.Formatter: > > Nifty! When I get a chance, I'll slap this together with an import hook using the untokenize hack, so I can actually play with i-strings (and f-strings) with the proposed syntax without needing a patch. If it looks good, I can write a real implementation that doesn't have all the untokenize problems, which could also eliminate the need for _getframe. (To make it backportable to 3.3/2.7 we'd still need to backport formatter_parser, right? But that still seems like something that could be done and posted on PyPI.) I don't know what the untokenize problems are, so I'm not sure I can help there. I also don't think I'd base any real implementation on _string.formatted_parser: it won't be terribly efficient. I've created a project on bitbucket: https://bitbucket.org/ericvsmith/istring where I'm playing with a "join" method and a callback interface, without ever exposing the looping and parsing to the caller. I think that would be a better interface than an iterator exposing the various parts of the string. But, as Guido suggests above, it's all just an academic exercise to understand how to best use i-strings. I suggest providing feedback on their API before implementing anything more serious. Eric. From Nikolaus at rath.org Tue Aug 25 04:05:36 2015 From: Nikolaus at rath.org (Nikolaus Rath) Date: Mon, 24 Aug 2015 19:05:36 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB92A0.3010909@mgmiller.net> (Mike Miller's message of "Mon, 24 Aug 2015 14:54:40 -0700") References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB92A0.3010909@mgmiller.net> Message-ID: <87vbc46s27.fsf@vostro.rath.org> On Aug 24 2015, Mike Miller wrote: > On 08/24/2015 02:28 PM, Nikolaus Rath wrote: >> *shudder*. After years of efforts to get people not to do this, you want >> to change course by 180 degrees and start telling people this is ok if >> they add an additional single character in front of the string? >> >> This sounds like very bad idea to me for many reasons: >> >> - People will forget to type the 'e', and things will appear to work >> but buggy. >> - People will forget that they need the 'e' (and the same thing will >> happen, further reinforcing the thought that the e is not required) >> - People will be confused because other languages don't have the 'e' >> (hmm. how do I do this in Perl? I guess I'll just drop the >> 'e'... *check*, works, great!) >> - People will assume that their my_custom_system() call also >> special-cases e strings and escape them (which it won't). >> > > No, since the variables will not be replaced, therefore the > command-line won't work. How is that compatible with your statement that > This means a billion lines of code using e-strings won't have to care > about them, only a handful of places. Either str(estr) performs interpolation (so billions of lines of code don't have to change, and my custom system()-like call get's an interpolated string as well until I change it to be estr-aware), or it does not (and billions of lines of code will break when they unexpectedly get an estr instead of a str). Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From ron3200 at gmail.com Tue Aug 25 04:23:42 2015 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 24 Aug 2015 22:23:42 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB9E81.4060500@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> Message-ID: On 08/24/2015 06:45 PM, Mike Miller wrote: >> - How problematic will it be that an e-string pins all the >> interpolated objects in memory for its lifetime? > > It will be an object holding a raw template string, and a number of > variables. In normal usage I don't suspect it to be a problem. If an objects __str__ method could have an optional fmt='spec' argument, then an estring, could just hold strings, and not the object references. That also prevent surprises if the object is mutated between the time it's estring is created and when the estring is used as a string. For that matter it prevents an estring from printing one way at one time, and another at another time. I don't know if the fomatting can be split like this... Where an object is formatted to a string representation, and then that is formatted to a field specification. The later being things like width, fill, right, center, and left. These are independent of the object and belong to the string. Things like nubmer of places and sign or to use leading or trailing zeros is part of the object being converted to a string. Cheers, Ron From python-ideas at mgmiller.net Tue Aug 25 04:36:47 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 24 Aug 2015 19:36:47 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <87vbc46s27.fsf@vostro.rath.org> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB92A0.3010909@mgmiller.net> <87vbc46s27.fsf@vostro.rath.org> Message-ID: <55DBD4BF.7010908@mgmiller.net> On 08/24/2015 07:05 PM, Nikolaus Rath wrote: > How is that compatible with your statement that > >> This means a billion lines of code using e-strings won't have to care >> about them, only a handful of places. > > Either str(estr) performs interpolation (so billions of lines of code > don't have to change, and my custom system()-like call get's an > interpolated string as well until I change it to be estr-aware), or it > does not (and billions of lines of code will break when they > unexpectedly get an estr instead of a str). > Not sure I understand... your system_like() call already accepts strings that could be formatted? The estr adds a protection (by escaping variables) that didn't exist in the past. It is not removing any protections or best practices. It is therefore safer than the f-string version, but you read additional protection as more dangerous, perhaps because someone in the future might get lazy. Is that right? But, people are already lazy (in a manner...), so it looks like a small win to me. By "don't have to care" I don't mean we throw out best practices, only that doing the right thing (rephrased as, not doing the wrong thing) becomes easier, as Nick C. taught is a good idea in his PEP. Any future docs certainly won't be shouting, "do this with os.system!!! It's safe now!!" They will still direct to subprocess.call(). In fact I'm sorry I mentioned os.system at all, it's just a few hours ago someone chewed out Nick C. for using subprocess.call() in his examples. ;) -Mike From eric at trueblade.com Tue Aug 25 04:42:26 2015 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 24 Aug 2015 22:42:26 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> Message-ID: <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> > On Aug 24, 2015, at 10:23 PM, Ron Adam wrote: > > On 08/24/2015 06:45 PM, Mike Miller wrote: >>> - How problematic will it be that an e-string pins all the >>> interpolated objects in memory for its lifetime? >> >> It will be an object holding a raw template string, and a number of >> variables. In normal usage I don't suspect it to be a problem. > > If an objects __str__ method could have an optional fmt='spec' argument, then an estring, could just hold strings, and not the object references. That also prevent surprises if the object is mutated between the time it's estring is created and when the estring is used as a string. For that matter it prevents an estring from printing one way at one time, and another at another time. > > I don't know if the fomatting can be split like this... Where an object is formatted to a string representation, and then that is formatted to a field specification. The later being things like width, fill, right, center, and left. These are independent of the object and belong to the string. Things like nubmer of places and sign or to use leading or trailing zeros is part of the object being converted to a string. It's not possible. For examples, look at all of the number format options. How would you implement hex conversions? Or datetime %A? Eric. From steve at pearwood.info Tue Aug 25 04:52:09 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 25 Aug 2015 12:52:09 +1000 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DA9B63.3010208@uni-wuppertal.de> References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: <20150825025209.GK3881@ando.pearwood.info> On Mon, Aug 24, 2015 at 06:19:47AM +0200, Prof. Dr. L. Humbert wrote: [...] > What students should be able to code: > > 1. varinat > #-------------wishful----------------------------------\ > class Tree: > def __init__(self, left: Tree, right: Tree): > self.left = left > self.right = right That would be very nice, but I don't see how it would be possible in a dynamic language with Python's rules. Can you explain how you expect this to work? In particular, what happens if Tree already has a value? Tree = "This is my Tree!" class Tree: def __init__(self, left: Tree, right: Tree): ... What happens if you use Tree outside of an annotation? # somehow, we enable forward declarations class Tree: x = Tree # What is the value of x here? The difficulty is that annotations are not merely declarations to the compiler, they have runtime effects as well. If they were pure declarations, we could invent some ad hoc rule like "if an annotation is an unbound name, inside a class, and that name is the same as the class, then treat it as a forward declaration". But we can't, because the __init__ function object needs to set the annotations before the Tree class exists. So the annotation needs to be something that actually exists. In order for the annotation to use Tree (without quotation marks) the name Tree needs to be bound to some existing value, and that value is used as the annotation, not the Tree class. If you can think of some way around this restriction, preferably one which is backwards-compatible (although that is not absolutely required) then please suggest it. > what students have to write instead: > > #-------------bad workaround----------------------------\ > class Tree: > def __init__(self, left: 'Tree', right: 'Tree'): > self.left = left > self.right = right I don't think this is a "bad" work-around. I think it is quite a good one. It is sad that we need a work-around, but given that we do, this is simple to use and learn: If the type already exists, you can annotate variables with the type itself. But if the type doesn't yet exist (say you are still constructing it), you will get a NameError, so you can use the name of the class as a string as a forward declaration: class Tree: # At this point, Tree is still being constructed, and the class # doesn't yet exist, so we need to use a forward reference. def __eq__(self, other: 'Tree') -> Bool: ... # At this point, the Tree class exists and no forward reference # is needed. def builder(data: List) -> Tree: ... > / > Please enable: > from __future__ import annotations > > so the *first* variant should be possible > \ > At this very moment (python 3.5rc1), it is not possible, but we need it, > so the construction will be orthogonal from the point of view for > students(!) - _one_ concept should work in different circumstances. I agree that is desirable, but surely many languages have some sort of forward declaration syntax? I know that both the Pascal and C families of languages do. -- Steve From stephen at xemacs.org Tue Aug 25 05:05:54 2015 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 25 Aug 2015 12:05:54 +0900 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DBA0FF.2000900@mail.de> References: <55DA9B63.3010208@uni-wuppertal.de> <55DBA0FF.2000900@mail.de> Message-ID: <87fv386p9p.fsf@uwakimon.sk.tsukuba.ac.jp> Sven R. Kunze writes: > On 24.08.2015 06:19, Prof. Dr. L. Humbert wrote: > > At this very moment (python 3.5rc1), it is not possible, but we need it, > > so the construction will be orthogonal from the point of view for > > students(!) - _one_ concept should work in different circumstances. > > How about not using type hints when teaching the basics of trees? That's somewhat unfair. I'll let Prof. Humbert explain his own thinking, but I can imagine a number of pedagogical contexts where I would use Python because it doesn't get in the programmer's way very often, but require students to provide type hints as a compact (and machine-checkable!) way of documenting that aspect of their design. I found his pedagogical approoach perfectly plausible when he posted originally. From greg.ewing at canterbury.ac.nz Tue Aug 25 02:03:17 2015 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 25 Aug 2015 12:03:17 +1200 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DA66C5.7000105@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> Message-ID: <55DBB0C5.1040301@canterbury.ac.nz> Eric V. Smith wrote: > An f-string would be shorthand for str(i-string). If I understand correctly, the point of i-strings would be to make it easy to do things like sql argument interpolation the right way. But if sql(f-string) is still legal (as it seems like it would have to be for quite a while to come, for backwards compatibility) then the wrong way is still just as easy as the right way, and no less obvious (what do the letters "f" and "i" have to do with sql?). So it seems to me that having both f-strings and i-strings will just add a lot of complication and confusion without really helping anything. -- Greg From ncoghlan at gmail.com Tue Aug 25 08:22:54 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Aug 2015 16:22:54 +1000 Subject: [Python-ideas] Deferred evaluation [was Re: Draft PEP on string interpolation] In-Reply-To: <20150824120033.GJ3881@ando.pearwood.info> References: <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <20150824120033.GJ3881@ando.pearwood.info> Message-ID: On 24 August 2015 at 22:00, Steven D'Aprano wrote: > I mean, I know how to get a closure in general terms, e.g.: > > [(lambda : i) for i in range(10)] > > but I'm not seeing where you would get a closure *specifically* in > this situation with your defer function. I was wrong when I though you could do this trick with f-strings - you need the delayed interpolation offered by PEP 501's i-strings in order to access the original objects directly. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue Aug 25 08:35:20 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Aug 2015 16:35:20 +1000 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <20150824012457.GD3881@ando.pearwood.info> <55DB1109.7010208@trueblade.com> Message-ID: On 25 August 2015 at 01:03, Paul Moore wrote: > On 24 August 2015 at 13:41, Eric V. Smith wrote: >> On 08/24/2015 07:35 AM, Paul Moore wrote: >>> I'm once again losing the thread of all the variations being proposed. >>> >>> As a reality check, is the expectation that something like the >>> following will still be possible: >>> >>> print(f"Iteration {n}: Duration {end-start} seconds") >> >> Yes, that's the PEP 498 proposal. I think (and this is just my opinion) >> that if we do something more complicated, like the delayed interpolation >> of i-strings, that we'd still keep f-strings. > > OK. That's my point, essentially - the discussion has drifted into > much more complex areas, with comments about how the wider-ranging > proposals cover the f-string case as a subset, and I just wanted to be > sure that there wasn't an implied "so we don't need f-strings any > more" in there. (Nick at one point spoke quite strongly against adding > multiple ways of doing the same thing). That was before my proposed design converged on being a potential implemention detail of Eric's, though :) Now we have the option of adding types.InterpolationTemplate as an implementation detail of f-strings, and then deciding *later* whether we want to allow creating of interpolation templates with deferred rendering. In that regard, Guido suggested that I split PEP 501 into two different PEPs, one for deferred rendering (which could be done as an implementation detail of f-strings, with f"templated {text}" being shorthand for format(i"templated {text}")), and another for $-substitution over {}-substitution (which would be a competing proposal for the surface syntax of the substitution expressions). I think that's a good idea, so I'll do that some time this week (not sure when, though) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Tue Aug 25 10:29:16 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 25 Aug 2015 09:29:16 +0100 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DB94F6.8050001@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB94F6.8050001@mgmiller.net> Message-ID: On 24 August 2015 at 23:04, Mike Miller wrote: > On 08/24/2015 02:54 PM, Paul Moore wrote: >> >> Agreed. In a convenience library where it's absolutely clear that a >> shell is involved (something like sarge or invoke) this is OK, but not >> in the stdlib as the "official" way to call external programs. > > Hmm, don't focus on os.system(), it could be any function, and not > particularly relevant, nor do I recommend this line as the official way. Well, can you use an example that isn't misleading in its security implications? Specifically, I assumed from your use of os.system that you were proposing that the stdlib function (specifically in this case, a function that we've been trying to deprecate in favour of more secure alternatives for years) be updated to understand e-strings. > Remember Nick Coghlan's statement that the "easy way should be the right > way"? That's what this is trying to accomplish. But the right way is not to use os.system, so I don't *want* it to be easy. If you have a better example than running shell commands, please explain. (If your example is running full-blown shell syntax, rather than single commands, please give a more complicated example and we can let the debate explode into one about portability of shell constructs - but os.system to run a single command with a set of arguments is *wrong* and subprocess.Popen was created to replace it with a cross-platform, secure by default, solution). >> - People will fail to understand the difference between e'...' and >> f'...' and will use the wrong one when using os.system, and things >> will work correctly but with security vulnerabilities. > > > I don't recommend e'' and f'', only e'' at this moment. Then I'm strongly against this. As I've stated on a number of occasions, to me the crucial main use of any variation on this proposal is print(f"Iteration {n}: Duration {end-start}") If your e-string proposal works for this (via some consequence of implicitly calling str()) then it may still be on the cards - but the need for explicit str() calls in pathlib is a source of frustration there, so I'd like to be 100% sure that your proposal doesn't result in a need for explicit str() calls anywhere before accepting that e-strings can replace f-strings. By the way, the terminology in this thread (e-strings, f-strings, i-strings...) is dreadful. We need names that capture the essential differences (I've already proposed "format strings" for f-strings). Naming is important! Paul From abarnert at yahoo.com Tue Aug 25 10:35:24 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 25 Aug 2015 01:35:24 -0700 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <20150825025209.GK3881@ando.pearwood.info> References: <55DA9B63.3010208@uni-wuppertal.de> <20150825025209.GK3881@ando.pearwood.info> Message-ID: <3EF5B4F2-FE26-4E5C-BBF5-A10175DDF454@yahoo.com> On Aug 24, 2015, at 19:52, Steven D'Aprano wrote: > > I agree that is desirable, but surely many languages have some sort of > forward declaration syntax? I know that both the Pascal and C families > of languages do. What would a forward declaration mean in Python? In C, a forward declaration for a struct tag specifies that it is a struct tag. You can reference "struct spam *" as a type after that, but you can't reference "struct spam", because you need the size for that, which doesn't exist yet. You can't dereference a spam or access a member of a spam. The only thing you know is that a thing called struct spam exists, and is a struct type rather than a function type or native value typedef. That wouldn't do any good in Python. To be useful, it would have to mean something very different. For example, it could bind the name to some magic marker that means "after something else is bound to this name, go back and fix up everything that made a reference to this magic marker to refer to the bound value instead". (Presumably any method on the marker value just raises a NoValueYetException or something.) From eric at trueblade.com Tue Aug 25 15:54:08 2015 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 25 Aug 2015 09:54:08 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> Message-ID: <55DC7380.50004@trueblade.com> On 08/24/2015 06:37 PM, Nathaniel Smith wrote: > On Mon, Aug 24, 2015 at 1:57 PM, Mike Miller wrote: >> Hi, here's my latest idea, riffing on other's latest this weekend. >> >> Let's call these e-strings (for expression), as it's easier to refer to the >> letter of the proposals than three digit numbers. >> >> So, an e-string looks like an f-string, though at compile-time, it is >> converted to an object instead (like i-string): >> >> print(e'Hello {friend}, filename: {filename}.') # converts to ==> >> >> print(estr('Hello {friend}, filename: {filename}.', friend=friend, >> filename=filename)) >> >> An estr is a subclass of str, therefore able to do the nice things a string >> can do. Rendering is deferred, and it also has a raw member, escape(), and >> translate() methods: >> >> class estr(str): >> # init: saves self.raw, args, kwargs for later >> # methods, ops render it >> # def escape(self, escape_func): # handles escaping >> # def translate(self, template, safe=True): # optional i18n support >> >> To make it as simple as possible to use by end-developers, it 1) doesn't >> require str() to be run explicitly, it renders itself when needed via its >> various methods and operators. Look for .raw, if you need the original. > > This is a really interesting idea. > > You could potentially re-use PyUnicode_READY to do the default rendering. I doubt you could get this to work, although feel free to prove me wrong. I think you'll end up with the same decision Pathlib made (PEP 428): don't derive from str. > Some things to think about: > > - If I concatenate two e-string objects, or an e-string and a regular > string, or interpolate an e-string into an e-string, then what > happens? > > - How problematic will it be that an e-string pins all the > interpolated objects in memory for its lifetime? Well, it seems to work for logging, but those don't tend to stay around very long. But this is one of the reasons to play with a sample implementation, to understand these sorts of issues. Eric. From Nikolaus at rath.org Tue Aug 25 17:02:39 2015 From: Nikolaus at rath.org (Nikolaus Rath) Date: Tue, 25 Aug 2015 08:02:39 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DBD4BF.7010908@mgmiller.net> (Mike Miller's message of "Mon, 24 Aug 2015 19:36:47 -0700") References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB92A0.3010909@mgmiller.net> <87vbc46s27.fsf@vostro.rath.org> <55DBD4BF.7010908@mgmiller.net> Message-ID: <8761439zsg.fsf@thinkpad.rath.org> On Aug 24 2015, Mike Miller wrote: > On 08/24/2015 07:05 PM, Nikolaus Rath wrote: >> How is that compatible with your statement that >> >>> This means a billion lines of code using e-strings won't have to care >>> about them, only a handful of places. >> >> Either str(estr) performs interpolation (so billions of lines of code >> don't have to change, and my custom system()-like call get's an >> interpolated string as well until I change it to be estr-aware), or it >> does not (and billions of lines of code will break when they >> unexpectedly get an estr instead of a str). >> > > Not sure I understand... your system_like() call already accepts > strings that could be formatted? I'm talking about someone who has implemented a function (for whatever reason) that behaves like os.system(). Say something like this (probably the calls are all wrong because I didn't look them up, but I trust everyone knows what I mean): def nonblocking_system(cmd): if os.fork() == 0: os.exec('/bin/sh', '-c', cmd) With this function, people have to be really careful about injection vulnerabilities - just like with os.system(): os.system('rm %s' % file) # danger! nonblocking_system('rm %s' % file) # danger! But now you're proposing that os.system() get's support for e-strings, which are then properly quoted. Now we have this: os.system(e'rm {file}') # ok nonblocking_system(e'rm {file}') # you'd think it's ok, but it's not I think this is a terrible situation, because you can never be quite sure where an e-string is ok (because the function is prepared for it), and where it will act just like a string. > The estr adds a protection (by escaping variables) that didn't exist > in the past. It is not removing any protections or best practices. No, but it muddles the water as to what is good and what is bad practice. 'rm {file}' has always been bad practice, but with e-strings e'rm {file}' may or may not be bad practice, depending what you do with it. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From tjreedy at udel.edu Tue Aug 25 17:15:08 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 25 Aug 2015 11:15:08 -0400 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DA9B63.3010208@uni-wuppertal.de> References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: On 8/24/2015 12:19 AM, Prof. Dr. L. Humbert wrote: > What students should be able to code: > > 1. varinat > #-------------wishful----------------------------------\ > class Tree: > def __init__(self, left: Tree, right: Tree): > self.left = left > self.right = right As you should know, at least after reading previous responses, making this work would require one of two major changes to Python class statements. 1. The class name has special (context sensitive) meaning in enclosed def statements. The compiler would have to compile def statements differently than it would the same def statements not in a Tree class. It would then have to patch all methods after the class is created. See the annoclass function below. A proposal to make the definition name of a function special within its definition has already been rejected. 2. Class statements would initially create an empty class bound to the class name. This could break back compatibility, and would require cleanup in case of a syntax error in the body. This would be similar to import statements initially putting a empty module in sys.modules to support circular imports. This is messy and still bug prone is use. > what students have to write instead: > > #-------------bad workaround----------------------------\ > class Tree: > def __init__(self, left: 'Tree', right: 'Tree'): > self.left = left > self.right = right You did not say why you think this is bad. Is it a) students have to type "'"s?, or b) the resulting annotations are strings instead of the class? The latter can easily be fixed. --- from types import FunctionType def annofix(klass): classname = klass.__name__ for ob in klass.__dict__.values(): if type(ob) is FunctionType: annotations = ob.__annotations__ for arg, anno in annotations.items(): if anno == classname: annotations[arg] = klass return klass @annofix class Tree: def __init__(self, left: 'Tree', right: 'Tree'): self.left = left self.right = right print(Tree.__init__.__annotations__) # {'left': , 'right': } --- An alternative is to use a placeholder object instead of the class name. This is less direct, not repeating the name of the class throughout the definition makes it easier to rename the class or copy methods to another class. --- class Klass: pass # An annotation object meaning 'the class this method is defined in' def annofix2(klass): for ob in klass.__dict__.values(): if type(ob) is FunctionType: annotations = ob.__annotations__ for arg, anno in annotations.items(): if anno == Klass: annotations[arg] = klass return klass @annofix2 class Tree2: def __init__(self, left: Klass, right: Klass): self.left = left self.right = right print(Tree2.__init__.__annotations__) {'right': , 'left': } -- Terry Jan Reedy From humbert at uni-wuppertal.de Tue Aug 25 18:01:17 2015 From: humbert at uni-wuppertal.de (Prof. Dr. L. Humbert) Date: Tue, 25 Aug 2015 18:01:17 +0200 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: <55DC914D.5010603@uni-wuppertal.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 25.08.2015 17:15, Terry Reedy wrote: > You did not say why you think this is bad. Is it a) students have to > type "'"s?, or b) the resulting annotations are strings instead of the > class? Ok, the answer for short: a) Pls let me explain it a bit: > As you should know, at least after reading previous responses, making > this work would require one of two major changes to Python class > statements. 1st class pedagogical/didactical thinking ? Consider: there are recursive defined ADTs and we want to enable students to understand concepts and produce python-code to realize, what they understood. The main point: if the students already understood, that it is possible to place type hints to place type hints for arguments and results of functions/methods they should be able to reuse the notation in an orthogonal manner. For example: >>> def b (first: int, second: str) -> List[bool]: ... print(first) ... return [second.find(str(first))>=0] ... >>> b(34, "Die 13 langen Nasen rauben ihm (34J) den Schlaf") 34 [True] >>> When it comes to recursive ADTs they should be able to write >>> class Tree: ... def __init__(self, left: Tree, right: Tree): ... self.left = left ... self.right = right TNX Ludger - -- https://twitter.com/n770 http://ddi.uni-wuppertal.de/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlXckU0ACgkQJQsN9FQ+jJ+zGgCdHZvnTcM5H4YMGVa/S0hv/c2o g8IAn2ZEFy8sL0f8uZDnzr1yFcFHc3A+ =Ex97 -----END PGP SIGNATURE----- From erik.m.bray at gmail.com Tue Aug 25 18:19:24 2015 From: erik.m.bray at gmail.com (Erik Bray) Date: Tue, 25 Aug 2015 12:19:24 -0400 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: On Tue, Aug 25, 2015 at 11:15 AM, Terry Reedy wrote: > 2. Class statements would initially create an empty class bound to the class > name. This could break back compatibility, and would require cleanup in > case of a syntax error in the body. This would be similar to import > statements initially putting a empty module in sys.modules to support > circular imports. This is messy and still bug prone is use. I have been thinking about this lately in a different context, and I would very much favor this approach. I think in large part because it works this way for modules it would make sense for it to work for classes as well. The fact that ClassName is bound to an object that will *eventually* become the class as soon as the parser has read in: class ClassName: represents, to me (and I would suspect to many students as well), the least astonishment. I realize it would be a very non-trivial change, however. >> what students have to write instead: >> >> #-------------bad workaround----------------------------\ >> class Tree: >> def __init__(self, left: 'Tree', right: 'Tree'): >> self.left = left >> self.right = right > > What about: >>> class Tree: ... """Forward declaration of Tree type.""" ... >>> class Tree(Tree): ... """Tree implementation.""" ... def __init__(self, left: Tree, right: Tree): ... self.left = left ... self.right = right A little ugly, and potentially error-prone (but only, I think, in exceptional cases). It's also a decent opportunity to teach something about forward-declaration, which I think is worth knowing about. And I think this makes what's going on clearer than the string-based workaround. I didn't follow every single thread about PEP-484 though and I don't know if, or why this approach to forward-declaration was rejected. Erik From srkunze at mail.de Tue Aug 25 18:43:53 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 25 Aug 2015 18:43:53 +0200 Subject: [Python-ideas] Properties for classes possible? In-Reply-To: <55D6CCAF.7030900@thomas-guettler.de> References: <55D57989.1020704@thomas-guettler.de> <55D5F04B.40006@mail.de> <55D5F2E5.6000906@thomas-guettler.de> <55D5F8EB.2050509@mail.de> <55D6037D.5050707@mail.de> <55D6CCAF.7030900@thomas-guettler.de> Message-ID: <55DC9B49.5000901@mail.de> The thinking was more like: instancemethod -> property classmethod -> classproperty staticmethod -> staticproperty On 21.08.2015 09:01, Thomas G?ttler wrote: > > > Am 20.08.2015 um 18:42 schrieb Sven R. Kunze: >> What about 'staticproperties'? > > Yes, that sound good. In my case a read-only classproperty is all I need. > > I guess that is what you mean with "staticproperty". > > Regards, > Thomas G?ttler > From steve at pearwood.info Tue Aug 25 18:56:59 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 26 Aug 2015 02:56:59 +1000 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <3EF5B4F2-FE26-4E5C-BBF5-A10175DDF454@yahoo.com> References: <55DA9B63.3010208@uni-wuppertal.de> <20150825025209.GK3881@ando.pearwood.info> <3EF5B4F2-FE26-4E5C-BBF5-A10175DDF454@yahoo.com> Message-ID: <20150825165659.GL3881@ando.pearwood.info> On Tue, Aug 25, 2015 at 01:35:24AM -0700, Andrew Barnert wrote: > On Aug 24, 2015, at 19:52, Steven D'Aprano wrote: > > > > I agree that is desirable, but surely many languages have some sort of > > forward declaration syntax? I know that both the Pascal and C families > > of languages do. > > What would a forward declaration mean in Python? I thought it was obvious from context, not to mention from the example given by the OP. Its a reference to something that doesn't exist yet, namely the class still in the process of being created. E.g.: class Tree: def merge(self, other:'Tree') -> 'Tree': ... The string 'Tree' is a forward reference to the Tree class, as far as either the type-checker or a human reader is concerned. The annotations will, of course, be strings. But they will be understood as a reference to the Tree class. I mean reference in the sense of "to refer to", not in the technical sense of "pointer". Aside: we could use a decorator which replaces all annotations of the form 'Tree' with the actual Tree class itself. In pseudo-code: def decorate(cls): for each method in cls: for key, val in method.__annotations__: if val == cls.__name__: method.__annotations__[key] = cls @decorate class Tree: ... This may be useful for runtime introspection, but it comes too late to be of any use to any type-checker that runs at compile-time or earlier. > To be useful, it would have to mean something very different. For > example, it could bind the name to some magic marker that means "after > something else is bound to this name, go back and fix up everything > that made a reference to this magic marker to refer to the bound value > instead". You're over complicating this. (Snarky comments regarding "a-strings" for annotations can go straight to /dev/null :-) Both PEP 484 and mypy call "use the class name as a string as a stand in for the actual class" a "forward reference": https://www.python.org/dev/peps/pep-0484/#forward-references http://mypy.readthedocs.org/en/latest/kinds_of_types.html#class-name-forward-references and the OP's example of annotations in the Tree class comes straight out of the PEP. I am sorry if I mislead you by being sloppy and calling them "forward declaration" sometimes. -- Steve From srkunze at mail.de Tue Aug 25 19:21:00 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 25 Aug 2015 19:21:00 +0200 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DC914D.5010603@uni-wuppertal.de> References: <55DA9B63.3010208@uni-wuppertal.de> <55DC914D.5010603@uni-wuppertal.de> Message-ID: <55DCA3FC.1040105@mail.de> On 25.08.2015 18:01, Prof. Dr. L. Humbert wrote: > > 1st class pedagogical/didactical thinking ? > Consider: there are recursive defined ADTs and we want to enable > students to understand concepts and produce python-code to realize, what > they understood. > > The main point: > if the students already understood, that it is possible to place type > hints to place type hints for arguments and results of functions/methods > they should be able to reuse the notation in an orthogonal manner. I am sorry about going back to this but why not teaching this in a different lesson? > When it comes to recursive ADTs they should be able to write > class Tree: > def __init__(self, left: Tree, right: Tree): > self.left = left > self.right = right I still think, only looking at recursive ADTs, it is enough to for them to write class Tree: def __init__(self, left_tree, right_tree): self.left_tree = left_tree self.right_tree = right_tree This way, you can teach them about code style, proper variable names and so forth. Regards, Sven R. Kunze -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Tue Aug 25 19:48:39 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 25 Aug 2015 19:48:39 +0200 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: <55DCAA77.30903@mail.de> On 25.08.2015 17:15, Terry Reedy wrote: > On 8/24/2015 12:19 AM, Prof. Dr. L. Humbert wrote: > >> What students should be able to code: >> >> 1. varinat >> #-------------wishful----------------------------------\ >> class Tree: >> def __init__(self, left: Tree, right: Tree): >> self.left = left >> self.right = right > > As you should know, at least after reading previous responses, making > this work would require one of two major changes to Python class > statements. > > 1. The class name has special (context sensitive) meaning in enclosed > def statements. The compiler would have to compile def statements > differently than it would the same def statements not in a Tree class. > It would then have to patch all methods after the class is created. > See the annoclass function below. > > A proposal to make the definition name of a function special within > its definition has already been rejected. > > 2. Class statements would initially create an empty class bound to the > class name. This could break back compatibility, and would require > cleanup in case of a syntax error in the body. This would be similar > to import statements initially putting a empty module in sys.modules > to support circular imports. This is messy and still bug prone is use. Although, I do not agree with the intentions of the OP, I would love to have "more forward references" in Python. I think the main issue here is the gab between intuition and what the compiler actually does. The following line: class MyClass: # first appearance of MyClass basically creates MyClass in the mind of the developer reading this piece of code. Thus, he expects to be able to use it after this line. However, Python first assigns the class to the name MyClass at the end of the class definition. Thus, it is usable only after that. People get around this (especially since one doesn't need it thus often), but it still feels... different. Best, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Tue Aug 25 19:49:18 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 25 Aug 2015 10:49:18 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <8761439zsg.fsf@thinkpad.rath.org> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB92A0.3010909@mgmiller.net> <87vbc46s27.fsf@vostro.rath.org> <55DBD4BF.7010908@mgmiller.net> <8761439zsg.fsf@thinkpad.rath.org> Message-ID: <55DCAA9E.4000103@mgmiller.net> On 08/25/2015 08:02 AM, Nikolaus Rath wrote: > No, but it muddles the water as to what is good and what is bad > practice. 'rm {file}' has always been bad practice, but with e-strings > e'rm {file}' may or may not be bad practice, depending what you do with > it. It would be bad practice since the function is deprecated, or just discouraged. But, are you implying that the escaping could be bypassed? Would that be possible? -Mike From python-ideas at mgmiller.net Tue Aug 25 20:02:27 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 25 Aug 2015 11:02:27 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB94F6.8050001@mgmiller.net> Message-ID: <55DCADB3.5010906@mgmiller.net> On 08/25/2015 01:29 AM, Paul Moore wrote: >> Remember Nick Coghlan's statement that the "easy way should be the right >> way"? That's what this is trying to accomplish. > > But the right way is not to use os.system, so I don't *want* it to be Ok, a few hours before someone complained to Nick that he was using subprocess.call as an example when it didn't completely apply. So I moved to the other alternative example that could be helped, os.system. I have no particular love for it, and am not recommending it. It was just one function out of many that needs input to be escaped as far as I was concerned. I didn't forsee that that the function would be focused on to the point of the derailing the idea. I suppose I'll try again if you'll bear with me. > If your e-string proposal works for this (via some consequence of > implicitly calling str()) then it may still be on the cards - but the > need for explicit str() calls in pathlib is a source of frustration In my original message (of this sub-thread) this is one of the main paragraphs: > To make it as simple as possible to use by end-developers, it 1) doesn't require > str() to be run explicitly, it renders itself when needed via its various > methods and operators. Look for .raw, if you need the original. Also if you check the example script at the bitbucket url, you'll see it is the case, though I've not yet implemented every case. -Mike From srkunze at mail.de Tue Aug 25 20:03:40 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 25 Aug 2015 20:03:40 +0200 Subject: [Python-ideas] Forward-References & Out-of-order declaration Message-ID: <55DCADFC.4030806@mail.de> While reading material for implementing some kind of import hooks and so forth for the xfork package, I came across this one: http://stackoverflow.com/questions/16907186/python-model-inheritance-and-order-of-model-declaration Reactivated in my mind, after reading Prof. Humbert's request of adding a "more forwarded" referencing of classes, I just wanted to ask: 1) What is the general opinion regarding having a more declarative style when writing modules? 2) That given, what is the general opinion about introducing the out-of-order declaration asked for in the StackOverflow posts? Best, Sven From python-ideas at mgmiller.net Tue Aug 25 20:07:39 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 25 Aug 2015 11:07:39 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB94F6.8050001@mgmiller.net> Message-ID: <55DCAEEB.9040501@mgmiller.net> On 08/25/2015 01:29 AM, Paul Moore wrote: > By the way, the terminology in this thread (e-strings, f-strings, > i-strings...) is dreadful. We need names that capture the essential > differences (I've already proposed "format strings" for f-strings). > Naming is important! Agreed, I have said the same in the context of the written PEPs, however in informal conversation, I think f,i, and e, are convenient short-hand for the various ideas. In my PEP draft you'll see no mention of -strings. -Mike From tjreedy at udel.edu Tue Aug 25 20:07:33 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 25 Aug 2015 14:07:33 -0400 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: On 8/25/2015 12:19 PM, Erik Bray wrote: > On Tue, Aug 25, 2015 at 11:15 AM, Terry Reedy wrote: >> 2. Class statements would initially create an empty class bound to the class >> name. This could break back compatibility, and would require cleanup in >> case of a syntax error in the body. This would be similar to import >> statements initially putting a empty module in sys.modules to support >> circular imports. This is messy and still bug prone is use. 'in use'. > I have been thinking about this lately in a different context, and I > would very much favor this approach. I think in large part because it > works this way for modules it would make sense for it to work for > classes as well. The fact that ClassName is bound to an object that > will *eventually* become the class as soon as the parser has read in: > > class ClassName: > > represents, to me (and I would suspect to many students as well), the > least astonishment. > > I realize it would be a very non-trivial change, however. It might be more useful to have def statements work that way (bind name to blank function object). Then def fac(n, _fac=fac): # less confusing than 'fac=fac' return _fac(n-1)*n if n > 1 else 1 would actually be recursive regardless of external name bindings. But as is the case with modules, exposing incomplete objects easily leads to buggy code, such as def f(n, code=f.__code__): pass >>> what students have to write instead: >>> >>> #-------------bad workaround----------------------------\ >>> class Tree: >>> def __init__(self, left: 'Tree', right: 'Tree'): >>> self.left = left >>> self.right = right > What about: >>>> class Tree: > ... """Forward declaration of Tree type.""" > ... >>>> class Tree(Tree): > ... """Tree implementation.""" > ... def __init__(self, left: Tree, right: Tree): > ... self.left = left > ... self.right = right > > A little ugly, and potentially error-prone (but only, I think, in > exceptional cases). It's also a decent opportunity to teach something > about forward-declaration, which I think is worth knowing about. And > I think this makes what's going on clearer than the string-based > workaround. I like this better than my decorator version. Notice that if Python were changed so that 'Tree' were bound to a blank class first thing, then Tree(Tree) would be subclassing itself, breaking code like the above unless a special rule was added to remove a class from its list of subclasses. -- Terry Jan Reedy From eric at trueblade.com Tue Aug 25 20:36:54 2015 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 25 Aug 2015 14:36:54 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <20150824202008.43410b34@anarchist.wooz.org> References: <55D65E4F.1040608@mgmiller.net> <55D6778C.6050600@mgmiller.net> <55D6911E.1080109@mgmiller.net> <55D7B292.9010909@mgmiller.net> <20150821213843.5394b5e8@limelight.wooz.org> <55D80E17.40407@mgmiller.net> <55DA66C5.7000105@trueblade.com> <55DB34ED.30304@trueblade.com> <55DB3E77.5070309@trueblade.com> <20150824202008.43410b34@anarchist.wooz.org> Message-ID: <55DCB5C6.60308@trueblade.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/24/2015 08:20 PM, Barry Warsaw wrote: > On Aug 24, 2015, at 11:55 AM, Eric V. Smith wrote: > >> I should have added: this is for i-strings that look like PEP >> 498's f-strings. I'm not trying to jump to conclusions about the >> syntax: > > I remember something else about $-strings, based on Mailman's > experience. Originally we also used %(foo)s strings, but when that > reached the breaking point (and PEP 292 was implemented), we > changed to $-strings. At that point we had to provide an upgrade > path for settings with the original %-strings. > > It turns out to not be too difficult to translate between them. > It would probably not be difficult to translate from $foo to {foo} > either, so with a properly defined hook, the porcelain could use > $-strings while all the underlying machinery could still use > {}-strings. It would probably have to be roughly limited to > simple name lookups with dot-chasing, and maybe it's not worth it. In https://bitbucket.org/ericvsmith/istring, in i18n.py, I've added the awesomely named convert_istring_format_to_dollar_format(). It also checks that you've only used identifiers and not specified a format_spec or a conversion character (exact specs TBD). I've not implemented the reverse function. I imagine you'd convert to $ format as part of extracting the strings from the source, do the translation, then convert back as part of building the translation database. It also shows how to implement _() with i-strings, including safe substitution required by a bad translation. I also have examples for logging and building up regex's from i-strings. I'm mainly using this to investigate the best API for i-strings. So far, I just have one method, join, that takes some callbacks. It also lets you substitute alternate strings, as needed for the _() examples. But this is all just an experiment. I'm not sold at all on the concept of i-strings (and even less so on the nearly equivalent e-strings). Eric. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAEBAgAGBQJV3LXGAAoJENxauZFcKtNxtu8H/1Sqrr8gyDIQ5piBPj77Hh3E 285Mmk9wrqgd9Xl3dLJBIb5p0H6GvMQi3DezGHDIBpqPBQneA+1cNpMuFJL07WKw tDXxsqacsiXPdxA9qx+iLP6cb1mwpsC3OtURZDPeVZPU6Ic/aIRk1DdShBleIlH6 v/X6BMQz0mrI/PpI364jo39hUr81iU0XWExeiigOWZu//nkjV+WeOUbdpQCBYl2M VEpGl5f2TlY0O85MBFdPc8RKGnROq7OyLhi8SvY+gknGPhwMI+gGeh19vyUPpKfW CEqDju5KWmYW7sCJ0e7JQ+Z5IvSBIAgQoJmfxibW4rhLbc73YwlaGaoYwt831lM= =Drm6 -----END PGP SIGNATURE----- From Nikolaus at rath.org Tue Aug 25 20:40:19 2015 From: Nikolaus at rath.org (Nikolaus Rath) Date: Tue, 25 Aug 2015 11:40:19 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DCAA9E.4000103@mgmiller.net> (Mike Miller's message of "Tue, 25 Aug 2015 10:49:18 -0700") References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB92A0.3010909@mgmiller.net> <87vbc46s27.fsf@vostro.rath.org> <55DBD4BF.7010908@mgmiller.net> <8761439zsg.fsf@thinkpad.rath.org> <55DCAA9E.4000103@mgmiller.net> Message-ID: <878u8z43fw.fsf@thinkpad.rath.org> On Aug 25 2015, Mike Miller wrote: > On 08/25/2015 08:02 AM, Nikolaus Rath wrote: >> No, but it muddles the water as to what is good and what is bad >> practice. 'rm {file}' has always been bad practice, but with e-strings >> e'rm {file}' may or may not be bad practice, depending what you do with >> it. > > It would be bad practice since the function is deprecated, or just > discouraged. What function? > But, are you implying that the escaping could be bypassed? Would that > be possible? According to you, yes. Just look at your example: | def os_system(command): # imagine os.system, subprocess, dbapi, etc. | if isinstance(command, estr): | command = command.escape(shlex.quote) # each chooses its own rules | do_something(command) So any function that doesn't special-case estr will "bypass" the escaping and pass it do it's version of the do_something() function without quoting. Best, -Rikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From python-ideas at mgmiller.net Tue Aug 25 20:54:09 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 25 Aug 2015 11:54:09 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <878u8z43fw.fsf@thinkpad.rath.org> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB92A0.3010909@mgmiller.net> <87vbc46s27.fsf@vostro.rath.org> <55DBD4BF.7010908@mgmiller.net> <8761439zsg.fsf@thinkpad.rath.org> <55DCAA9E.4000103@mgmiller.net> <878u8z43fw.fsf@thinkpad.rath.org> Message-ID: <55DCB9D1.5060901@mgmiller.net> On 08/25/2015 11:40 AM, Nikolaus Rath wrote: > So any function that doesn't special-case estr will "bypass" the > escaping and pass it do it's version of the do_something() function > without quoting. Yes, system(command % dangerous) was dangerous and will still be. Confining input to e-strings is probably not practical. That's a good point. -Mike From python-ideas at mgmiller.net Tue Aug 25 21:06:55 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 25 Aug 2015 12:06:55 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55D65E4F.1040608@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> Message-ID: <55DCBCCF.2010502@mgmiller.net> TL;DR: (Version 2, hopefully more clear) Let's discuss whether to make "doing the right thing as easy as doing the wrong thing" a desired goal for string interpolation. Details -- we could: 1) Automatically escape potentially dangerous input variables to sensitive functions, or 2) Make developers do it the hard way, making them completely responsible for safety, and always responsible. (Knowing that often they don't). 3) Some combination of the two. A trivial implementation of 1) is below. Instead of rendering the string immediately, it is deferred until use, with template and parameters stashed inside an object, allowing the receiver to specify escaping/quoting rules. --------------------------------- Let's call these e-strings (for expression), as it's easier to refer to the letter of the proposals than three digit numbers. So, an e-string looks like an f-string, though at compile-time, it is converted to an object instead (like an i-string): print(e'Hello {friend}, filename: {filename}.') # converts to ==> print(estr('Hello {friend}, filename: {filename}.', friend=friend, filename=filename)) An estr is a subclass of str, therefore able to do the nice things a string can do. Rendering is deferred until the variable is used, and it also has a .raw member, escape(), and translate() methods: class estr(str): # init: saves self.raw, args, kwargs for later # methods, ops render it # def escape(self, escape_func): # handles escaping # def translate(self, template, safe=True): # optional i18n support To make it as simple as possible to use by end-developers, it: 1) Doesn't require str() to be run explicitly, it renders itself when needed via its various methods and operators. Look for .raw, if you need the original. Also, 2) A bit of responsibility is pushed to stdlib/pypi. In a handful of sensitive places, the object is checked beforehand and escaped when needed: # imagine html, db, subprocess input etc. def sensitive_func_that_escapes(input): if isinstance(input, estr): input = input.escape(shlex.quote) # each chooses its own rules do_something(input) This means numerous callers using e-strings won't have to do explicit escaping, only a handful of callee libraries will--which is common with database apis, for example. What is easiest to type is now safe as well:: sensitive_func_that_escapes_input(e'user input: {input}') # sleep easy This could enable the safety and features we'd like, without burdening the everyday user. I've created a sample script to demonstrate at: https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_example.py Here is the output: # consider: e'Hello {friend}, filename: {filename}.' friend: 'John' filename: "somefile; rm -rf ~ 'foo' " original: Hello {friend}, filename: {filename}. w/ print(): Hello John, filename: somefile; rm -rf ~ 'foo' . shell escape: Hello John, filename: 'somefile; rm -rf ~ '"'"'foo'"'"' '. html escape: Hello John, filename: somefile; rm -rf ~ 'foo' <html>. sql escape: Hello "John", filename: "somefile; rm -rf ~ 'foo' ". logger DEBUG Hello John, filename: somefile; rm -rf ~ 'foo' . upper+encode: b"HELLO JOHN, FILENAME: SOMEFILE; RM -RF ~ 'FOO' ." translated?: Hola John, archivo: somefile; rm -rf ~ 'foo' . Is this automatic escaping desired? Or should we continue to make the end-developer fully responsible for escaping input? -Mike From python-ideas at mgmiller.net Tue Aug 25 21:42:40 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 25 Aug 2015 12:42:40 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DCBCCF.2010502@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55DCBCCF.2010502@mgmiller.net> Message-ID: <55DCC530.404@mgmiller.net> Here is another variation that renders the estr immediately, and makes a new copy when escaping: https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_example_immediate.py This would eliminate surprises or potential race-conditions, though it may hinder flexibility. -Mike From p.f.moore at gmail.com Tue Aug 25 22:01:53 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 25 Aug 2015 21:01:53 +0100 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DCADB3.5010906@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> <55DB94F6.8050001@mgmiller.net> <55DCADB3.5010906@mgmiller.net> Message-ID: First of all, please accept my apologies for anything that's already been explained that I've missed. This thread is huge and confusing, and honestly I don't have the time to do much more than skim it. I'm certainly commenting without doing all the research - I'm trying to avoid gross errors, but how well I succeed I can't say. On 25 August 2015 at 19:02, Mike Miller wrote: > > On 08/25/2015 01:29 AM, Paul Moore wrote: >>> >>> Remember Nick Coghlan's statement that the "easy way should be the right >>> way"? That's what this is trying to accomplish. >> >> But the right way is not to use os.system, so I don't *want* it to be > > Ok, a few hours before someone complained to Nick that he was using > subprocess.call as an example when it didn't completely apply. So I moved > to the other alternative example that could be helped, os.system. I have no > particular love for it, and am not recommending it. It was just one > function out of many that needs input to be escaped as far as I was > concerned. I understand your point about "one function of many". Ignoring its flaws for a moment, I understand that os.system is an easily understood one of the many. What I'm not clear about is whether under your proposal, os.system would need to be *changed* to accept e-strings, or whether it would work as it stands. Having just read through your PEP, I'm not sure I'm any the wiser. I think what you're saying is that x = 12 foo(e'number = {x}') will be translated *at compile time* to x = 12 foo(''.join(['number = ', str(x)])) But that seems to leave no way for anything to "safely quote" x. And there's no obvious way foo can influence the quoting if it *doesn't* need rewriting to know about e-strings. So you seem to be saying that e-strings will only be "safely quoted" if the function using them knows to do so. Which leads back to the question of how a user can *know* that an e-string will be safe or not when used with a particular function. That's not making the right way easy, it's giving function writers a new way to make the right thing easy - but only if they target Python versions with e-strings only, or they offer two options, one for e-strings and one for old-style strings. That's likely to make it *harder* for the end user to chose the safe option. I'm sure I'm missing something fundamental in the above, because that's so far away from offering the benefits you're suggesting. But I can't work out what. > I didn't forsee that that the function would be focused on to the point of > the derailing the idea. I suppose I'll try again if you'll bear with me. Sorry. Maybe a better idea would be to show how someone would need to write a safe os.system. If you're saying "you can just use os.system unchanged from the current version", then see above - I don't understand how. >> If your e-string proposal works for this (via some consequence of >> implicitly calling str()) then it may still be on the cards - but the >> need for explicit str() calls in pathlib is a source of frustration > > > In my original message (of this sub-thread) this is one of the main > paragraphs: > >> To make it as simple as possible to use by end-developers, it 1) doesn't >> require >> str() to be run explicitly, it renders itself when needed via its various >> methods and operators. Look for .raw, if you need the original. Please explain "renders itself". What "rendering" is done when it's passed as an argument to a function (e.g. os.system)? Put this another way, what is type(e'foo {x}')? If it's not str, then at least some code (notably os.system in your example, as it wants "safe quoting") using e-strings will need to know about them. If it *is* str, then I'm baffled, as above. > Also if you check the example script at the bitbucket url, you'll see it is > the case, though I've not yet implemented every case. Sorry, I couldn't work out what bitbucket URL you meant. If you meant your PEP, there's a lot of code samples in there, but I'm not clear which bit you mean :-( Paul From guido at python.org Tue Aug 25 22:48:53 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Aug 2015 13:48:53 -0700 Subject: [Python-ideas] Forward-References & Out-of-order declaration In-Reply-To: <55DCADFC.4030806@mail.de> References: <55DCADFC.4030806@mail.de> Message-ID: On Tue, Aug 25, 2015 at 11:03 AM, Sven R. Kunze wrote: > While reading material for implementing some kind of import hooks and so > forth for the xfork package, I came across this one: > > http://stackoverflow.com/questions/16907186/python-model-inheritance-and-order-of-model-declaration > > Reactivated in my mind, after reading Prof. Humbert's request of adding a > "more forwarded" referencing of classes, I just wanted to ask: > > 1) What is the general opinion regarding having a more declarative style > when writing modules? > It would be a nice idea for a different language. Python's execution model would have to be changed dramatically in order to support this (outside the very narrow case of annotations or the cases where it already works). This just isn't going to happen -- too much code relies on the existing execution model. > 2) That given, what is the general opinion about introducing the > out-of-order declaration asked for in the StackOverflow posts? > I couldn't extract an actual proposal from the stackoverflow link you gave. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Tue Aug 25 23:17:22 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 25 Aug 2015 14:17:22 -0700 Subject: [Python-ideas] Forward-References & Out-of-order declaration In-Reply-To: <55DCADFC.4030806@mail.de> References: <55DCADFC.4030806@mail.de> Message-ID: On Aug 25, 2015, at 11:03, Sven R. Kunze wrote: > > While reading material for implementing some kind of import hooks and so forth for the xfork package, I came across this one: > http://stackoverflow.com/questions/16907186/python-model-inheritance-and-order-of-model-declaration > > Reactivated in my mind, after reading Prof. Humbert's request of adding a "more forwarded" referencing of classes, I just wanted to ask: > > 1) What is the general opinion regarding having a more declarative style when writing modules? > 2) That given, what is the general opinion about introducing the out-of-order declaration asked for in the StackOverflow posts? The fact that the global namespace is imperative rather than declarative means that you can use functions and classes in defining later functions, classes, and constants. This is essential for features like decorators, metaclasses, dynamically-generated types liked namedtuples, etc. Sure, you could probably come up with ways to replace all of those features with similar features that didn't require the existing implementation (make everything lazy, use a two-level store, or just come up with special-purpose workarounds for each feature), or require users to do things differently (e.g., you can only use decorators imported from a separate scope), but that would be a pretty different language. From p.f.moore at gmail.com Tue Aug 25 23:19:48 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 25 Aug 2015 22:19:48 +0100 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DCBCCF.2010502@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55DCBCCF.2010502@mgmiller.net> Message-ID: On 25 August 2015 at 20:06, Mike Miller wrote: > This means numerous callers using e-strings won't have to do explicit > escaping, only a handful of callee libraries will--which is common with > database apis, for example. What is easiest to type is now safe as well:: > > sensitive_func_that_escapes_input(e'user input: {input}') # sleep easy OK. The issue here is that if the user mistakenly calls a function that *doesn't* escape its input, expecting that it will, there will be a silent vulnerability. The problem isn't what I thought it was, using the wriong type of string, it's more about using the wrong function. Of course, having two functions one of which is e-string aware and safe, and one of which isn't, and is unsafe, is a pretty bad API. Or is it? Developers will for quite a long time have to deal with providing compatibility for versions of Python with and without e-strings. Consider pyinvoke - invoke.run() runs shell commands, much like os.system. Suppose version X of pyinvoke adds e-string support. If I write a program using e-strings and invoke.run, it's safe for people with pyinvoke version X installed, but unsafe if my users have version X-1 installed. That's a pretty nasty bug. I honestly have no idea how significant this risk is. But it's something that should be considered when claiming that the proposal makes it "hard to do the wrong thing". Paul. From abarnert at yahoo.com Tue Aug 25 23:24:45 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 25 Aug 2015 14:24:45 -0700 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <20150825165659.GL3881@ando.pearwood.info> References: <55DA9B63.3010208@uni-wuppertal.de> <20150825025209.GK3881@ando.pearwood.info> <3EF5B4F2-FE26-4E5C-BBF5-A10175DDF454@yahoo.com> <20150825165659.GL3881@ando.pearwood.info> Message-ID: On Aug 25, 2015, at 09:56, Steven D'Aprano wrote: > >> On Tue, Aug 25, 2015 at 01:35:24AM -0700, Andrew Barnert wrote: >>> On Aug 24, 2015, at 19:52, Steven D'Aprano wrote: >>> >>> I agree that is desirable, but surely many languages have some sort of >>> forward declaration syntax? I know that both the Pascal and C families >>> of languages do. >> >> What would a forward declaration mean in Python? > > I thought it was obvious from context, not to mention from the example > given by the OP. I thought it was obvious, until you brought up C and Pascal, whose forward references are a pretty different thing from what PEP 484 and the OP's example imply, and whose compilation process is radically different from Python's. If you meant the same thing as the PEP, then the shorter answer is: I don't think there's anything useful to learn from C here. I think people have a sense of what it would mean to do what the OP wants, or at least more so than what it would mean to port the vaguely similar idea from C. From abarnert at yahoo.com Tue Aug 25 23:34:20 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 25 Aug 2015 14:34:20 -0700 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: <4287F220-6B33-431E-B2E1-61D12C078D3B@yahoo.com> On Aug 25, 2015, at 09:19, Erik Bray wrote: > >> On Tue, Aug 25, 2015 at 11:15 AM, Terry Reedy wrote: >> 2. Class statements would initially create an empty class bound to the class >> name. This could break back compatibility, and would require cleanup in >> case of a syntax error in the body. This would be similar to import >> statements initially putting a empty module in sys.modules to support >> circular imports. This is messy and still bug prone is use. > > I have been thinking about this lately in a different context, and I > would very much favor this approach. I think in large part because it > works this way for modules it would make sense for it to work for > classes as well. The fact that ClassName is bound to an object that > will *eventually* become the class as soon as the parser has read in: > > class ClassName: > > represents, to me (and I would suspect to many students as well), the > least astonishment. The problem here is, what if someone writes this: def __init__(self, left: Tree, right: Tree): # something with left.left Or: @classmethod def maketree(cls): return Tree(None, None) Here, Tree is "defined", but the type checker can't actually infer the type of left.left or the arguments of Tree's constructor (even if __init__ was defined before maketree). There are various ways you could special-case things to deal with this problem. The simplest would be that a forward-declared class just has no methods or other attributes, or maybe that it has only the ones inherited from superclasses or metaclasses, until the definition is completed, but my naive intuition says that it's obvious what both of the above mean, and the only reason I'd expect it to be an error is by understanding how it has to work under the covers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Aug 26 00:11:09 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 26 Aug 2015 00:11:09 +0200 Subject: [Python-ideas] Forward-References & Out-of-order declaration In-Reply-To: References: <55DCADFC.4030806@mail.de> Message-ID: <55DCE7FD.1050105@mail.de> On 25.08.2015 22:48, Guido van Rossum wrote: > On Tue, Aug 25, 2015 at 11:03 AM, Sven R. Kunze > wrote: > > While reading material for implementing some kind of import hooks > and so forth for the xfork package, I came across this one: > http://stackoverflow.com/questions/16907186/python-model-inheritance-and-order-of-model-declaration > > Reactivated in my mind, after reading Prof. Humbert's request of > adding a "more forwarded" referencing of classes, I just wanted to > ask: > > 1) What is the general opinion regarding having a more declarative > style when writing modules? > > > It would be a nice idea for a different language. Python's execution > model would have to be changed dramatically in order to support this > (outside the very narrow case of annotations or the cases where it > already works). This just isn't going to happen -- too much code > relies on the existing execution model. I can totally understand that and I am not a friend of hard cut/dramatic changes and so forth myself. It's just that I was pondering multiple times over this and I am usual sort of: "there must be a way and it must work smoothly". What I asked about is basically a reduction of functionality. Thus, the result of that reduction is a subset of Python and so still valid Python. So, I can think of two ways in order to test it out: - another keyword instead of 'import' (e.g. 'use') that does not actually execute the module but construct its dict from its AST or - mark the modules alike the pyxl magic encoding you referred to lately and thus enforce this sort of declaration model Don't get me wrong. Most of the time, I love Python's spontaneous nature of getting things done on the module level; and I never want that to go away. However, once your code base matures, you need to shape your code the right way and massage it in some separate packages and modules. Most of these libs are just of a declarative style (e.g. 5 classes and 3 functions, period); nothing more is necessary here. Also the issue of failing cyclic imports can be solved by some sort of "restricted/declarative importing". Not a huge issue by itself and most of the time, one can fix it by scratching one's head enough; it's time-consuming nevertheless. > 2) That given, what is the general opinion about introducing the > out-of-order declaration asked for in the StackOverflow posts? > > > I couldn't extract an actual proposal from the stackoverflow link you > gave. Not sure if that makes sense anymore, but I think it's worth at least to get the idea across: The issue (not a huge problem again but annoying from time to time) is that the order of declaration in a module actually matters. IIRC other modern languages like C# don't require you do actually care about this anymore. Possible example (for whatever reason an author wants to do that -- also cf. stackoverflow): class UseThis(Base): pass class UseThat(Base): pass class Base: pass In that regard, Python feels a bit rusty. Best, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Wed Aug 26 00:36:38 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Tue, 25 Aug 2015 15:36:38 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DCBCCF.2010502@mgmiller.net> Message-ID: <55DCEDF6.6030602@mgmiller.net> Ok, I think the automatic-escaping part of this idea is dead. Though well-intentioned, it creates some uncertainty. The e-string object and .escape(escape_function) method could still be useful for manual use though, do you agree? -Mike On 08/25/2015 02:19 PM, Paul Moore wrote: From p.f.moore at gmail.com Wed Aug 26 01:12:50 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 26 Aug 2015 00:12:50 +0100 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DCEDF6.6030602@mgmiller.net> References: <55D65E4F.1040608@mgmiller.net> <55DCBCCF.2010502@mgmiller.net> <55DCEDF6.6030602@mgmiller.net> Message-ID: On 25 August 2015 at 23:36, Mike Miller wrote: > The e-string object and .escape(escape_function) method could still be > useful for manual use though, do you agree? I'm not sure. The principle of having something like that makes sense (more than just sense, it's highly useful), but DB-api functions have been more or less doing that for years with the cursor("select * from foo where bar = ?") approach. I'm not clear how much advantage new syntax gives. I'll have to actually read the proposal in more detail to really say. Paul From ncoghlan at gmail.com Wed Aug 26 01:55:24 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 26 Aug 2015 09:55:24 +1000 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DCBCCF.2010502@mgmiller.net> Message-ID: On 26 August 2015 at 07:19, Paul Moore wrote: > Of course, having two functions one of which is e-string aware and > safe, and one of which isn't, and is unsafe, is a pretty bad API. Or > is it? Developers will for quite a long time have to deal with > providing compatibility for versions of Python with and without > e-strings. Consider pyinvoke - invoke.run() runs shell commands, much > like os.system. Suppose version X of pyinvoke adds e-string support. > If I write a program using e-strings and invoke.run, it's safe for > people with pyinvoke version X installed, but unsafe if my users have > version X-1 installed. That's a pretty nasty bug. > > I honestly have no idea how significant this risk is. But it's > something that should be considered when claiming that the proposal > makes it "hard to do the wrong thing". Right, injection is number 1 on the OWASP top 10 list for a reason: https://www.owasp.org/index.php/Top_10_2013-A1-Injection The problem is that "things you want to make easy for a developer to do" often necessarily translates to "things you make easy for a developer to do with untrusted user supplied data". Unfortunately, it isn't generally viable to make the paranoid behaviour the default if "empowering and easy to learn" are two of your language design goals, as it means you end up not trusting the *developer*, and make them jump through annoying hoops just to get things done on their own local system. There *are* languages that work that way, but "we're protecting you from problems you don't know you have yet" is generally a poor sales pitch when someone is just trying to write their first "Hello World!" app (it's still a good goal, but it needs to be unobtrusive). Thus, the trick you want to pull off is: 1. Make the wrong thing relatively easy for a security scanner (or the mark 1 human eyeball) to detect 2. Make the right thing a simple mechanical change away from the wrong thing 3. Make the right thing just as easy to read as the wrong thing so folks don't resent having to switch That's the line I now want to walk with f-strings vs i-strings: given a static analyser with a list of APIs that it deems to be security sensitive, it can say "passing an f-string here is wrong, and a plain string is dubious, but an i-string is OK". Hiding the difference between eager interpolation and deferred interpolation from the developer is a non-goal from my perspective - it makes it too hard to glance at a piece of code and say "yes, that's a security sensitve API, but it's using deferred interpolation, so it's likely OK (and if not, that's a bug in the security sensitive API)" or "hmm, that's using eager interpolation with a sensitive API, that could be an issue, we should look closer and consider switching to deferred interpolation here". I'm also going to switch to using completely made up API names, since folks otherwise anchor on "but that's not the way that API currently works" without accounting for the fact that APIs can be updated to dispatch to different behaviours based on the types of their arguments :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Wed Aug 26 02:26:02 2015 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 26 Aug 2015 09:26:02 +0900 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DCA3FC.1040105@mail.de> References: <55DA9B63.3010208@uni-wuppertal.de> <55DC914D.5010603@uni-wuppertal.de> <55DCA3FC.1040105@mail.de> Message-ID: <874mjm7v51.fsf@uwakimon.sk.tsukuba.ac.jp> Sven R. Kunze writes: > I still think, only looking at recursive ADTs, it is enough to for them > to write > > class Tree: > def __init__(self, left_tree, right_tree): > self.left_tree = left_tree > self.right_tree = right_tree > > This way, you can teach them about code style, proper variable names and > so forth. Personally, I think that's ugly and unnecessarily verbose. YMMV, but I don't see why anybody else's sense of style should bow to yours. Also, if you're using Python, your abstraction is broken. I suspect students will write from tree import Tree my_tree = Tree(1, Tree(2, 3)) *which should fail* because Tree is not a union type, but your "ADTs by naming convention" approach can't catch that. Of course, in a real program "Although practicality beats purity" would argue that's a feature, but if you're teaching ADTs it's a bug. Also of course, under strict typing Prof. Humbert's example cannot be instantiated (since there's no default for the subtrees, you need to create an infinitely deep Tree of Trees of Trees ...). I doubt he intended that, but when students graduate to languages like Haskell, they're going to need to understand that kind of thing. From stephen at xemacs.org Wed Aug 26 04:14:35 2015 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 26 Aug 2015 11:14:35 +0900 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DC914D.5010603@uni-wuppertal.de> References: <55DA9B63.3010208@uni-wuppertal.de> <55DC914D.5010603@uni-wuppertal.de> Message-ID: <8737z67q44.fsf@uwakimon.sk.tsukuba.ac.jp> Prof. Dr. L. Humbert writes: > 1st class pedagogical/didactical thinking ? > Consider: there are recursive defined ADTs and we want to enable > students to understand concepts and produce python-code to realize, what > they understood. I think we all understand your point. The problem is that Python as designed is simply not capable of that. You have a choice: python-code with its "class statements bind names to (fully-constructed) class objects" semantics, or pseudo-python with "class statements are declarations" semantics. Python simply isn't a declarative language in that sense. I don't have any objection to a language with different semantics, but I find Python's semantics very consistent (except for module import -- which is improved a lot thanks to Brett -- and the occasional class which creates attributes in __getattr__ -- which I dislike for this reason). I doubt Python-Dev will want to give up that consistency; I know I don't want to. Although the PEP doesn't explicitly say it's a good practice, AFAICS using the name of a type (ie, a string) is supported everywhere a type identifier is. (Explicitly permitted are *forward* references to *undefined* types. However, in the "Django" example where "A.a" is the name of a type defined in module A, "B.b" is defined in B, and each uses the other, in import A import B # refers to A.a using the string "A.a" A.a is an existing type. I conclude that unless the implementation is excessively complicated, it's permitted to refer to already defined types using the string name.) In other words, although class Leaf(): def __init__(self, value: int): self.value = value class Tree: # with leaves def __init__(self, left: Union['Tree', Leaf], right: Union['Tree', Leaf]): self.left = left self.right = right is indeed non-orthogonal and ugly, you could declare the constructors def __init__(self, value: 'int'): def __init__(self, left: 'Union[Tree, Leaf]', right: 'Union[Tree, Leaf]'): using actual names of types (strings like "'Tree'") instead of bound names (identifiers like "Tree") everywhere. That may not be quite as clean as you'd like, but it seems orthogonal enough to me: you have a consistent syntax for all type annotations. I'd also point out that your own notation isn't quite orthogonal: self isn't annotated in your method definitions. If students can handle the special syntax for "self", I suppose that they can handle a special syntax for recursively defined types (or you could use Steven d'A's approach of a placeholder class "RecursivelyDefined", which would require augmenting the typechecker to recognize it). Steve From ron3200 at gmail.com Wed Aug 26 04:20:42 2015 From: ron3200 at gmail.com (Ron Adam) Date: Tue, 25 Aug 2015 21:20:42 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> Message-ID: On 08/24/2015 09:42 PM, Eric V. Smith wrote: >> On Aug 24, 2015, at 10:23 PM, Ron >> Adam wrote: >>> >>> On 08/24/2015 06:45 PM, Mike Miller wrote: >>>>>>> - How problematic will it be that an e-string pins all >>>>>>> the interpolated objects in memory for its lifetime? >>>>> >>>>> It will be an object holding a raw template string, and a >>>>> number of variables. In normal usage I don't suspect it to be >>>>> a problem. >>> >>> If an objects __str__ method could have an optional fmt='spec' >>> argument, then an estring, could just hold strings, and not the >>> object references. That also prevent surprises if the object is >>> mutated between the time it's estring is created and when the >>> estring is used as a string. For that matter it prevents an >>> estring from printing one way at one time, and another at another >>> time. >>> >>> I don't know if the fomatting can be split like this... Where an >>> object is formatted to a string representation, and then that is >>> formatted to a field specification. The later being things like >>> width, fill, right, center, and left. These are independent of >>> the object and belong to the string. Things like nubmer of >>> places and sign or to use leading or trailing zeros is part of >>> the object being converted to a string. > It's not possible. For examples, look at all of the number format > options. How would you implement hex conversions? Or datetime %A? I'm not sure which part you are referring to.. But I think adding an optional argument to __str__ methods is probably out. As to splitting the format spec, I think it would be possible, but It may not be needed. I still think early evaluation is a must here. The issue I have with the late evaluation is shown in your current example of logging. If the time which may be from an actual time() function rather than a fixed time is not evaluated until the logged list is printed at the end of the run, all the times will be set to when it's printed rather than when the logged even happened. Another similar reason is the evaluated expression is sensitive to what object is in the name at the time it is evaluated. If it's evaluated later, the object from the name look up may be something entirely unexpected because that name may have been reused during each iteration of a loop. So all the logged entries that refer to that name will give the last value rather than the value at the time the event was logged. Here's a slightly reworked version to compare to. Hope this is helpful, Ron import sys import _string def interleave(*iters): result = [] for items in zip(*iters): for item in items: result.append(item) return result # i-string class i: def __init__(self, s): self.s = s locals = sys._getframe(1).f_locals globals = sys._getframe(1).f_globals self.literals = [] self.values = [] # Evaluate the expressions now, and remember them. # This freezes the value at execution time. for literal, expr, format_spec, conversion in \ _string.formatter_parser(self.s): self.literals.append(literal) if expr: value = eval(expr, locals, globals) self.values.append(value.__format__(format_spec)) else: self.values.append('') def __str__(self): return ''.join(interleave(self.literals, self.values)) # f-string def f(s): return str(i(s)) # logging def log(istring, echo=True): logged = 'log:' + str(istring) print(logged) return logged # test if __name__ == '__main__': x = i('Version in caps {sys.version.upper()!r}') print(str(x)) name = 'Eric' dog = 'Fido' s = f('My name is {name}, my dog is {dog}') print(repr(s)) assert repr(s) == "'My name is Eric, my dog is Fido'" assert type(s) == str import datetime def func(value): return i('called func with "{value:10}"') logline = 'as of {now:%Y-%m-%d} the value is {400+1:#06x}' now = datetime.datetime(2015, 8, 10, 12, 13, 15) logged = log(i(logline), echo=True) assert logged == "log:as of 2015-08-10 the value is 0x0191" now = datetime.datetime(2015, 8, 11, 12, 13, 15) logged = log(i(logline), echo=True) assert logged == "log:as of 2015-08-11 the value is 0x0191" logged = log(i('{func(42)}')) assert logged == 'log:called func with " 42"' import re delimiter = '+' trailing_re = re.escape(r'\S+') regex = i(r'{delimiter}\d+{delimiter}{trailing_re}') print(regex) assert str(regex) == r"+\d++\\S\+" From steve at pearwood.info Wed Aug 26 05:15:12 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 26 Aug 2015 13:15:12 +1000 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: References: <55DA9B63.3010208@uni-wuppertal.de> <20150825025209.GK3881@ando.pearwood.info> <3EF5B4F2-FE26-4E5C-BBF5-A10175DDF454@yahoo.com> <20150825165659.GL3881@ando.pearwood.info> Message-ID: <20150826031511.GM3881@ando.pearwood.info> On Tue, Aug 25, 2015 at 02:24:45PM -0700, Andrew Barnert wrote: > On Aug 25, 2015, at 09:56, Steven D'Aprano wrote: > > > >> On Tue, Aug 25, 2015 at 01:35:24AM -0700, Andrew Barnert wrote: > >>> On Aug 24, 2015, at 19:52, Steven D'Aprano wrote: > >>> > >>> I agree that is desirable, but surely many languages have some sort of > >>> forward declaration syntax? I know that both the Pascal and C families > >>> of languages do. > >> > >> What would a forward declaration mean in Python? > > > > I thought it was obvious from context, not to mention from the example > > given by the OP. > > I thought it was obvious, until you brought up C and Pascal, whose > forward references are a pretty different thing from what PEP 484 and > the OP's example imply, and whose compilation process is radically > different from Python's. If you meant the same thing as the PEP, then > the shorter answer is: I don't think there's anything useful to learn > from C here. In context, I was explicitly replying to the OPs comment about "needing" to annotate methods with the class object itself, rather than using a string, because "_one_ concept should work in different circumstances". I was pointing out that other languages make do with two concepts, and have their own ways of dealing with the problem of referring to something which doesn't exist yet. I wasn't suggesting that we copy what C, or any other language, does. To be honest, I thought that my post was pretty clear that far from thinking there is a problem to be solved, the use of string literals like 'Tree' is not just an acceptable solution to the problem, but it is an elegant solution to the problem. As I see it: - adding some sort of complicated, ad hoc special case to allow forward references would be a nasty hack and should be rejected; - large changes to the language (e.g. swapping to a two-pass compile process, to allow function and class hoisting) would eliminate the problem but break backwards compatibility and is a huge change for such a minor issue. I don't see this as needing anything more than teaching the students how Python's execution model actually works, plus a simple work-around for annotations within a class (use the class name as a string). -- Steve From rosuav at gmail.com Wed Aug 26 06:19:03 2015 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 26 Aug 2015 14:19:03 +1000 Subject: [Python-ideas] Forward-References & Out-of-order declaration In-Reply-To: <55DCE7FD.1050105@mail.de> References: <55DCADFC.4030806@mail.de> <55DCE7FD.1050105@mail.de> Message-ID: On Wed, Aug 26, 2015 at 8:11 AM, Sven R. Kunze wrote: > The issue (not a huge problem again but annoying from time to time) is that > the order of declaration in a module actually matters. IIRC other modern > languages like C# don't require you do actually care about this anymore. > > Possible example (for whatever reason an author wants to do that -- also cf. > stackoverflow): > > class UseThis(Base): > pass > > class UseThat(Base): > pass > > class Base: > pass > > In that regard, Python feels a bit rusty. Frankly, I don't have a problem with this. You get a mandate that requires you to do what's good practice anyway: lay things out in a logical order. In the same way that Python's use of indentation for block structure is generally just enforcing what you'd have done regardless of language, this requires that you sort things in dependency order. That tends to mean that the first use of any name in a module is its definition/source. Want to know what 'frobnosticate' means? Go to the top of the file, search for it. Having that enforced by the language is a restriction, but how often does good code have to be seriously warped to fit into that model? Not often, in my experience. ChrisA From tjreedy at udel.edu Wed Aug 26 07:50:35 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 26 Aug 2015 01:50:35 -0400 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DCAA77.30903@mail.de> References: <55DA9B63.3010208@uni-wuppertal.de> <55DCAA77.30903@mail.de> Message-ID: On 8/25/2015 1:48 PM, Sven R. Kunze wrote: > I think the main issue here is the gab between intuition and what the > compiler actually does. The following line: > > class MyClass: # first appearance of MyClass > > basically creates MyClass in the mind of the developer reading this > piece of code. I think the gap is less than you think ;-). Or maybe we think differently when reading code. Both human and compiler create the concept 'MyClass' (properly quoted) as an instance of the concept 'class'. In a static language like C, types are only concepts in the minds of programmers and compilers. There are no runtime char, int, float, or struct xyz objects, only the names or concepts. When the compiler is done, there are only bytes in a sense not true of Python. > Thus, he expects to be able to use it after this line. One can use the string 'MyClass' in an annotation, for instance, and eventually dereference it to the object after the object is created. A smart type checker could understand that 'MyClass' in annotations within the class MyClass statement means instances of the future MyClass object. A developer should not expect to use not-yet-existent attributes and methods of the object. -- Terry Jan Reedy From guettliml at thomas-guettler.de Wed Aug 26 09:07:59 2015 From: guettliml at thomas-guettler.de (=?UTF-8?B?VGhvbWFzIEfDvHR0bGVy?=) Date: Wed, 26 Aug 2015 09:07:59 +0200 Subject: [Python-ideas] Properties for classes possible? In-Reply-To: References: <55D57989.1020704@thomas-guettler.de> <55DAECCA.70200@thomas-guettler.de> Message-ID: <55DD65CF.6000504@thomas-guettler.de> Here is the created issue: http://bugs.python.org/issue24941 Please let me know if something is missing. Thomas G?ttler Am 24.08.2015 um 19:17 schrieb Terry Reedy: > On 8/24/2015 6:07 AM, Thomas G?ttler wrote: >> >> >> Am 20.08.2015 um 17:29 schrieb Guido van Rossum: >>> I think it's reasonable to propose @classproperty as a patch to >>> CPython. It needs to be C code. Not sure about the >>> writable version. The lazy=True part is not appropriate for th he >>> stdlib (it's just a memoize pattern). >> >> What's the next step? > > Open an issue on the tracker. Quote Guido's message above with list name, date, and thread name -- or pipermail archive > url. Add python code below, or revision thereof, for someone to translate to C. > >> My knowledge of the programming language C is very limited. I am not >> able to write a >> patch for CPython. >> >> I could write a patch which looks like this: >> >> {{{ >> # From http://stackoverflow.com/a/5192374/633961 >> >> class classproperty(object): >> def __init__(self, f): >> self.f = f >> def __get__(self, obj, owner): >> return self.f(owner) >> >> }}} >> >> >> >> >> > > -- Thomas Guettler http://www.thomas-guettler.de/ From p.f.moore at gmail.com Wed Aug 26 10:17:51 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 26 Aug 2015 09:17:51 +0100 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DCBCCF.2010502@mgmiller.net> Message-ID: On 26 August 2015 at 00:55, Nick Coghlan wrote: > I'm also going to switch to using completely made up API names, since > folks otherwise anchor on "but that's not the way that API currently > works" without accounting for the fact that APIs can be updated to > dispatch to different behaviours based on the types of their arguments > :) One advantage of the otherwise unfortunate obsession about "that's not how os.system works" is that it did flag up in my mind the issue of backward compatibility, in the form I noted (what if version X of an API doesn't handle e-strings and so is unsafe, but version X+1 does handle them and so is safe). Certainly older versions being worse is a routine issue, but a dependency on what version of a module is installed very definitely fails your "make the wrong thing easy to detect" criterion. One key advantage of the os.system -> subprocess.run migration is that the wrong thing is easy to detect - if you're using os.system, or you're not supplying a list, or you have shell=True, you're doing it wrong. Your second goal is fairly strongly in conflict with the first one, so satisfying both of them is the major challenge (I'd personally drop 2 in favour of 1 without a second thought, but I don't have a large codebase to maintain, so that's an easy choice for me). Your third goal is fine, but a matter of personal taste. I actually find subprocess.call([arg, ...]) more readable than os.system("something or other"). Maybe auto-quoting would change my mind, but in the first instance I'd probably just think of it as "yet another quoting syntax whose limitationsI have to remember" Paul From eric at trueblade.com Wed Aug 26 14:56:51 2015 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 26 Aug 2015 08:56:51 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> Message-ID: <55DDB793.1070906@trueblade.com> On 8/25/2015 10:20 PM, Ron Adam wrote: > On 08/24/2015 09:42 PM, Eric V. Smith wrote: >>> On Aug 24, 2015, at 10:23 PM, Ron >>> Adam wrote: >>>> >>>> On 08/24/2015 06:45 PM, Mike Miller wrote: >>>>>>>> - How problematic will it be that an e-string pins all >>>>>>>> the interpolated objects in memory for its lifetime? >>>>>> >>>>>> It will be an object holding a raw template string, and a >>>>>> number of variables. In normal usage I don't suspect it to be >>>>>> a problem. >>>> >>>> If an objects __str__ method could have an optional fmt='spec' >>>> argument, then an estring, could just hold strings, and not the >>>> object references. That also prevent surprises if the object is >>>> mutated between the time it's estring is created and when the >>>> estring is used as a string. For that matter it prevents an >>>> estring from printing one way at one time, and another at another >>>> time. >>>> >>>> I don't know if the fomatting can be split like this... Where an >>>> object is formatted to a string representation, and then that is >>>> formatted to a field specification. The later being things like >>>> width, fill, right, center, and left. These are independent of >>>> the object and belong to the string. Things like nubmer of >>>> places and sign or to use leading or trailing zeros is part of >>>> the object being converted to a string. > >> It's not possible. For examples, look at all of the number format >> options. How would you implement hex conversions? Or datetime %A? > > I'm not sure which part you are referring to.. But I think adding an > optional argument to __str__ methods is probably out. The part that's not possible is to have the format_spec always be interpreted on a string ojbect, even if the format_spec refers to a different type (such as datetime). > As to splitting the format spec, I think it would be possible, but It > may not be needed. > > I still think early evaluation is a must here. The issue I have with > the late evaluation is shown in your current example of logging. If the > time which may be from an actual time() function rather than a fixed > time is not evaluated until the logged list is printed at the end of the > run, all the times will be set to when it's printed rather than when the > logged even happened. There are two things being evaluated: the expressions (the things inside the {}'s), and the value of the i-string (or whatever it's called here, I've lost track). The expressions would be evaluated immediately, when the i-string is created. This is identical to what would happen if, instead of being in an i-string, the expressions were written in Python code. The value of the i-string would be evaluated later, such as when str() or log() or whatever evaluated the contents of the string. This is what my example on bitbucket does. See i.__init__ for eval(), where the expressions are evaluated. Then later, i.join() actually evaluates the content of the string. Note that evaluating the i-string need not result in a string as the result. See the regex example. The 'i' class needs better support for this, but it's doable. Adding that is on my list of things to do, once I have a better API thought out. > Another similar reason is the evaluated expression is sensitive to what > object is in the name at the time it is evaluated. If it's evaluated > later, the object from the name look up may be something entirely > unexpected because that name may have been reused during each iteration > of a loop. So all the logged entries that refer to that name will give > the last value rather than the value at the time the event was logged. Sure. Currently: logging.info('the time is %s', datetime.datetime.now()) Evaluates the current time immediately, but builds up the string later. That's equivalent to what this would do in my bitbucket log.py example: msg = i("the time is {datetime.datetime.now()}") log.log(msg) Also, see test_i in simple.py, again on bitbucket. It shows that changing the values after an i-string is created has no effect on the contents of the i-string. This would be different if the values were mutable, of course. I'll add a test for that to show what I mean. I think your example below is a functional subset of what I have on bitbucket. The only real distinction is that I can do substitutions from a different string, using the expressions that were originally evaluated when the i-string was constructed. This is needed for the i18n case. I realize i18n might never use this, but it's a useful thought experiment in any case. Eric. > Here's a slightly reworked version to compare to. > > Hope this is helpful, > Ron > > > > import sys > import _string > > def interleave(*iters): > result = [] > for items in zip(*iters): > for item in items: > result.append(item) > return result > > > # i-string > class i: > def __init__(self, s): > self.s = s > locals = sys._getframe(1).f_locals > globals = sys._getframe(1).f_globals > self.literals = [] > self.values = [] > # Evaluate the expressions now, and remember them. > # This freezes the value at execution time. > for literal, expr, format_spec, conversion in \ > _string.formatter_parser(self.s): > self.literals.append(literal) > if expr: > value = eval(expr, locals, globals) > self.values.append(value.__format__(format_spec)) > else: > self.values.append('') > > def __str__(self): > return ''.join(interleave(self.literals, self.values)) > > > > # f-string > def f(s): > return str(i(s)) > > > # logging > def log(istring, echo=True): > logged = 'log:' + str(istring) > print(logged) > return logged > > > > # test > > if __name__ == '__main__': > > x = i('Version in caps {sys.version.upper()!r}') > print(str(x)) > > > name = 'Eric' > dog = 'Fido' > s = f('My name is {name}, my dog is {dog}') > print(repr(s)) > assert repr(s) == "'My name is Eric, my dog is Fido'" > assert type(s) == str > > > import datetime > def func(value): > return i('called func with "{value:10}"') > > logline = 'as of {now:%Y-%m-%d} the value is {400+1:#06x}' > now = datetime.datetime(2015, 8, 10, 12, 13, 15) > logged = log(i(logline), echo=True) > assert logged == "log:as of 2015-08-10 the value is 0x0191" > > now = datetime.datetime(2015, 8, 11, 12, 13, 15) > logged = log(i(logline), echo=True) > assert logged == "log:as of 2015-08-11 the value is 0x0191" > > logged = log(i('{func(42)}')) > assert logged == 'log:called func with " 42"' > > > import re > delimiter = '+' > trailing_re = re.escape(r'\S+') > regex = i(r'{delimiter}\d+{delimiter}{trailing_re}') > print(regex) > assert str(regex) == r"+\d++\\S\+" > > > > > > > > > > > > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From ron3200 at gmail.com Wed Aug 26 16:51:47 2015 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 26 Aug 2015 09:51:47 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DDB793.1070906@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> Message-ID: On 08/26/2015 07:56 AM, Eric V. Smith wrote: > I think your example below is a functional subset of what I have on > bitbucket. The only real distinction is that I can do substitutions from > a different string, using the expressions that were originally evaluated > when the i-string was constructed. This is needed for the i18n case. I > realize i18n might never use this, but it's a useful thought experiment > in any case. In my example... the literal and value parts of the strings are stored as strings in two different lists, so you can still apply an i18n translator to just the literal parts, or to the value parts, or to both. It just needs another method. If it's done as a property it could be spelled... s = 'string' i'This {s} will be translated'._ A nice improvement to that would be to add a literal quote ability to the format language. i'This {"string":Q} will be translated'.+ It allows marking parts of a string to not translate without needing to set it an external (to the string) variable as the example above does. Adding a raw quote option, RQ, would help in the cases of html and regular expressions. (as your's does), but it seems this would be a good addition to the format language so it would work with regular strings too. I don't have time to test yours this morning, but What happens in this case? x = [1] ix = i('{x}') x = [2] # Mutates i-string content? print(str(ix)) Does this print "[1]" or "[2]"? Cheers, Ron From eric at trueblade.com Wed Aug 26 17:06:56 2015 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 26 Aug 2015 11:06:56 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> Message-ID: <55DDD610.3030005@trueblade.com> On 08/26/2015 10:51 AM, Ron Adam wrote: > On 08/26/2015 07:56 AM, Eric V. Smith wrote: >> I think your example below is a functional subset of what I have on >> bitbucket. The only real distinction is that I can do substitutions from >> a different string, using the expressions that were originally evaluated >> when the i-string was constructed. This is needed for the i18n case. I >> realize i18n might never use this, but it's a useful thought experiment >> in any case. > > In my example... the literal and value parts of the strings are stored > as strings in two different lists, so you can still apply an i18n > translator to just the literal parts, or to the value parts, or to both. > It just needs another method. If it's done as a property it could be > spelled... > > s = 'string' > i'This {s} will be translated'._ I still think the i18n case is off the table, per Barry. But in any event, you can't translate the literals in pieces. I think you need to design something that works with gettext. Since the part of my design that allows this is just an optional parameter to my i.join() method, there's not much cost. I do scan the string again, but that would likely be optimized away in a C version. > A nice improvement to that would be to add a literal quote ability to > the format language. > > i'This {"string":Q} will be translated'.+ That would just work, without the :Q. Expressions cannot be translated, and "string" is an expression. > It allows marking parts of a string to not translate without needing to > set it an external (to the string) variable as the example above does. > Adding a raw quote option, RQ, would help in the cases of html and > regular expressions. (as your's does), but it seems this would be a > good addition to the format language so it would work with regular > strings too. > > > I don't have time to test yours this morning, but What happens in this > case? > > x = [1] > ix = i('{x}') > x = [2] # Mutates i-string content? > print(str(ix)) > > Does this print "[1]" or "[2]"? I added a similar test this morning. My code produces "[2]". I can't imagine a design that could produce a different result, but follow the "delayed evaluation of the string" model. Eric. From eric at trueblade.com Wed Aug 26 17:39:24 2015 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 26 Aug 2015 11:39:24 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DDD610.3030005@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> <55DDD610.3030005@trueblade.com> Message-ID: <55DDDDAC.4030201@trueblade.com> On 08/26/2015 11:06 AM, Eric V. Smith wrote: > On 08/26/2015 10:51 AM, Ron Adam wrote: >> I don't have time to test yours this morning, but What happens in this >> case? >> >> x = [1] >> ix = i('{x}') >> x = [2] # Mutates i-string content? >> print(str(ix)) >> >> Does this print "[1]" or "[2]"? > > I added a similar test this morning. My code produces "[2]". I can't > imagine a design that could produce a different result, but follow the > "delayed evaluation of the string" model. Oops, I misread this as mutating x. Mine would produce "[1]". Here are the tests: # changing a mutable value doesn't affect the i-string n = 0 x = i('{n}') self.assertEqual(str(x), '0') n = 1 self.assertEqual(str(x), '0') # but a mutable value will l = [1] x = i('{l}') self.assertEqual(str(x), '[1]') l[0] = 2 self.assertEqual(str(x), '[2]') l = [3] self.assertEqual(str(x), '[2]') Eric. From tjreedy at udel.edu Wed Aug 26 17:43:42 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 26 Aug 2015 11:43:42 -0400 Subject: [Python-ideas] Properties for classes possible? In-Reply-To: <55DD65CF.6000504@thomas-guettler.de> References: <55D57989.1020704@thomas-guettler.de> <55DAECCA.70200@thomas-guettler.de> <55DD65CF.6000504@thomas-guettler.de> Message-ID: On 8/26/2015 3:07 AM, Thomas G?ttler wrote: > Here is the created issue: http://bugs.python.org/issue24941 > > Please let me know if something is missing. How would a reviewer know that your Python code works properly? How would a C translator know that the translation is correct? Write a unittest for the proposed builtin. (I would start with the current test for property.) If possible, submit it as a patch to whatever file has the unittest for property. If you cannot create .diffs, post the code in a message. -- Terry Jan Reedy From Nikolaus at rath.org Wed Aug 26 18:05:37 2015 From: Nikolaus at rath.org (Nikolaus Rath) Date: Wed, 26 Aug 2015 09:05:37 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: (Nick Coghlan's message of "Wed, 26 Aug 2015 09:55:24 +1000") References: <55D65E4F.1040608@mgmiller.net> <55DCBCCF.2010502@mgmiller.net> Message-ID: <87io82ujam.fsf@thinkpad.rath.org> On Aug 26 2015, Nick Coghlan wrote: > I'm also going to switch to using completely made up API names, since > folks otherwise anchor on "but that's not the way that API currently > works" without accounting for the fact that APIs can be updated to > dispatch to different behaviours based on the types of their arguments > :) If you "update" subprocess.call (I assume this is one of the examples you have in mind) to perform proper escaping and calling a shell when receiving a X-string, the caller now needs to check if he's actually using the right version of the module. Before: subprocess.call(['rm', file]) after: if subprocess.__version__ < something: subprocess.call(['rm', file]) else: subprocess.call(sh'rm {file}') is that really an improvement? In practice you'd probably declare the dependency in setup.py instead, but this just makes it more likely to go out-of-sync, or to be completely lost when code is being cargo-culted. Or are you proposing that sh'rm {file}' wouldn't actually behave like a str, so str(sh'rm {file}') would fail? I guess that would work, but it seems that would have other implications - aren't we talking about *string* interpolation here? If the result isn't even behaving like a str, this seems like a misnomer. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From kale at thekunderts.net Wed Aug 26 19:15:45 2015 From: kale at thekunderts.net (Kale Kundert) Date: Wed, 26 Aug 2015 10:15:45 -0700 Subject: [Python-ideas] Forward-References & Out-of-order declaration In-Reply-To: References: Message-ID: <55DDF441.1030107@thekunderts.net> > On Wed, Aug 26, 2015 at 8:11 AM, Sven R. Kunze wrote: > > The issue (not a huge problem again but annoying from time to time) is that > > the order of declaration in a module actually matters. IIRC other modern > > languages like C# don't require you do actually care about this anymore. > > > > Possible example (for whatever reason an author wants to do that -- also cf. > > stackoverflow): > > > > class UseThis(Base): > > pass > > > > class UseThat(Base): > > pass > > > > class Base: > > pass > > > > In that regard, Python feels a bit rusty. > > Frankly, I don't have a problem with this. You get a mandate that > requires you to do what's good practice anyway: lay things out in a > logical order. In the same way that Python's use of indentation for > block structure is generally just enforcing what you'd have done > regardless of language, this requires that you sort things in > dependency order. That tends to mean that the first use of any name in > a module is its definition/source. Want to know what 'frobnosticate' > means? Go to the top of the file, search for it. > > Having that enforced by the language is a restriction, but how often > does good code have to be seriously warped to fit into that model? Not > often, in my experience. > > ChrisA Just to provide a concrete example, sqlalchemy's ORM seems to really contort itself (at least from the user's perspective) to get around this problem. The reason in that case is that the dependencies between tables don't have to be directed acyclic graphs, e.g. it's common for two tables to depend on each other. I've also run into this problem when working with my home-grown message-passing APIs, which can also form more complicated dependency graphs. So I do think that good code occasionally has to warp itself to fit into python's model. -Kale From python-ideas at mgmiller.net Wed Aug 26 20:20:15 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 26 Aug 2015 11:20:15 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution Message-ID: <55DE035F.9070101@mgmiller.net> One of the remaining questions for the string interpolation subject is whether to allow for easy access to environment variables and output-capture of external processes (aka command-substitution) as bash does. While incredibly useful in use-cases such as shell-script replacements, the functionality is perceived to be, if not dangerous. More so than arbitrary expressions? Given that we are talking about string literals and not input, I'm not sure, so am looking for feedback. The idea is not unheard of in Python, there was a module that captured process output called commands in the old days, which was superseded at some point by subprocess.check_output() I believe. Here is some example syntax modeled on bash, though placed inside .format braces. Note both start with $ as the signal:: >>> x'Home folder: {$HOME}' # environment 'Home folder: /home/nobody' >>> x'Files: {$(/bin/ls .)}' # capture output 'foo foo1 foo2' For safety, command substitution should return output in a way analogous to the modern equivalent:: subprocess.check_output(['/bin/ls', '.'], shell=False).decode(encoding) -Mike From p.f.moore at gmail.com Wed Aug 26 20:58:29 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 26 Aug 2015 19:58:29 +0100 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE035F.9070101@mgmiller.net> References: <55DE035F.9070101@mgmiller.net> Message-ID: On 26 August 2015 at 19:20, Mike Miller wrote: > One of the remaining questions for the string interpolation subject is > whether to allow for easy access to environment variables and output-capture > of external processes (aka command-substitution) as bash does. > > While incredibly useful in use-cases such as shell-script replacements, the > functionality is perceived to be, if not dangerous. More so than arbitrary > expressions? Given that we are talking about string literals and not input, > I'm not sure, so am looking for feedback. > > The idea is not unheard of in Python, there was a module that captured > process output called commands in the old days, which was superseded at some > point by subprocess.check_output() I believe. This seems like a really good idea for an external module but I see no reason why it is important enough to deserve built in syntax in the core language. Note that cross platform issues are going to be a major issue: * $HOME is "the wrong way to do it" - Windows has no HOME env variable. The right way is os.path.expanduser("~") * $(/bin/ls) is the wrong way - os.listdir(".") is the right way (windows has no ls command) Making it easy to write platform specific code when it's not needed is as much of an antipattern as making it easy to write insecure code IMO. Paul From python-ideas at mgmiller.net Wed Aug 26 21:16:38 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 26 Aug 2015 12:16:38 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: References: <55DE035F.9070101@mgmiller.net> Message-ID: <55DE1096.2070904@mgmiller.net> Hi, The use case is for shell-script replacements, which can be but are often not typically cross-platform. Let's try different examples: >>> x'version: {$(/usr/bin/xdpyinfo -version)}' # capture 'version: xdpyinfo 1.3.1' >>> x'display: {$DISPLAY}' # env 'display: :0.0' -Mike On 08/26/2015 11:58 AM, Paul Moore wrote: > Note that cross platform issues are going to be a major issue From eric at trueblade.com Wed Aug 26 21:21:11 2015 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 26 Aug 2015 15:21:11 -0400 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE035F.9070101@mgmiller.net> References: <55DE035F.9070101@mgmiller.net> Message-ID: <55DE11A7.2070704@trueblade.com> On 08/26/2015 02:20 PM, Mike Miller wrote: > One of the remaining questions for the string interpolation subject is > whether to allow for easy access to environment variables and > output-capture of external processes (aka command-substitution) as bash > does. > > While incredibly useful in use-cases such as shell-script replacements, > the functionality is perceived to be, if not dangerous. More so than > arbitrary expressions? Given that we are talking about string literals > and not input, I'm not sure, so am looking for feedback. > > The idea is not unheard of in Python, there was a module that captured > process output called commands in the old days, which was superseded at > some point by subprocess.check_output() I believe. > > Here is some example syntax modeled on bash, though placed inside > .format braces. Note both start with $ as the signal:: > > >>> x'Home folder: {$HOME}' # environment > 'Home folder: /home/nobody' > > >>> x'Files: {$(/bin/ls .)}' # capture output > 'foo foo1 foo2' -1000 for any language syntax that allows access to environment variables or shell output. That said, with PEP-498 you can do: >>> import os >>> f'HOME={os.environ["HOME"]}' 'HOME=/home/eric' Which is about as easy as I'd like to make this. Eric. From python-ideas at mgmiller.net Wed Aug 26 21:44:25 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 26 Aug 2015 12:44:25 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE1310.3010709@brenbarn.net> References: <55DE035F.9070101@mgmiller.net> <55DE1310.3010709@brenbarn.net> Message-ID: <55DE1719.1070704@mgmiller.net> True, though less readable I think. If we're going to go as far as arbitrary expressions, let's discuss making very common scripting tasks easier. -Mike On 08/26/2015 12:27 PM, Brendan Barnwell wrote: > You can already do this with the existing proposals by interpolating an > expression whose value is an environment variable (e.g., 'My home is > {os.environ["HOME"]}') or whatever other data you want to interpolate. There's > no reason to add special syntax for this. > From python-ideas at mgmiller.net Wed Aug 26 21:53:47 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 26 Aug 2015 12:53:47 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE11A7.2070704@trueblade.com> References: <55DE035F.9070101@mgmiller.net> <55DE11A7.2070704@trueblade.com> Message-ID: <55DE194B.5000906@mgmiller.net> Hold on, it took f'' strings a while to grow on you, give it a few minutes. ;) I'd like Python to be competitive with other (shell) scripting languages, and appeals to purity stand in the way of that. Sometimes practical is just darn useful. We've already acquiesced to arbitrary expressions, so this is a small further step, icing on the cake, no? I believe Guido mentioned something about "half-measures" in one of his messages. -Mike On 08/26/2015 12:21 PM, Eric V. Smith wrote: > Which is about as easy as I'd like to make this. From p.f.moore at gmail.com Wed Aug 26 21:54:42 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 26 Aug 2015 20:54:42 +0100 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE1096.2070904@mgmiller.net> References: <55DE035F.9070101@mgmiller.net> <55DE1096.2070904@mgmiller.net> Message-ID: On 26 August 2015 at 20:16, Mike Miller wrote: > The use case is for shell-script replacements, which can be but are often > not typically cross-platform. Let's try different examples: > > >>> x'version: {$(/usr/bin/xdpyinfo -version)}' # capture > 'version: xdpyinfo 1.3.1' > > >>> x'display: {$DISPLAY}' # env > 'display: :0.0' This is not an important enough use case to warrant language support, IMO. If you want something like this, there are a lot of tools already available on PyPI: https://pypi.python.org/pypi/invoke/0.10.1 if you want to run sets of command lines, grouped together as "tasks" https://pypi.python.org/pypi/sarge/0.1.4 if you're looking for a shell-like syntax (with cross-platform support for constructs like &&, || etc) http://plumbum.readthedocs.org/en/latest/ if you want access to shell commands from within Python code and probably a host of others. Honestly, this is starting to feel like Perl. Sorry, but I don't like this proposal at all. Paul From eric at trueblade.com Wed Aug 26 22:02:13 2015 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 26 Aug 2015 16:02:13 -0400 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE194B.5000906@mgmiller.net> References: <55DE035F.9070101@mgmiller.net> <55DE11A7.2070704@trueblade.com> <55DE194B.5000906@mgmiller.net> Message-ID: <55DE1B45.7060701@trueblade.com> On 08/26/2015 03:53 PM, Mike Miller wrote: > Hold on, it took f'' strings a while to grow on you, give it a few > minutes. ;) > > I'd like Python to be competitive with other (shell) scripting > languages, and appeals to purity stand in the way of that. Sometimes > practical is just darn useful. > > We've already acquiesced to arbitrary expressions, so this is a small > further step, icing on the cake, no? I believe Guido mentioned > something about "half-measures" in one of his messages. Python is never going to be bash. >>> env=os.environ.get >>> f'HOME={env("HOME")}' 'HOME=/home/eric' That's good enough. Eric. From python-ideas at mgmiller.net Wed Aug 26 22:02:53 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 26 Aug 2015 13:02:53 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: References: <55DE035F.9070101@mgmiller.net> <55DE1096.2070904@mgmiller.net> Message-ID: <55DE1B6D.10303@mgmiller.net> Understood, Btw, subprocess.check_output() is already in the standard library. This idea was about further simplifying it even further than that, making Python a contender for shell-scripting. -Mike On 08/26/2015 12:54 PM, Paul Moore wrote: From guido at python.org Wed Aug 26 22:14:21 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Aug 2015 13:14:21 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE1B6D.10303@mgmiller.net> References: <55DE035F.9070101@mgmiller.net> <55DE1096.2070904@mgmiller.net> <55DE1B6D.10303@mgmiller.net> Message-ID: On Wed, Aug 26, 2015 at 1:02 PM, Mike Miller wrote: > Btw, subprocess.check_output() is already in the standard library. This > idea was about further simplifying it even further than that, making Python > a contender for shell-scripting. > No. This is an outright bad idea. Before you know it people are calling out to bash for tasks like removing a file or finding out the current directory, yet claiming to know Python on their resume. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Aug 26 22:22:23 2015 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 26 Aug 2015 22:22:23 +0200 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: References: <55DA9B63.3010208@uni-wuppertal.de> <55DCAA77.30903@mail.de> Message-ID: <55DE1FFF.7030607@mail.de> On 26.08.2015 07:50, Terry Reedy wrote: > A developer should not expect to use not-yet-existent attributes and > methods of the object. Unfortunately, that is where I disagree. The definition of "not-yet-existent attribute" can vary from developer to developer. From abarnert at yahoo.com Wed Aug 26 23:30:50 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 26 Aug 2015 14:30:50 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE1096.2070904@mgmiller.net> References: <55DE035F.9070101@mgmiller.net> <55DE1096.2070904@mgmiller.net> Message-ID: <8E88EB62-350E-4B2B-B3F4-8AABB7FEF8A3@yahoo.com> On Aug 26, 2015, at 12:16, Mike Miller wrote: > > Hi, > > The use case is for shell-script replacements, which can be but are often not typically cross-platform. Let's try different examples: > > >>> x'version: {$(/usr/bin/xdpyinfo -version)}' # capture > 'version: xdpyinfo 1.3.1' People who expect this to work will likely expect to be able to break the results into separates "arguments", with the bash quoting rules. People already ask on StackOverflow why things like this don't work: subprocess.call('files=$(ls)', shell=True) subprocess.call('cp $files %s' % dest, shell=True) No matter how many places they try to put the quotes, or braces, or where they add extern, still nothing gets copied. With your change, they can fix it like this: files = x'{$(ls)}' subprocess.call(x'cp {files} {dest}') ... and now it seems to work, except that it's actually not copying any files with spaces in the name. They may not even notice, which is bad. But if they do, no matter where you add the quotes or braces, there's no way to fix it. A Python string value is not a bash array value that stringifies itself in different ways depending on the quoting context. The right answer is still to actually use the shell by cramming it into one line, to get the output as a list of lines and insert each line as a quoted argument, or, best of all, to just use listdir and shutil in the first place instead of trying to translate from Bash to Python one word at a time, which works about as well as one-abstract word one-abstract instance that manner Japanese from English to translating. From abarnert at yahoo.com Wed Aug 26 23:58:00 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 26 Aug 2015 14:58:00 -0700 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DE1FFF.7030607@mail.de> References: <55DA9B63.3010208@uni-wuppertal.de> <55DCAA77.30903@mail.de> <55DE1FFF.7030607@mail.de> Message-ID: <26FC0EA2-99ED-428D-9569-A1A171C09015@yahoo.com> On Aug 26, 2015, at 13:22, Sven R. Kunze wrote: > >> On 26.08.2015 07:50, Terry Reedy wrote: >> A developer should not expect to use not-yet-existent attributes and methods of the object. > > Unfortunately, that is where I disagree. The definition of "not-yet-existent attribute" can vary from developer to developer. But it has to mean something, and what it means dramatically affects what the code does. That's why Python has a simple rule: a straightforward imperative execution order that means you can easily tell whether the attribute was created before use. Also note that in Python, attributes can be added, replaced, and removed later, and their values can be mutated later. So the notion of "later" has to be simple to understand. Just saying "evaluate these statements in some order that's legal" doesn't work when some of those statements can be mutating state. In Python, a statement can create, replace, or destroys attributes of the module or any other object, and even an expression can mutate the values of those attributes. And in fact that's what everything in Python is doing, even declarative-looking statements like class, so you can't just block mutation, you have to deal with it as a fundamental thing. Besides fully compiler-driven evaluation order, there are two obvious alternatives that let you to keep linear order, but make sense of using values before they're created: lazy evaluation, as in Haskell, and dataflow evaluation, as in Oz. Maybe one of those is what you want here. But trying to fit either of those together with mutable objects sensibly is not trivial, nor is it trivial to fit them together with the kind of dynamic OO that Python provides, much less both in one. I'd love to see what someone could come up with by pursuing either of those, but I suspect it wouldn't feel much like Python. From rosuav at gmail.com Thu Aug 27 01:41:19 2015 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 27 Aug 2015 09:41:19 +1000 Subject: [Python-ideas] Forward-References & Out-of-order declaration In-Reply-To: <55DDF441.1030107@thekunderts.net> References: <55DDF441.1030107@thekunderts.net> Message-ID: On Thu, Aug 27, 2015 at 3:15 AM, Kale Kundert wrote: > Just to provide a concrete example, sqlalchemy's ORM seems to really contort > itself (at least from the user's perspective) to get around this problem. The > reason in that case is that the dependencies between tables don't have to be > directed acyclic graphs, e.g. it's common for two tables to depend on each other. Fair point, but in my experience with SQLAlchemy, it's not that bad to identify tables with string identifiers: class Manufacturer(Base): __tablename__ = 'manufacturers' id = Column(Integer, primary_key=True) name = Column(String, nullable=False) stuff = relationship("Thing", backref="manufacturer") class Thing(Base): __tablename__ = 'things' id = Column(Integer, primary_key=True) name = Column(String, nullable=False) makerid = Column(Integer, ForeignKey('manufacturer.id'), nullable=False) (I might have the specifics a bit wrong, but it's something like this.) > I've also run into this problem when working with my home-grown message-passing > APIs, which can also form more complicated dependency graphs. So I do think > that good code occasionally has to warp itself to fit into python's model. Same sort of thing. You're right that these violate (by necessity) the principle of "first use is definition", but these are incredibly rare cases. Just in the example above, and without any real code doing any real work, I have seven names: Manufacturer, Base, Column, Integer, relationship, Thing, String Two of them have a cyclic relationship (Manufacturer and Thing). All of the rest can still follow that principle (and in this case, most of them would be listed in the top-of-file import block). There are specific situations where the graph is more complicated, but I still like being confident that the code will broadly follow that design layout. The two basic solutions still apply: either use string names to identify not-yet-defined objects, or have a "pre-declare" syntax to make things possible. In C, "struct foo;" is enough to let you declare pointers to foo; in Python, you could have "Thing = Table()" prior to defining Manufacturer, and then you could use an unquoted Thing to define the relationship. Either way makes it clear that something unusual is happening. ChrisA From ron3200 at gmail.com Thu Aug 27 03:08:55 2015 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 26 Aug 2015 20:08:55 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DDDDAC.4030201@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> <55DDD610.3030005@trueblade.com> <55DDDDAC.4030201@trueblade.com> Message-ID: On 08/26/2015 10:39 AM, Eric V. Smith wrote: > On 08/26/2015 11:06 AM, Eric V. Smith wrote: >> >On 08/26/2015 10:51 AM, Ron Adam wrote: >>> >>I don't have time to test yours this morning, but What happens in this >>> >>case? >>> >> >>> >> x = [1] >>> >> ix = i('{x}') >>> >> x = [2] # Mutates i-string content? Oops on my part... I should have written what you did below to mutate x and not rebind it. >>> >> print(str(ix)) >>> >> >>> >>Does this print "[1]" or "[2]"? >> > >> >I added a similar test this morning. My code produces "[2]". I can't >> >imagine a design that could produce a different result, but follow the >> >"delayed evaluation of the string" model. > Oops, I misread this as mutating x. Mine would produce "[1]". Here are > the tests: Ok... I see, but you understood the direction I was going... > # changing a mutable value doesn't affect the i-string > n = 0 > x = i('{n}') > self.assertEqual(str(x), '0') > n = 1 > self.assertEqual(str(x), '0') Yes, changing the name reference, which isn't the same as changing a mutable, doesn't change the object in the i-string. That's already evaluated in the __init__ method. > # but a mutable value will > l = [1] > x = i('{l}') > self.assertEqual(str(x), '[1]') > l[0] = 2 > self.assertEqual(str(x), '[2]') This was the example I was meaning to write above.. which you figured out. ;-) And you get '[2]'. If you store a string instead of the value, then mutating the object won't effect the i-string. Also you don't get held references to objects that may be more expensive than a string. I think these points need to be in the PEP. Cheers, Ron From wes.turner at gmail.com Thu Aug 27 03:26:24 2015 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 26 Aug 2015 20:26:24 -0500 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE1B45.7060701@trueblade.com> References: <55DE035F.9070101@mgmiller.net> <55DE11A7.2070704@trueblade.com> <55DE194B.5000906@mgmiller.net> <55DE1B45.7060701@trueblade.com> Message-ID: On Aug 26, 2015 3:01 PM, "Eric V. Smith" wrote: > > On 08/26/2015 03:53 PM, Mike Miller wrote: > > Hold on, it took f'' strings a while to grow on you, give it a few > > minutes. ;) > > > > I'd like Python to be competitive with other (shell) scripting > > languages, and appeals to purity stand in the way of that. Sometimes > > practical is just darn useful. > > > > We've already acquiesced to arbitrary expressions, so this is a small > > further step, icing on the cake, no? I believe Guido mentioned > > something about "half-measures" in one of his messages. > > Python is never going to be bash. > > >>> env=os.environ.get > >>> f'HOME={env("HOME")}' > 'HOME=/home/eric' How is this wrong/another way to do the wrong thing? * $HOME may contain quotes, single quotes, newlines (which subprocess.call interprets as separate commands), semicolons > > That's good enough. No, tuples (as exec specifies) are good enough. This is the wrong kind of lazy. > > Eric. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Aug 27 03:41:00 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 27 Aug 2015 11:41:00 +1000 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE194B.5000906@mgmiller.net> References: <55DE035F.9070101@mgmiller.net> <55DE11A7.2070704@trueblade.com> <55DE194B.5000906@mgmiller.net> Message-ID: <20150827014100.GN3881@ando.pearwood.info> On Wed, Aug 26, 2015 at 12:53:47PM -0700, Mike Miller wrote: > Hold on, it took f'' strings a while to grow on you, give it a few minutes. > ;) > > I'd like Python to be competitive with other (shell) scripting languages, Why? If you want a shell language, there are many existing shell languages that are far more compact/terse/unreadable/convenient/"easy to use (wrongly)" than Python will ever be, even with f-strings. Even Perl is not a shell language. Python already makes a good scripting language. It just requires more typing and more thought, which encourages writing correct code rather than "easy to type" code which may not be correct. Can we get away from the harmful meme that being "easier" is necessarily always better? That way of thinking leads to PHP. > and appeals to purity stand in the way of that. Sometimes practical is > just darn useful. [snark] We've thrown away the rest of the Zen with these f-strings, so why not throw away that one too? *fractional-wink* I don't think there is any "practical beats purity" argument to be made here. It's not like it is hard to get access to environment variables. > We've already acquiesced to arbitrary expressions, so this is a small > further step, icing on the cake, no? I believe Guido mentioned something > about "half-measures" in one of his messages. Perhaps less icing on the cake and more the straw that breaks the camel's back? -- Steve From brenbarn at brenbarn.net Thu Aug 27 03:55:18 2015 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Wed, 26 Aug 2015 18:55:18 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE194B.5000906@mgmiller.net> References: <55DE035F.9070101@mgmiller.net> <55DE11A7.2070704@trueblade.com> <55DE194B.5000906@mgmiller.net> Message-ID: <55DE6E06.8060807@brenbarn.net> On 2015-08-26 12:53, Mike Miller wrote: > We've already acquiesced to arbitrary expressions, so this is a small further > step, icing on the cake, no? I believe Guido mentioned something about > "half-measures" in one of his messages. There's no comparison between "arbitrary expressions" and "new syntax for shell shortcuts". "Arbitrary expressions" are arbitrary Python expressions, so that just means being able to do what you can already do in Python, with Python syntax, in Python strings. This includes being able to access environment variables, since you can already do that with a Python expression. These shell shortcuts are just a way to open a back door that would bring all of shell syntax into Python, and add new complications to Python's own syntax as well. I see quick-and-dirty shell scripting as pretty small potatoes in the scheme of things Python can be used for; it's not worth changing the language in any significant way to a accommodate that. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From guido at python.org Thu Aug 27 03:59:10 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Aug 2015 18:59:10 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE6E06.8060807@brenbarn.net> References: <55DE035F.9070101@mgmiller.net> <55DE11A7.2070704@trueblade.com> <55DE194B.5000906@mgmiller.net> <55DE6E06.8060807@brenbarn.net> Message-ID: On Wed, Aug 26, 2015 at 6:55 PM, Brendan Barnwell wrote: > On 2015-08-26 12:53, Mike Miller wrote: > >> We've already acquiesced to arbitrary expressions, so this is a small >> further >> step, icing on the cake, no? I believe Guido mentioned something about >> "half-measures" in one of his messages. >> > > There's no comparison between "arbitrary expressions" and "new > syntax for shell shortcuts". "Arbitrary expressions" are arbitrary Python > expressions, so that just means being able to do what you can already do in > Python, with Python syntax, in Python strings. This includes being able to > access environment variables, since you can already do that with a Python > expression. These shell shortcuts are just a way to open a back door that > would bring all of shell syntax into Python, and add new complications to > Python's own syntax as well. I see quick-and-dirty shell scripting as > pretty small potatoes in the scheme of things Python can be used for; it's > not worth changing the language in any significant way to a accommodate > that. > I am more and more beginning to believe that Mike is just playing an elaborate prank on us, seeing how far he can go with this before people start noticing the proposal has jumped the shark. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Thu Aug 27 04:40:35 2015 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 26 Aug 2015 21:40:35 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DDD610.3030005@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> <55DDD610.3030005@trueblade.com> Message-ID: On 08/26/2015 10:06 AM, Eric V. Smith wrote: >> A nice improvement to that would be to add a literal quote ability to >> >the format language. >> > >> > i'This {"string":Q} will be translated'.+ > That would just work, without the :Q. Expressions cannot be translated, > and "string" is an expression. Not quite.. the {"string":Q} would include the quotes, while the expression {"string"} would not include the quotes. Cheers, Ron From eric at trueblade.com Thu Aug 27 04:41:43 2015 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 26 Aug 2015 22:41:43 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> <55DDD610.3030005@trueblade.com> <55DDDDAC.4030201@trueblade.com> Message-ID: <55DE78E7.8020500@trueblade.com> On 8/26/2015 9:08 PM, Ron Adam wrote: > If you store a string instead of the value, then mutating the object > won't effect the i-string. Also you don't get held references to > objects that may be more expensive than a string. > > I think these points need to be in the PEP. Well, it's Nick's PEP, so you'll have to convince him. Here I'll talk about my ideas on i-strings, which I've been implementing on that bitbucket repo I've posted. Although I believe they're consistent with where Nick is taking PEP 501. As I've said before, it's not possible for an i-string to convert all of its expressions to text when the i-string is first constructed. The entire point of delaying the interpolation until some point after the object is constructed is that you don't know how the string conversion is going to be done. Take this i-string: i'value: {value}' How would you convert value to a string before you know how it's being converted, or even if it's being converted to a string? What if you use a conversion function that converts the i-string to a list, containing the values of the expressions? Or maybe your converter is going to call repr() on each expression. If you convert to a string first, you've destroyed information that the converter needs. n = 10 s = 'text' x = i'{n}:{s}' to_list(x) -> [10, ':', 'text'] to_repr(x) -> '10:"text"' And this doesn't even take into account the format_specs or conversions, which only have meaning to the conversion function. Eric. From steve at pearwood.info Thu Aug 27 04:43:51 2015 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 27 Aug 2015 12:43:51 +1000 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DCAA77.30903@mail.de> References: <55DA9B63.3010208@uni-wuppertal.de> <55DCAA77.30903@mail.de> Message-ID: <20150827024348.GP3881@ando.pearwood.info> On Tue, Aug 25, 2015 at 07:48:39PM +0200, Sven R. Kunze wrote: > I think the main issue here is the gab between intuition and what the > compiler actually does. The following line: > > class MyClass: # first appearance of MyClass > > basically creates MyClass in the mind of the developer reading this > piece of code. Thus, he expects to be able to use it after this line. Intuition according to whom? Some people expect that. Others do not. People hold all sorts of miscomprehensions and misunderstandings about the languages they use, and Python is no different. > However, Python first assigns the class to the name MyClass at the end > of the class definition. Thus, it is usable only after that. > > People get around this (especially since one doesn't need it thus > often), but it still feels... different. To me, it feels intuitive and natural. Of course you can't use the class until after you have finished creating it. To me, alternatives like Javascript's function hoisting feel weird. This looks like time travel: // print is provided by the Rhino JS interpreter var x = f(); print(x); // multiple pages later function f() {return "Hello World!";}; How can you call a function that doesn't exist yet? There are even stranger examples, but for the sake of brevity let's just say that what seems "intuitive" to one person may be "weird" to another. With one or two minor exceptions, the Python interactive interpreter behaves identically to the non-interactive interpreter. If you have valid Python code, you can run it interactively. The same can't be said for Javascript. You can't run the above example interactively without *actual* time travel, if you try, it fails: [steve at ando ~]$ rhino Rhino 1.7 release 0.7.r2.3.el5_6 2011 05 04 js> var x = f(); js: "", line 2: uncaught JavaScript runtime exception: ReferenceError: "f" is not defined. at :2 A nice, clean, easy to understand execution model is easy to reason about. Predictability is much more important than convenience: I much prefer code which does what I expect over code that saves me a few characters, or lines, of typing, but surprises me by acting in a way I didn't expect. The fewer special cases I have to learn, the more predictable the language and the less often I am surprised. Python treats functions and classes as ordinary values bound to ordinary names in the ordinary way: the binding doesn't occur until the statement is executed. I like it that way. -- Steve From python-ideas at mgmiller.net Thu Aug 27 05:00:29 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 26 Aug 2015 20:00:29 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: References: <55DE035F.9070101@mgmiller.net> <55DE11A7.2070704@trueblade.com> <55DE194B.5000906@mgmiller.net> <55DE6E06.8060807@brenbarn.net> Message-ID: <55DE7D4D.1050406@mgmiller.net> Yes, another way of stating that I'm getting documentation on design decisions. -Mike On 08/26/2015 06:59 PM, Guido van Rossum wrote: > > I am more and more beginning to believe that Mike is just playing an elaborate > prank on us, seeing how far he can go with this before people start noticing the > proposal has jumped the shark. > > -- > --Guido van Rossum (python.org/~guido ) > From tjreedy at udel.edu Thu Aug 27 06:21:01 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 27 Aug 2015 00:21:01 -0400 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <20150827024348.GP3881@ando.pearwood.info> References: <55DA9B63.3010208@uni-wuppertal.de> <55DCAA77.30903@mail.de> <20150827024348.GP3881@ando.pearwood.info> Message-ID: On 8/26/2015 10:43 PM, Steven D'Aprano wrote: > On Tue, Aug 25, 2015 at 07:48:39PM +0200, Sven R. Kunze wrote: > >> I think the main issue here is the gab between intuition and what the >> compiler actually does. The following line: >> >> class MyClass: # first appearance of MyClass >> >> basically creates MyClass in the mind of the developer reading this >> piece of code. Thus, he expects to be able to use it after this line. > > Intuition according to whom? Some people expect that. Others do not. > > People hold all sorts of miscomprehensions and misunderstandings about > the languages they use, and Python is no different. > > >> However, Python first assigns the class to the name MyClass at the end >> of the class definition. Thus, it is usable only after that. >> >> People get around this (especially since one doesn't need it thus >> often), but it still feels... different. > > To me, it feels intuitive and natural. Of course you can't use the class > until after you have finished creating it. To me, alternatives like > Javascript's function hoisting feel weird. This looks like time travel: > > // print is provided by the Rhino JS interpreter > var x = f(); > print(x); > // multiple pages later > function f() {return "Hello World!";}; > > How can you call a function that doesn't exist yet? There are even > stranger examples, but for the sake of brevity let's just say that what > seems "intuitive" to one person may be "weird" to another. > > With one or two minor exceptions, the Python interactive interpreter > behaves identically to the non-interactive interpreter. If you have > valid Python code, you can run it interactively. The same can't be said > for Javascript. You can't run the above example interactively without > *actual* time travel, if you try, it fails: > > [steve at ando ~]$ rhino > Rhino 1.7 release 0.7.r2.3.el5_6 2011 05 04 > js> var x = f(); > js: "", line 2: uncaught JavaScript runtime exception: > ReferenceError: "f" is not defined. > at :2 > > > A nice, clean, easy to understand execution model is easy to reason > about. Predictability is much more important than convenience: I much > prefer code which does what I expect over code that saves me a few > characters, or lines, of typing, but surprises me by acting in a way I > didn't expect. The fewer special cases I have to learn, the more > predictable the language and the less often I am surprised. > > Python treats functions and classes as ordinary values bound to ordinary > names in the ordinary way: the binding doesn't occur until the statement > is executed. I like it that way. So do I. The same is true of import statements -- the binding of the name to the module does not happen until the module is built. It happens that the import machinery has a cache where is sticks an initially empty module in case of circular imports. But that is normally invisible to the code with the import statement. -- Terry Jan Reedy From python-ideas at mgmiller.net Thu Aug 27 06:40:54 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 26 Aug 2015 21:40:54 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation Message-ID: <55DE94D6.50700@mgmiller.net> With the major design decisions made, behold version 2 of my draft PEP on string interpolation. It's now significantly shorter due to removal of most of the i18n related discussion, pruning, as well as simplification of the prose itself. I don't expect many changes from here on: https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep-05XX.rst TL;DR: Here is a summary table and comparisons with my current understanding of the other proposals, please correct if they are now out of date: String Interpolation PEP Comparison =================================== ================= ================= ================= ================= PEP PEP 498 PEP 501 Draft PEP ================= ================= ================= ================= Name Format/f-string Gen. Purpose Str? Expression-string Prefix f'' i'' e'' Syntax str.format()+ .format+Template+ str.format()+ Returns String join expr? Object Object Immediate Render Yes No Yes Deferred Render No Yes, str, mutable Yes I18n Support No Yes Input available Escaping Hook No No Yes, manual ================= ================= ================= ================= The table can be found here and updated via pull-request: https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep_comparison.rst -Mike From ron3200 at gmail.com Thu Aug 27 08:13:52 2015 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 27 Aug 2015 01:13:52 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DE78E7.8020500@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> <55DDD610.3030005@trueblade.com> <55DDDDAC.4030201@trueblade.com> <55DE78E7.8020500@trueblade.com> Message-ID: On 08/26/2015 09:41 PM, Eric V. Smith wrote: > On 8/26/2015 9:08 PM, Ron Adam wrote: >> >If you store a string instead of the value, then mutating the object >> >won't effect the i-string. Also you don't get held references to >> >objects that may be more expensive than a string. >> > >> >I think these points need to be in the PEP. > Well, it's Nick's PEP, so you'll have to convince him. > > Here I'll talk about my ideas on i-strings, which I've been implementing > on that bitbucket repo I've posted. Although I believe they're > consistent with where Nick is taking PEP 501. > > As I've said before, it's not possible for an i-string to convert all of > its expressions to text when the i-string is first constructed. The > entire point of delaying the interpolation until some point after the > object is constructed is that you don't know how the string conversion > is going to be done. > Take this i-string: > i'value: {value}' > > How would you convert value to a string before you know how it's being > converted, or even if it's being converted to a string? What if you use > a conversion function that converts the i-string to a list, containing > the values of the expressions? Or maybe your converter is going to call > repr() on each expression. If you convert to a string first, you've > destroyed information that the converter needs. > > n = 10 > s = 'text' > x = i'{n}:{s}' > > to_list(x) -> [10, ':', 'text'] > to_repr(x) -> '10:"text"' > > And this doesn't even take into account the format_specs or conversions, > which only have meaning to the conversion function. Sure it does, you can access the format spec and apply it manually to each item or not. Is there another choice? Depending on how you want to make the values and specs visible. def to_repr(istr): return ''.join(repr(item.format(spec)) for item, spec in istr.items()) I think an actual repr of an i-string may look like this... repr(x) #-> i'{10}:{"text"}' Or maybe... "i'{10}:{\"text\"}'", so it can be used with eval. What concerns me is how much memory it could take to keep object references arround. Considder a logging situation that logs thousands of items. Each i-string could contains references to several objects. And possibly each of those objects contains references to more objects of which memory would have been released hours ago if it weren't for the i-strings. Oops.. my computer is now disc caching so bad it will take days to finish the process it is logging. Meanwhile, it can't process any new input. If one of the use cases is logging, then this is a realistic possibility. I do recognize the added flexibility that keeping the references offers, but I'm not sure it's needed. Cheers, Ron From njs at pobox.com Thu Aug 27 08:32:32 2015 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 26 Aug 2015 23:32:32 -0700 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <87pp2cs7e8.fsf@thinkpad.rath.org> Message-ID: On Mon, Aug 24, 2015 at 3:45 PM, Guido van Rossum wrote: > On Mon, Aug 24, 2015 at 3:32 PM, Nathaniel Smith wrote: >> >> [...] >> I mean, it's great that the rise of languages like Python that have >> easy range-checked string manipulation has knocked buffer overflows >> out of the #1 spot, but... :-) >> >> Guido is right that the nice thing about classic string interpolation >> is that its use in many languages gives us tons of data about how it >> works in practice. But one of the things that data tells us is that it >> actually causes a lot of problems! Do we actually want to continue the >> status quo, where one set of people keep designing languages features >> to make it easier and easier to slap strings together, and then >> another set of people spend increasing amounts of energy trying to >> educate all the users about why they shouldn't actually use those >> features? It wouldn't be the end of the world (that's why we call it >> "the status quo" ;-)), and trying to design something new and better >> is always difficult and risky, but this seems like a good moment to >> think very hard about whether there's a better way. > > > Or maybe from the persistence of quoting bugs we could conclude that the > ways people slap strings together have very little effect on this category > of bugs? I was going to say something about how we could learn from the solutions that are regularly deployed for these problems, and just haven't historically influenced language designers so they're less convenient and don't get used enough... but then I realized that I had misremembered and jinja2 actually disables automatic escaping by default: http://jinja.pocoo.org/docs/dev/templates/#html-escaping which certainly reduced my enthusiasm for the idea. If someone does want to follow up I guess it might still be worth asking the jinja2 folks (or similar projects) whether there's anything Python could do to help fix the issues they identify... -n -- Nathaniel J. Smith -- http://vorpus.org From brenbarn at brenbarn.net Wed Aug 26 21:27:12 2015 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Wed, 26 Aug 2015 12:27:12 -0700 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE035F.9070101@mgmiller.net> References: <55DE035F.9070101@mgmiller.net> Message-ID: <55DE1310.3010709@brenbarn.net> On 2015-08-26 11:20, Mike Miller wrote: > One of the remaining questions for the string interpolation subject is whether > to allow for easy access to environment variables and output-capture of external > processes (aka command-substitution) as bash does. You can already do this with the existing proposals by interpolating an expression whose value is an environment variable (e.g., 'My home is {os.environ["HOME"]}') or whatever other data you want to interpolate. There's no reason to add special syntax for this. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From eric at trueblade.com Thu Aug 27 11:17:29 2015 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 27 Aug 2015 05:17:29 -0400 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> <55DDD610.3030005@trueblade.com> <55DDDDAC.4030201@trueblade.com> <55DE78E7.8020500@trueblade.com> Message-ID: <55DED5A9.8090202@trueblade.com> On 8/27/2015 2:13 AM, Ron Adam wrote: > On 08/26/2015 09:41 PM, Eric V. Smith wrote: >> On 8/26/2015 9:08 PM, Ron Adam wrote: >>> >If you store a string instead of the value, then mutating the object >>> >won't effect the i-string. Also you don't get held references to >>> >objects that may be more expensive than a string. >>> > >>> >I think these points need to be in the PEP. >> Well, it's Nick's PEP, so you'll have to convince him. >> >> Here I'll talk about my ideas on i-strings, which I've been implementing >> on that bitbucket repo I've posted. Although I believe they're >> consistent with where Nick is taking PEP 501. >> >> As I've said before, it's not possible for an i-string to convert all of >> its expressions to text when the i-string is first constructed. The >> entire point of delaying the interpolation until some point after the >> object is constructed is that you don't know how the string conversion >> is going to be done. > >> Take this i-string: >> i'value: {value}' >> >> How would you convert value to a string before you know how it's being >> converted, or even if it's being converted to a string? What if you use >> a conversion function that converts the i-string to a list, containing >> the values of the expressions? Or maybe your converter is going to call >> repr() on each expression. If you convert to a string first, you've >> destroyed information that the converter needs. >> >> n = 10 >> s = 'text' >> x = i'{n}:{s}' >> >> to_list(x) -> [10, ':', 'text'] >> to_repr(x) -> '10:"text"' >> >> And this doesn't even take into account the format_specs or conversions, >> which only have meaning to the conversion function. > > Sure it does, you can access the format spec and apply it manually to > each item or not. Is there another choice? You're not reading what I'm writing. Using your proposal of immediately converting to strings, how would you write the version of "to_list" whose output I show above? > Depending on how you want to make the values and specs visible. > > def to_repr(istr): > return ''.join(repr(item.format(spec)) for item, spec > in istr.items()) > > > I think an actual repr of an i-string may look like this... > > repr(x) #-> i'{10}:{"text"}' > > Or maybe... "i'{10}:{\"text\"}'", so it can be used with eval. Again, that's not what I'm talking about. How would you write the "to_repr" function whose output I show above? > What concerns me is how much memory it could take to keep object > references arround. Considder a logging situation that logs thousands > of items. Each i-string could contains references to several objects. > And possibly each of those objects contains references to more objects > of which memory would have been released hours ago if it weren't for the > i-strings. Oops.. my computer is now disc caching so bad it will take > days to finish the process it is logging. Meanwhile, it can't process > any new input. > > If one of the use cases is logging, then this is a realistic possibility. Logging is already passed the object references. This is how logging is called today: logging.info('the values are %d and %f', an_int, get_a_float()) As you can see, it's passed a string and some objects. That's what an i-string is! But with a nicer syntax and a more flexible way to convert objects to strings. If logging were instead passed an i-string: logging.info(i'the values are {an_int} and {get_a_float()}') and if logging were changed so that where it currently builds a string using "msg = str(msg), msg = msg % self.args" [1], it instead said: if (isinstance(msg, types.InterpolationTemplate)): msg = str(msg) else: msg = str(msg) % self.args then there would be zero change in the memory usage of the logging module [2]. Anyway, that's my last input on the subject. You can either follow the code in my bitbucket repo and show how you'd implement its use cases with your approach, or we can just wait for Nick to update the PEP. Eric. [1]: https://hg.python.org/cpython/file/tip/Lib/logging/__init__.py#l328 [2]: Sadly, it's not quite so simple since logging has a pluggable setLogRecordFactory architecture. But the point on memory usage stands. > I do recognize the added flexibility that keeping the references offers, > but I'm not sure it's needed. > > Cheers, > Ron From cody.piersall at gmail.com Thu Aug 27 15:36:04 2015 From: cody.piersall at gmail.com (Cody Piersall) Date: Thu, 27 Aug 2015 08:36:04 -0500 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <55DE94D6.50700@mgmiller.net> References: <55DE94D6.50700@mgmiller.net> Message-ID: On Wed, Aug 26, 2015 at 11:40 PM, Mike Miller wrote: > > With the major design decisions made, behold version 2 of my draft PEP on string interpolation. > > It's now significantly shorter due to removal of most of the i18n related discussion, pruning, as well as simplification of the prose itself. I don't expect many changes from here on: > > https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep-05XX.rst > > TL;DR: Here is a summary table and comparisons with my current understanding of the other proposals, please correct if they are now out of date: > > > String Interpolation PEP Comparison > =================================== > > > ================= ================= ================= ================= > PEP PEP 498 PEP 501 Draft PEP > ================= ================= ================= ================= > Name Format/f-string Gen. Purpose Str? Expression-string > Prefix f'' i'' e'' > Syntax str.format()+ .format+Template+ str.format()+ > Returns String join expr? Object Object > Immediate Render Yes No Yes > Deferred Render No Yes, str, mutable Yes > I18n Support No Yes Input available > Escaping Hook No No Yes, manual > ================= ================= ================= ================= > Is the Draft PEP column of the table supposed to have both "Immediate Render" and "Deferred Render" as "Yes"? I'm hoping that's a typo, otherwise I don't understand what it means at all. > The table can be found here and updated via pull-request: > > https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep_comparison.rst > > -Mike > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ Cody -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Thu Aug 27 17:22:40 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Thu, 27 Aug 2015 08:22:40 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> Message-ID: <55DF2B40.7030307@mgmiller.net> On 08/27/2015 06:36 AM, Cody Piersall wrote: > > Is the Draft PEP column of the table supposed to have both "Immediate Render" > and "Deferred Render" as "Yes"? I'm hoping that's a typo, otherwise I don't > understand what it means at all. Yes, it supports both. By storing all inputs to the object, it can be rendered again, with optional changes such as overriding a value, or escaping the input. -Mike From cody.piersall at gmail.com Thu Aug 27 18:19:45 2015 From: cody.piersall at gmail.com (Cody Piersall) Date: Thu, 27 Aug 2015 11:19:45 -0500 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <55DF2B40.7030307@mgmiller.net> References: <55DE94D6.50700@mgmiller.net> <55DF2B40.7030307@mgmiller.net> Message-ID: On Thu, Aug 27, 2015 at 10:22 AM, Mike Miller wrote: > On 08/27/2015 06:36 AM, Cody Piersall wrote: >> Is the Draft PEP column of the table supposed to have both "Immediate Render" >> and "Deferred Render" as "Yes"? I'm hoping that's a typo, otherwise I don't >> understand what it means at all. > > Yes, it supports both. By storing all inputs to the object, it can be rendered again, with optional changes such as overriding a value, or escaping the input. > > -Mike When you say immediate, do you mean that it only takes a call to str()? Or is there some way to have e'this is an e-string' evaluate to a string without doing anything else? Cody -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Thu Aug 27 18:51:19 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Thu, 27 Aug 2015 09:51:19 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> <55DF2B40.7030307@mgmiller.net> Message-ID: <55DF4007.3070606@mgmiller.net> On 08/27/2015 09:19 AM, Cody Piersall wrote: > When you say immediate, do you mean that it only takes a call to str()? Or is > there some way to have e'this is an e-string' evaluate to a string without doing > anything else? There is no need to call str() manually with e'', the .rendered member is returned by default: >>> print(estr('Hello {friend}.')) 'Hello John' Here is the example implementation: https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py -Mike From eric at trueblade.com Thu Aug 27 19:27:05 2015 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 27 Aug 2015 13:27:05 -0400 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <55DF2B40.7030307@mgmiller.net> References: <55DE94D6.50700@mgmiller.net> <55DF2B40.7030307@mgmiller.net> Message-ID: <55DF4869.8000303@trueblade.com> On 08/27/2015 11:22 AM, Mike Miller wrote: > > On 08/27/2015 06:36 AM, Cody Piersall wrote: >> >> Is the Draft PEP column of the table supposed to have both "Immediate >> Render" >> and "Deferred Render" as "Yes"? I'm hoping that's a typo, otherwise I >> don't >> understand what it means at all. > > Yes, it supports both. By storing all inputs to the object, it can be > rendered again, with optional changes such as overriding a value, or > escaping the input. The problem with this auto-rendering is that the format_spec and conversion character have to make sense to __format__. For example, you couldn't do this, if value were a string: to_html(e'

{value:raw}

') Imagine that ":raw" is interpreted by to_html() to mean that the string does not get html escaped. With PEP 501, the format_spec and conversion are opaque to the i-string machinery, and are only interpreted by the custom interpolation function (here, to_html()). Eric. From python-ideas at mgmiller.net Thu Aug 27 20:15:32 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Thu, 27 Aug 2015 11:15:32 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <55DF4869.8000303@trueblade.com> References: <55DE94D6.50700@mgmiller.net> <55DF2B40.7030307@mgmiller.net> <55DF4869.8000303@trueblade.com> Message-ID: <55DF53C4.9090906@mgmiller.net> On 08/27/2015 10:27 AM, Eric V. Smith wrote: > The problem with this auto-rendering is that the format_spec and Hmm, I believe this is a design choice, one that should be made depending on whether this use-case is important and/or common. The estr provides for this situation instead by allowing for additional renderings if/when needed, but doesn't require str() in the common case. This is the first I've seen of directives passed inside the format spec, I'll add it to the comparison table. -Mike From ron3200 at gmail.com Thu Aug 27 20:21:52 2015 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 27 Aug 2015 13:21:52 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: <55DED5A9.8090202@trueblade.com> References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> <55DDD610.3030005@trueblade.com> <55DDDDAC.4030201@trueblade.com> <55DE78E7.8020500@trueblade.com> <55DED5A9.8090202@trueblade.com> Message-ID: On 08/27/2015 04:17 AM, Eric V. Smith wrote: >>> n = 10 >>> >>s = 'text' >>> >>x = i'{n}:{s}' >>> >> >>> >>to_list(x) -> [10, ':', 'text'] >>> >>to_repr(x) -> '10:"text"' >>> >> >>> >>And this doesn't even take into account the format_specs or conversions, >>> >>which only have meaning to the conversion function. >> > >> >Sure it does, you can access the format spec and apply it manually to >> >each item or not. Is there another choice? > You're not reading what I'm writing. Using your proposal of immediately > converting to strings, how would you write the version of "to_list" > whose output I show above? > >> >Depending on how you want to make the values and specs visible. >> > >> > def to_repr(istr): >> > return ''.join(repr(item.format(spec)) for item, spec >> > in istr.items()) Well, what you have above is applying repr to the values but not the literal parts. It's doable, but that should really be part of the format spec rather than applying a function from outside, or it should be part of the expression in the i-strings. i'{repr(n)}:{repr(s}}' i'{n!r}:{s!r}' Yes, this topic is getting too drawn out. Possibly I'm not seeing a finer point in your examples. I think it will sort it self out as the implementation progress's, so I'm not too worried about it. I'll look at the code in your repository when I have time and try to keep things to concrete examples that you can use in your tests if I have time. Cheers, Ron From guido at python.org Thu Aug 27 20:27:33 2015 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Aug 2015 11:27:33 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <55DE94D6.50700@mgmiller.net> References: <55DE94D6.50700@mgmiller.net> Message-ID: On Wed, Aug 26, 2015 at 9:40 PM, Mike Miller wrote: > With the major design decisions made, behold version 2 of my draft PEP on > string interpolation. > > It's now significantly shorter due to removal of most of the i18n related > discussion, pruning, as well as simplification of the prose itself. I > don't expect many changes from here on: > > https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep-05XX.rst > I'm confused by this proposal. There are many paragraphs about motivation, philosophy, other languages, etc., but the proposal itself seems to be poorly specified. E.g. I couldn't figure out what code should be produced by: a = e"Sliced {n} onions in {t1-t0:.3f} seconds." Generalizing from the only example in the specification, this would become: a = est("Sliced {n} onions in {t1-t0:.3f} seconds", n=n, t1-t0=t1-t0) which is invalid syntax. Similarly, I don't see how e.g. the following could be rendered correctly: a = e"Three random numbers: {rand()}, {rand()}, {rand()}." I also don't understand the claim that no str(estr) is necessary to render the result -- the estr implementation given at https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py has a __str__ method that renders the .rendered attribute, but without the str() call the type of 'a' in the above examples would not be str, and various things that operate on strings (e.g. regular expression searches) would not work. A solution might be to make estr a subclass of str, but nothing in the PEP suggests that you have even considered this problem. (The only hint I can find is the comment "more magic-methods to be implemented here, to improve str compatibility" in your demo implementation, but without subclassing str this is not enough.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Aug 27 20:30:11 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 27 Aug 2015 11:30:11 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <55DF4869.8000303@trueblade.com> References: <55DE94D6.50700@mgmiller.net> <55DF2B40.7030307@mgmiller.net> <55DF4869.8000303@trueblade.com> Message-ID: On Aug 27, 2015, at 10:27, Eric V. Smith wrote: > >> On 08/27/2015 11:22 AM, Mike Miller wrote: >> >>> On 08/27/2015 06:36 AM, Cody Piersall wrote: >>> >>> Is the Draft PEP column of the table supposed to have both "Immediate >>> Render" >>> and "Deferred Render" as "Yes"? I'm hoping that's a typo, otherwise I >>> don't >>> understand what it means at all. >> >> Yes, it supports both. By storing all inputs to the object, it can be >> rendered again, with optional changes such as overriding a value, or >> escaping the input. > > The problem with this auto-rendering is that the format_spec and > conversion character have to make sense to __format__. For example, you > couldn't do this, if value were a string: > > to_html(e'

{value:raw}

') > > Imagine that ":raw" is interpreted by to_html() to mean that the string > does not get html escaped. > > With PEP 501, the format_spec and conversion are opaque to the i-string > machinery, and are only interpreted by the custom interpolation function > (here, to_html()). With str.format, it's the type of value that decides how to interpret the format spec. If you leave it up to the consumer of the i-string (the interpolation function) instead of the value's type, how do you handle things like numeric formats and datetime formats and so on? Would I need to do something like this: to_html(e'

{str(e"{value:05}"):raw}

') Or would to_html (and every other custom interpolator) have to take things like :raw05 and parse out a format spec to pass to value.__format__? Maybe what you really want is !raw rather than :raw. If there is no conversion, __format__ gets called and the result passed around as part of the i-string object; if there is one, the value, conversion, and format spec get passed instead (and then to_html could decide that conversion 'raw' means to call format(value, format_spec) and then not escape the result). Although that's pretty different from how the standard conversions work (call repr or ascii on the value, then format the resulting string with the format spec). So maybe replacement fields need another subpart separate from both the conversion and the format spec that's only used by custom interpolators? From eric at trueblade.com Thu Aug 27 21:08:13 2015 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 27 Aug 2015 15:08:13 -0400 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> <55DF2B40.7030307@mgmiller.net> <55DF4869.8000303@trueblade.com> Message-ID: <55DF601D.8030801@trueblade.com> On 08/27/2015 02:30 PM, Andrew Barnert via Python-ideas wrote: > On Aug 27, 2015, at 10:27, Eric V. Smith wrote: >> >>> On 08/27/2015 11:22 AM, Mike Miller wrote: >>> >>>> On 08/27/2015 06:36 AM, Cody Piersall wrote: >>>> >>>> Is the Draft PEP column of the table supposed to have both "Immediate >>>> Render" >>>> and "Deferred Render" as "Yes"? I'm hoping that's a typo, otherwise I >>>> don't >>>> understand what it means at all. >>> >>> Yes, it supports both. By storing all inputs to the object, it can be >>> rendered again, with optional changes such as overriding a value, or >>> escaping the input. >> >> The problem with this auto-rendering is that the format_spec and >> conversion character have to make sense to __format__. For example, you >> couldn't do this, if value were a string: >> >> to_html(e'

{value:raw}

') >> >> Imagine that ":raw" is interpreted by to_html() to mean that the string >> does not get html escaped. >> >> With PEP 501, the format_spec and conversion are opaque to the i-string >> machinery, and are only interpreted by the custom interpolation function >> (here, to_html()). > > With str.format, it's the type of value that decides how to interpret the format spec. If you leave it up to the consumer of the i-string (the interpolation function) instead of the value's type, how do you handle things like numeric formats and datetime formats and so on? Would I need to do something like this: > > to_html(e'

{str(e"{value:05}"):raw}

') > > Or would to_html (and every other custom interpolator) have to take things like :raw05 and parse out a format spec to pass to value.__format__? Your interpolator would need to decide. It might never call value.__format__. It might invent some other protocol, like __html_escape__(fmt_spec). Or, it might bake-in knowledge of how to convert whatever types it cares about. Or another good choice would be to use the singledispatch module. In fact, I like singledispatch so much that I'm going to have to use it in an example. > Maybe what you really want is !raw rather than :raw. If there is no conversion, __format__ gets called and the result passed around as part of the i-string object; if there is one, the value, conversion, and format spec get passed instead (and then to_html could decide that conversion 'raw' means to call format(value, format_spec) and then not escape the result). You could do that in addition or in place of format_spec. Except currently conversions are only allowed to be a single character, but I don't see any reason not to relax that. The take away is that the PEP 501 i-string machinery applies zero significance to format_spec and conversion. It just parses them out of the template string. It's left up to the interpolator to apply some meaning to them. > Although that's pretty different from how the standard conversions work (call repr or ascii on the value, then format the resulting string with the format spec). So maybe replacement fields need another subpart separate from both the conversion and the format spec that's only used by custom interpolators? There's nothing from stopping you from doing this. You could decide that your format_spec, for some interpolator, is composed of "part1^part2", and do something based on part1 and part2. See: https://bitbucket.org/ericvsmith/istring/src/d92e47c96609eed44ed57b7d3c1932b5a156c01a/istring.py?at=default#istring.py-13 for how my i-string str() interpolator applies format_spec and conversion. Another interpolator could do something different (for example, https://bitbucket.org/ericvsmith/istring/src/d92e47c96609eed44ed57b7d3c1932b5a156c01a/regex.py?at=default#regex.py-6 for regex escaping). Eric. From abarnert at yahoo.com Thu Aug 27 21:25:42 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 27 Aug 2015 12:25:42 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <55DF601D.8030801@trueblade.com> References: <55DE94D6.50700@mgmiller.net> <55DF2B40.7030307@mgmiller.net> <55DF4869.8000303@trueblade.com> <55DF601D.8030801@trueblade.com> Message-ID: <99D8B79E-61E2-44FB-84DB-AE90CFAFBB9E@yahoo.com> On Aug 27, 2015, at 12:08, Eric V. Smith wrote: > >> On 08/27/2015 02:30 PM, Andrew Barnert via Python-ideas wrote: >>> On Aug 27, 2015, at 10:27, Eric V. Smith wrote: >>> >>>>> On 08/27/2015 11:22 AM, Mike Miller wrote: >>>>> >>>>> On 08/27/2015 06:36 AM, Cody Piersall wrote: >>>>> >>>>> Is the Draft PEP column of the table supposed to have both "Immediate >>>>> Render" >>>>> and "Deferred Render" as "Yes"? I'm hoping that's a typo, otherwise I >>>>> don't >>>>> understand what it means at all. >>>> >>>> Yes, it supports both. By storing all inputs to the object, it can be >>>> rendered again, with optional changes such as overriding a value, or >>>> escaping the input. >>> >>> The problem with this auto-rendering is that the format_spec and >>> conversion character have to make sense to __format__. For example, you >>> couldn't do this, if value were a string: >>> >>> to_html(e'

{value:raw}

') >>> >>> Imagine that ":raw" is interpreted by to_html() to mean that the string >>> does not get html escaped. >>> >>> With PEP 501, the format_spec and conversion are opaque to the i-string >>> machinery, and are only interpreted by the custom interpolation function >>> (here, to_html()). >> >> With str.format, it's the type of value that decides how to interpret the format spec. If you leave it up to the consumer of the i-string (the interpolation function) instead of the value's type, how do you handle things like numeric formats and datetime formats and so on? Would I need to do something like this: >> >> to_html(e'

{str(e"{value:05}"):raw}

') >> >> Or would to_html (and every other custom interpolator) have to take things like :raw05 and parse out a format spec to pass to value.__format__? > > Your interpolator would need to decide. It might never call > value.__format__. It might invent some other protocol, like > __html_escape__(fmt_spec). Or, it might bake-in knowledge of how to > convert whatever types it cares about. Or another good choice would be > to use the singledispatch module. In fact, I like singledispatch so much > that I'm going to have to use it in an example. But even with singledispatch, you have to write formatters for every type that just call the default format; it means you only have N+M functions to write instead of N*M (where N is the number of interpolators and M the number of types to format), but that's still a lot more than just N functions. Also, of course, the fact that you're doing it differently from the usual "just write a __format__ method" means an extra thing for people to learn, and search for. And I think that's functionality people will almost always expect. Whether I'm dealing with a logger, an i18n library, or even a SQL DECIMAL field, I'd expect :3.5 to mean the same thing it does in str.format. So making every project write the identical code to make that true just because a small number of them won't care seems like a bad idea. And similarly, not having a standard way to separate out the interpolator spec (which is obviously unique to every interpolator) and the format spec (which should be the same for almost every interpreter, but is different for each type) seems like it just adds confusion without adding flexibility. Finally, the fact that, by default, handling format specs isn't done at all means it's very easy to design and implement an interpolator that doesn't do it, use it for a while, and only later realize that you need to be able to do the equivalent of :+05 and haven't left any way to do that and now have to find a clumsy way to tack it on. From python-ideas at mgmiller.net Thu Aug 27 21:43:41 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Thu, 27 Aug 2015 12:43:41 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> Message-ID: <55DF686D.8000602@mgmiller.net> On 08/27/2015 11:27 AM, Guido van Rossum wrote: > specified. E.g. I couldn't figure out what code should be produced by: > > a = e"Sliced {n} onions in {t1-t0:.3f} seconds." > > Generalizing from the only example in the specification, this would become: > > a = est("Sliced {n} onions in {t1-t0:.3f} seconds", n=n, t1-t0=t1-t0) Yes, the demo does not currently handle arbitrary expressions, only .format() is implemented. The PEP is relying on much of the PEP 498 implementation (which I don't have at hand), so that part is underspecified. I hope to reconcile the details if/when the larger design is chosen. For now, I will add the ability to pass a context dictionary at init as well as keywords to prepare for further implementation. > I also don't understand the claim that no str(estr) is necessary to render the > result -- the estr implementation given at > https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py has a > __str__ method that renders the .rendered attribute, but without the str() call Yes, no explicit call is necessary. When you do things like print(e'') or e''.upper() or '' + e'', you'll get the rendered string without str(). That could be more clear, yes, and maybe not substantially different than i''. > A solution > might be to make estr a subclass of str, but nothing in the PEP suggests that I originally subclassed it from str, but others on the list advised against it. Which is preferred? Under what name is the string in a string object actually held (if any)? -Mike From python-ideas at mgmiller.net Thu Aug 27 22:38:08 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Thu, 27 Aug 2015 13:38:08 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> Message-ID: <55DF7530.4030400@mgmiller.net> I've addressed those questions, updated the demo script, and added your examples in a separate examples section. I'll write more after lunch. There is the question of how much detail to copy from PEP 498, I wonder if it will change any further? -Mike On 08/27/2015 11:27 AM, Guido van Rossum wrote: From guido at python.org Thu Aug 27 23:02:40 2015 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Aug 2015 14:02:40 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <55DF7530.4030400@mgmiller.net> References: <55DE94D6.50700@mgmiller.net> <55DF7530.4030400@mgmiller.net> Message-ID: On Thu, Aug 27, 2015 at 1:38 PM, Mike Miller wrote: > I've addressed those questions, updated the demo script, and added your > examples in a separate examples section. I'll write more after lunch. > Looking through your latest commit ( https://bitbucket.org/mixmastamyk/docs/commits/760274613d8c306cd688385fbbcf2a73a8bb3165?at=default) I think you've painted yourself into an impossible corner, and haven't thought through the consequences in all cases enough. Apparently my Socratic questions didn't help enough, so I feel compelled to give you the answers. :-) The interpreter can't and shouldn't be passing in the values of all the variables involved in an expression. It should only be passing in the final evaluated result of each "slot" in the e-string. For my second example (with three rand() calls) it should pass the three different random values returned by the three rand() calls into the estr() constructor, e.g.: b = estr("Three random numbers: {rand()}, {rand()}, {rand()}.", rand(), rand(), rand()) Your entire formatting machinery should be rewritten using positional values instead of keyword args. Regarding subclassing str, there are indeed many problems with that (e.g. what is the type of str(...)+estr(...)), but without it, you will never be able to claim that str() calls are never needed, because quite a few built-in operations and stdlib modules in Python *require* that their arguments are str subclasses (or they treat str subclasses different than other classes). An important example is the re module. > There is the question of how much detail to copy from PEP 498, I wonder if > it will change any further? Undoubtedly PEP 498 will evolve. You're better off not depending on it directly (despite being an alternative or variant). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Aug 27 21:13:19 2015 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 27 Aug 2015 12:13:19 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> Message-ID: On Aug 27, 2015, at 11:27, Guido van Rossum wrote: > > I also don't understand the claim that no str(estr) is necessary to render the result -- the estr implementation given at https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py has a __str__ method that renders the .rendered attribute, but without the str() call the type of 'a' in the above examples would not be str, and various things that operate on strings (e.g. regular expression searches) would not work. A solution might be to make estr a subclass of str, but nothing in the PEP suggests that you have even considered this problem. (The only hint I can find is the comment "more magic-methods to be implemented here, to improve str compatibility" in your demo implementation, but without subclassing str this is not enough.) Even subclassing str doesn't really help, because there's plenty of code (including, I believe, regex searches) that just looks at the raw string storage that gets created at str.__new__ and can never be mutated or replaced later. So, anything that's delayed-rendered is not a str, or at least it's not the right str. (I know someone earlier in the discussion suggested that at the C level you could replace PyUnicode_READY with a function that, if it's an estr, first calls self.__str__ and then initializes the string storage to the result and then does the normal READY stuff, but I don't think that actually works, does it?) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Aug 27 23:49:00 2015 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Aug 2015 14:49:00 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> Message-ID: On Thu, Aug 27, 2015 at 12:13 PM, Andrew Barnert wrote: > On Aug 27, 2015, at 11:27, Guido van Rossum wrote: > > I also don't understand the claim that no str(estr) is necessary to render > the result -- the estr implementation given at > https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py > has a __str__ method that renders the .rendered attribute, but without the > str() call the type of 'a' in the above examples would not be str, and > various things that operate on strings (e.g. regular expression searches) > would not work. A solution might be to make estr a subclass of str, but > nothing in the PEP suggests that you have even considered this problem. > (The only hint I can find is the comment "more magic-methods to be > implemented here, to improve str compatibility" in your demo > implementation, but without subclassing str this is not enough.) > > > Even subclassing str doesn't really help, because there's plenty of code > (including, I believe, regex searches) that just looks at the raw string > storage that gets created at str.__new__ and can never be mutated or > replaced later. So, anything that's delayed-rendered is not a str, or at > least it's not the right str. > > (I know someone earlier in the discussion suggested that at the C level > you could replace PyUnicode_READY with a function that, if it's an estr, > first calls self.__str__ and then initializes the string storage to the > result and then does the normal READY stuff, but I don't think that > actually works, does it?) > I think subclassing would be enough -- the raw string should be the default rendered string (not the template), and any code that wants to do something *different* will have to extract the template and the list of values (and whatever else is extracted) from other attributes. (Note that it shouldn't be *necessary* to store anything besides the template and the list of values, since the rest of the info can be recovered by parsing the template. But might be *convenient* to store some other things, like the actual text of the expression that produced each value, and the format spec (or perhaps integers into the template that let you find these -- the details would be up to the implementer and a matter of QoI). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Fri Aug 28 02:59:36 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Thu, 27 Aug 2015 17:59:36 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> <55DF7530.4030400@mgmiller.net> Message-ID: <55DFB278.8040209@mgmiller.net> Ok, sorry for the noise, I had to run out earlier and rushed my response. I've decided to try the ES6 pattern, of strings and values tuples instead. Seems to be working so far. -Mike On 08/27/2015 02:02 PM, Guido van Rossum wrote: > On Thu, Aug 27, 2015 at 1:38 PM, Mike Miller > wrote: > > I've addressed those questions, updated the demo script, and added your > examples in a separate examples section. I'll write more after lunch. From random832 at fastmail.us Fri Aug 28 03:55:19 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Thu, 27 Aug 2015 21:55:19 -0400 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <55DF4869.8000303@trueblade.com> References: <55DE94D6.50700@mgmiller.net> <55DF2B40.7030307@mgmiller.net> <55DF4869.8000303@trueblade.com> Message-ID: <1440726919.1072611.368146729.713ACCD3@webmail.messagingengine.com> On Thu, Aug 27, 2015, at 13:27, Eric V. Smith wrote: > The problem with this auto-rendering is that the format_spec and > conversion character have to make sense to __format__. For example, you > couldn't do this, if value were a string: > > to_html(e'

{value:raw}

') I feel like this would be better done with something like {value!raw}. From humbert at uni-wuppertal.de Fri Aug 28 08:59:58 2015 From: humbert at uni-wuppertal.de (Prof. Dr. L. Humbert) Date: Fri, 28 Aug 2015 08:59:58 +0200 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55DA9B63.3010208@uni-wuppertal.de> References: <55DA9B63.3010208@uni-wuppertal.de> Message-ID: <55E006EE.4040209@uni-wuppertal.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear collegues, for me the arguments are quite clear and I don't want to change much of the underlying work to get my prefered notation ;-) So I though of an alternative orthogonal approach for learning and being orthogonal. But I found, this is another showstopper for being orthogona l: It is indeed possible to run the following code / from typing import List class Tree: def __init__(self, left: 'Tree', right: 'Tree'): self.left = left self.right = right def leaves(self) -> List['Tree']: return [] def greeting(name: 'str') -> 'str': return 'Hello ' + name \ but not ? def leaves(self) -> 'List'['Tree']: ? which would be orthogonal, when deciding to put all used types in '?' Perhaps there will be a chance to make this a valid construction? be orthogonal ;-) Ludger - -- https://twitter.com/n770 http://ddi.uni-wuppertal.de/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlXgBu4ACgkQJQsN9FQ+jJ/ueQCdH7RUdpJ4DZd0/12AbP6dLF+E 8NgAn1XFIuRRCIC+Bas68qPXi0SVwgtT =QF2Z -----END PGP SIGNATURE----- From p.f.moore at gmail.com Fri Aug 28 09:52:44 2015 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 28 Aug 2015 08:52:44 +0100 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <1440726919.1072611.368146729.713ACCD3@webmail.messagingengine.com> References: <55DE94D6.50700@mgmiller.net> <55DF2B40.7030307@mgmiller.net> <55DF4869.8000303@trueblade.com> <1440726919.1072611.368146729.713ACCD3@webmail.messagingengine.com> Message-ID: On 28 August 2015 at 02:55, wrote: > On Thu, Aug 27, 2015, at 13:27, Eric V. Smith wrote: >> The problem with this auto-rendering is that the format_spec and >> conversion character have to make sense to __format__. For example, you >> couldn't do this, if value were a string: >> >> to_html(e'

{value:raw}

') > > I feel like this would be better done with something like {value!raw}. While I appreciate that people are still in a design phase with this, I'd like to point out that in the end, people will need to teach and remember this stuff. Currently ! introduces a conversion, which is one of r, s, or a (and is rarely used except for the occasional !r). Whereas : introduces a format spec, which is a mini-language for describing how to format the value and is specific to the type of the value. The "raw" thing above feels like neither of those things. It feels more like a conversion, but if so then conversions are currently single letters, and language-defined. I'd also expect (for that reason) that any conversions would be valid in any type of formatting (str.format, f-strings, e-strings, whatever). I'm not saying you *have* to follow those rules, just that it feels like we're setting up for a huge teachability nightmare (and a feature that no-one will ever use, because they can't remember how[1]) if we don't at least try to adhere to some level of consistency here. Paul [1] I already tend to ignore most of the features of format strings beyond putting in a simple field number or name, because I don't remember the details and would have to look them up. I'm pretty sure I'd never use something like !raw, no matter how it was spelt, for exactly the same reason. From python-ideas at mgmiller.net Fri Aug 28 10:34:55 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Fri, 28 Aug 2015 01:34:55 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> <55DF7530.4030400@mgmiller.net> Message-ID: <55E01D2F.6060508@mgmiller.net> Hi, I was able to get this done tonight. There's still simplifications to be done, perhaps keeping the string fragments in one piece? Also, a lot of things need to be passed to the constructor to avoid parsing the template twice. Positional arguments are working however, and we're back to inheriting from str. It is now rendering into the "real" string, and everything seems to work without the magic methods. ;) Thanks, -Mike On 08/27/2015 02:02 PM, Guido van Rossum wrote: From skrah at bytereef.org Fri Aug 28 15:17:54 2015 From: skrah at bytereef.org (Stefan Krah) Date: Fri, 28 Aug 2015 13:17:54 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?pep-0484_-_Forward_references_and_Didact?= =?utf-8?q?ics_-_be=09orthogonal?= References: <55DA9B63.3010208@uni-wuppertal.de> <55E006EE.4040209@uni-wuppertal.de> Message-ID: Prof. Dr. L. Humbert writes: > It is indeed possible to run the following code > / > from typing import List > class Tree: > def __init__(self, left: 'Tree', right: 'Tree'): > self.left = left > self.right = right > def leaves(self) -> List['Tree']: > return [] > def greeting(name: 'str') -> 'str': > return 'Hello ' + name > \ > > but not > ? > def leaves(self) -> 'List'['Tree']: > ? > which would be orthogonal, when deciding to put all used types in '?' > > Perhaps there will be a chance to make this a valid construction? The issue is that Python does not have separate type/value universes. 'Tree' is just a type hint, not a type in the conventional sense. I would very much like if Python *did* have separate types/values, so that one could write (OCaml): class tree (left : tree) (right : tree) = object val left = left val right = right end ;; Which is an uninhabited type, since you need a tree to construct a tree! :) Thus: class tree (left : tree option) (right : tree option) = object val left = left val right = right end ;; Stefan Krah From encukou at gmail.com Fri Aug 28 15:58:50 2015 From: encukou at gmail.com (Petr Viktorin) Date: Fri, 28 Aug 2015 15:58:50 +0200 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: <55E006EE.4040209@uni-wuppertal.de> References: <55DA9B63.3010208@uni-wuppertal.de> <55E006EE.4040209@uni-wuppertal.de> Message-ID: On Fri, Aug 28, 2015 at 8:59 AM, Prof. Dr. L. Humbert wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dear collegues, > for me the arguments are quite clear and I don't want to change much of > the underlying work to get my prefered notation ;-) > > So I though of an alternative orthogonal approach for learning and being > orthogonal. But I found, this is another showstopper for being orthogona > l: > > It is indeed possible to run the following code > / > from typing import List > class Tree: > def __init__(self, left: 'Tree', right: 'Tree'): > self.left = left > self.right = right > def leaves(self) -> List['Tree']: > return [] > def greeting(name: 'str') -> 'str': > return 'Hello ' + name > \ > > but not > ? > def leaves(self) -> 'List'['Tree']: > ? > which would be orthogonal, when deciding to put all used types in '?' > > Perhaps there will be a chance to make this a valid construction? You can put the entire hint in a string: def leaves(self) -> 'List[Tree]': From random832 at fastmail.us Fri Aug 28 16:18:49 2015 From: random832 at fastmail.us (random832 at fastmail.us) Date: Fri, 28 Aug 2015 10:18:49 -0400 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> <55DF2B40.7030307@mgmiller.net> <55DF4869.8000303@trueblade.com> <1440726919.1072611.368146729.713ACCD3@webmail.messagingengine.com> Message-ID: <1440771529.1185379.368580889.24B700FA@webmail.messagingengine.com> On Fri, Aug 28, 2015, at 03:52, Paul Moore wrote: > On 28 August 2015 at 02:55, wrote: > > On Thu, Aug 27, 2015, at 13:27, Eric V. Smith wrote: > >> The problem with this auto-rendering is that the format_spec and > >> conversion character have to make sense to __format__. For example, > >> you couldn't do this, if value were a string: > >> > >> to_html(e'

{value:raw}

') > > > > I feel like this would be better done with something like > > {value!raw}. > > While I appreciate that people are still in a design phase with this, > I'd like to point out that in the end, people will need to teach and > remember this stuff. > > Currently ! introduces a conversion, which is one of r, s, or a (and > is rarely used except for the occasional !r). Whereas : introduces a > format spec, which is a mini-language for describing how to format the > value and is specific to the type of the value. Yes, but the format spec mini-language belongs to the type of the value. Depending on what value is, "raw" could _already_ have a meaning. > The "raw" thing above feels like neither of those things. It feels > more like a conversion, but if so then conversions are currently > single letters, and language-defined. Well, at the time I posted that I thought we were moving away from things being language-defined, because my mind was still on the "user- defined string prefixes" proposal from a while back. Them currently being single letters isn't really a compelling argument. > I'd also expect (for that reason) that any conversions would be valid > in any type of formatting (str.format, f-strings, e-strings, > whatever). And what if the type of value expects to be able to process a format specifier of "raw", rather than it being used by to_html for its own purpose? The advantage of conversion specifiers is that they're currently a closed set. ---- One thing that I don't think *either* version successfully expresses is that while in many cases (including the to_html example) we want a string, that won't always be the case. If we have a syntax for inserting something as, e.g., a SQL parameter, it should be able to accept a double, but I'm not convinced it shouldn't _also_ be able to describe putting the string result of converting the double with ".05d" as a varchar. From guido at python.org Fri Aug 28 17:11:14 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Aug 2015 08:11:14 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: <55E01D2F.6060508@mgmiller.net> References: <55DE94D6.50700@mgmiller.net> <55DF7530.4030400@mgmiller.net> <55E01D2F.6060508@mgmiller.net> Message-ID: Thanks, this looks much better, if we ever want to go in this direction. (Though I think you may want to separate the field names into two parts, the expression text and the format spec.) Can you work with the team at peps at python.org to get a PEP number for this? On Fri, Aug 28, 2015 at 1:34 AM, Mike Miller wrote: > Hi, > > I was able to get this done tonight. There's still simplifications to be > done, perhaps keeping the string fragments in one piece? Also, a lot of > things need to be passed to the constructor to avoid parsing the template > twice. > > Positional arguments are working however, and we're back to inheriting > from str. It is now rendering into the "real" string, and everything seems > to work without the magic methods. ;) > > Thanks, > > -Mike > > On 08/27/2015 02:02 PM, Guido van Rossum wrote: > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Fri Aug 28 19:13:33 2015 From: python-ideas at mgmiller.net (Mike Miller) Date: Fri, 28 Aug 2015 10:13:33 -0700 Subject: [Python-ideas] Draft2 PEP on string interpolation In-Reply-To: References: <55DE94D6.50700@mgmiller.net> <55DF7530.4030400@mgmiller.net> <55E01D2F.6060508@mgmiller.net> Message-ID: <55E096BD.7010802@mgmiller.net> On 08/28/2015 08:11 AM, Guido van Rossum wrote: > Thanks, this looks much better, if we ever want to go in this direction. (Though > I think you may want to separate the field names into two parts, the expression > text and the format spec.) It does currently separate the expression texts from the format specs internally, it felt excessive to pass them in separately in the constructor, as I'm sort of complaining about below. But, the end-developer won't see this typically, so it could be done before or after I suppose. Anyone have a preference? > Can you work with the team at peps at python.org to get a > PEP number for this? > Yes, nice to have a third alternative to choose from. -Mike From humbert at uni-wuppertal.de Fri Aug 28 21:21:23 2015 From: humbert at uni-wuppertal.de (Prof. Dr. L. Humbert) Date: Fri, 28 Aug 2015 21:21:23 +0200 Subject: [Python-ideas] pep-0484 - Forward references and Didactics - be orthogonal In-Reply-To: References: <55DA9B63.3010208@uni-wuppertal.de> <55E006EE.4040209@uni-wuppertal.de> Message-ID: <55E0B4B3.5050901@uni-wuppertal.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 28.08.2015 15:58, Petr Viktorin wrote: ? >> It is indeed possible to run the following code >> / >> from typing import List >> class Tree: >> def __init__(self, left: 'Tree', right: 'Tree'): >> self.left = left >> self.right = right >> def leaves(self) -> List['Tree']: >> return [] >> def greeting(name: 'str') -> 'str': >> return 'Hello ' + name >> \ >> ? pv> You can put the entire hint in a string: pv> def leaves(self) -> 'List[Tree]': TNX ? solves the problem in this example and makes it orthogonal. Next showstopper will come, when working on/with datastructures, which contains entangled class-structures, perhaps the instantiation of a class, when we have to use self.node = Node(?) but not self.node= 'Node'(?) So I think, we as educators have to live with this pedagogical ?suboptimal? solution(s) and have to communicate those non-orthogonal notation and make clear, what the reason is all about. TNX Ludger - -- https://twitter.com/n770 http://ddi.uni-wuppertal.de/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlXgtLMACgkQJQsN9FQ+jJ+4cgCfTaer3dFG4CscmNu/yo4AWxti 0AQAoI9nyTaA2hgDfyCEFk5WdkW1/28L =Wp9U -----END PGP SIGNATURE----- From ncoghlan at gmail.com Sat Aug 29 06:50:20 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Aug 2015 14:50:20 +1000 Subject: [Python-ideas] String interpolation: environment variables, command substitution In-Reply-To: <55DE1719.1070704@mgmiller.net> References: <55DE035F.9070101@mgmiller.net> <55DE1310.3010709@brenbarn.net> <55DE1719.1070704@mgmiller.net> Message-ID: On 27 August 2015 at 05:44, Mike Miller wrote: > True, though less readable I think. If we're going to go as far as > arbitrary expressions, let's discuss making very common scripting tasks > easier. There's already a way to make common scripting tasks easy: use the preferred shell for your preferred platform. That said, if anyone really wants to advance the state of the art in Python's "embedded shell scripting" capabilities, then I'd highly recommend exploring Julia's capabilities in that area and seeing how to produce a comparable system using runtime processing of strings in Python: http://julia.readthedocs.org/en/latest/manual/running-external-programs/ Combining a system like that with f-strings (and/or i-strings) would then allow ready interpolation of Python variables into command lines using either *nix or Windows appropriate syntax. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Aug 29 06:41:04 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Aug 2015 14:41:04 +1000 Subject: [Python-ideas] Forward-References & Out-of-order declaration In-Reply-To: References: <55DDF441.1030107@thekunderts.net> Message-ID: On 27 August 2015 at 09:41, Chris Angelico wrote: > The two basic solutions still apply: either use string names to > identify not-yet-defined objects, or have a "pre-declare" syntax to > make things possible. In C, "struct foo;" is enough to let you declare > pointers to foo; in Python, you could have "Thing = Table()" prior to > defining Manufacturer, and then you could use an unquoted Thing to > define the relationship. Either way makes it clear that something > unusual is happening. It's also the case that *circular dependencies hint at a design problem*. They're sometimes an unavoidable problem (because you're modelling a genuinely bidirectional relationship), but they're still a problem, since acyclic models structurally avoid a *lot* of the challenges that come up when cycles may be present (for example, consider how much easier it is to traverse a filesystem tree if you *don't* support following symlinks). Teasing apart a data model (which is what a class hierarchy represents) to either eliminate the circular references, or else limit them to within particular files is actually a pretty good way to figure out which parts of that model are tightly coupled, and which are more loosely related. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Aug 29 07:06:21 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Aug 2015 15:06:21 +1000 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> <55DDD610.3030005@trueblade.com> <55DDDDAC.4030201@trueblade.com> <55DE78E7.8020500@trueblade.com> <55DED5A9.8090202@trueblade.com> Message-ID: On 28 August 2015 at 04:21, Ron Adam wrote: > On 08/27/2015 04:17 AM, Eric V. Smith wrote: >>>> >>>> n = 10 >>>> >>s = 'text' >>>> >>x = i'{n}:{s}' >>>> >> >>>> >>to_list(x) -> [10, ':', 'text'] >>>> >>to_repr(x) -> '10:"text"' >>>> >> >>>> >>And this doesn't even take into account the format_specs or >>>> >> conversions, >>>> >>which only have meaning to the conversion function. >>> >>> > >>> >Sure it does, you can access the format spec and apply it manually to >>> >each item or not. Is there another choice? >> >> You're not reading what I'm writing. Using your proposal of immediately >> converting to strings, how would you write the version of "to_list" >> whose output I show above? >> >>> >Depending on how you want to make the values and specs visible. >>> > >>> > def to_repr(istr): >>> > return ''.join(repr(item.format(spec)) for item, spec >>> > in istr.items()) > > > > Well, what you have above is applying repr to the values but not the literal > parts. It's doable, but that should really be part of the format spec > rather than applying a function from outside, or it should be part of the > expression in the i-strings. > > i'{repr(n)}:{repr(s}}' > i'{n!r}:{s!r}' > > Yes, this topic is getting too drawn out. Possibly I'm not seeing a finer > point in your examples. I think it will sort it self out as the > implementation progress's, so I'm not too worried about it. The key with i-strings is that they introduce the possibility of replacing additional elements in the rendering pipeline: the field interpolator, and the overall renderer. With f-strings, there's only one field interpolator: the format() builtin, which receives both the value to be interpolated (eagerly calculated at the point where the f-string appears in the code) and the format string. Each substitution field is then replaced with the result of "format(field_value, field_spec)". With f-strings, there's also only one overall renderer: "".join. The literal text elements and the substituted fields are combined back together through string concatenation. The *whole point* of i-strings is to make not just the format() call replaceable, but also the overall process whereby the literal elements, the values of the substitution expressions, and the format specifiers for those expressions are rendered into an output object. Guido's not convinced yet that it makes sense to expose that capability to end users, and I think that skepticism is fair. However, if types.InterpolationTemplate is developed as an implementation detail of f-strings (which is the option Eric has been exploring), then we can create those at runtime from normal strings, and see how useful they might be for cases where ''.join() and format() aren't the best choice of rendering primitives. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ron3200 at gmail.com Sat Aug 29 18:41:55 2015 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 29 Aug 2015 11:41:55 -0500 Subject: [Python-ideas] Draft PEP on string interpolation In-Reply-To: References: <55D65E4F.1040608@mgmiller.net> <55DB8549.3070908@mgmiller.net> <55DB9E81.4060500@mgmiller.net> <872E8096-4007-481C-A5A5-B9E33380ED7E@trueblade.com> <55DDB793.1070906@trueblade.com> <55DDD610.3030005@trueblade.com> <55DDDDAC.4030201@trueblade.com> <55DE78E7.8020500@trueblade.com> <55DED5A9.8090202@trueblade.com> Message-ID: On 08/29/2015 12:06 AM, Nick Coghlan wrote: > The key with i-strings is that they introduce the possibility of > replacing additional elements in the rendering pipeline: the field > interpolator, and the overall renderer. Yes, I'm seeing some cases that might be easier with the delayed formatting, but have been able to avoid that so far. I'm still looking for examples where it's really needed. I've been running the examples in the string format docs, and a few things stand out. One is nested evaluations, but that may be fixable. Nesting arguments and more complex examples: >>> for align, text in zip('<^>', ['left', 'center', 'right']): ... '{0:{fill}{align}16}'.format(text, fill=align, align=align) ... 'left<<<<<<<<<<<<' '^^^^^center^^^^^' '>>>>>>>>>>>right' I haven't gotten this example to work yet. Another is the ability to turn off evaluation for an expression so it becomes the value. That is needed in the cases where the expression is used as key or as is. I've managed to do it by adding a type letter 'q' for quote to the format types. (externally at the moment.) A 'q' type is protected from being evaluated as expressions. >>> f('Coordinates: {latitude:q}, {longitude:q}' ).format(latitude='37.24N', longitude='-115.81W') 'Coordinates: 37.24N, -115.81W' Another example... >>> f("int: {0:q:d}; hex: {0:q:x}; oct: {0:q:o}; bin: {0:q:b}").format(42) 'int: 42; hex: 2a; oct: 52; bin: 101010' In my current implementation, the format_spec is split to try out separating the field formatting from the value formatting. That seems to work nicely and is easier to think about for some cases, (and is backwords compatible), but here, you can see it is chaining the format spec, the "q:" gets taken off by the expression evaluation step. The conflict introduced with that is when a time format spec is used... it has ':'s in it, so this will probably need to be changed, or a way to quote the time format spec may work. a '!q' like '!r' is for repr may be an option too. [ Details ;-) ] My current f() type is based on an expression class (e). class f(e, str): def __new__(cls, content): return str.__new__(cls, content) def __repr__(self): return repr(e.__str__(self)) def format(self, *args, **kwds): # Remove the leading 'e("' and ending '")'. return e.__repr__(self)[3:-2].format(*args, **kwds) Not too complex. It could also override an expression evaluation method on e. (I would just need to move it out of the __init__ method.) > With f-strings, there's only one field interpolator: the format() > builtin, which receives both the value to be interpolated (eagerly > calculated at the point where the f-string appears in the code) and > the format string. Each substitution field is then replaced with the > result of "format(field_value, field_spec)". > > With f-strings, there's also only one overall renderer: "".join. The > literal text elements and the substituted fields are combined back > together through string concatenation. > > The *whole point* of i-strings is to make not just the format() call > replaceable, but also the overall process whereby the literal > elements, the values of the substitution expressions, and the format > specifiers for those expressions are rendered into an output object. Yes, the concept I'm trying out at the moment is to have a builtin expression string type that can be sub-classed to make an f-string, or i-string. (or html-string, regex-string, etc.) So it's still what you are describing, but I'm attempting to avoid keeping references to external objects in it. The f-string in this case could still have sugar f"...." to meet the common case. The other cases can use the regular class constructors... e("..."), or i("..."), etc.. > Guido's not convinced yet that it makes sense to expose that > capability to end users, and I think that skepticism is fair. However, > if types.InterpolationTemplate is developed as an implementation > detail of f-strings (which is the option Eric has been exploring), > then we can create those at runtime from normal strings, and see how > useful they might be for cases where ''.join() and format() aren't the > best choice of rendering primitives. Yes, I agree. Cheers, Ron From ericsnowcurrently at gmail.com Sat Aug 29 23:21:38 2015 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 29 Aug 2015 15:21:38 -0600 Subject: [Python-ideas] Properties for classes possible? In-Reply-To: <55D57989.1020704@thomas-guettler.de> References: <55D57989.1020704@thomas-guettler.de> Message-ID: On Thu, Aug 20, 2015 at 12:54 AM, Thomas G?ttler wrote: > I think it would be great to have properties for classes in Python2 and > Python3 As always there's a rich history to which we can turn: on the mailing lists and the issue tracker. The addition of a "class property" is not a new idea, which implies it *might* be worth pursuing, but only if the previous obstacles/objections are resolved. [1] Furthermore, composition of abc.abstractmethod and property/classmethod/staticmethod was added not that long ago. The evolution of that composition provides some context for what is appropriate here. [2] Note that first we added abc.abstractmethod, then abc.abstractclassmethod, and then proper composition (abc.abstractmethod + classmethod). Though it's subtly different than the abstractmethod case, I suggest we avoid adding "classproperty" and skip straight to getting the composition approach working in a similar way to abstractmethod [3]: class Spam: @classmethod @property def eggs(cls): return 42 [Note that the alternate composition order doesn't work out since property is a descriptor that resolves strictly against instances. Would that be obvious enough or a point of confusion?] Unfortunately, there is a problem with applying classmethod onto property. Obviously a property isn't a function such that "classmethod" is an accurately described modifier. This is more concretely problematic because the classmethod implementation directly wraps the decorated object in a method object. [4] In Python it would look like this: class classmethod: def __init__(self, func): self.func = func def __get__(self, obj, cls): return types.MethodType(self.func, cls) I expect that this is an optimization over calling self.func.__get__, which optimization was likely supported by the sensible assumption that only functions would be passed to classmethod. The naive implementation of classmethod would look more like this: class classmethod: def __init__(self, func): self.func = func def __get__(self, obj, cls): return self.func.__get__(cls, type(cls)) If that were the actual implementation then we wouldn't be having this conversation. :) So to get composition to work correctly we have 3 options: 1. switch to the less efficient, naive implementation of classmethod.__get__ 2. add "classproperty", which does the right thing 3. add "classresolved" (or similarly named), which resolves wrapped descriptors to the class rather than the instance I advocate for #3 over the others. It provides for broader application while not impacting the optimizations in classmethod (and it has a more accurate name). It would work similarly to the naive classmethod implementation (and work as a less efficient replacement for classmethod): class classresolved: def __init__(self, wrapped): self.wrapped = wrapped def __get__(self, obj, cls): try: getter = self.wrapped.__get__ except AttributeError: return self.wrapped return getter(cls, type(cls)) Note that wrapped data descriptors (like property) would behave as non-data descriptors since classresolved is itself a non-data descriptor. The case for making it a data descriptor can be treated separately, but I don't think that case is as strong. All this leads to some broader observations about useful, generic descriptors in the stdlib. I'll open a new thread for that conversation. > > There are some "patterns" to get this working: > > > http://stackoverflow.com/questions/5189699/how-can-i-make-a-class-property-in-python > > http://stackoverflow.com/questions/128573/using-property-on-classmethods > > ... but an official solution would be more "zen of python". > > Do you think properties for classes would be useful? I think so. Here are use cases off the top of my head: * read-only class attrs * dynamically generated class attrs * lazily generated class attrs * class attrs that track access * class attrs that interact with a class registry * class attrs that replace themselves on the class upon first use * ... So basically stick in nearly all the use cases for properties, but applied to classes. The main difference is that a "class property" would be a non-data descriptor. Pursuing a data descriptor approach is debatably overreaching and potentially problematic. Note that normally bound class attrs still meet most needs, so any solution here should be cognizant of the possibility of providing an attractive nuisance here (and avoid it!). > > If it works for classes, then it could be used for modules, too? Module attribute access does not involve the descriptor protocol. There are ways to work around that to support descriptors (e.g. the module-replaces-itself-in-sys-modules trick), but that is an orthogonal issue. Furthermore, using some other mechanism than the descriptor protocol to achieve module "properties" isn't worth it ("special cases aren't special enough..."). -eric [1] Some examples from the history of the idea: (oct2005) https://mail.python.org/pipermail/python-list/2005-October/321426.html an attempt to implement classproperty (jan2011) https://mail.python.org/pipermail/python-ideas/2011-January/008950.html Enumeration of many permutations of decorators; proposal to add classproperty; Guido in favor; Michael shows a simple implementation (feb2014) http://bugs.python.org/issue20659 "To get the behaviour you're requesting, you need to use a custom metaclass and define the property there." [2] Changes relative to abc.abstractmethod: (apr2009) http://bugs.python.org/issue5867 Compose abc.abstractmethod and classmethod (changed to abc.abstractclassmethod)...Guido said "I object to making changes to the classmethod implementation." (mar2011) http://bugs.python.org/issue11610 Compose abc.abstractmethod and property/classmethod/staticmethod/ [3] https://docs.python.org/3/library/abc.html#abc.abstractmethod [4] https://hg.python.org/cpython/file/default/Objects/funcobject.c (cm_descr_get) From ericsnowcurrently at gmail.com Sat Aug 29 23:35:06 2015 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 29 Aug 2015 15:35:06 -0600 Subject: [Python-ideas] some useful descriptors + a classtools module (was: Properties for classes possible?) Message-ID: On Sat, Aug 29, 2015 at 3:21 PM, Eric Snow wrote: > I advocate for #3 over the others. It provides for broader > application while not impacting the optimizations in classmethod (and > it has a more accurate name). It would work similarly to the naive > classmethod implementation (and work as a less efficient replacement > for classmethod): > > class classresolved: > def __init__(self, wrapped): > self.wrapped = wrapped > def __get__(self, obj, cls): > try: > getter = self.wrapped.__get__ > except AttributeError: > return self.wrapped > return getter(cls, type(cls)) > > Note that wrapped data descriptors (like property) would behave as > non-data descriptors since classresolved is itself a non-data > descriptor. The case for making it a data descriptor can be treated > separately, but I don't think that case is as strong. > > All this leads to some broader observations about useful, generic > descriptors in the stdlib. I'll open a new thread for that > conversation. [Just to be clear, I don't have much time to pursue this so it's more food-for-though than anything. Feel free to pick up the baton. ] While we're on the topic, there are other generic descriptors that are worth considering for inclusion in the stdlib. I've written more than my fair share of descriptors in the past, many to meet real needs. [1] Along with "classresolved", here are some non-data descriptors that might be worth adding: * lazy - late bound attr that replaces itself on the class (or instance) * Attr - marks an attr as defined on instances of the class, e.g. to satisfy an ABC or as programmatic "documentation"; like binding a place-holder class attr but supports doc strings * rawattr - basically a synonym for staticmethod * classattr - equivalent to simply binding the wrapped object directly on the class, except that wrapped descriptors always resolve as though called on the class * classonly - like classattr, but lookup for instances results in AttributeError; this is like adding a method to a metaclass without using a metaclass * nondata - turns data descriptors into non-data descriptors * classunresolved - causes descriptors to resolve only against instances; like the inverse of a classonly/classattr combo [note that I already proposed "classonly" to this list a couple years ago [2]] While the implementation of many of these is relatively trivial, understanding of the descriptor protocol (and the attribute lookup machinery) is just outside the understanding of your average Python user. I'm all for changing that but in the meantime I think it's justifiable to provide useful (even if relatively trivial) descriptors in the stdlib. Keep in mind that the implementation of property, classmethod, and staticmethod have about the same level of complexity as the descriptors I've described above. Also, if it makes sense to add any of these (or other) useful descriptors then it might also make sense for them to live in a stdlib module rather than as builtins (as property/classmethod/staticmethod do). There really isn't a good fit for them currently. Consequently, I would advocate for adding a new "classtools" module, inspired by the existing functools and itertools modules. The new module would also be an appropriate place for more than descriptors, as there are likely other things that would fit well there. In fact there are a few things we've stuck in the stdlib (e.g. in inspect and types) that would have gone into a classtools module if it had existed. -eric [1] https://bitbucket.org/ericsnowcurrently/presentations/src/default/utpy-may2015/ [2] https://mail.python.org/pipermail/python-ideas/2013-March/019848.html From ericfahlgren at gmail.com Sun Aug 30 20:40:05 2015 From: ericfahlgren at gmail.com (Eric Fahlgren) Date: Sun, 30 Aug 2015 11:40:05 -0700 Subject: [Python-ideas] Draft PEP on string interpolation Message-ID: <021001d0e353$4a8b7ab0$dfa27010$@gmail.com> On Sat Aug 29 18:41:55 CEST 2015, Ron Adam wrote: > One is nested evaluations, but that may be fixable. > > Nesting arguments and more complex examples: > > >>> for align, text in zip('<^>', ['left', 'center', 'right']): > ... '{0:{fill}{align}16}'.format(text, fill=align, align=align) > ... > 'left<<<<<<<<<<<<' > '^^^^^center^^^^^' > '>>>>>>>>>>>right' > > I haven't gotten this example to work yet. I've deployed an implementation of my interpretation of PEP-498 in our 2.7 production code, tested in 3.4. Here's the function and your test case working properly: #!/bin/env python from __future__ import print_function, division import sys as _sys try: import _string # Py3 def _parseFormat(string): return _string.formatter_parser(string) unicode = str except ImportError: # Py2 def _parseFormat(string): return string._formatter_parser() def _stringInterpolater(s, depth=1, evalCallback=None, context=None, **kwds): """ $uuid:4cbb6191-464a-56dd-88e2-52f4f861527e$ Experimental implementation of the behavior described in https://www.python.org/dev/peps/pep-0498/ The first extension, ``depth``, allows for reimplementation from a function that looks deeper into the call stack for the "current" context (see ``printi``, below). The second extension, ``evalCallback``, allows us to intercept the value before formatting so that we can, for example, convert a Marker object to an Adams id (see Simulatable.formatFunction for details). ``context`` allows the user to pass a dictionary to completely replace the namespace calculations performed below. It too gets superseded by ``kwds``. """ # Local frame is needed for error reporting, so calculate it even # when the user supplies a context. localFrame = _sys._getframe() while depth: localFrame = localFrame.f_back depth -= 1 if context: localDict = kwds globalDict = context.copy() else: localDict = localFrame.f_locals.copy() # Must copy to avoid side effects. localDict.update(kwds) # Our **kwds override locals. globalDict = localFrame.f_globals def doFormat(formatString): """ $uuid:7c45191f-741b-51a0-b75a-a534e99f58cb$ """ result = list() for text, expression, formatSpec, conversion in _parseFormat(formatString): if text: result.append(text) if expression is None: break value = eval(expression, globalDict, localDict) if evalCallback: value = evalCallback(value) if conversion == "r": if isinstance(value, unicode): # Delete this check in Py3. value = str(value) # Eat the annoying "u" prefix in Py2. value = repr(value) elif conversion == "s": value = str(value) formatSpec = doFormat(formatSpec) # Recurse to evaluate embedded formats. try: result.append(format(value, formatSpec)) except ValueError as e: raise ValueError("{}, object named '{}'".format(str(e), expression), localFrame) return "".join(result) return doFormat(s) f = _stringInterpolater def printi(*args, **kwds): """ $uuid:15cccafc-f4ae-58df-ab3e-03553e567bf0$ """ sep = kwds.pop("sep", " ") # Py2 smell, in Py3 you'd just put them in the signature. end = kwds.pop("end", "\n") file = kwds.pop("file", _sys.stdout) newArgs = list() for arg in args: if isinstance(arg, (str, unicode)): newArgs.append(f(arg, depth=2, **kwds)) print(*newArgs, sep=sep, end=end, file=file) if __name__ == "__main__": strings = { "<" : "left ", "^" : " center ", ">" : " right", } width = 2*5 + max(len(s) for s in strings.values()) for align in strings: fill = align print(f("{strings[align]:{fill}{align}{width}}")) printi("{strings[align]:{fill}{align}{width}}")