From ncoghlan at gmail.com Thu Dec 1 00:39:52 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Dec 2011 09:39:52 +1000 Subject: [Python-Dev] PEP 402: Simplified Package Layout and Partitioning In-Reply-To: References: <4E43E9A6.7020608@netwok.org> <20110811183114.701DF3A406B@sparrow.telecommunity.com> <4ED1196E.8090505@netwok.org> Message-ID: On Thu, Dec 1, 2011 at 1:28 AM, PJ Eby wrote: > It doesn't help at all that I'm not really in a position to provide an > implementation, and the persons most likely to implement have been leaning > somewhat towards 382, or wanting to modify 402 such that it uses .pyp > directory extensions so that PEP 395 can be supported... While I was initially a fan of the possibilities of PEP 402, I eventually decided that we would be trading an easy problem ("you need an '__init__.py' marker file or a '.pyp' extension to get Python to recognise your package directory") for a hard one ("What's your sys.path look like? What did you mean for it to look like?"). Symlinks (and the fact we implicitly call realname() during system initialisation and import) just make things even messier. *Deliberately* allowing package structures on the filesystem to become ambiguous is a recipe for future pain (and could potentially undo a lot of the good work done by PEP 328's elimination of implicit relative imports). I acknowledge there is a lot of confusion amongst novices as to how packages and imports actually work, but my diagnosis of the root cause of that problem is completely different from that supposed by PEP 402 (as documented in the more recent versions of PEP 395, I've come to believe it is due to the way we stuff up the default sys.path[0] initialisation when packages are involved). So, in the end, I've come to strongly prefer the PEP 382 approach. The principle of "Explicit is better than implicit" applies to package detection on the filesystem just as much as it does to any other kind of API design, and it really isn't that different from the way we treat actual Python files (i.e. you can *execute* arbitrary files, but they need to have an appropriate extension if you want to import them). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From anacrolix at gmail.com Thu Dec 1 01:46:47 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 1 Dec 2011 11:46:47 +1100 Subject: [Python-Dev] STM and python In-Reply-To: References:

Message-ID: I did see this, I'm not convinced it's only relevant to PyPy. On Thu, Dec 1, 2011 at 2:25 AM, Benjamin Peterson wrote: > 2011/11/30 Matt Joiner : >> Given GCC's announcement that Intel's STM will be an extension for C >> and C++ in GCC 4.7, what does this mean for Python, and the GIL? >> >> I've seen efforts made to make STM available as a context, and for use >> in user code. I've also read about the "old attempts way back" that >> attempted to use finer grain locking. The understandably failed due to >> the heavy costs involved in both the locking mechanisms used, and the >> overhead of a reference counting garbage collection system. >> >> However given advances in locking and garbage collection in the last >> decade, what attempts have been made recently to try these new ideas >> out? In particular, how unlikely is it that all the thread safe >> primitives, global contexts, and reference counting functions be made >> __transaction_atomic, and magical parallelism performance boosts >> ensue? > > Have you seen http://morepypy.blogspot.com/2011/08/we-need-software-transactional-memory.html > ? > > > -- > Regards, > Benjamin From solipsis at pitrou.net Thu Dec 1 01:50:12 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 1 Dec 2011 01:50:12 +0100 Subject: [Python-Dev] STM and python References: Message-ID: <20111201015012.3a6f1ca2@pitrou.net> On Thu, 1 Dec 2011 01:31:14 +1100 Matt Joiner wrote: > > However given advances in locking and garbage collection in the last > decade, what attempts have been made recently to try these new ideas > out? In particular, how unlikely is it that all the thread safe > primitives, global contexts, and reference counting functions be made > __transaction_atomic, and magical parallelism performance boosts > ensue? IMHO, it sounds a bit too magical to be true. > I'm aware that C89, platforms without STM/GCC, and single threaded > performance are concerns. Please ignore these for the sake of > discussion about possibilities. > > http://gcc.gnu.org/wiki/TransactionalMemory I find it interesting that the only example of hardware transactional memory mentioned in this page is a Sun CPU project which has been cancelled. Does Intel have anything similar in the works? Regards Antoine. From greg at krypto.org Thu Dec 1 01:58:29 2011 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 30 Nov 2011 16:58:29 -0800 Subject: [Python-Dev] STM and python In-Reply-To: References:

Message-ID: Azul has been using hardware transactional memory on their custom CPUs (and likely STM in their current x86 virtual machine based products) to great effect for their massively parallel Java VM (700+ cpu cores and gobs of ram) for over 4 years. I'll leave it to the reader to do the relevant searching to read more on that. My point is: This is up to any given Python VM implementation to take advantage of or not as it sees fit. Shoe horning it into an existing VM may not make much sense but anyone is welcome to try. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Dec 1 06:41:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Dec 2011 15:41:35 +1000 Subject: [Python-Dev] STM and python In-Reply-To: References:

Message-ID: On Thu, Dec 1, 2011 at 10:58 AM, Gregory P. Smith wrote: > Azul has been using hardware transactional memory on their custom CPUs (and > likely STM in their current x86 virtual machine based products) to great > effect for their massively parallel Java VM (700+ cpu cores and gobs of ram) > for over 4 years. ?I'll leave it to the reader to do the relevant searching > to read more on that. > > My point is: This is up to any given Python VM implementation to take > advantage of or not as it sees fit. ?Shoe horning it into an existing VM may > not make much sense but anyone is welcome to try. There's a patch somewhere on the tracker to add an "Armin Rigo hook" to the CPython eval loop so he can play with STM in Python as well (at least, I think it was STM he wanted it for - it might have been something else). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From anacrolix at gmail.com Thu Dec 1 07:06:43 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 1 Dec 2011 17:06:43 +1100 Subject: [Python-Dev] STM and python In-Reply-To: References:

Message-ID: I saw this, I believe it just exposes an STM primitive to user code. It doesn't make use of STM for Python internals. Explicit STM doesn't seem particularly useful for a language that doesn't expose raw memory in its normal usage. On Thu, Dec 1, 2011 at 4:41 PM, Nick Coghlan wrote: > On Thu, Dec 1, 2011 at 10:58 AM, Gregory P. Smith wrote: >> Azul has been using hardware transactional memory on their custom CPUs (and >> likely STM in their current x86 virtual machine based products) to great >> effect for their massively parallel Java VM (700+ cpu cores and gobs of ram) >> for over 4 years. ?I'll leave it to the reader to do the relevant searching >> to read more on that. >> >> My point is: This is up to any given Python VM implementation to take >> advantage of or not as it sees fit. ?Shoe horning it into an existing VM may >> not make much sense but anyone is welcome to try. > > There's a patch somewhere on the tracker to add an "Armin Rigo hook" > to the CPython eval loop so he can play with STM in Python as well (at > least, I think it was STM he wanted it for - it might have been > something else). > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From raymond.hettinger at gmail.com Thu Dec 1 07:10:12 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 30 Nov 2011 22:10:12 -0800 Subject: [Python-Dev] Warnings Message-ID: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> When updating the documentation, please don't go overboard with warnings. The docs need to be worded affirmatively -- say what a tool does and show how to use it correctly. See http://docs.python.org/documenting/style.html#affirmative-tone The docs for the subprocess module currently have SEVEN warning boxes on one page: http://docs.python.org/library/subprocess.html#module-subprocess The implicit message is that our tools are hazardous and should be avoided. Please show some restraint and aim for clean looking, high-quality technical writing without the FUD. Look at the SQLite3 docs for an example of good writing. The prevention of SQL injection attacks is discussed briefly and effectively without big red boxes littering the page. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Thu Dec 1 08:02:25 2011 From: glyph at twistedmatrix.com (Glyph) Date: Thu, 1 Dec 2011 02:02:25 -0500 Subject: [Python-Dev] PEP 402: Simplified Package Layout and Partitioning In-Reply-To: References: <4E43E9A6.7020608@netwok.org> <20110811183114.701DF3A406B@sparrow.telecommunity.com> <4ED1196E.8090505@netwok.org> Message-ID: On Nov 30, 2011, at 6:39 PM, Nick Coghlan wrote: > On Thu, Dec 1, 2011 at 1:28 AM, PJ Eby wrote: >> It doesn't help at all that I'm not really in a position to provide an >> implementation, and the persons most likely to implement have been leaning >> somewhat towards 382, or wanting to modify 402 such that it uses .pyp >> directory extensions so that PEP 395 can be supported... > > While I was initially a fan of the possibilities of PEP 402, I > eventually decided that we would be trading an easy problem ("you need > an '__init__.py' marker file or a '.pyp' extension to get Python to > recognise your package directory") for a hard one ("What's your > sys.path look like? What did you mean for it to look like?"). Symlinks > (and the fact we implicitly call realname() during system > initialisation and import) just make things even messier. > *Deliberately* allowing package structures on the filesystem to become > ambiguous is a recipe for future pain (and could potentially undo a > lot of the good work done by PEP 328's elimination of implicit > relative imports). > > I acknowledge there is a lot of confusion amongst novices as to how > packages and imports actually work, but my diagnosis of the root cause > of that problem is completely different from that supposed by PEP 402 > (as documented in the more recent versions of PEP 395, I've come to > believe it is due to the way we stuff up the default sys.path[0] > initialisation when packages are involved). > > So, in the end, I've come to strongly prefer the PEP 382 approach. The > principle of "Explicit is better than implicit" applies to package > detection on the filesystem just as much as it does to any other kind > of API design, and it really isn't that different from the way we > treat actual Python files (i.e. you can *execute* arbitrary files, but > they need to have an appropriate extension if you want to import > them). I've helped an almost distressing number of newbies overcome their confusion about sys.path and packages. Systems using Twisted are, almost by definition, hairy integration problems, and are frequently being created or maintained by people with little to no previous Python experience. Given that experience, I completely agree with everything you've written above (except for the part where you initially liked it). I appreciate the insight that PEP 402 offers about python's package mechanism (and the difficulties introduced by namespace packages). Its statement of the problem is good, but in my opinion its solution points in exactly the wrong direction: packages need to be _more_ explicit about their package-ness and tools need to be stricter about how they're laid out. It would be great if sys.path[0] were actually correct when running a script inside a package, or at least issued a warning which would explain how to correctly lay out said package. I would love to see a loud alarm every time a module accidentally got imported by the same name twice. I wish I knew, once and for all, whether it was 'import Image' or 'from PIL import Image'. My hope is that if Python starts to tighten these things up a bit, or at least communicate better about best practices, editors and IDEs will develop better automatic discovery features and frameworks will start to normalize their sys.path setups and stop depending on accidents of current directory and script location. This will in turn vastly decrease confusion among new python developers taking on large projects with a bunch of libraries, who mostly don't care what the rules for where files are supposed to go are, and just want to put them somewhere that works. -glyph From glyph at twistedmatrix.com Thu Dec 1 08:15:01 2011 From: glyph at twistedmatrix.com (Glyph) Date: Thu, 1 Dec 2011 02:15:01 -0500 Subject: [Python-Dev] Warnings In-Reply-To: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> Message-ID: <22C86443-2C02-4D0A-A62A-A1CD75F87D08@twistedmatrix.com> On Dec 1, 2011, at 1:10 AM, Raymond Hettinger wrote: > When updating the documentation, please don't go overboard with warnings. > The docs need to be worded affirmatively -- say what a tool does and show how to use it correctly. > See http://docs.python.org/documenting/style.html#affirmative-tone > > The docs for the subprocess module currently have SEVEN warning boxes on one page: > http://docs.python.org/library/subprocess.html#module-subprocess > The implicit message is that our tools are hazardous and should be avoided. > > Please show some restraint and aim for clean looking, high-quality technical writing without the FUD. > > Look at the SQLite3 docs for an example of good writing. The prevention of SQL injection attacks is discussed briefly and effectively without big red boxes littering the page. I'm not convinced this is actually a great example of how to outline pitfalls clearly; it doesn't say what an SQL injection attack is, or what the consequences might be. Also, it's not the best example of a positive tone. The narrative is: You probably want to do X. Don't do Y, because it will make you vulnerable to a Q attack. Instead, do Z. Here's an example of Y. Don't do it! Okay, finally, here's an example of Z. It would be better to say "You probably want to do X. Here's how you do X, with Z. Here's an example of Z." Then, later, discuss why some people want to do Y, and why you should avoid that impulse. However, what 'subprocess' is doing clearly isn't an improvement, it's not an effective introduction to secure process execution, just a reference document punctuated with ambiguous anxiety. sqlite3 is at least somewhat specific :). I think both of these documents point to a need for a recommended idiom for discussing security, or at least common antipatterns, within the Python documentation. I like the IETF's "security considerations" section, because it separates things off into a section that can be referred to later, once the developer has had an opportunity to grasp the basics. Any section with security implications can easily say "please refer to the 'security considerations' section for important information on how to avoid common mistakes" without turning into a big security digression on its own. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Dec 1 08:32:36 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Dec 2011 17:32:36 +1000 Subject: [Python-Dev] Warnings In-Reply-To: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> Message-ID: On Thu, Dec 1, 2011 at 4:10 PM, Raymond Hettinger wrote: > When updating the documentation, please don't go overboard with warnings. > The docs need to be worded affirmatively -- say what a tool does and show > how to use it correctly. > See?http://docs.python.org/documenting/style.html#affirmative-tone > > The docs for the subprocess module currently have SEVEN warning boxes on one > page: > http://docs.python.org/library/subprocess.html#module-subprocess > The implicit message is that our tools are hazardous and should be avoided. I have no problem with eliminating a lot of those specific warnings - I kept them there in the last rewrite (and added a couple of new ones) because avoiding shell injection vulnerabilities is such a driving theme behind the subprocess module design. Since I was already changing a lot of other things, messing with that aspect really wasn't high on my priority list. Now that we have the "frequently used arguments" section, though, the rest of the warnings could fairly readily be downgraded to notes or inline references to that section. > Please?show some restraint and aim for clean looking, high-quality technical > writing without the FUD. I do object to you calling genuine attempts to educate programmers about security issues FUD, though. It's not FUD - novice programmers inflict shell injection, script injection and SQL injection vulnerabilities on the world every day. The multiple warnings are there in the subprocess docs because people often only look at the documentation for the specific function they're interested in, not at the broader context of the page it is part of. "Overkill" is a legitimate complaint, but calling attempts to highlight genuinely insecure practices FUD is the kind of attitude that has given the world so many years of persistent vulnerability to buffer overflow attacks :P Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Dec 1 08:36:37 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Dec 2011 17:36:37 +1000 Subject: [Python-Dev] Warnings In-Reply-To: <22C86443-2C02-4D0A-A62A-A1CD75F87D08@twistedmatrix.com> References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> <22C86443-2C02-4D0A-A62A-A1CD75F87D08@twistedmatrix.com> Message-ID: On Thu, Dec 1, 2011 at 5:15 PM, Glyph wrote: > I think both of these documents point to a need for a recommended idiom for > discussing security, or at least common antipatterns, within the Python > documentation. ?I like the IETF's "security considerations" section, because > it separates things off into a section that can be referred to later, once > the developer has had an opportunity to grasp the basics. ?Any section with > security implications can easily say "please refer to the 'security > considerations' section for important information on how to avoid common > mistakes" without turning into a big security digression on its own. I like that approach - one of the problems with online docs is the fact people don't read them in order, hence the proliferation of warnings for the subprocess module. A clear "Security Considerations" section with appropriate cross links would allow us to be clear and explicit about common problems without littering the docs with red warning boxes for security issues that are inherent in a particular task rather than being a Python-specific problem. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Dec 1 08:55:19 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Dec 2011 17:55:19 +1000 Subject: [Python-Dev] Warnings In-Reply-To: References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> <22C86443-2C02-4D0A-A62A-A1CD75F87D08@twistedmatrix.com> Message-ID: On Thu, Dec 1, 2011 at 5:36 PM, Nick Coghlan wrote: > On Thu, Dec 1, 2011 at 5:15 PM, Glyph wrote: >> I think both of these documents point to a need for a recommended idiom for >> discussing security, or at least common antipatterns, within the Python >> documentation. ?I like the IETF's "security considerations" section, because >> it separates things off into a section that can be referred to later, once >> the developer has had an opportunity to grasp the basics. ?Any section with >> security implications can easily say "please refer to the 'security >> considerations' section for important information on how to avoid common >> mistakes" without turning into a big security digression on its own. > > I like that approach - one of the problems with online docs is the > fact people don't read them in order, hence the proliferation of > warnings for the subprocess module. A clear "Security Considerations" > section with appropriate cross links would allow us to be clear and > explicit about common problems without littering the docs with red > warning boxes for security issues that are inherent in a particular > task rather than being a Python-specific problem. I created http://bugs.python.org/issue13515 to propose a specific documentation style guide adopt along these lines (expanded a bit to cover other cross-cutting concerns like the pipe buffer blocking I/O problem in subprocess). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From arigo at tunes.org Thu Dec 1 12:01:44 2011 From: arigo at tunes.org (Armin Rigo) Date: Thu, 1 Dec 2011 12:01:44 +0100 Subject: [Python-Dev] STM and python In-Reply-To: References:

Message-ID: Hi, On Thu, Dec 1, 2011 at 07:06, Matt Joiner wrote: > I saw this, I believe it just exposes an STM primitive to user code. > It doesn't make use of STM for Python internals. That's correct. > Explicit STM doesn't seem particularly useful for a language that > doesn't expose raw memory in its normal usage. In my opinion, that sentence could not be more wrong. It is true that, as I discuss on the blog post cited a few times in this thread, the first goal I see is to use STM to replace the GIL as an internal way of keeping the state of the interpreter consistent. This could quite possibly be achieved using the new GCC __transaction_atomic keyword, although I see already annoying issues (e.g. the keyword can only protect a _syntactically nested_ piece of code as a transaction). However there is another aspect: user-exposed STM, which I didn't explore much. While it is potentially even more important, it is a language design question, so I'm happy to delegate it to python-dev. In my opinion, explicit STM (like Clojure) is not only *a* way to write multithreaded Python programs, but it seems to be *the only* way that really makes sense in general, for more than small examples and more than examples where other hacks are enough (see http://en.wikipedia.org/wiki/Software_transactional_memory#Composable_operations ). In other words, locks are low-level and should not be used in a high-level language, like direct memory accesses, just because it forces the programmer to think about increasingly complicated situations. And of course there is the background idea that TM might be available in hardware someday. My own guess is that it will occur, and I bet that in 5 to 10 years all new Intel and AMD CPUs will have Hybrid TM. On such hardware, the performance penalty mostly disappears (which is also, I guess, the reasoning behind GCC 4.7, offering a future path to use Hybrid TM). If python-dev people are interested in exploring the language design space in that direction, I would be most happy to look in more detail at GCC 4.7. If we manage to make use of it, then we could get a version of CPython using STM internally with a very minimal patch. If it seems useful we can then turn that patch into #ifdefs into the normal CPython. It would of course be off by default because of the performance hit; still, it would give an optional alternate "CPythonSTM" to play with in order to come up with good user-level abstractions. (This is what I'm already trying to do with PyPy without using GCC 4.7, and it's progressing nicely.) (My existing patch to CPython emulating user-level STM with the GIL is not really satisfying, also for the reason that it cannot emulate some other potentially useful user constructs, like abort_and_retry().) A bient?t, Armin. From g.brandl at gmx.net Thu Dec 1 22:24:54 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 01 Dec 2011 22:24:54 +0100 Subject: [Python-Dev] Warnings In-Reply-To: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> Message-ID: Am 01.12.2011 07:10, schrieb Raymond Hettinger: > When updating the documentation, please don't go overboard with warnings. > The docs need to be worded affirmatively -- say what a tool does and show how to > use it correctly. > See http://docs.python.org/documenting/style.html#affirmative-tone > > The docs for the subprocess module currently have SEVEN warning boxes on one page: > http://docs.python.org/library/subprocess.html#module-subprocess > The implicit message is that our tools are hazardous and should be avoided. > > Please show some restraint and aim for clean looking, high-quality technical > writing without the FUD. > > Look at the SQLite3 docs for an example of good writing. The prevention of SQL > injection attacks is discussed briefly and effectively without big red boxes > littering the page. Obviously, +1. Georg From anacrolix at gmail.com Fri Dec 2 06:32:59 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 2 Dec 2011 16:32:59 +1100 Subject: [Python-Dev] STM and python In-Reply-To: References:

Message-ID: Armin, thanks for weighing in on this. I'm keen to see a CPython making use of STM, maybe I'll give it a try over Christmas break. I'm willing to take the single threaded performance hit, as I have several applications that degrade due to significant contention with the GIL. The other benefits of STM you describe make it a lot more appealing. I actually tried out Haskell recently to make use of many of the advanced features but came crawling back. If anyone else is keen to try this, I'm happy to receive patches for testing and review. On Thu, Dec 1, 2011 at 10:01 PM, Armin Rigo wrote: > Hi, > > On Thu, Dec 1, 2011 at 07:06, Matt Joiner wrote: >> I saw this, I believe it just exposes an STM primitive to user code. >> It doesn't make use of STM for Python internals. > > That's correct. > >> Explicit STM doesn't seem particularly useful for a language that >> doesn't expose raw memory in its normal usage. > > In my opinion, that sentence could not be more wrong. > > It is true that, as I discuss on the blog post cited a few times in > this thread, the first goal I see is to use STM to replace the GIL as > an internal way of keeping the state of the interpreter consistent. > This could quite possibly be achieved using the new GCC > __transaction_atomic keyword, although I see already annoying issues > (e.g. the keyword can only protect a _syntactically nested_ piece of > code as a transaction). > > However there is another aspect: user-exposed STM, which I didn't > explore much. ?While it is potentially even more important, it is a > language design question, so I'm happy to delegate it to python-dev. > In my opinion, explicit STM (like Clojure) is not only *a* way to > write multithreaded Python programs, but it seems to be *the only* way > that really makes sense in general, for more than small examples and > more than examples where other hacks are enough (see > http://en.wikipedia.org/wiki/Software_transactional_memory#Composable_operations > ). ?In other words, locks are low-level and should not be used in a > high-level language, like direct memory accesses, just because it > forces the programmer to think about increasingly complicated > situations. > > And of course there is the background idea that TM might be available > in hardware someday. ?My own guess is that it will occur, and I bet > that in 5 to 10 years all new Intel and AMD CPUs will have Hybrid TM. > On such hardware, the performance penalty mostly disappears (which is > also, I guess, the reasoning behind GCC 4.7, offering a future path to > use Hybrid TM). > > If python-dev people are interested in exploring the language design > space in that direction, I would be most happy to look in more detail > at GCC 4.7. ?If we manage to make use of it, then we could get a > version of CPython using STM internally with a very minimal patch. ?If > it seems useful we can then turn that patch into #ifdefs into the > normal CPython. ?It would of course be off by default because of the > performance hit; still, it would give an optional alternate > "CPythonSTM" to play with in order to come up with good user-level > abstractions. ?(This is what I'm already trying to do with PyPy > without using GCC 4.7, and it's progressing nicely.) ?(My existing > patch to CPython emulating user-level STM with the GIL is not really > satisfying, also for the reason that it cannot emulate some other > potentially useful user constructs, like abort_and_retry().) > > > A bient?t, > > Armin. -- ?_? From status at bugs.python.org Fri Dec 2 18:07:32 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 2 Dec 2011 18:07:32 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20111202170732.659371CE85@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2011-11-25 - 2011-12-02) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3148 (+14) closed 22154 (+26) total 25302 (+40) Open issues with patches: 1342 Issues opened (29) ================== #13483: Use VirtualAlloc to allocate memory arenas http://bugs.python.org/issue13483 opened by pitrou #13486: msvc9compiler.py doesn't properly generate manifest files. http://bugs.python.org/issue13486 opened by Jahangir #13491: Fixes for sqlite3 doc http://bugs.python.org/issue13491 opened by Nebelhom #13492: ./configure --with-system-ffi=LIBFFI-PATH http://bugs.python.org/issue13492 opened by michael.kraus #13493: Import error with embedded python on AIX 6.1 http://bugs.python.org/issue13493 opened by python_hu #13494: 'cast' any value to a Boolean? http://bugs.python.org/issue13494 opened by mark.dickinson #13495: IDLE: Regression - Two ColorDelegator instances loaded http://bugs.python.org/issue13495 opened by serwy #13496: bisect module: Overflow at index computation http://bugs.python.org/issue13496 opened by Voo #13497: Fix for broken nice test on non-broken platforms with pedantic http://bugs.python.org/issue13497 opened by yaneurabeya #13498: os.makedirs exist_ok documentation is incorrect, as is some of http://bugs.python.org/issue13498 opened by r.david.murray #13499: uuid documentation example uses invalid REPL/doctest syntax http://bugs.python.org/issue13499 opened by petri.lehtinen #13500: Hitting EOF gets cmd.py into a infinite EOF on return loop http://bugs.python.org/issue13500 opened by yaneurabeya #13501: Make libedit support more generic; port readline / libedit to http://bugs.python.org/issue13501 opened by yaneurabeya #13502: Documentation for Event.wait return value is either wrong or i http://bugs.python.org/issue13502 opened by r.david.murray #13503: improved efficiency of bytearray pickling by using bytes type http://bugs.python.org/issue13503 opened by irmen #13504: Meta-issue for "Invent with Python" IDLE feedback http://bugs.python.org/issue13504 opened by ncoghlan #13505: Bytes objects pickled in 3.x with protocol <=2 are unpickled i http://bugs.python.org/issue13505 opened by pitrou #13506: IDLE sys.path does not contain Current Working Directory http://bugs.python.org/issue13506 opened by MarcoScataglini #13507: Modify OS X installer builds to package liblzma for the new lz http://bugs.python.org/issue13507 opened by ned.deily #13508: ctypes' find_library breaks with ARM ABIs http://bugs.python.org/issue13508 opened by lool #13510: Clarify that readlines() is not needed to iterate over a file http://bugs.python.org/issue13510 opened by potten #13511: ./configure --includedir, --libdir accept multiple http://bugs.python.org/issue13511 opened by rpq #13512: ~/.pypirc created insecurely http://bugs.python.org/issue13512 opened by Vincent.Danen #13513: IOBase docs incorrectly link to the GNU readline module http://bugs.python.org/issue13513 opened by meador.inge #13515: Consistent documentation practices for security concerns and c http://bugs.python.org/issue13515 opened by ncoghlan #13516: Gzip old log files in rotating handlers http://bugs.python.org/issue13516 opened by ramhux #13518: configparser http://bugs.python.org/issue13518 opened by mickeyju #13519: Tkinter rowconfigure and columnconfigure functions crash if mi http://bugs.python.org/issue13519 opened by aoi.leslie #13520: Patch to make pickle aware of __qualname__ http://bugs.python.org/issue13520 opened by sbt Most recent 15 issues with no replies (15) ========================================== #13520: Patch to make pickle aware of __qualname__ http://bugs.python.org/issue13520 #13519: Tkinter rowconfigure and columnconfigure functions crash if mi http://bugs.python.org/issue13519 #13516: Gzip old log files in rotating handlers http://bugs.python.org/issue13516 #13513: IOBase docs incorrectly link to the GNU readline module http://bugs.python.org/issue13513 #13507: Modify OS X installer builds to package liblzma for the new lz http://bugs.python.org/issue13507 #13501: Make libedit support more generic; port readline / libedit to http://bugs.python.org/issue13501 #13499: uuid documentation example uses invalid REPL/doctest syntax http://bugs.python.org/issue13499 #13498: os.makedirs exist_ok documentation is incorrect, as is some of http://bugs.python.org/issue13498 #13495: IDLE: Regression - Two ColorDelegator instances loaded http://bugs.python.org/issue13495 #13478: No documentation for timeit.default_timer http://bugs.python.org/issue13478 #13476: Simple exclusion filter for unittest autodiscovery http://bugs.python.org/issue13476 #13464: HTTPResponse is missing an implementation of readinto http://bugs.python.org/issue13464 #13463: Fix parsing of package_data http://bugs.python.org/issue13463 #13456: Providing a custom HTTPResponse class to HTTPConnection http://bugs.python.org/issue13456 #13438: "Delete patch set" review action doesn't work http://bugs.python.org/issue13438 Most recent 15 issues waiting for review (15) ============================================= #13520: Patch to make pickle aware of __qualname__ http://bugs.python.org/issue13520 #13516: Gzip old log files in rotating handlers http://bugs.python.org/issue13516 #13513: IOBase docs incorrectly link to the GNU readline module http://bugs.python.org/issue13513 #13512: ~/.pypirc created insecurely http://bugs.python.org/issue13512 #13511: ./configure --includedir, --libdir accept multiple http://bugs.python.org/issue13511 #13508: ctypes' find_library breaks with ARM ABIs http://bugs.python.org/issue13508 #13503: improved efficiency of bytearray pickling by using bytes type http://bugs.python.org/issue13503 #13501: Make libedit support more generic; port readline / libedit to http://bugs.python.org/issue13501 #13500: Hitting EOF gets cmd.py into a infinite EOF on return loop http://bugs.python.org/issue13500 #13497: Fix for broken nice test on non-broken platforms with pedantic http://bugs.python.org/issue13497 #13495: IDLE: Regression - Two ColorDelegator instances loaded http://bugs.python.org/issue13495 #13491: Fixes for sqlite3 doc http://bugs.python.org/issue13491 #13486: msvc9compiler.py doesn't properly generate manifest files. http://bugs.python.org/issue13486 #13483: Use VirtualAlloc to allocate memory arenas http://bugs.python.org/issue13483 #13473: Add tests for files byte-compiled by distutils[2] http://bugs.python.org/issue13473 Top 10 most discussed issues (10) ================================= #6715: xz compressor support http://bugs.python.org/issue6715 18 msgs #7652: Merge C version of decimal into py3k. http://bugs.python.org/issue7652 13 msgs #11379: Remove "lightweight" from minidom description http://bugs.python.org/issue11379 13 msgs #1040439: Missing documentation on how to link with libpython http://bugs.python.org/issue1040439 10 msgs #13400: packaging: build command should have options to control byte-c http://bugs.python.org/issue13400 9 msgs #13493: Import error with embedded python on AIX 6.1 http://bugs.python.org/issue13493 9 msgs #12567: curses implementation of Unicode is wrong in Python 3 http://bugs.python.org/issue12567 7 msgs #13475: Add '-p'/'--path0' command line option to override sys.path[0] http://bugs.python.org/issue13475 7 msgs #13496: bisect module: Overflow at index computation http://bugs.python.org/issue13496 7 msgs #13405: Add DTrace probes http://bugs.python.org/issue13405 6 msgs Issues closed (26) ================== #6753: Python 3.1.1 test_cmd_line fails on Fedora 11 http://bugs.python.org/issue6753 closed by haypo #7111: abort when stderr is closed http://bugs.python.org/issue7111 closed by pitrou #8414: Add test cases for assert http://bugs.python.org/issue8414 closed by ezio.melotti #11427: ctypes from_buffer no longer accepts bytes http://bugs.python.org/issue11427 closed by haypo #12307: Inconsistent formatting of section titles in PEP 0 http://bugs.python.org/issue12307 closed by eric.araujo #12618: py_compile cannot create files in current directory http://bugs.python.org/issue12618 closed by meador.inge #12850: [PATCH] stm.atomic http://bugs.python.org/issue12850 closed by arigo #12856: tempfile PRNG reuse between parent and child process http://bugs.python.org/issue12856 closed by pitrou #12945: ctypes works incorrectly with _swappedbytes_ = 1 http://bugs.python.org/issue12945 closed by meador.inge #13380: ctypes: add an internal function for reseting the ctypes cache http://bugs.python.org/issue13380 closed by meador.inge #13434: time.xmlrpc.com dead http://bugs.python.org/issue13434 closed by pitrou #13448: PEP 3155 implementation http://bugs.python.org/issue13448 closed by pitrou #13452: PyUnicode_EncodeDecimal: reject error handlers different than http://bugs.python.org/issue13452 closed by haypo #13467: Typo in doc for library/sysconfig http://bugs.python.org/issue13467 closed by eric.araujo #13471: setting access time beyond Jan. 2038 on remote share failes on http://bugs.python.org/issue13471 closed by Thorsten.Simons #13481: Use an accurate clock in timeit http://bugs.python.org/issue13481 closed by pitrou #13482: _tkinter.TclError: invalid command name "tixDirSelectBox" http://bugs.python.org/issue13482 closed by Martin.Unzner #13484: mail rejected: tutor at python.org http://bugs.python.org/issue13484 closed by eric.araujo #13485: tcl question http://bugs.python.org/issue13485 closed by amaury.forgeotdarc #13487: inspect.getmodule fails when module imports change sys.modules http://bugs.python.org/issue13487 closed by eric.araujo #13488: Some old preprocessors have problem with "#define" not in the http://bugs.python.org/issue13488 closed by jcea #13489: collections.Counter doc does not list added version http://bugs.python.org/issue13489 closed by ezio.melotti #13490: broken downloads counting on pypi.python.org http://bugs.python.org/issue13490 closed by loewis #13509: On uninstallation, distutils bdist_wininst fails to run post i http://bugs.python.org/issue13509 closed by eric.araujo #13514: PIL does not support iTXt PNG chunks [patch] http://bugs.python.org/issue13514 closed by ezio.melotti #13517: readdir() in os.listdir not threadsafe on OSX 10.6.8 http://bugs.python.org/issue13517 closed by thouis From solipsis at pitrou.net Sat Dec 3 21:39:03 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 3 Dec 2011 21:39:03 +0100 Subject: [Python-Dev] Style guide for FAQs? Message-ID: <20111203213903.1ebfe7c5@pitrou.net> Hello, I notice that some FAQs are not only outdated but seem to favour a writing style that's quite lengthy and full of anecdotal details. It seems to me that there is value in giving terse answers in FAQs (we have - or should have - reference documentation where things are explained in more detail). One primary example is the performance question: file:///home/antoine/cpython/32/Doc/build/html/faq/programming.html#my-program-is-too-slow-how-do-i-speed-it-up It mixes a couple of generalities with incredibly specific suggestions such as early binding of methods or use of default argument values to fold constants. I think a beginner reading this entry won't get any meaningful information out of it. Any advice on whether it's ok to hack and slash into the fat? :) Regards Antoine. From solipsis at pitrou.net Sat Dec 3 21:58:01 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 3 Dec 2011 21:58:01 +0100 Subject: [Python-Dev] Style guide for FAQs? References: <20111203213903.1ebfe7c5@pitrou.net> Message-ID: <20111203215801.74ea1209@pitrou.net> On Sat, 3 Dec 2011 21:39:03 +0100 Antoine Pitrou wrote: > > One primary example is the performance question: > file:///home/antoine/cpython/32/Doc/build/html/faq/programming.html#my-program-is-too-slow-how-do-i-speed-it-up Woohoo. This should of course be: http://docs.python.org/dev/faq/programming.html#my-program-is-too-slow-how-do-i-speed-it-up cheers Antoine. From tjreedy at udel.edu Sun Dec 4 03:55:35 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 03 Dec 2011 21:55:35 -0500 Subject: [Python-Dev] Style guide for FAQs? In-Reply-To: <20111203215801.74ea1209@pitrou.net> References: <20111203213903.1ebfe7c5@pitrou.net> <20111203215801.74ea1209@pitrou.net> Message-ID: On 12/3/2011 3:58 PM, Antoine Pitrou wrote: > On Sat, 3 Dec 2011 21:39:03 +0100 > Antoine Pitrou wrote: >> >> One primary example is the performance question: >> file:///home/antoine/cpython/32/Doc/build/html/faq/programming.html#my-program-is-too-slow-how-do-i-speed-it-up > > Woohoo. This should of course be: > http://docs.python.org/dev/faq/programming.html#my-program-is-too-slow-how-do-i-speed-it-up That looks like a mini-howto ;-), rather than a FAQ entry. The changes you have made so far have looked good to me. -- Terry Jan Reedy From ncoghlan at gmail.com Sun Dec 4 05:11:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Dec 2011 14:11:58 +1000 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Issue #13211: Add .reason attribute to HTTPError to implement parent class In-Reply-To: References: Message-ID: On Sun, Dec 4, 2011 at 12:46 AM, jason.coombs wrote: > +def test_HTTPError_interface(): > + ? ?""" > + ? ?Issue 13211 reveals that HTTPError didn't implement the URLError > + ? ?interface even though HTTPError is a subclass of URLError. > + > + ? ?>>> err = urllib.error.HTTPError(msg='something bad happened', url=None, code=None, hdrs=None, fp=None) > + ? ?>>> assert hasattr(err, 'reason') > + ? ?>>> err.reason > + ? ?'something bad happened' > + ? ?""" > + Did you re-run the test suite after forward-porting to 3.3? I'm consistently getting failures: $ ./python -m test test_urllib2 [1/1] test_urllib2 ********************************************************************** File "/home/ncoghlan/devel/py3k/Lib/test/test_urllib2.py", line 1457, in test.test_urllib2.test_HTTPError_interface Failed example: err = urllib.error.HTTPError(msg='something bad happened', url=None, code=None, hdrs=None, fp=None) Exception raised: Traceback (most recent call last): File "/home/ncoghlan/devel/py3k/Lib/doctest.py", line 1253, in __run compileflags, 1), test.globs) File "", line 1, in err = urllib.error.HTTPError(msg='something bad happened', url=None, code=None, hdrs=None, fp=None) TypeError: HTTPError does not take keyword arguments ********************************************************************** File "/home/ncoghlan/devel/py3k/Lib/test/test_urllib2.py", line 1458, in test.test_urllib2.test_HTTPError_interface Failed example: assert hasattr(err, 'reason') Exception raised: Traceback (most recent call last): File "/home/ncoghlan/devel/py3k/Lib/doctest.py", line 1253, in __run compileflags, 1), test.globs) File "", line 1, in assert hasattr(err, 'reason') NameError: name 'err' is not defined ********************************************************************** File "/home/ncoghlan/devel/py3k/Lib/test/test_urllib2.py", line 1459, in test.test_urllib2.test_HTTPError_interface Failed example: err.reason Exception raised: Traceback (most recent call last): File "/home/ncoghlan/devel/py3k/Lib/doctest.py", line 1253, in __run compileflags, 1), test.globs) File "", line 1, in err.reason NameError: name 'err' is not defined ********************************************************************** 1 items had failures: 3 of 3 in test.test_urllib2.test_HTTPError_interface ***Test Failed*** 3 failures. test test_urllib2 failed -- 3 of 65 doctests failed 1 test failed: test_urllib2 [142313 refs] Now, this failure is quite possibly due to a flaw in the PEP 3151 implementation (see http://bugs.python.org/issue12555), but picking up this kind of thing is the reason we say to always run the tests before committing, even for a simple merge. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Sun Dec 4 09:42:23 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 04 Dec 2011 09:42:23 +0100 Subject: [Python-Dev] Style guide for FAQs? In-Reply-To: References: <20111203213903.1ebfe7c5@pitrou.net> <20111203215801.74ea1209@pitrou.net> Message-ID: Am 04.12.2011 03:55, schrieb Terry Reedy: > On 12/3/2011 3:58 PM, Antoine Pitrou wrote: >> On Sat, 3 Dec 2011 21:39:03 +0100 >> Antoine Pitrou wrote: >>> >>> One primary example is the performance question: >>> file:///home/antoine/cpython/32/Doc/build/html/faq/programming.html#my-program-is-too-slow-how-do-i-speed-it-up >> >> Woohoo. This should of course be: >> http://docs.python.org/dev/faq/programming.html#my-program-is-too-slow-how-do-i-speed-it-up > > That looks like a mini-howto ;-), > rather than a FAQ entry. > > The changes you have made so far have looked good to me. Definitely. Georg From martin at v.loewis.de Sun Dec 4 10:56:06 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 04 Dec 2011 10:56:06 +0100 Subject: [Python-Dev] STM and python In-Reply-To: References: Message-ID: <4EDB43B6.1080501@v.loewis.de> > However given advances in locking and garbage collection in the last > decade, what attempts have been made recently to try these new ideas > out? If that's the question you want an answer to, it would have been better had you listed the efforts that you are already aware of. If you really are unaware of any effort, try googling to find http://www.kamaelia.org/STM http://peak.telecommunity.com/DevCenter/TrellisSTM http://bugs.python.org/issue12850 http://dl.acm.org/citation.cfm?id=1978911 http://www-sal.cs.uiuc.edu/~zilles/papers/python_htm.dls2006.pdf and more Regards, Martin From mail at timgolden.me.uk Sun Dec 4 11:59:13 2011 From: mail at timgolden.me.uk (Tim Golden) Date: Sun, 04 Dec 2011 10:59:13 +0000 Subject: [Python-Dev] Issue 13524: subprocess on Windows Message-ID: <4EDB5281.8040807@timgolden.me.uk> http://bugs.python.org/issue13524 Someone raised issue13524 yesterday to illustrate that a subprocess will crash immediately if an environment block is passed which does not contain a valid SystemRoot environment variable. Note that the calling (Python) process is unaffected; this isn't - strictly - a Python crash. The issue is essentially a Windows one where a fairly unusual cornercase -- passing an empty environment -- has unforseen effects. The smallest reproducible example is this: import os, sys import subprocess subprocess.Popen( [sys.executable], env={} ) and it can be prevented like this: import os, sys import subprocess subprocess.Popen( [sys.executable], env={"SystemRoot" : os.environ['SystemRoot']} ) There's a blog post here which gives a worked example: http://jpassing.com/2009/12/28/the-hidden-danger-of-forgetting-to-specify-systemroot-in-a-custom-environment-block/ but as the author points out, nowhere on MSDN is there a warning that SystemRoot is mandatory. (And, in effect, it's not as it would just be possible to write code which had no need of it). So... what's our take on this? As I see it we could: 1) Do nothing: it's the caller's responsibility to understand the complications of the chosen Operating System. 2) Add a doc warning (ironically, considering the recent to-and-fro on doc warnings in this very module). 3) Add a check into the subprocess.Popen code which would raise some exception if the environment block is empty (or doesn't contain SystemRoot) on the grounds that this probably wasn't what the user thought they were doing. 4) Automatically add an entry for SystemRoot to the env block if it's not present already. It's tempting to opt for (1) and if we were exposing an API called CreateProcess which mimicked the underlying Windows API I would be inclined to go that way. But we're abstracting a little bit away from that and I think that that layer of abstraction carries its own responsibilities. Option (3) seems to give the best balance. It *is* a cornercase, but at the same time it's easy to misunderstand that the env block you're passing in *replaces* rather than *augments* that of the current process. Thoughts? TJG From ncoghlan at gmail.com Sun Dec 4 12:42:14 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Dec 2011 21:42:14 +1000 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: <4EDB5281.8040807@timgolden.me.uk> References: <4EDB5281.8040807@timgolden.me.uk> Message-ID: On Sun, Dec 4, 2011 at 8:59 PM, Tim Golden wrote: > So... what's our take on this? As I see it we could: > > 1) Do nothing: it's the caller's responsibility to understand the > ? complications of the chosen Operating System. > > 2) Add a doc warning (ironically, considering the recent to-and-fro > ? on doc warnings in this very module). > > 3) Add a check into the subprocess.Popen code which would raise some > ? exception if the environment block is empty (or doesn't contain > ? SystemRoot) on the grounds that this probably wasn't what the user > ? thought they were doing. > > 4) Automatically add an entry for SystemRoot to the env block if it's > ? not present already. > > > It's tempting to opt for (1) and if we were exposing an API called > CreateProcess which mimicked the underlying Windows API I would be > inclined to go that way. But we're abstracting a little bit away > from that and I think that that layer of abstraction carries its > own responsibilities. > > Option (3) seems to give the best balance. It *is* a cornercase, but at > the same time it's easy to misunderstand that the env block you're > passing in *replaces* rather than *augments* that of the current > process. There's actually two questions to be answered: 1. What should we do in 3.2 and 2.7? 2. Should we do anything more in 3.3? Raising an exception is not really an appropriate response for any of them - running without SystemRoot actually works fine in most cases, so raising an exception could break currently working code. As the blog post noted, it's only some specific modules that don't work if SystemRoot is not set. Should we really be inserting workarounds in subprocess for buggy platform code that doesn't fall back to a sensible default if a particular environment variable isn't set? So, I don't think this is really a subprocess problem at all. It's a platform bug on Windows that means the 'random' module may fail if SystemRoot is not set in the environment. So, I think the right approach is to: 1. Unset 'SystemRoot' in a windows shell 2. Run the test suite and observe the scale of the breakage 3. Then either: - figure out a workaround that allows us to set an appropriate default value for SystemRoot if needed (depending on the scope of the problem, either do this at interpreter startup, or only in affected modules) - if no feasible workaround is found, detect the failures related to this problem and report a more meaningful error message Either way, add explicit tests to the test suite to ensure that affected modules behave as expected when SystemRoot is not set. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mail at timgolden.me.uk Sun Dec 4 13:20:11 2011 From: mail at timgolden.me.uk (Tim Golden) Date: Sun, 04 Dec 2011 12:20:11 +0000 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: References: <4EDB5281.8040807@timgolden.me.uk> Message-ID: <4EDB657B.3030105@timgolden.me.uk> On 04/12/2011 11:42, Nick Coghlan wrote: > There's actually two questions to be answered: > 1. What should we do in 3.2 and 2.7? > 2. Should we do anything more in 3.3? Agreed. > 1. Unset 'SystemRoot' in a windows shell > 2. Run the test suite and observe the scale of the breakage Sorry; something I should have highlighted in the earlier post. Behaviour varies between Windows versions. On WinXP, if you unset SystemRoot in a cmd shell, you won't be able to run the test suite: Python won't even start up. On Win7 Python will start but, eg, the random module will fail. This is actually a separate issue: how much of Python will work without a valid SystemRoot. The OP's issue was that if you use subprocess to start an arbitrary process (you get the same problem if you try "notepad.exe") and pass it an env block without a valid SystemRoot then that process will likely fail to start up. And it won't be obvious why. The case where someone tries to run Python (in general) without a valid SystemRoot is a tiny cornercase and you'd be quite right to push that back and say "Don't do that". I don't believe we have to test for it or add code to work around it. While I put the idea forward, I agree that an exception is more likely than not to break existing code. I just can't see any clear alternative, apart from option 1: we do nothing. TJG From p.f.moore at gmail.com Sun Dec 4 13:41:46 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 4 Dec 2011 12:41:46 +0000 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: <4EDB657B.3030105@timgolden.me.uk> References: <4EDB5281.8040807@timgolden.me.uk> <4EDB657B.3030105@timgolden.me.uk> Message-ID: On 4 December 2011 12:20, Tim Golden wrote: > On 04/12/2011 11:42, Nick Coghlan wrote: >> >> There's actually two questions to be answered: >> 1. What should we do in 3.2 and 2.7? >> 2. Should we do anything more in 3.3? See below... > This is actually a separate issue: how much of Python will work > without a valid SystemRoot. The OP's issue was that if you use > subprocess to start an arbitrary process (you get the same problem > if you try "notepad.exe") and pass it an env block without a valid > SystemRoot then that process will likely fail to start up. And it > won't be obvious why. I'm not 100% clear on the problem here. From how I'm reading things, the problem is that not supplying SystemRoot will cause (some or all) invocations of subprocess.Popen to fail - it's not specific to starting Python. In that case, it seems to me that it's an OS issue, but one that we should work around. My feeling is that option 4 is best - set SystemRoot to its current value if it's not been set by the user. This leaves the user unable to set an environment with SystemRoot missing, but if the OS fails to handle that properly, then I'm OK with that limitation. As regards the version question above, I'd take the view that as an OS issue, it's OK to leave it unchanged in 2.7 and 3.2, but add the above to 3.3. Paul. From mail at timgolden.me.uk Sun Dec 4 15:08:36 2011 From: mail at timgolden.me.uk (Tim Golden) Date: Sun, 04 Dec 2011 14:08:36 +0000 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: References: <4EDB5281.8040807@timgolden.me.uk> <4EDB657B.3030105@timgolden.me.uk> Message-ID: <4EDB7EE4.3030403@timgolden.me.uk> On 04/12/2011 12:41, Paul Moore wrote: > I'm not 100% clear on the problem here. From how I'm reading things, > the problem is that not supplying SystemRoot will cause (some or all) > invocations of subprocess.Popen to fail - it's not specific to > starting Python. That's basically the situation. > > My feeling is that option 4 is best - set SystemRoot to its current > value if it's not been set by the user. This leaves the user unable to > set an environment with SystemRoot missing, but if the OS fails to > handle that properly, then I'm OK with that limitation. FWIW if we went this route we could set it if it's missing but that still allows the user to set it to blank. I'm just a little bit wary of altering the environment which the user believes has been set. TJG From martin.packman at canonical.com Sun Dec 4 17:48:16 2011 From: martin.packman at canonical.com (Martin Packman) Date: Sun, 4 Dec 2011 16:48:16 +0000 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: <4EDB5281.8040807@timgolden.me.uk> References: <4EDB5281.8040807@timgolden.me.uk> Message-ID: On 04/12/2011, Tim Golden wrote: > > Someone raised issue13524 yesterday to illustrate that a > subprocess will crash immediately if an environment block is > passed which does not contain a valid SystemRoot environment > variable. ... > 2) Add a doc warning (ironically, considering the recent to-and-fro > on doc warnings in this very module). There appears to already be such a warning, added because of a similar earlier bug: Really this is a problem with the subprocess api making a common case harder to do than necessary. If you read the documentation, you'll get it right, but that's not ideal: >From the bug, the problem with the reporter's code is he passes a dict with the one value he cares about as `env` to subprocess.Popen without realising that it will prevent the inheriting of the current environment. Your suggested fix for him also has an issue, it changes the environment of the parent process without resetting it. Instead you need something like: e = dict(os.environ) e['PATH_TO_MY_APPS'] = "path/to/my/apps" The bzrlib TestCase has a method using subprocess that provides an `env_changes` argument. With that, it's much easier to override or remove just one variable without accidentally clearing the current environment. Martin From ncoghlan at gmail.com Sun Dec 4 21:52:14 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Dec 2011 06:52:14 +1000 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: <4EDB657B.3030105@timgolden.me.uk> References: <4EDB5281.8040807@timgolden.me.uk> <4EDB657B.3030105@timgolden.me.uk> Message-ID: That's why I'm suggesting we look specifically at the cases where *Python* misbehaves in an empty environment on Windows. Those are legitimately our issue. The problem in *general* is a platform one, so I don't think it makes sense for us to modify the environment that has explicitly been passed in (e.g. how would you test running without SystemRoot if subprocess added it automatically?). An extra parameter in the already confusing Popen signature wouldn't be clearer than explicitly copying os.environ and modifying it. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) On Dec 4, 2011 10:22 PM, "Tim Golden" wrote: > On 04/12/2011 11:42, Nick Coghlan wrote: > >> There's actually two questions to be answered: >> 1. What should we do in 3.2 and 2.7? >> 2. Should we do anything more in 3.3? >> > > Agreed. > > 1. Unset 'SystemRoot' in a windows shell >> 2. Run the test suite and observe the scale of the breakage >> > > Sorry; something I should have highlighted in the earlier post. > Behaviour varies between Windows versions. On WinXP, if you > unset SystemRoot in a cmd shell, you won't be able to run the > test suite: Python won't even start up. On Win7 Python will > start but, eg, the random module will fail. > > This is actually a separate issue: how much of Python will work > without a valid SystemRoot. The OP's issue was that if you use > subprocess to start an arbitrary process (you get the same problem > if you try "notepad.exe") and pass it an env block without a valid > SystemRoot then that process will likely fail to start up. And it > won't be obvious why. > > The case where someone tries to run Python (in general) without > a valid SystemRoot is a tiny cornercase and you'd be quite right > to push that back and say "Don't do that". I don't believe we have > to test for it or add code to work around it. > > While I put the idea forward, I agree that an exception is more likely > than not to break existing code. I just can't see any clear alternative, > apart from option 1: we do nothing. > > TJG > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun Dec 4 22:08:33 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 04 Dec 2011 16:08:33 -0500 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: <4EDB5281.8040807@timgolden.me.uk> References: <4EDB5281.8040807@timgolden.me.uk> Message-ID: On 12/4/2011 5:59 AM, Tim Golden wrote: > http://bugs.python.org/issue13524 > > Someone raised issue13524 yesterday to illustrate that a > subprocess will crash immediately if an environment block is > passed which does not contain a valid SystemRoot environment > variable. > > Note that the calling (Python) process is unaffected; this > isn't - strictly - a Python crash. The issue is essentially > a Windows one where a fairly unusual cornercase -- passing > an empty environment -- has unforseen effects. > > The smallest reproducible example is this: > > import os, sys > import subprocess > subprocess.Popen( > [sys.executable], > env={} > ) > > and it can be prevented like this: > > import os, sys > import subprocess > subprocess.Popen( > [sys.executable], > env={"SystemRoot" : os.environ['SystemRoot']} > ) > > There's a blog post here which gives a worked example: > > > http://jpassing.com/2009/12/28/the-hidden-danger-of-forgetting-to-specify-systemroot-in-a-custom-environment-block/ > > > but as the author points out, nowhere on MSDN is there a warning > that SystemRoot is mandatory. (And, in effect, it's not as it > would just be possible to write code which had no need of it). > > So... what's our take on this? As I see it we could: > > 1) Do nothing: it's the caller's responsibility to understand the > complications of the chosen Operating System. > > 2) Add a doc warning (ironically, considering the recent to-and-fro > on doc warnings in this very module). > > 3) Add a check into the subprocess.Popen code which would raise some > exception if the environment block is empty (or doesn't contain > SystemRoot) on the grounds that this probably wasn't what the user > thought they were doing. > > 4) Automatically add an entry for SystemRoot to the env block if it's > not present already. > > > It's tempting to opt for (1) and if we were exposing an API called > CreateProcess which mimicked the underlying Windows API I would be > inclined to go that way. But we're abstracting a little bit away > from that and I think that that layer of abstraction carries its > own responsibilities. > > Option (3) seems to give the best balance. It *is* a cornercase, but at > the same time it's easy to misunderstand that the env block you're > passing in *replaces* rather than *augments* that of the current > process. > > Thoughts? My inclination would be #4 on Windows, certainly for 3.3, unless there is a clear reason not to. For 2.7/3.2, at least say (not warn, just say) in the doc that that a subprocess on Windows may require that SystemRoot be set. The blog post says the problem is worse on Win 7. So it is not going away. The blog post has a comment from Martin Loewis a year ago linking to http://mail.python.org/pipermail/python-dev/2010-November/105866.html That thread refers to a bug that was not posted on the tracker. This makes at least three (including #3440). -- Terry Jan Reedy From ncoghlan at gmail.com Mon Dec 5 01:16:01 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Dec 2011 10:16:01 +1000 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: References: <4EDB5281.8040807@timgolden.me.uk> Message-ID: On Mon, Dec 5, 2011 at 7:08 AM, Terry Reedy wrote: > My inclination would be #4 on Windows, certainly for 3.3, unless there is a > clear reason not to. Yes, there is: that environment is the *exact* environment that should be passed to the child processes. It's not our place to go implicitly adding things to it. If MS aren't willing to add SystemRoot automatically in CreateProcess (despite releasing libraries that require it to be set), there's no way we should be adding it for them. Fixing our stuff (like importing the random module) to work to at least some degree even if SystemRoot isn't set should definitely be done, but beyond that a comment in the docs pointing out the problem (i.e. MS releasing things that require SystemRoot be set without updating CreateProcess to ensure that it *is* set) is as far as we should go. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From martin at v.loewis.de Mon Dec 5 09:10:51 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 05 Dec 2011 09:10:51 +0100 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: <4EDB5281.8040807@timgolden.me.uk> References: <4EDB5281.8040807@timgolden.me.uk> Message-ID: <4EDC7C8B.6040007@v.loewis.de> > Thoughts? Apparently, there are at least two "users" of SystemRoot: - side-by-side (fusion?) apparently uses it to locate the WinSxS folder, at least on some Windows releases, - certain registry keys contain SystemRoot, in particular the path names of crypto providers (this apparently is XP only, and fixed on Windows 7) I agree with Nick that we shouldn't do anything except perhaps for documentation changes. There are many other environment variables whose absence could also cause failures to run the executable, such as PATH, LD_LIBRARY_PATH, etc. Even not passing DISPLAY may cause the subprocess to fail starting. IOW, users should "normally" pass all environment variables, and only augment it with any specific additions and deletions that they know are needed for the subprocess. If a user deliberately passes a small set of environment variables (e.g. none), we must assume that it was deliberate, and that any resulting failures are desired. People do such stuff for security reasons, and side-stepping their enforcement is not appropriate for Python to do. Regards, Martin From mail at timgolden.me.uk Mon Dec 5 10:01:17 2011 From: mail at timgolden.me.uk (Tim Golden) Date: Mon, 05 Dec 2011 09:01:17 +0000 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: <4EDC7C8B.6040007@v.loewis.de> References: <4EDB5281.8040807@timgolden.me.uk> <4EDC7C8B.6040007@v.loewis.de> Message-ID: <4EDC885D.5030708@timgolden.me.uk> On 05/12/2011 08:10, "Martin v. L?wis" wrote: > I agree with Nick that we shouldn't do anything except perhaps > for documentation changes. There are many other environment variables > whose absence could also cause failures to run the executable, > such as PATH, LD_LIBRARY_PATH, etc. Even not passing DISPLAY may > cause the subprocess to fail starting. > > IOW, users should "normally" pass all environment variables, and > only augment it with any specific additions and deletions that > they know are needed for the subprocess. If a user deliberately > passes a small set of environment variables (e.g. none), we must > assume that it was deliberate, and that any resulting failures > are desired. People do such stuff for security reasons, and > side-stepping their enforcement is not appropriate for Python > to do. Having slept on this I must confess that this is pretty much the conclusion I'd come to: we can't do anything in code which is guaranteed to be correct in every case. The best we can do is document. And, as Martin Packman pointed out (and I had missed), this particular condition is already documented, at least enough to point a user to. We could probably do with a HOWTO (or blog post or whatever) on using subprocess on Windows, not least because a fair amount of the docs are Unix-centric and actually very slightly confusing for naive Windows-based developers. I think my proposal now is: do nothing. I'm aware that Nick Coghlan has been making fairly extensive changes to the subprocess docs recently and I don't I can propose anything on this matter which amounts to more than shuffling the pieces around. TJG From ncoghlan at gmail.com Mon Dec 5 10:41:18 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Dec 2011 19:41:18 +1000 Subject: [Python-Dev] Issue 13524: subprocess on Windows In-Reply-To: <4EDC885D.5030708@timgolden.me.uk> References: <4EDB5281.8040807@timgolden.me.uk> <4EDC7C8B.6040007@v.loewis.de> <4EDC885D.5030708@timgolden.me.uk> Message-ID: On Mon, Dec 5, 2011 at 7:01 PM, Tim Golden wrote: > We could probably do with a HOWTO (or blog post or whatever) on using > subprocess on Windows, not least because a fair amount of the docs > are Unix-centric and actually very slightly confusing for naive > Windows-based developers. > > I think my proposal now is: do nothing. I'm aware that Nick Coghlan > has been making fairly extensive changes to the subprocess docs > recently and I don't I can propose anything on this matter which > amounts to more than shuffling the pieces around. The subprocess module could probably do with a HOWTO, full stop. Subprocess invocation is something where platform details are always going to matter a lot, and there are subtle details even on Unix that are confusing (e.g. I have a command in my current project that I've only managed to get working by running it via the shell - I still don't know why direct invocation of the binary with the appropriate arguments doesn't work). At the moment, we're still trying to cram an entire essay on subprocess invocation into the subprocess.Popen constructor definition, which is far from optimal. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mathieu.malaterre at gmail.com Mon Dec 5 16:26:50 2011 From: mathieu.malaterre at gmail.com (Mathieu Malaterre) Date: Mon, 5 Dec 2011 16:26:50 +0100 Subject: [Python-Dev] ImportError: No module named multiarray (is back) In-Reply-To: <4ED127C5.1060004@in.waw.pl> References: <4ECBFF19.8080100@in.waw.pl> <4ECD1D31.7080802@netwok.org> <4ED127C5.1060004@in.waw.pl> Message-ID: Hi Zbyszek, See below my comment. 2011/11/26 Zbigniew J?drzejewski-Szmek : > Hi, > I apologize in advance for the length of this mail. > > sys.path > ======== > When a script or a module is executed by invoking python with proper > arguments, sys.path is extended. When a path to script is given, the > directory containing the script is prepended. When '-m' or '-c' is used, > $CWD is prepended. This is documented in > http://docs.python.org/dev/using/cmdline.html, so far ok. > > sys.path and $PYTHONPATH is like $PATH -- if you can convince someone to put > a directory under your control in any of them, you can execute code as this > someone. Therefore, sys.path is dangerous and important. Unfortunately, > sys.path manipulations are only described very briefly, and without any > commentary, in the on-line documentation. python(1) manpage doesn't even > mention them. > > The problem: each of the commands below is insecure: > > python /tmp/script.py ? ? ? ? ? ? ? ? (when script.py is safe by itself) > ? ? ? ?('/tmp' is added to sys.path, so an attacker can override any > ? ? ? ? module imported in /tmp/script.py by writing to /tmp/module.py) > > cd /tmp && python -mtimeit -s 'import numpy' 'numpy.test()' > ? ? ? ?(UNIX users are accustomed to being able to safely execute > ? ? ? ? programs in any directory, e.g. ls, or gcc, or something. > > ? ? ? ? Here '' is added to sys.path, so it is not secure to run > ? ? ? ? python is other-user-writable directories.) > > cd /tmp/ && python -c 'import numpy; print(numpy.version.version)' > ? ? ? ? (The same as above, '' is added to sys.path.) > > cd /tmp && python > ? ? ? ? (The same as above). > > IMHO, if this (long-lived) behaviour is necessary, it should at least be > prominently documented. Also in the manpage. > > Prepending realpath(dirname(scriptname)) > ======================================== > Before adding a directory to sys.path as described above, Python actually > runs os.path.realpath over it. This means that if the path to a script given > on the commandline is actually a symlink, the directory containing the real > file will be executed. This behaviour is not really documented (the > documentation only says "the directory containing that file is added to the > start of sys.path"), but since the integrity of sys.path is so important, it > should be, IMHO. > > Using realpath instead of the (expected) path specified by the user breaks > imports of non-pure-python (mixed .py and .so) modules from modules executed > as scripts on Debian. This is because Debian installs > architecture-independent python files in /usr/share/pyshared, and symlinks > those files into /usr/lib/pymodules/pythonX.Y/. The architecture-dependent > .so and python-version-dependent .pyc files are installed in > ?/usr/lib/pymodules/pythonX.Y/. When a script, e.g. > /usr/lib/pymodules/pythonX.Y/script.py, is executed, the directory > /usr/share/pyshared is prepended to sys.path. If the script tries to import > a module which has architecture-dependent parts (e.g. numpy) it first sees > the incomplete module in /usr/share/pyshared and fails. > > This happens for example in parallel python > (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=620551) and recently when > packaging CellProfiler for Debian. > > Again, if this is on purpose, it should be documented. > > PEP 395 (Qualified Names for Modules) > ===================================== > > PEP 395 proposes another sys.path manipulation. When running a script, the > directory tree will be walked upwards as long as there are __init__.py > files, and then the first directory without will be added. > > This is of course a fine idea, but it makes a scenario, which was previously > safe, insecure. More precisely, when executing a script in a directory in a > parent directory-writable-by-other-users, the parent directory will be added > to sys.path. > > So the (safe) operation of downloading an archive with a package, unzipping > it in /tmp, changing into the created directory, checking that the script > doesn't do anything bad, and running a script is now insecure if there is > __init__.py in the archive root. > > > I guess that it would be useful to have an option to turn off those sys.path > manipulations. Thanks very much for the details explanation. Given this, I believe I can safely give up on CellProfiler packaging until this issue is addressed upstream (either in CellProfiler using an indirection, or in python). Thanks, -- Mathieu From arigo at tunes.org Tue Dec 6 10:55:58 2011 From: arigo at tunes.org (Armin Rigo) Date: Tue, 6 Dec 2011 10:55:58 +0100 Subject: [Python-Dev] STM and python In-Reply-To: References:

Message-ID: Hi, Actually, not even one month ago, Intel announced that its processors will offer Hardware Transactional Memory in 2013: http://www.h-online.com/newsticker/news/item/Processor-Whispers-About-Haskell-and-Haswell-1389507.html So yes, obviously, it's going to happen. A bient?t, Armin. From anacrolix at gmail.com Tue Dec 6 13:28:42 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Tue, 6 Dec 2011 23:28:42 +1100 Subject: [Python-Dev] STM and python In-Reply-To: References:

Message-ID: This is very interesting, cheers for the link. On Tue, Dec 6, 2011 at 8:55 PM, Armin Rigo wrote: > Hi, > > Actually, not even one month ago, Intel announced that its processors > will offer Hardware Transactional Memory in 2013: > > http://www.h-online.com/newsticker/news/item/Processor-Whispers-About-Haskell-and-Haswell-1389507.html > > So yes, obviously, it's going to happen. > > > A bient?t, > > Armin. -- ?_? From jaraco at jaraco.com Tue Dec 6 23:34:07 2011 From: jaraco at jaraco.com (Jason R. Coombs) Date: Tue, 6 Dec 2011 22:34:07 +0000 Subject: [Python-Dev] [Python-checkins] cpython (2.7): PDB now will properly escape backslashes in the names of modules it executes. In-Reply-To: <4EC67559.90409@netwok.org> References: <4EC67559.90409@netwok.org> Message-ID: <7E79234E600438479EC119BD241B48D6A246E8@CH1PRD0602MB098.namprd06.prod.outlook.com> ?ric, These are all good suggestions. I'll make them at some point. Thanks. > -----Original Message----- > From: python-dev-bounces+jaraco=jaraco.com at python.org [mailto:python- > dev-bounces+jaraco=jaraco.com at python.org] On Behalf Of ?ric Araujo > Sent: Friday, 18 November, 2011 10:10 > To: python-dev at python.org > Subject: Re: [Python-Dev] [Python-checkins] cpython (2.7): PDB now will > properly escape backslashes in the names of modules it executes. > > Hi Jason, > > > http://hg.python.org/cpython/rev/f7dd5178f36a > > branch: 2.7 > > user: Jason R. Coombs > > date: Thu Nov 17 18:03:24 2011 -0500 > > summary: > > PDB now will properly escape backslashes in the names of modules it > > executes. Fixes #7750 > > > diff --git a/Lib/test/test_pdb.py b/Lib/test/test_pdb.py > > +class Tester7750(unittest.TestCase): > I think we have an unwritten rule that test class and method names should > tell something about what they test. (We do have things like TestWeirdBugs > and test_12345, but I don?t think it?s a useful pattern to follow :) Not a big > deal anyway. > > > + # if the filename has something that resolves to a python > > + # escape character (such as \t), it will fail > > + test_fn = '.\\test7750.py' > > + > > + msg = "issue7750 only applies when os.sep is a backslash" > > + @unittest.skipUnless(os.path.sep == '\\', msg) > > + def test_issue7750(self): > > + with open(self.test_fn, 'w') as f: > > + f.write('print("hello world")') > > + cmd = [sys.executable, '-m', 'pdb', self.test_fn,] > > + proc = subprocess.Popen(cmd, > > + stdout=subprocess.PIPE, > > + stdin=subprocess.PIPE, > > + stderr=subprocess.STDOUT, > > + ) > > + stdout, stderr = proc.communicate('quit\n') > > + self.assertNotIn('IOError', stdout, "pdb munged the > > + filename") > Why not check for assertIn(filename, stdout)? (In other words, check for > intended behavior rather than implementation of the erstwhile bug.) > > BTW, I?ve just tested that giving a message argument to assertNotIn (the > third argument), unittest still displays the other arguments to allow for easier > debugging. I didn?t know that, it?s cool! > > > + def tearDown(self): > > + if os.path.isfile(self.test_fn): > > + os.remove(self.test_fn) > In my own tests, I?ve become fond of using ?self.addCleanup(os.remove, > filename)?: It?s shorter that a tearDown and is right there on the line that > follows or precedes the file creation. > > > if __name__ == '__main__': > > test_main() > > + unittest.main() > This looks strange. > > Regards > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python- > dev/jaraco%40jaraco.com -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6662 bytes Desc: not available URL: From cs at zip.com.au Wed Dec 7 02:23:12 2011 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 7 Dec 2011 12:23:12 +1100 Subject: [Python-Dev] Warnings In-Reply-To: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> Message-ID: <20111207012312.GA7566@cskk.homeip.net> On 30Nov2011 22:10, Raymond Hettinger wrote: | When updating the documentation, please don't go overboard with warnings. | The docs need to be worded affirmatively -- say what a tool does and show how to use it correctly. | See http://docs.python.org/documenting/style.html#affirmative-tone I come to this late, but if we're going after the docs... At the above link one finds this text: This assures that files are flushed [...] It does not. It _ensures_ that files are flushed. The doco style "affirmative tone" _assures_. The coding practice _ensures_! Pedanticly, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ There is one evil which...should never be passed over in silence but be continually publicly attacked, and that is corruption of the language... - W.H. Auden From raymond.hettinger at gmail.com Wed Dec 7 07:40:31 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 7 Dec 2011 00:40:31 -0600 Subject: [Python-Dev] Warnings In-Reply-To: <20111207012312.GA7566@cskk.homeip.net> References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> <20111207012312.GA7566@cskk.homeip.net> Message-ID: <9B227A4E-9788-4E6D-B415-0DF7CED47455@gmail.com> On Dec 6, 2011, at 7:23 PM, Cameron Simpson wrote: > This assures that files are flushed [...] > > It does not. It _ensures_ that files are flushed. The doco style "affirmative > tone" _assures_. The coding practice _ensures_! > > Pedanticly, > -- > Cameron Simpson I can assure you that I've ensured that you're fully insured ;-) Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Wed Dec 7 19:22:56 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 07 Dec 2011 19:22:56 +0100 Subject: [Python-Dev] Warnings In-Reply-To: <20111207012312.GA7566@cskk.homeip.net> References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> <20111207012312.GA7566@cskk.homeip.net> Message-ID: Am 07.12.2011 02:23, schrieb Cameron Simpson: > On 30Nov2011 22:10, Raymond Hettinger wrote: > | When updating the documentation, please don't go overboard with warnings. > | The docs need to be worded affirmatively -- say what a tool does and show how to use it correctly. > | See http://docs.python.org/documenting/style.html#affirmative-tone > > I come to this late, but if we're going after the docs... > > At the above link one finds this text: > > This assures that files are flushed [...] > > It does not. It _ensures_ that files are flushed. The doco style "affirmative > tone" _assures_. The coding practice _ensures_! > > Pedanticly, Oh, come on, surely this doesn't effect the casual reader? Georg From martin at v.loewis.de Wed Dec 7 19:33:57 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 07 Dec 2011 19:33:57 +0100 Subject: [Python-Dev] [Python-checkins] cpython (2.7): PDB now will properly escape backslashes in the names of modules it executes. In-Reply-To: <4EC67559.90409@netwok.org> References: <4EC67559.90409@netwok.org> Message-ID: <4EDFB195.60607@v.loewis.de> > I think we have an unwritten rule that test class and method names > should tell something about what they test. (We do have things like > TestWeirdBugs and test_12345, but I don?t think it?s a useful pattern to > follow :) I completely disagree. test_12345 is a very good name for a test case, in particular if it tests the value of a tau constant in the math module. There can't be any more precise documentation of the test purpose. Regards, Martin From steve at holdenweb.com Wed Dec 7 19:40:56 2011 From: steve at holdenweb.com (Steve Holden) Date: Wed, 7 Dec 2011 10:40:56 -0800 Subject: [Python-Dev] Python Best Again Message-ID: <48E6CE91-AA36-427D-A1C5-FFC4B9A4690E@holdenweb.com> I've just added a news item to the python.org home page noting that Linux Journal readers have voted Python the Best Programming Language for the third year in a row. This is excellent news, though I find it hard to believe that coming up on the outside we see C++. While it demonstrates that Linux Journal readers like object-oriented programming, it shows an uncomfortable tendency towards masochism :) and implies we can't necessarily trust their judgment. ;-) Attempted humor aside, here I am taking the opportunity as PSF chairman to say a big "thank you" to all developers and everyone else who helps to keep putting out releases that gain the kind of popularity that this most recent vote indicates. I know we do it to create a great programming environment, not for popularity, but the Foundation's mission involves encouraging the growth of the international Python community. Please pass this on to other members of your developer community who may not receive this message directly. Seriously, thanks. Having quality releases of a great language really does make it easier to promote Python! regards Steve -- Steve Holden steve at holdenweb.com, Holden Web, LLC http://holdenweb.com/ Python classes (and much more) through the web http://oreillyschool.com/ From massimo.dipierro at gmail.com Wed Dec 7 19:45:31 2011 From: massimo.dipierro at gmail.com (Massimo Di Pierro) Date: Wed, 7 Dec 2011 12:45:31 -0600 Subject: [Python-Dev] [PSF-Members] Python Best Again In-Reply-To: <48E6CE91-AA36-427D-A1C5-FFC4B9A4690E@holdenweb.com> References: <48E6CE91-AA36-427D-A1C5-FFC4B9A4690E@holdenweb.com> Message-ID: <37B986D5-DE20-473F-A438-D99AFB7FF7C4@gmail.com> Hello Steve, congratulations to all of you in the foundation who work hard to make Python the success that it is. Massimo On Dec 7, 2011, at 12:40 PM, Steve Holden wrote: > I've just added a news item to the python.org home page noting that Linux Journal readers have voted Python the Best Programming Language for the third year in a row. > > This is excellent news, though I find it hard to believe that coming up on the outside we see C++. While it demonstrates that Linux Journal readers like object-oriented programming, it shows an uncomfortable tendency towards masochism :) and implies we can't necessarily trust their judgment. ;-) > > Attempted humor aside, here I am taking the opportunity as PSF chairman to say a big "thank you" to all developers and everyone else who helps to keep putting out releases that gain the kind of popularity that this most recent vote indicates. I know we do it to create a great programming environment, not for popularity, but the Foundation's mission involves encouraging the growth of the international Python community. Please pass this on to other members of your developer community who may not receive this message directly. > > Seriously, thanks. Having quality releases of a great language really does make it easier to promote Python! > > regards > Steve > -- > Steve Holden steve at holdenweb.com, Holden Web, LLC http://holdenweb.com/ > Python classes (and much more) through the web http://oreillyschool.com/ > > > > _______________________________________________ > PSF-Members mailing list > PSF-Members at python.org > http://mail.python.org/mailman/listinfo/psf-members From ethan at stoneleaf.us Wed Dec 7 20:00:41 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 07 Dec 2011 11:00:41 -0800 Subject: [Python-Dev] Warnings In-Reply-To: References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> <20111207012312.GA7566@cskk.homeip.net> Message-ID: <4EDFB7D9.6010206@stoneleaf.us> Georg Brandl wrote: > Am 07.12.2011 02:23, schrieb Cameron Simpson: >> On 30Nov2011 22:10, Raymond Hettinger wrote: >> | When updating the documentation, please don't go overboard with warnings. >> | The docs need to be worded affirmatively -- say what a tool does and show how to use it correctly. >> | See http://docs.python.org/documenting/style.html#affirmative-tone >> >> I come to this late, but if we're going after the docs... >> >> At the above link one finds this text: >> >> This assures that files are flushed [...] >> >> It does not. It _ensures_ that files are flushed. The doco style "affirmative >> tone" _assures_. The coding practice _ensures_! >> >> Pedanticly, > > Oh, come on, surely this doesn't effect the casual reader? No, of course not -- although it might /affect/ said reader by causing him/her to think, "I don't think that word means what you think it means..." ;) Seriously, it's best to use the correct words with the correct meanings. If someone is willing to fix it, let them. ~Ethan~ From wolfson at gmail.com Wed Dec 7 21:01:52 2011 From: wolfson at gmail.com (Ben Wolfson) Date: Wed, 7 Dec 2011 12:01:52 -0800 Subject: [Python-Dev] Warnings In-Reply-To: <4EDFB7D9.6010206@stoneleaf.us> References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> <20111207012312.GA7566@cskk.homeip.net> <4EDFB7D9.6010206@stoneleaf.us> Message-ID: On Wed, Dec 7, 2011 at 11:00 AM, Ethan Furman wrote: > > No, of course not -- although it might /affect/ said reader by causing > him/her to think, "I don't think that word means what you think it means..." > ?;) > > Seriously, it's best to use the correct words with the correct meanings. ?If > someone is willing to fix it, let them. I'm sure this hypothetical reader will then look "assure" up in the OED and find this: 5. To make certain the occurrence or arrival of (an event); to ensure. -- Ben Wolfson "Human kind has used its intelligence to vary the flavour of drinks, which may be sweet, aromatic, fermented or spirit-based. ... Family and social life also offer numerous other occasions to consume drinks for pleasure." [Larousse, "Drink" entry] From ben+python at benfinney.id.au Wed Dec 7 21:15:18 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 08 Dec 2011 07:15:18 +1100 Subject: [Python-Dev] Warnings References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> <20111207012312.GA7566@cskk.homeip.net> Message-ID: <87k4682l61.fsf@benfinney.id.au> Georg Brandl writes: > Am 07.12.2011 02:23, schrieb Cameron Simpson: > > This assures that files are flushed [...] > > > > It does not. It _ensures_ that files are flushed. The doco style > > "affirmative tone" _assures_. The coding practice _ensures_! > > Oh, come on, surely this doesn't effect the casual reader? Some readers could of been confused irregardless. -- \ ?We must find our way to a time when faith, without evidence, | `\ disgraces anyone who would claim it.? ?Sam Harris, _The End of | _o__) Faith_, 2004 | Ben Finney From tseaver at palladion.com Wed Dec 7 21:16:24 2011 From: tseaver at palladion.com (Tres Seaver) Date: Wed, 07 Dec 2011 15:16:24 -0500 Subject: [Python-Dev] Warnings In-Reply-To: References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> <20111207012312.GA7566@cskk.homeip.net> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/07/2011 01:22 PM, Georg Brandl wrote: > Am 07.12.2011 02:23, schrieb Cameron Simpson: >> On 30Nov2011 22:10, Raymond Hettinger >> wrote: | When updating the documentation, please don't go overboard >> with warnings. | The docs need to be worded affirmatively -- say >> what a tool does and show how to use it correctly. | See >> http://docs.python.org/documenting/style.html#affirmative-tone >> >> I come to this late, but if we're going after the docs... >> >> At the above link one finds this text: >> >> This assures that files are flushed [...] >> >> It does not. It _ensures_ that files are flushed. The doco style >> "affirmative tone" _assures_. The coding practice _ensures_! >> >> Pedanticly, > > Oh, come on, surely this doesn't effect the casual reader? /me presumes an ironic mispeling there. ;) Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk7fyZgACgkQ+gerLs4ltQ5eaQCeL+E4CVxa1BWhm/MsPw29u/Ym QnUAoKBOY37dNA9aT5TZkv4hu9ixZjBn =jg86 -----END PGP SIGNATURE----- From victor.stinner at haypocalc.com Thu Dec 8 02:43:40 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 08 Dec 2011 02:43:40 +0100 Subject: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues Message-ID: <1504453.f4XqDVp2GQ@ned> Hi, I would like to deny the creation of an Unicode string containing characters outside the range [U+0000; U+10FFFF]. The check is already present in some places (e.g. the builtin chr() function), but not everywhere. The last important function is PyUnicode_FromWideChar, function used to decode text from the OS. The problem is that test_locale fails on Solaris with such checks. I would like to know how to handle Solaris issues. One possible solution is to not handle issues, and just raise exceptions and skip the failing tests on Solaris ;-) Another solution is to modify locale.strxfrm() on all platforms to return a list of int, instead of a str. The type of the result is not really important, we just have to be able to compare two results (equal, greater, lesser or equal, etc.). Another solution? -- The two Solaris issues: - in the hu_HU locale, localeconv() returns U+30000020 for the thousands separator - locale.strxfrm() calls wcsxfrm() which returns characters in the range [0x1000000; 0x1FFFFFF] For localeconv(), it is the b'\xA0' byte string decoded from an encoding looking like ISO-8859-?? (b'\xA0' is not decodable from UTF-8). It looks like a bug in the decoder. It also looks like OpenIndiana doesn't use ISO-8859 locale anymore, only UTF-8 locales (which is much better!). I'm unable to reproduce the issue on my OpenIndiana VM. For wcsxfrm(), I'm not sure of the range. Example of a result: {0x1010163, 0x1010101, 0x1010103, 0x1010101, 0x1010103, 0x1010101, 0x1010101}. It looks like wcsxfrm() uses the result of strxfrm() by grouping bytes 3 by 3 and add 0x1000000 to each group. Example of strxfrm() output for the same input: {0x01, 0x01, 0x63, 0x01, 0x01, 0x01, ...}. See http://bugs.python.org/issue13441 for more information. Victor From stephen at xemacs.org Thu Dec 8 03:13:30 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 08 Dec 2011 11:13:30 +0900 Subject: [Python-Dev] Warnings In-Reply-To: References: <17CC15CD-539C-4214-ADD5-E85322259C64@gmail.com> <20111207012312.GA7566@cskk.homeip.net> Message-ID: <87d3bz24l1.fsf@uwakimon.sk.tsukuba.ac.jp> Georg Brandl writes: > Oh, come on, surely this doesn't effect the casual reader? Casual readers aren't effective in any case; you want to hear the opinions of those who care. From chrism at plope.com Thu Dec 8 06:08:39 2011 From: chrism at plope.com (Chris McDonough) Date: Thu, 08 Dec 2011 00:08:39 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? Message-ID: <1323320919.2710.24.camel@thinko> On the heels of Armin's blog post about the troubles of making the same codebase run on both Python 2 and Python 3, I have a concrete suggestion. It would help a lot for code that straddles both Py2 and Py3 to be able to make use of u'' literals. It would seem to be an easy thing to reenable (see http://www.reddit.com/r/Python/comments/n3q7q/thoughts_on_python_3_armin_ronachers_thoughts_and/c36397t ) . It would seem to cost very little in terms of maintenance, and not much in docs. It would make it possible to share code like this across py2 and py3: a = u'foo' Instead of (with e.g. six): a = u('foo') Or: from __future__ import unicode_literals a = 'foo' I recognize that the last option is probably the way "its meant to be done", but in reality it's just more practical to not fail when literal notation is more specific than strictly necessary. - C From benjamin at python.org Thu Dec 8 07:02:22 2011 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 8 Dec 2011 01:02:22 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323320919.2710.24.camel@thinko> References: <1323320919.2710.24.camel@thinko> Message-ID: 2011/12/8 Chris McDonough : > On the heels of Armin's blog post about the troubles of making the same > codebase run on both Python 2 and Python 3, I have a concrete > suggestion. > > It would help a lot for code that straddles both Py2 and Py3 to be able > to make use of u'' literals. Helpful or not helpful, I think that ship has sailed. The earliest it could see the light of day is 3.3, which would leave people trying to support 3.1 and 3.2 in a bind. -- Regards, Benjamin From chrism at plope.com Thu Dec 8 07:10:44 2011 From: chrism at plope.com (Chris McDonough) Date: Thu, 08 Dec 2011 01:10:44 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> Message-ID: <1323324644.2710.28.camel@thinko> On Thu, 2011-12-08 at 01:02 -0500, Benjamin Peterson wrote: > 2011/12/8 Chris McDonough : > > On the heels of Armin's blog post about the troubles of making the same > > codebase run on both Python 2 and Python 3, I have a concrete > > suggestion. > > > > It would help a lot for code that straddles both Py2 and Py3 to be able > > to make use of u'' literals. > > Helpful or not helpful, I think that ship has sailed. The earliest it > could see the light of day is 3.3, which would leave people trying to > support 3.1 and 3.2 in a bind. Right.. the title does say "readd ... support in 3.3". Are you suggesting "the ship has sailed" for eternity because it can't be supported in Python < 3.3? - C From benjamin at python.org Thu Dec 8 07:18:06 2011 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 8 Dec 2011 01:18:06 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323324644.2710.28.camel@thinko> References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> Message-ID: 2011/12/8 Chris McDonough : > On Thu, 2011-12-08 at 01:02 -0500, Benjamin Peterson wrote: >> 2011/12/8 Chris McDonough : >> > On the heels of Armin's blog post about the troubles of making the same >> > codebase run on both Python 2 and Python 3, I have a concrete >> > suggestion. >> > >> > It would help a lot for code that straddles both Py2 and Py3 to be able >> > to make use of u'' literals. >> >> Helpful or not helpful, I think that ship has sailed. The earliest it >> could see the light of day is 3.3, which would leave people trying to >> support 3.1 and 3.2 in a bind. > > Right.. the title does say "readd ... support in 3.3". ?Are you > suggesting "the ship has sailed" for eternity because it can't be > supported in Python < 3.3? I'm questioning the real utility of it. -- Regards, Benjamin From chrism at plope.com Thu Dec 8 07:31:56 2011 From: chrism at plope.com (Chris McDonough) Date: Thu, 08 Dec 2011 01:31:56 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> Message-ID: <1323325916.2710.39.camel@thinko> On Thu, 2011-12-08 at 01:18 -0500, Benjamin Peterson wrote: > 2011/12/8 Chris McDonough : > > On Thu, 2011-12-08 at 01:02 -0500, Benjamin Peterson wrote: > >> 2011/12/8 Chris McDonough : > >> > On the heels of Armin's blog post about the troubles of making the same > >> > codebase run on both Python 2 and Python 3, I have a concrete > >> > suggestion. > >> > > >> > It would help a lot for code that straddles both Py2 and Py3 to be able > >> > to make use of u'' literals. > >> > >> Helpful or not helpful, I think that ship has sailed. The earliest it > >> could see the light of day is 3.3, which would leave people trying to > >> support 3.1 and 3.2 in a bind. > > > > Right.. the title does say "readd ... support in 3.3". Are you > > suggesting "the ship has sailed" for eternity because it can't be > > supported in Python < 3.3? > > I'm questioning the real utility of it. All I can really offer is my own experience here based on writing code that needs to straddle Python 2.5, 2.6, 2.7 and 3.2 without use of 2to3. Having u'' work across all of these would mean porting would not require as much eyeballing as code modified via "from future import unicode_literals", it would let more code work on 2.5 unchanged, and the resulting code would execute faster than code that required us to use a u() function. What's the case against? - C From ncoghlan at gmail.com Thu Dec 8 08:33:29 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 8 Dec 2011 17:33:29 +1000 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323325916.2710.39.camel@thinko> References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko> Message-ID: Such code still won't work on 3.2, hence restoring the redundant notation would be ultimately pointless. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) On Dec 8, 2011 4:34 PM, "Chris McDonough" wrote: > On Thu, 2011-12-08 at 01:18 -0500, Benjamin Peterson wrote: > > 2011/12/8 Chris McDonough : > > > On Thu, 2011-12-08 at 01:02 -0500, Benjamin Peterson wrote: > > >> 2011/12/8 Chris McDonough : > > >> > On the heels of Armin's blog post about the troubles of making the > same > > >> > codebase run on both Python 2 and Python 3, I have a concrete > > >> > suggestion. > > >> > > > >> > It would help a lot for code that straddles both Py2 and Py3 to be > able > > >> > to make use of u'' literals. > > >> > > >> Helpful or not helpful, I think that ship has sailed. The earliest it > > >> could see the light of day is 3.3, which would leave people trying to > > >> support 3.1 and 3.2 in a bind. > > > > > > Right.. the title does say "readd ... support in 3.3". Are you > > > suggesting "the ship has sailed" for eternity because it can't be > > > supported in Python < 3.3? > > > > I'm questioning the real utility of it. > > All I can really offer is my own experience here based on writing code > that needs to straddle Python 2.5, 2.6, 2.7 and 3.2 without use of 2to3. > Having u'' work across all of these would mean porting would not require > as much eyeballing as code modified via "from future import > unicode_literals", it would let more code work on 2.5 unchanged, and the > resulting code would execute faster than code that required us to use a > u() function. > > What's the case against? > > - C > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chrism at plope.com Thu Dec 8 08:45:08 2011 From: chrism at plope.com (Chris McDonough) Date: Thu, 08 Dec 2011 02:45:08 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko> Message-ID: <1323330308.2710.52.camel@thinko> On Thu, 2011-12-08 at 17:33 +1000, Nick Coghlan wrote: > Such code still won't work on 3.2, hence restoring the redundant > notation would be ultimately pointless. None of the code I've written which straddles Python 2/3 supports anything except Python 3.2+, and likewise I expect that for the next crop of porters/straddlers, their code won't support anything but Python 3.3+. So there is a point, which is to make it easier for people to port code that can straddle the most recent Python 3 release as well as 2.7/2.6. In that context, I don't see much relevance of having no support for u'' in Python 3.2. - C From lukasz at langa.pl Thu Dec 8 08:54:18 2011 From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=) Date: Thu, 8 Dec 2011 08:54:18 +0100 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323320919.2710.24.camel@thinko> References: <1323320919.2710.24.camel@thinko> Message-ID: Wiadomo?? napisana przez Chris McDonough w dniu 8 gru 2011, o godz. 06:08: > It would make it possible to share code like this across py2 and py3: > > a = u'foo' > As Armin himself wrote, py3k-compatible code ported from 2.x is often very ugly. This kind of change would only deepen the problem. -1 > Or: > > from __future__ import unicode_literals > a = 'foo' > > I recognize that the last option is probably the way "its meant to be > done" Yes, that's the reason 2.x has b''. If Python 2.8 ever came to be, making this __future__ work with the standard library would be the right way to do it. -- Pozdrawiam serdecznie, ?ukasz Langa Senior Systems Architecture Engineer IT Infrastructure Department Grupa Allegro Sp. z o.o. Pomy?l o ?rodowisku naturalnym zanim wydrukujesz t? wiadomo??! Please consider the environment before printing out this e-mail. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.jpg Type: image/jpeg Size: 1898 bytes Desc: not available URL: From stefan at bytereef.org Thu Dec 8 10:17:52 2011 From: stefan at bytereef.org (Stefan Krah) Date: Thu, 8 Dec 2011 10:17:52 +0100 Subject: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues In-Reply-To: <1504453.f4XqDVp2GQ@ned> References: <1504453.f4XqDVp2GQ@ned> Message-ID: <20111208091752.GA29901@sleipnir.bytereef.org> Victor Stinner wrote: > For localeconv(), it is the b'\xA0' byte string decoded from an encoding > looking like ISO-8859-?? (b'\xA0' is not decodable from UTF-8). It looks like > a bug in the decoder. It also looks like OpenIndiana doesn't use ISO-8859 > locale anymore, only UTF-8 locales (which is much better!). I'm unable to > reproduce the issue on my OpenIndiana VM. I'm think that b'\xA0' is a valid thousands separator. The 'fi_FI' locale also uses that. Decimal.__format__() has to handle the 'n' specifier, which takes the thousands separator directly from localeconv(). Currently I have this horrible function to deal with the problem: /* Convert decimal_point or thousands_sep, which may be multibyte or in the range [128, 255], to a UTF8 string. */ static PyObject * dotsep_as_utf8(const char *s) { PyObject *utf8; PyObject *tmp; wchar_t buf[2]; size_t n; n = mbstowcs(buf, s, 2); if (n != 1) { /* Issue #7442 */ PyErr_SetString(PyExc_ValueError, "invalid decimal point or unsupported " "combination of LC_CTYPE and LC_NUMERIC"); return NULL; } tmp = PyUnicode_FromWideChar(buf, n); if (tmp == NULL) { return NULL; } utf8 = PyUnicode_AsUTF8String(tmp); Py_DECREF(tmp); return utf8; } The main issue is that there is no portable function mbst_to_utf8() that uses the current locale. If possible, it would be great to have such a thing in the C-API. I'm not sure why the b'\xA0' problem only occurs in Solaris. Many systems have this thousands separator. Stefan Krah From stefan at bytereef.org Thu Dec 8 10:42:31 2011 From: stefan at bytereef.org (Stefan Krah) Date: Thu, 8 Dec 2011 10:42:31 +0100 Subject: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues In-Reply-To: <20111208091752.GA29901@sleipnir.bytereef.org> References: <1504453.f4XqDVp2GQ@ned> <20111208091752.GA29901@sleipnir.bytereef.org> Message-ID: <20111208094231.GA30187@sleipnir.bytereef.org> Stefan Krah wrote: > I'm not sure why the b'\xA0' problem only occurs in Solaris. Many systems > have this thousands separator. Are LC_CTYPE and LC_NUMERIC set to the same value on the buildbot? Otherwise you encounter http://bugs.python.org/issue7442 . Stefan Krah From tjreedy at udel.edu Thu Dec 8 11:54:28 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 08 Dec 2011 05:54:28 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323325916.2710.39.camel@thinko> References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko> Message-ID: On 12/8/2011 1:31 AM, Chris McDonough wrote: > What's the case against? From a 3.x perpective, an irrelevant 'u' would be pure noise and make the language a bit harder to learn. The intent for 3.x is that one be able to learn 3.x without knowing anything about 2.x. So bridge stuff has been put into 2.6 and even more in 2.7. But it does not really belong in 3.x. -- Terry Jan Reedy From vinay_sajip at yahoo.co.uk Thu Dec 8 12:01:49 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 8 Dec 2011 11:01:49 +0000 (UTC) Subject: [Python-Dev] readd u'' literal support in 3.3? References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko> <1323330308.2710.52.camel@thinko> Message-ID: Chris McDonough plope.com> writes: > > In that context, I don't see much relevance of having no support for u'' > in Python 3.2. > Well, if 3.2 remains in use for a longish time, then it is relevant, in the broader context, isn't it? We know how conservative Linux distributions can be with their Python releases - although most are still releasing 2.x as their system Python, this could change at some point in the future. Even if it doesn't, there might be a fair user base of people stuck with 3.2 for any number of reasons, and to support them, the change you propose won't help, because some variant of a package will still have to use u() and b(), just for 3.2 support. I'm not arguing against your proposed change itself - just against your point about the relevance of 3.2. Regards, Vinay Sajip From stephan.richter at gmail.com Thu Dec 8 12:05:51 2011 From: stephan.richter at gmail.com (Stephan Richter) Date: Thu, 08 Dec 2011 06:05:51 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> Message-ID: <5242067.5aBSYdFaIB@einstein> On Thursday, December 08, 2011 01:18:06 AM Benjamin Peterson wrote: > > Right.. the title does say "readd ... support in 3.3". Are you > > suggesting "the ship has sailed" for eternity because it can't be > > supported in Python < 3.3? > > I'm questioning the real utility of it. The real utility is to make it possible to port libraries to Py3 or at least make it a lot easier. It is somewhat naive to think that you can just tell everyone to upgrade to Python 2.7 and then use the future import. Having to change all that code can also be a big bug magnet. Chris has been a great champion of bringing the Web app community closer to Python 3. His experience with porting code is pretty extensive especially in keeping it compatible with older Pythonn 2 versions (down to 2.5). If the Python Devs want more adoption of Python 3, they should at least throw a bone from time to time and make adoption a bit easier. The arguments against this proposal seem academic and purist to me. (Mmh, I cannot believe I just wrote that having been accused of that myself in the past.) Regards, Stephan -- Entrepreneur and Software Geek Google me. "Zope Stephan Richter" From anacrolix at gmail.com Thu Dec 8 12:08:17 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 8 Dec 2011 22:08:17 +1100 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko> Message-ID: Nobody is using 3 yet ;) Sure, I use it for some personal projects, and other people pretend to support it. Not really. The worst of the pain in porting to Python 3000 has yet to even begin! On Thu, Dec 8, 2011 at 6:33 PM, Nick Coghlan wrote: > Such code still won't work on 3.2, hence restoring the redundant notation > would be ultimately pointless. > > -- > Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) > > On Dec 8, 2011 4:34 PM, "Chris McDonough" wrote: >> >> On Thu, 2011-12-08 at 01:18 -0500, Benjamin Peterson wrote: >> > 2011/12/8 Chris McDonough : >> > > On Thu, 2011-12-08 at 01:02 -0500, Benjamin Peterson wrote: >> > >> 2011/12/8 Chris McDonough : >> > >> > On the heels of Armin's blog post about the troubles of making the >> > >> > same >> > >> > codebase run on both Python 2 and Python 3, I have a concrete >> > >> > suggestion. >> > >> > >> > >> > It would help a lot for code that straddles both Py2 and Py3 to be >> > >> > able >> > >> > to make use of u'' literals. >> > >> >> > >> Helpful or not helpful, I think that ship has sailed. The earliest it >> > >> could see the light of day is 3.3, which would leave people trying to >> > >> support 3.1 and 3.2 in a bind. >> > > >> > > Right.. the title does say "readd ... support in 3.3". ?Are you >> > > suggesting "the ship has sailed" for eternity because it can't be >> > > supported in Python < 3.3? >> > >> > I'm questioning the real utility of it. >> >> All I can really offer is my own experience here based on writing code >> that needs to straddle Python 2.5, 2.6, 2.7 and 3.2 without use of 2to3. >> Having u'' work across all of these would mean porting would not require >> as much eyeballing as code modified via "from future import >> unicode_literals", it would let more code work on 2.5 unchanged, and the >> resulting code would execute faster than code that required us to use a >> u() function. >> >> What's the case against? >> >> - C >> >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com > -- ?_? From lukasz at langa.pl Thu Dec 8 13:08:31 2011 From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=) Date: Thu, 8 Dec 2011 13:08:31 +0100 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <5242067.5aBSYdFaIB@einstein> References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <5242067.5aBSYdFaIB@einstein> Message-ID: <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> Wiadomo?? napisana przez Stephan Richter w dniu 8 gru 2011, o godz. 12:05: > It is somewhat naive to think that you can just tell > everyone to upgrade to Python 2.7 and then use the future import. Having to > change all that code can also be a big bug magnet. A big bug magnet is using a Python version that is not getting any fixes whatsoever. When I'm backporting stuff from Python 3, I'm targeting 2.6+ because it's still somewhat supported by us. What's more important though is that there were tremendous changes in that release in terms of bridging the gap between Python 2 and 3. I'm wondering why developers inflict so much impediment to support a Python version that's 5+ years old and was replaced by a newer one in virtually every operating system. Recent versions of Mac OS X, RedHat and Debian all sport Python 2.6+. It seems only GAE and Jython are stuck on Python 2.5. Python 2.6 has ABCs, supports b'' (and even has a "bytes" alias for the str type), forward compatibility __futures__ (print_function, unicode_literals, division and absolute_imports), "except Exception as e", etc. The thing we did miss was making sure the std lib doesn't break when unicode_literals are used. And that's a bummer. -- Pozdrawiam serdecznie, ?ukasz Langa Senior Systems Architecture Engineer IT Infrastructure Department Grupa Allegro Sp. z o.o. Pomy?l o ?rodowisku naturalnym zanim wydrukujesz t? wiadomo??! Please consider the environment before printing out this e-mail. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.jpg Type: image/jpeg Size: 1898 bytes Desc: not available URL: From stephan.richter at gmail.com Thu Dec 8 13:14:09 2011 From: stephan.richter at gmail.com (Stephan Richter) Date: Thu, 08 Dec 2011 07:14:09 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> Message-ID: <3344831.JP9Cfj4Ety@einstein> On Thursday, December 08, 2011 01:08:31 PM ?ukasz Langa wrote: > A big bug magnet is using a Python version that is not getting any fixes > whatsoever. When I'm backporting stuff from Python 3, I'm targeting 2.6+ > because it's still somewhat supported by us. What's more important though > is that there were tremendous changes in that release in terms of bridging > the gap between Python 2 and 3. But you might not have that luxury and updating code to a new Python version is a lot of work. As you can see in my signature, I am very much involved in the Zope community. The entire Zope, Plone and Pyramid ecosystem is extremely large and one can simply not make blanket statements about Python version use. We try very hard to move our libraries up the version ladder but we must also take great care of backwards-compatibility. (We have seen already what happens if we do not with Zoep 2 versus 3. And Python is struggling with similar issues, even though the changes were much less drastic.) Regards, Stephan -- Entrepreneur and Software Geek Google me. "Zope Stephan Richter" From barry at python.org Thu Dec 8 13:18:44 2011 From: barry at python.org (Barry Warsaw) Date: Thu, 8 Dec 2011 07:18:44 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323320919.2710.24.camel@thinko> References: <1323320919.2710.24.camel@thinko> Message-ID: <20111208071844.6fe1970c@limelight.wooz.org> On Dec 08, 2011, at 12:08 AM, Chris McDonough wrote: > from __future__ import unicode_literals > a = 'foo' I agree this is an annoying thing to have to change when supporting a dual-Python-version codebase, but it's not the most annoying. print-functions are a little more painful to switch because there's no easy Emacs conversion for them. ;) This one is actually pretty useful because it does make you go through and be very specific about which literals are bytes and which are unicodes. Also, re-adding u'' prefixes doesn't help you much because you might still have byte literals which you have to b'' prefix. Do you really want both 'foo' and u'foo' to be unicode literals? -1 Cheers, -Barry From barry at python.org Thu Dec 8 13:27:20 2011 From: barry at python.org (Barry Warsaw) Date: Thu, 8 Dec 2011 07:27:20 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko> <1323330308.2710.52.camel@thinko> Message-ID: <20111208072720.0d243557@limelight.wooz.org> On Dec 08, 2011, at 11:01 AM, Vinay Sajip wrote: >Well, if 3.2 remains in use for a longish time, then it is relevant, in the >broader context, isn't it? We know how conservative Linux distributions can >be with their Python releases - although most are still releasing 2.x as >their system Python, this could change at some point in the future. Even if >it doesn't, there might be a fair user base of people stuck with 3.2 for any >number of reasons, and to support them, the change you propose won't help, >because some variant of a package will still have to use u() and b(), just >for 3.2 support. Case in point: Ubuntu 12.04 is a long term support release, meaning 5 years of official support on both the desktop and server. It will ship with Python 2.7 and 3.2 only. -Barry From ncoghlan at gmail.com Thu Dec 8 13:32:43 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 8 Dec 2011 22:32:43 +1000 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <3344831.JP9Cfj4Ety@einstein> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein> Message-ID: If people decide to delay their Py3k migrations until they can drop 2.5 support, they're quite free to do so. The only reason for porting right now is to support 3.2, thus making a future reintroduction of u'' useless. Those that delay their ports can use the forward compatibility in 2.6. Having just purged so much cruft from the language, pleas to add some back permanently for a problem that is going to fade from significance within the next couple of years are unlikely to get very far. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at haypocalc.com Thu Dec 8 13:24:51 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 08 Dec 2011 13:24:51 +0100 Subject: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues In-Reply-To: <20111208091752.GA29901@sleipnir.bytereef.org> References: <1504453.f4XqDVp2GQ@ned> <20111208091752.GA29901@sleipnir.bytereef.org> Message-ID: <4EE0AC93.5030706@haypocalc.com> Le 08/12/2011 10:17, Stefan Krah a ?crit : > I'm think that b'\xA0' is a valid thousands separator. I agree, but it's not the point: the problem is that b'\xA0' is decoded to a strange U+30000020 character by mbstowcs(). > Currently I have this horrible function to deal with the problem: > > ... > n = mbstowcs(buf, s, 2); > ... > tmp = PyUnicode_FromWideChar(buf, n); > if (tmp == NULL) { > return NULL; > } > utf8 = PyUnicode_AsUTF8String(tmp); > Py_DECREF(tmp); > return utf8; I would not help this specific issue: b'\xA0' is not decodable from UTF-8. > I'm not sure why the b'\xA0' problem only occurs in Solaris. Many systems > have this thousands separator. The problem is not directly in the C localeconv() function, but in mbstowcs() with the hu_HU locale. You can try my test program for this issue: http://bugs.python.org/file23876/localeconv_wchar.c My test is maybe not correct, because it only sets LC_ALL, which is a little bit different than Python tests (see below). -- I don't remember on which buildbot the issue occurred :-( - "sparc solaris10 gcc 3.x" has "LANG=C" and "TZ=Europe/Berlin" environement variable - "x86 OpenIndiana 3.x" and "AMD64 OpenIndian a%203.x" have "TZ=Europe/London" and no locale variable!? The issue occurred for example in test_lc_numeric_basic() of test__locale which sets LC_NUMERIC and LC_CTYPE locales (but not LC_ALL). LC_ALL and LC_NUMERIC are different in this test, but LC_NUMERIC and LC_CTYPE are the same. -- Stefan: would you accept that locale.localeconv() and locale.strxfrm() stop working (instead of returning invalid data) on Solaris in certains cases (it looks like the issue depends on the locale and the OS version)? It can be a motivation to fix the root of the issue ;-) Victor From stefan at bytereef.org Thu Dec 8 14:42:11 2011 From: stefan at bytereef.org (Stefan Krah) Date: Thu, 8 Dec 2011 14:42:11 +0100 Subject: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues In-Reply-To: <4EE0AC93.5030706@haypocalc.com> References: <1504453.f4XqDVp2GQ@ned> <20111208091752.GA29901@sleipnir.bytereef.org> <4EE0AC93.5030706@haypocalc.com> Message-ID: <20111208134211.GA31211@sleipnir.bytereef.org> Victor Stinner wrote: > The problem is not directly in the C localeconv() function, but in > mbstowcs() with the hu_HU locale. Ah, I see. > You can try my test program for this issue: > http://bugs.python.org/file23876/localeconv_wchar.c Can't test on OpenSolaris, since Oracle removed the package repo and I need the ISO locales. > Stefan: would you accept that locale.localeconv() and locale.strxfrm() > stop working (instead of returning invalid data) on Solaris in certains > cases (it looks like the issue depends on the locale and the OS > version)? It can be a motivation to fix the root of the issue ;-) Yes, if the cause is a broken mbstowcs() that sounds good. Stefan Krah From vinay_sajip at yahoo.co.uk Thu Dec 8 16:27:57 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 8 Dec 2011 15:27:57 +0000 (UTC) Subject: [Python-Dev] readd u'' literal support in 3.3? References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko> Message-ID: Matt Joiner gmail.com> writes: > > Nobody is using 3 yet ;) > > Sure, I use it for some personal projects, and other people pretend to > support it. Not really. > > The worst of the pain in porting to Python 3000 has yet to even begin! > The classic chicken-and-egg problem, right? Someone's got to make a start. If you aim for porting with a single codebase and are not too hung up about "practicality beats purity" hacks like e = sys.exc_info()[1], then I think decent progress can be made with little risk, as long as the project has good test coverage (and if it doesn't ... well, that's risky even if you stay on 2.x ...). Django porting took a week of elapsed time (i.e. < 1 person-week of effort) to go from thousands of test failures under 3.x and sqlite to zero test failures. Django is a pretty big project, so I can't imagine "ordinary mortal" projects are going to be too bad (as long as not implemented pathologically). Of course, the Django port has some way to go, but still ... pip and virtualenv are relatively mature single code base ports, too. As additional examples - I've done Babel, Whoosh, Elixir, WTForms and others the same way. Of course, I understand that YMMV. Regards, Vinay Sajip From jannis at leidel.info Thu Dec 8 16:53:22 2011 From: jannis at leidel.info (Jannis Leidel) Date: Thu, 8 Dec 2011 16:53:22 +0100 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko>

Message-ID: On 08.12.2011, at 16:27, Vinay Sajip wrote: > Matt Joiner gmail.com> writes: > >> >> Nobody is using 3 yet ;) >> >> Sure, I use it for some personal projects, and other people pretend to >> support it. Not really. >> >> The worst of the pain in porting to Python 3000 has yet to even begin! >> > > The classic chicken-and-egg problem, right? Someone's got to make a start. If > you aim for porting with a single codebase and are not too hung up about > "practicality beats purity" hacks like e = sys.exc_info()[1], then I think > decent progress can be made with little risk, as long as the project has good > test coverage (and if it doesn't ... well, that's risky even if you stay on 2.x > ...). > > Django porting took a week of elapsed time (i.e. < 1 person-week of effort) to > go from thousands of test failures under 3.x and sqlite to zero test failures. > Django is a pretty big project, so I can't imagine "ordinary mortal" projects > are going to be too bad (as long as not implemented pathologically). Of course, > the Django port has some way to go, but still ... pip and virtualenv are > relatively mature single code base ports, too. As additional examples - I've > done Babel, Whoosh, Elixir, WTForms and others the same way. I don't want to rain on your parade, but even if your port of Django passes all tests, it's not at all near completion. As a framework we not only have to worry about the ability to run on Python 3.X but also how to teach our community to upgrade their projects (if possible at all). That means to reduce the number of hacks needed and thoroughly reviewing to not suddenly lead into a maintenance dead end. E.g. I'm still not sure the one codebase strategy is better than the 2to3 strategy. Also, stating that pip and virtualenv were easy to port like other projects seems to me like only half of the story -- Carl and me had to fix a non trivial part of your port before being able to do the Py3k release. I don't mean to diminish your work, it *is* appreciated, but I'm rather careful with generalizations when it comes to changes of a platform on such epic scale. Best, Jannis From vinay_sajip at yahoo.co.uk Thu Dec 8 17:46:31 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 8 Dec 2011 16:46:31 +0000 (UTC) Subject: [Python-Dev] readd u'' literal support in 3.3? References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko>

Message-ID: Jannis Leidel leidel.info> writes: > I don't want to rain on your parade, Not at all - feel free. I don't feel rained on in the least :-) > but even if your port of Django passes all tests, it's not at all near > completion. As a framework we not only have to worry about the ability to run > on Python 3.X but also how to teach our community to upgrade their projects > (if possible at all). That means to reduce the number of hacks needed and > thoroughly reviewing to not suddenly lead into a maintenance dead end. > E.g. I'm still not sure the one codebase strategy is better than the 2to3 > strategy. Of course, and I did say in the post you're replying to that I know that the Django port has some way to go. But even if you decide that the single code base port is not something you want for Django, nevertheless, I think I've shown that the single port strategy can work for a large project like Django from a purely technical perspective such as passing a very large test suite. Of course, there are many non-technical issues such as documentation, ease of ongoing maintenance etc. which no doubt you will be reviewing in due course. (In the above, I'm using "technical" in a very narrow sense, obviously.) > Also, stating that pip and virtualenv were easy to port like other projects > seems to me like only half of the story -- Carl and me had to fix a > non-trivial part of your port before being able to do the Py3k release. Sure, and I didn't mean to imply that I did all the work - but I did announce it only after I got almost all, if not all, tests passing on 2.x and 3.x from a single code base - just as I did with Django. If the tests didn't cover everything, then more work would certainly have been required, but it's still a respectable milestone to have achieved, IMO. But it's the single code base strategy that I wanted to highlight - and AFAIK you haven't had to back-pedal on that (or at least, if you did, it might have been nice to drop me a line to that effect). > I don't mean to diminish your work, it *is* appreciated, but I'm rather > careful with generalizations when it comes to changes of a platform on > such epic scale. I hope I'm not being careless where you're being careful, but where does caution start and timidity begin? You might remember that you brought up the desirability of the Python 3 port on django-developers in September, which got me thinking about it. My view of it is, if everyone thinks of it like eating an elephant, no one is even going to take the first bite, for fear of indigestion. Don't get me wrong - I understand about priorities and commitments, and everyone scratching their own itch. So, I scratched mine, and bet on the hunch that the elephant was only a chocolate elephant, and not a real one. Time will of course tell ;-) Regards, Vinay Sajip From martin at v.loewis.de Thu Dec 8 18:26:59 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 08 Dec 2011 18:26:59 +0100 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323320919.2710.24.camel@thinko> References: <1323320919.2710.24.camel@thinko> Message-ID: <4EE0F363.4060208@v.loewis.de> > It would make it possible to share code like this across py2 and py3: > > a = u'foo' > > Instead of (with e.g. six): > > a = u('foo') > > Or: > > from __future__ import unicode_literals > a = 'foo' > > I recognize that the last option is probably the way "its meant to be > done", but in reality it's just more practical to not fail when literal > notation is more specific than strictly necessary. You are giving these two options already: - The former works for all Python versions. Although it may appear tedious to convert existing code to replace all Unicode literals with function calls, it would actually be possible/easy to write an automatic converter that does so for a complete code base, based on lib2to3. - the second version is truly practical for all applications/libraries that only support 2.6+. In addition, there also is another option: - use 2to3, in some form So you have already three solutions which are all transitional in some sense, and you want yet another option? I fail to see why this option is more practical than the options that are already there. Regards, Martin From shane at hathawaymix.org Thu Dec 8 19:21:40 2011 From: shane at hathawaymix.org (Shane Hathaway) Date: Thu, 08 Dec 2011 11:21:40 -0700 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323325916.2710.39.camel@thinko> References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko> Message-ID: <4EE10034.2070809@hathawaymix.org> On 12/07/2011 11:31 PM, Chris McDonough wrote: > All I can really offer is my own experience here based on writing code > that needs to straddle Python 2.5, 2.6, 2.7 and 3.2 without use of 2to3. > Having u'' work across all of these would mean porting would not require > as much eyeballing as code modified via "from future import > unicode_literals", it would let more code work on 2.5 unchanged, and the > resulting code would execute faster than code that required us to use a > u() function. Could you elaborate on why "from __future__ import unicode_literals" is inadequate (other than the Python 2.6 requirement)? Shane From tseaver at palladion.com Thu Dec 8 20:03:15 2011 From: tseaver at palladion.com (Tres Seaver) Date: Thu, 08 Dec 2011 14:03:15 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <4EE0F363.4060208@v.loewis.de> References: <1323320919.2710.24.camel@thinko> <4EE0F363.4060208@v.loewis.de> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/08/2011 12:26 PM, "Martin v. L?wis" wrote: >> It would make it possible to share code like this across py2 and >> py3: >> >> a = u'foo' >> >> Instead of (with e.g. six): >> >> a = u('foo') >> >> Or: >> >> from __future__ import unicode_literals a = 'foo' >> >> I recognize that the last option is probably the way "its meant to >> be done", but in reality it's just more practical to not fail when >> literal notation is more specific than strictly necessary. > > You are giving these two options already: - The former works for all > Python versions. Although it may appear tedious to convert existing > code to replace all Unicode literals with function calls, it would > actually be possible/easy to write an automatic converter that does so > for a complete code base, based on lib2to3. I guess this could be done to generate "straddling" code from 2-only code. Note that the overhead of the function call is likely significant in some cases: generating a module scope constant is the only sane replacement there, which might be harder to do in a fixer (I haven't tried to write one yet). > - the second version is truly practical for all > applications/libraries that only support 2.6+. Right. The question is would running more P2 code unmodified in P3 be a "Good Thing" from the perspective of P3 uptake: developers who run up against such issues tend to hit "camelback-meet-straw" points and bounce off the effort. Such a tiny change (a six line patch and an extra '.. note::' in the language reference section on string literal syntax) might be worth avoiding that risk. > In addition, there also is another option: - use 2to3, in some form 2to3 is not practical in a "straddling" case: - - The script is too slow to use in development mode (like being back in "compile the world" Java / C++ land). - - The transformed code generates tracebacks that don't match the source. > So you have already three solutions which are all transitional in > some sense, and you want yet another option? I fail to see why this > option is more practical than the options that are already there. The "redundant" u'*' spelling would be present in Python3 for the same reason that the equally-reduntant b'*' spelling is present in Python 2.6+: it makes writing portable code simpler. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk7hCfIACgkQ+gerLs4ltQ5t8wCfalykXvpSq6awllQUpCymf8iM 3P0An0cCY/iZHcK82V+CqW07wCpGfBtf =Q4Fv -----END PGP SIGNATURE----- From glyph at twistedmatrix.com Thu Dec 8 21:32:20 2011 From: glyph at twistedmatrix.com (Glyph) Date: Thu, 8 Dec 2011 15:32:20 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein> Message-ID: On Dec 8, 2011, at 7:32 AM, Nick Coghlan wrote: > Having just purged so much cruft from the language, pleas to add some back permanently for a problem that is going to fade from significance within the next couple of years are unlikely to get very far. > This problem is never going to go away. This is not a comment on the success of py3, but rather the persistence of old versions of things. Even assuming an awesomely optimistic schedule for py3k migrations, even assuming that *everything* on PyPI supports Py3 by the end of 2013, consider that all around the world, every day, new code is still being written in FORTRAN. Much of it is being in FORTRAN 77, despite the fact that Fotran 90 is now over 20 years old. Efforts still crop up periodically (some successful, some failed) to migrate these "legacy" projects to other languages, some of them as modern as C. There are plenty of proprietary Python 2 systems which exist today for which there will not be a budget for a Python 3 migration this decade. If history is an accurate guide, people will still be hired to work on python 2.x systems in the year 2100. Some of them will be being hired to migrate that python 2.x code to python 3 (or 4, or 5, whatever we have by then). If they're not, it will be because they're being hired to try to migrate it to Javascript instead, not because the Python 3 migration is "done" by then. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Thu Dec 8 22:27:06 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 08 Dec 2011 22:27:06 +0100 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

Message-ID: <4EE12BAA.1050601@v.loewis.de> > This is not a comment on the success of py3, but rather the persistence > of old versions of things. Even assuming an awesomely optimistic > schedule for py3k migrations, even assuming that *everything* on PyPI > supports Py3 by the end of 2013, consider that all around the world, > every day, new code is still being written in FORTRAN. While this is true for FORTRAN, it is not for Python 1.5: no new Python 1.5 code is written around the world, at least not every day. Also for FORTRAN, new code that is written every day likely isn't FORTRAN 66, but more likely FORTRAN 90 or newer. The reason for that is that FORTRAN just isn't an obsolete language, by any means, else people wouldn't bother producing new versions of it, porting compilers to new processors, and so on. Contrast this to Python 1, and soon Python 2, which actually *is* obsolete (just as FORTRAN 66 *is* obsolete). > Much of it is being in FORTRAN 77 Can you prove this? I trust that existing code is being maintained in FORTRAN 77. For new code, I'm skeptical. > There are plenty of proprietary Python 2 systems which exist today for > which there will not be a budget for a Python 3 migration this decade. And people using it can happily continue to use Python 2. If they don't have a need to port their code to Python 3, they are not concerned by whether you use a u prefix for strings in Python 3 or not. Regards, Martin From robert.kern at gmail.com Thu Dec 8 22:41:09 2011 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 08 Dec 2011 21:41:09 +0000 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <4EE12BAA.1050601@v.loewis.de> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> Message-ID: On 12/8/11 9:27 PM, "Martin v. L?wis" wrote: [Glyph writes:] >> Much of it is being in FORTRAN 77 > > Can you prove this? I trust that existing code is being maintained > in FORTRAN 77. For new code, I'm skeptical. Personally, I've written more new code in FORTRAN 77 than in Fortran 90+. Even with all of the quirks in FORTRAN 77 compilers, it's still substantially easier to connect FORTRAN 77 code to C and Python than 90+. When they introduced some of the nicer language features, they left the precise details of memory structures of the new types undefined, so compilers chose different ways to implement them. Some of the very latest developments in modern Fortran have begun to standardize the FFI for these features (or at least let you write a standardized shim for them) and compilers are catching up. For people writing new whole programs in Fortran, yes, they are probably mostly using 90+. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From janssen at parc.com Thu Dec 8 23:09:59 2011 From: janssen at parc.com (Bill Janssen) Date: Thu, 8 Dec 2011 14:09:59 PST Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <4EE12BAA.1050601@v.loewis.de> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> Message-ID: <51106.1323382199@parc.com> =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > While this is true for FORTRAN, it is not for Python 1.5: no new > Python 1.5 code is written around the world, at least not every day. I don't know about that. I've seen a lot of Python 2 code which was apparently written by folks who learned Python 1.5.2 and never needed to learn about newer features. I suspect that's still going on fairly widely. Bill From solipsis at pitrou.net Fri Dec 9 01:35:35 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 9 Dec 2011 01:35:35 +0100 Subject: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage() References: Message-ID: <20111209013535.6fb38068@pitrou.net> On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner wrote: > > +.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) > + > + Get a new copy of a Unicode object. > + > + .. versionadded:: 3.3 I'm not sure I understand. Why would you make a copy of an immutable object? From tjreedy at udel.edu Fri Dec 9 01:44:32 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 08 Dec 2011 19:44:32 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko>

Message-ID: On 12/8/2011 10:53 AM, Jannis Leidel wrote: > possible at all). That means to reduce the number of hacks needed and > thoroughly reviewing to not suddenly lead into a maintenance dead > end. E.g. I'm still not sure the one codebase strategy is better than > the 2to3 strategy. One codebase with version compatibility hacks and no use of 2to3 is one pure strategy. Two codebases with no compatibility hacks (at least for 2 versus 3) and use of 2to3 to bridge all differences is another. Perhaps we need something in between, with a mix of compatibility hacks and automatic 2to3 conversions that has not been discovered yet, or that can be customized on a project by project basis. Deleting 'u' prefixes from string literals is something that is easy to do with 2to3 for anyone who cannot use the future import because of supporting 2.5. More that one person has said that *any* use of 2to3 is impractical for rapid-turnaround development because 2to3 is 'too slow'. If so, have the usual methods for speeding up a Python program been applied? Has anyone profiled 2to3? Is most of the time spent in 2to3 itself or some particular module that it uses? Is the time that is spend in 2to3 itself a result of the overall framework or particular fixers? If the latter, can slow fixers be eliminated by using a compatibility hack in the Python 2 code? Has anyone tried to compile 2to3 and prerequisite Python-coded modules? -- Terry Jan Reedy From glyph at twistedmatrix.com Fri Dec 9 01:52:28 2011 From: glyph at twistedmatrix.com (Glyph) Date: Thu, 8 Dec 2011 19:52:28 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <4EE12BAA.1050601@v.loewis.de> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> Message-ID: <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> Zooming back in to the actual issue this thread is about, I think the u""-vs-"" issue is a bit of a red herring, because the _real_ problem here is that 2to3 is slow and buggy and so migration efforts are starting to work around it, and therefore want to run the same code on 3.x and all the way back to 2.5. In my opinion, effort should be spent on optimizing the suggested migration tools and getting them to work properly, not twiddling the syntax so that it's marginally easier to avoid them. On Dec 8, 2011, at 4:27 PM, Martin v. L?wis wrote: >> This is not a comment on the success of py3, but rather the persistence >> of old versions of things. Even assuming an awesomely optimistic >> schedule for py3k migrations, even assuming that *everything* on PyPI >> supports Py3 by the end of 2013, consider that all around the world, >> every day, new code is still being written in FORTRAN. > > While this is true for FORTRAN, it is not for Python 1.5: no new > Python 1.5 code is written around the world, at least not every day. > Also for FORTRAN, new code that is written every day likely isn't > FORTRAN 66, but more likely FORTRAN 90 or newer. That's because Python 1.5 was upward-compatible with 2.x, and pretty much everyone could gently migrate, and start developing on the new versions even while supporting the old ones. That is obviously not true of 3.x, by design; 2to3 requires that you still develop on the old version even if you support a new one, not to mention the substantially increased effort of migration. > The reason for that is that FORTRAN just isn't an obsolete language, > by any means, else people wouldn't bother producing new versions of > it, porting compilers to new processors, and so on. Contrast this to > Python 1, and soon Python 2, which actually *is* obsolete (just as > FORTRAN 66 *is* obsolete). Much as the Python core team might wish Python 2 would "soon" be obsolete, all of these things are happening for python 2.x now and all indications are that they will continue to happen. PyPy, Jython, ShedSkin, Skulpt, IronPython, and possibly a few others are (to varying degrees) all targeting 2.x right now, because that's where the application code they want to run is. PyPy is even porting the JIT compiler to a new processor (ARM). F66 is indeed obsolete, but it became obsolete because people stopped using it, not because the standards committee declared it so. >> Much of it is being in FORTRAN 77 > > Can you prove this? I trust that existing code is being maintained > in FORTRAN 77. For new code, I'm skeptical. I am not deeply immersed in the world where F77 is still popular, so I don't have any citations for you, but casual conversations with people working in the sciences, especially chemistry and materials science, suggests to me that a lot of F77 and start new projects in it. (I can see someone with more direct experience promptly replied in this thread already, anyway.) >> There are plenty of proprietary Python 2 systems which exist today for >> which there will not be a budget for a Python 3 migration this decade. > > And people using it can happily continue to use Python 2. If they > don't have a need to port their code to Python 3, they are not concerned > by whether you use a u prefix for strings in Python 3 or not. I didn't say they didn't have a need ever, I said they didn't have a budget now. What you are saying to those users here is basically: "if you can't migrate today, then just don't bother, we're never going to make it any easier". Despite the fact that I ultimately agree on u'' (nobody should care about this), it is not a good message to send. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Dec 9 01:56:00 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 9 Dec 2011 01:56:00 +0100 Subject: [Python-Dev] readd u'' literal support in 3.3? References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> Message-ID: <20111209015600.4cbc5cf1@pitrou.net> On Thu, 8 Dec 2011 19:52:28 -0500 Glyph wrote: > Zooming back in to the actual issue this thread is about, I think the u""-vs-"" issue is a bit of a red herring, because the _real_ problem here is that 2to3 is slow and buggy and so migration efforts are starting to work around it, and therefore want to run the same code on 3.x and all the way back to 2.5. > > In my opinion, effort should be spent on optimizing the suggested migration tools and getting them to work properly, not twiddling the syntax so that it's marginally easier to avoid them. Instead of modifying 2.x code and running 2to3 time after time on it, you can use 2to3 on unmodified 2.x code and fix the generated 3.x code. With proper use of branches and a DVCS, merging later 2.x changes should be mostly painless. (at least it works on https://bitbucket.org/pitrou/t3k/) Regards Antoine. From vinay_sajip at yahoo.co.uk Fri Dec 9 02:39:39 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Fri, 9 Dec 2011 01:39:39 +0000 (UTC) Subject: [Python-Dev] readd u'' literal support in 3.3? References: <1323320919.2710.24.camel@thinko> <1323324644.2710.28.camel@thinko> <1323325916.2710.39.camel@thinko>

Message-ID: Terry Reedy udel.edu> writes: > More that one person has said that *any* use of 2to3 is impractical for > rapid-turnaround development because 2to3 is 'too slow'. If so, have the > usual methods for speeding up a Python program been applied? Has anyone > profiled 2to3? Is most of the time spent in 2to3 itself or some > particular module that it uses? Is the time that is spend in 2to3 itself > a result of the overall framework or particular fixers? If the latter, > can slow fixers be eliminated by using a compatibility hack in the > Python 2 code? Has anyone tried to compile 2to3 and prerequisite > Python-coded modules? > It's not the speed of 2to3 per se; this seems very reasonable for a tool of its type. It's the overall process, which currently involves running 2to3 on an entire codebase (for example, using setup.py with flags to run 2to3 during setup). With a large project like Django, and hundreds or thousands of source files, 2to3 used in this way is on a hiding to nothing; no amount of profiling and tweaking is likely to lead to acceptable turnaround. However, 2to3 tools could be developed which are based on 2to3/lib2to3 and are *incremental* in nature; then as you edit and save a file, its processed version could be available very shortly afterwards (since we only need to translate the file that was saved) - this would be even quicker in an IDE where the 2to3 code (and perhaps the AST of files being worked on) could remain loaded in memory over an entire development session. That, along with some more/smarter fixers, could go some way to addressing the "too slow" issue. Regards, Vinay Sajip From tjreedy at udel.edu Fri Dec 9 03:01:30 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 08 Dec 2011 21:01:30 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> Message-ID: On 12/8/2011 7:52 PM, Glyph wrote: > Zooming back in to the actual issue this thread is about, I think the > u""-vs-"" issue is a bit of a red herring, because the _real_ problem > here is that 2to3 is slow and buggy and so migration efforts are > starting to work around it, and therefore want to run the same code on > 3.x and all the way back to 2.5. I would expect that running one codebase would push one to only run on 2.6+, which would make one codebase easier, but it does not seem to. > In my opinion, effort should be spent on optimizing the suggested > migration tools and getting them to work properly, not twiddling the > syntax so that it's marginally easier to avoid them. This is what I tried to say in my last post. ... > I didn't say they didn't have a /need ever/, I said they didn't have a > /budget now/. What you are saying to those users here is basically: "if > you can't migrate today, then just don't bother, we're never going to > make it any easier". Despite the fact that I ultimately agree on u'' > (nobody should care about this), it is not a good message to send. I agree that would not be a good message, but a) I do not think that was the intent (I think is was more like "the *current* start of porting tools is a moot point for those not now porting") and b) good messages go both ways. People say "Python 2 is where the money is, it has (almost?) all the production apps, etcetera." Probably (mostly?) true. So where is the support from the vast army of 2.7 users for continuing to polish 2.7 past the normal 2 years (which ended last June)? Or for improving the migration tools? -- Terry Jan Reedy From regebro at gmail.com Fri Dec 9 03:50:16 2011 From: regebro at gmail.com (Lennart Regebro) Date: Fri, 9 Dec 2011 03:50:16 +0100 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> Message-ID: "from future import unicode_literals" is my fault. I'm sorry. It's pretty useless. It was suggested by somebody and I then supported it's adding, instead of allowing u'' which I suggested. But it doesn't work. One reason is that you need to be able to say "This should be str in Python 2, and binary in Python 3, that should be Unicode in Python 2 and str in Python 3, and that over there should be str in both versions", and the future import doesn't support that. Adding u'' support solves the problem, but then again, so does having a b() and an u() method. I'm not sure of the utility of adding functionality to Python 3 that can be solved with six. //Lennart From guido at python.org Fri Dec 9 03:53:55 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 8 Dec 2011 18:53:55 -0800 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com>

Message-ID: Are you saying that with that future import, b"..." is still a Unicode literal? On Thu, Dec 8, 2011 at 6:50 PM, Lennart Regebro wrote: > "from future import unicode_literals" is my fault. I'm sorry. It's > pretty useless. It was suggested by somebody and I then supported it's > adding, instead of allowing u'' which I suggested. But it doesn't > work. > > One reason is that you need to be able to say "This should be str in > Python 2, and binary in Python 3, that should be Unicode in Python 2 > and str in Python 3, and that over there should be str in both > versions", and the future import doesn't support that. > > Adding u'' support solves the problem, but then again, so does having > a b() and an u() method. I'm not sure of the utility of adding > functionality to Python 3 that can be solved with six. > > //Lennart > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Dec 9 04:11:10 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 9 Dec 2011 13:11:10 +1000 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> Message-ID: On Fri, Dec 9, 2011 at 12:01 PM, Terry Reedy wrote: > On 12/8/2011 7:52 PM, Glyph wrote: >> >> Zooming back in to the actual issue this thread is about, I think the >> u""-vs-"" issue is a bit of a red herring, because the _real_ problem >> here is that 2to3 is slow and buggy and so migration efforts are >> starting to work around it, and therefore want to run the same code on >> 3.x and all the way back to 2.5. > > > I would expect that running one codebase would push one to only run on 2.6+, > which would make one codebase easier, but it does not seem to. Actually, most of the feedback I've heard is that using one codebase is comparatively straightforward if you can drop support for 2.5 and earlier. Mainly because of this: >>> from __future__ import unicode_literals >>> from __future__ import print_function >>> print >>> print(type('')) >>> print(type(b'')) That's why I'm quite happy to say to people that if they currently have to support 2.5 or earlier, and they're not prepared to fork their codebase or drop support for those earlier Python versions in new releases, then it's *perfectly fine* for them to delay their 3.x support until they *can* use the compatibility tools we provide to make "single source" approaches easier. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From barry at python.org Fri Dec 9 04:34:08 2011 From: barry at python.org (Barry Warsaw) Date: Thu, 8 Dec 2011 22:34:08 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com>

Message-ID: <20111208223408.0e2e8bd1@limelight.wooz.org> On Dec 09, 2011, at 03:50 AM, Lennart Regebro wrote: >One reason is that you need to be able to say "This should be str in >Python 2, and binary in Python 3, that should be Unicode in Python 2 >and str in Python 3, and that over there should be str in both >versions", and the future import doesn't support that. Sorry, I don't understand this. What does it mean to be "str in both versions"? And why would you want that? As for "str in Python 2 and binary in Python 3", b'' prefixes do that in Python >= 2.6 without the future import (if I take "binary" to mean bytes type). As for "Unicode in Python 2 and str in Python 3", unadorned strings with the future import in Python >= 2.6 does that just fine. One of the nice things too is that with #include in Python >= 2.6, changing all your PyStrings to PyBytes, you can get the same behavior in your extension modules. You still need to be clear about what are bytes and what are strings. The problem comes when you aren't or can't be sure, i.e. you have objects that are sometimes one and sometimes the other. Such as email headers. In that case, you're kind of screwed. Python 2's str type let you cheat, but not without consequences. Those consequences are spelled "UnicodeErrors" and I'll be glad to be rid of them. Cheers, -Barry From barry at python.org Fri Dec 9 04:38:16 2011 From: barry at python.org (Barry Warsaw) Date: Thu, 8 Dec 2011 22:38:16 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com>

Message-ID: <20111208223816.2329a110@limelight.wooz.org> On Dec 08, 2011, at 06:53 PM, Guido van Rossum wrote: >Are you saying that with that future import, b"..." is still a Unicode >literal? No, the future import has no impact on b-strings. -----snip snip----- from __future__ import print_function import sys print(sys.version_info.major, sys.version_info.minor, type(b'')) -----snip snip----- $ python /tmp/foo.py 2 7 $ python3 /tmp/foo.py 3 2 -----snip snip----- from __future__ import print_function, unicode_literals import sys print(sys.version_info.major, sys.version_info.minor, type(b'')) -----snip snip----- $ python /tmp/foo.py 2 7 $ python3 /tmp/foo.py 3 2 Cheers, -Barry From chrism at plope.com Fri Dec 9 05:24:33 2011 From: chrism at plope.com (Chris McDonough) Date: Thu, 08 Dec 2011 23:24:33 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <20111208223408.0e2e8bd1@limelight.wooz.org> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com>

<20111208223408.0e2e8bd1@limelight.wooz.org> Message-ID: <1323404673.2710.132.camel@thinko> On Thu, 2011-12-08 at 22:34 -0500, Barry Warsaw wrote: > On Dec 09, 2011, at 03:50 AM, Lennart Regebro wrote: > > >One reason is that you need to be able to say "This should be str in > >Python 2, and binary in Python 3, that should be Unicode in Python 2 > >and str in Python 3, and that over there should be str in both > >versions", and the future import doesn't support that. > > Sorry, I don't understand this. What does it mean to be "str in both > versions"? And why would you want that? > > As for "str in Python 2 and binary in Python 3", b'' prefixes do that in > Python >= 2.6 without the future import (if I take "binary" to mean bytes > type). > > As for "Unicode in Python 2 and str in Python 3", unadorned strings with the > future import in Python >= 2.6 does that just fine. > > One of the nice things too is that with #include in Python >= > 2.6, changing all your PyStrings to PyBytes, you can get the same behavior in > your extension modules. > > You still need to be clear about what are bytes and what are strings. The > problem comes when you aren't or can't be sure, i.e. you have objects that are > sometimes one and sometimes the other. Such as email headers. In that case, > you're kind of screwed. Python 2's str type let you cheat, but not without > consequences. Those consequences are spelled "UnicodeErrors" and I'll be glad > to be rid of them. The PEP 3333 WSGI protocol *requires* that you present its APIs with "native strings" (str on Python 3, str on Python 2). So while the oversimplification "don't do that" sounds great here, in real life, not so much. - C From chrism at plope.com Fri Dec 9 05:33:24 2011 From: chrism at plope.com (Chris McDonough) Date: Thu, 08 Dec 2011 23:33:24 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com>

Message-ID: <1323405204.2710.139.camel@thinko> On Fri, 2011-12-09 at 03:50 +0100, Lennart Regebro wrote: > "from future import unicode_literals" is my fault. I'm sorry. It's > pretty useless. It was suggested by somebody and I then supported it's > adding, instead of allowing u'' which I suggested. But it doesn't > work. > > One reason is that you need to be able to say "This should be str in > Python 2, and binary in Python 3, that should be Unicode in Python 2 > and str in Python 3, and that over there should be str in both > versions", and the future import doesn't support that. This is also true. But even so, b'' exists as a porting nicety. The argument for supporting u'' is the same one the one which exists for b'', except in the opposite direction. Since popular library code is going to need to run on both Python 2 and Python 3 for the foreseeable future, anything to make this easier helps. Supporting u'' in 3.3 will prevent me from needing to think about bytes/text distinction again while porting/straddling. Every time I say this to somebody who isn't listening closely they say "AHA! You're *supposed* to think about bytes vs. text, that's the whole point stupid!" They fail to hear the "again" in that sentence. I've clearly already thought about the distinction between bytes and text at least once: that's *why* I'm using a u'' literal there. I shouldn't have to think about it again to service syntax constraints. Code that is more explicit than strictly necessary should not be needlessly punished. Continuing to not support u'' in Python 3 will be like having an immigration station where folks who have a b'ritish' passport can get through right away, but folks with a u'kranian' passport need to get back on a plane that appears to come from the Ukraine before they receive another tag that says they are indeed from the Ukraine. It's just pointless makework. - C From ncoghlan at gmail.com Fri Dec 9 06:30:36 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 9 Dec 2011 15:30:36 +1000 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323405204.2710.139.camel@thinko> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com>

<1323405204.2710.139.camel@thinko> Message-ID: On Fri, Dec 9, 2011 at 2:33 PM, Chris McDonough wrote: > Continuing to not support u'' in Python 3 will be like having an > immigration station where folks who have a ?b'ritish' passport can get > through right away, but folks with a u'kranian' passport need to get > back on a plane that appears to come from the Ukraine before they > receive another tag that says they are indeed from the Ukraine. ?It's > just pointless makework. OK, I think I finally understand your point. You want the ability to be able to, in your Python 2.x code, write modules that use *all three* kinds of string literal: ---------- foo = u"this is a Unicode string in both Python 2.x and 3.x" bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x" baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" ---------- This is driven by the desire to use APIs (like the PEP 3333 version of WSGI) that are defined in terms of "native strings" in the context of applications that already include a strong binary/text separation. Currently, in modules shared between the two series, you can't use the "u" marker at all, since Python 3.x leaves it out as being redundant - instead, you have a binary switch (in the form of the future import) that lets you toggle the behaviour of basic string literals between the first two forms: ---------- bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x" baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" ---------- from __future__ import unicode_literals foo = "this is a Unicode string in both Python 2.x and 3.x" baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" ---------- Currently, to get all 3 kinds of behaviour in a shared codebase without additional function calls at runtime, you need to pick one set of strings (either "always Unicode" or "native string type") and move them out to a separate module. So, for example, depending on which set you decided to move: ---------- from unicode_strings import foo bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x" baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" ---------- from __future__ import unicode_literals foo = "this is a Unicode string in both Python 2.x and 3.x" from native_strings import bar baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" ---------- Or, alternatively, you use 'six' (or a similar compatibility module) and ensure unicode at runtime, using native or binary strings otherwise: ---------- from six import u foo = u("this is a Unicode string in both Python 2.x and 3.x") bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x" baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" ---------- If you want to target 3.2, you *have* to use one of those mechanisms - any potential restoration of u'' syntax support won't help you (and even after 3.3 gets released in the latter half of next year, it's still going to be a fair while before it makes it's way into the various distros, especially the ones that include long term support from major vendors). So, instead of attempting to paper over the problem by reintroducing u'', perhaps the discussion we should be having is whether or not PEP 3333's superficially appealing concept of defining an API in terms of "native strings" is a loser in practice, and we should instead be looking more closely at PEP 444 (since that goes the route of using 'str' in 2.x and 'bytes' in 3.x, thus rendering "from __future__ import unicode_literals" an adequate solution for 2.6+ compatibility). The amount of pain that PEP 3333 seems to be causing in the web development world suggests to me we may simply have been *wrong* to think that PEP 3333 would be a workable long term approach. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From chrism at plope.com Fri Dec 9 06:33:59 2011 From: chrism at plope.com (Chris McDonough) Date: Fri, 09 Dec 2011 00:33:59 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> Message-ID: <1323408839.2710.143.camel@thinko> On Thu, 2011-12-08 at 19:52 -0500, Glyph wrote: > Zooming back in to the actual issue this thread is about, I think the > u""-vs-"" issue is a bit of a red herring, because the _real_ problem > here is that 2to3 is slow and buggy and so migration efforts are > starting to work around it, and therefore want to run the same code on > 3.x and all the way back to 2.5. Even if it weren't slow, I still wouldn't use it to automatically convert code at install time; a single codebase is easier to reason about, and easier to support. Users send me tracebacks all the time; having them match the source is a wonderful thing. - C From ncoghlan at gmail.com Fri Dec 9 06:41:40 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 9 Dec 2011 15:41:40 +1000 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323408839.2710.143.camel@thinko> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> <1323408839.2710.143.camel@thinko> Message-ID: On Fri, Dec 9, 2011 at 3:33 PM, Chris McDonough wrote: > Even if it weren't slow, I still wouldn't use it to automatically > convert code at install time; a single codebase is easier to reason > about, and easier to support. ?Users send me tracebacks all the time; > having them match the source is a wonderful thing. Yeah, if single source doesn't work, then I think Antoine's suggested way (i.e. convert once, then maintain two distinct branches and builds, the way python-dev did for years with the standard library) is a more sane option. It lets you investigate tracebacks properly, it reduces your cycle times, etc, etc. With a modern DVCS, it should be significantly less painful than it was for us when we were maintaining four branches with only svnmerge to help out. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Fri Dec 9 06:43:35 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 8 Dec 2011 21:43:35 -0800 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323408839.2710.143.camel@thinko> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> <1323408839.2710.143.camel@thinko> Message-ID: On Thu, Dec 8, 2011 at 9:33 PM, Chris McDonough wrote: > On Thu, 2011-12-08 at 19:52 -0500, Glyph wrote: > > Zooming back in to the actual issue this thread is about, I think the > > u""-vs-"" issue is a bit of a red herring, because the _real_ problem > > here is that 2to3 is slow and buggy and so migration efforts are > > starting to work around it, and therefore want to run the same code on > > 3.x and all the way back to 2.5. > > Even if it weren't slow, I still wouldn't use it to automatically > convert code at install time; a single codebase is easier to reason > about, and easier to support. Users send me tracebacks all the time; > having them match the source is a wonderful thing. Even though 2to3 was my idea, I am gradually beginning to appreciate this approach. I skimmed the docs for "six" and liked it. But I think the specific proposal of adding u"..." literals back to 3.3 is not going to do much good. If we had had the foresight way back when, we could have added them back to 3.1 and we would have been okay. But having them in 3.3 but not in 3.2 is just adding insult to injury. I recommend writing b"...".decode('utf-8'); maybe six's u() does the same? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chrism at plope.com Fri Dec 9 07:01:10 2011 From: chrism at plope.com (Chris McDonough) Date: Fri, 09 Dec 2011 01:01:10 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> <1323408839.2710.143.camel@thinko> Message-ID: <1323410470.2710.158.camel@thinko> On Thu, 2011-12-08 at 21:43 -0800, Guido van Rossum wrote: > On Thu, Dec 8, 2011 at 9:33 PM, Chris McDonough > wrote: > On Thu, 2011-12-08 at 19:52 -0500, Glyph wrote: > > Zooming back in to the actual issue this thread is about, I > think the > > u""-vs-"" issue is a bit of a red herring, because the > _real_ problem > > here is that 2to3 is slow and buggy and so migration efforts > are > > starting to work around it, and therefore want to run the > same code on > > 3.x and all the way back to 2.5. > > > Even if it weren't slow, I still wouldn't use it to > automatically > convert code at install time; a single codebase is easier to > reason > about, and easier to support. Users send me tracebacks all > the time; > having them match the source is a wonderful thing. > > Even though 2to3 was my idea, I am gradually beginning to appreciate > this approach. I skimmed the docs for "six" and liked it. > > But I think the specific proposal of adding u"..." literals back to > 3.3 is not going to do much good. If we had had the foresight way back > when, we could have added them back to 3.1 and we would have been > okay. But having them in 3.3 but not in 3.2 is just adding insult to > injury. AFAICT, at the current pace of porting, lots of authors of existing, popular Python 2 libraries won't be releasing a ported/straddled version any time soon; almost certainly many won't even begin work on a port until after 3.3 is final. As a result, on the supplier side, there will be plenty of code that will eventually work only as a straddle across 2.6, 2.7, and 3.3. On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases will have the wherewithal to compile their own Python 3 (or use a PPA or equivalent) until the distros catch up. So I'm not sure why 3.2 not having support for u'' should be a real blocker for the change. > I recommend writing b"...".decode('utf-8'); maybe six's u() does the > same? It does this: def u(s): return unicode(s, "unicode_escape") That's two Python function calls, of course, which is obviously icky if you use a lot of literals at a nonmodule scope. - C From ncoghlan at gmail.com Fri Dec 9 07:36:03 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 9 Dec 2011 16:36:03 +1000 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323410470.2710.158.camel@thinko> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> <1323408839.2710.143.camel@thinko> <1323410470.2710.158.camel@thinko> Message-ID: On Fri, Dec 9, 2011 at 4:01 PM, Chris McDonough wrote: > On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases > will have the wherewithal to compile their own Python 3 (or use a PPA or > equivalent) until the distros catch up. > > So I'm not sure why 3.2 not having support for u'' should be a real > blocker for the change. If this argument was valid, people wouldn't be so worried about maintaining 2.5 compatibility in their libraries. Consider if I tried to make this argument to justify everyone dropping 2.5 and earlier support today: """On the consumer side, folks who want to run 2.6+ codebases on older Linux distros have the wherewithal to compile their own more recent Python 2 (or use a PPA or equivalent) until they can move to a more recent version of their distro.""" It's simply not true in the general case - people don't maintain 2.4+ compatibility for fun, they do it because RHEL5 (and CentOS 5, etc) are still reasonably common and ship with 2.4 as the system Python. As soon as you switch away from the system provided Python, you're switching away from the vendors entire pre-packaged Python *stack*, not just the interpreter itself. You then have to install (and generally build) *everything* for yourself. While that is certainly possible these days (and a lot simpler than it used to be), it's still not trivial [1]. Since 3.2 is already quite usable for applications that aren't fighting with the "native strings" problem (which seems to be the common thread running through the complaints I've heard from web framework authors), and with it being included in at least the next Ubuntu LTS, current versions of Fedora, Arch, etc, it's going to be around for a long time. Ignoring 3.1 is a reasonable option. Ignoring 3.2 entirely is unlikely to be viable for anyone that is interested in supporting 3.x within the next couple of years - the 3.3 release is at least 9 months away, and it's also going to take a while for it to make its way into distros after the final release gets published on python.org. Hence my suggestion: perhaps the problem is the fact that PEP 3.3/WSGI 1.0.1 introduced the "native string" concept as a minimalist hack to try to get a usable gateway interface in Python 3, and that just doesn't work in practice when attempting to straddle 2.x and 3.x (because the values WSGI is dealing with aren't really text, they're bytes, only *some* of which represent text). Perhaps a PEP 444 based model would be less painful and more coherent in the long run? Cheers, Nick. [1] http://readthedocs.org/docs/ncoghlan_devs-python-notes/en/latest/venv_bootstrap.html -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From chrism at plope.com Fri Dec 9 08:38:05 2011 From: chrism at plope.com (Chris McDonough) Date: Fri, 09 Dec 2011 02:38:05 -0500 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> <1323408839.2710.143.camel@thinko> <1323410470.2710.158.camel@thinko> Message-ID: <1323416285.2710.219.camel@thinko> On Fri, 2011-12-09 at 16:36 +1000, Nick Coghlan wrote: > On Fri, Dec 9, 2011 at 4:01 PM, Chris McDonough wrote: > > On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases > > will have the wherewithal to compile their own Python 3 (or use a PPA or > > equivalent) until the distros catch up. > > > > So I'm not sure why 3.2 not having support for u'' should be a real > > blocker for the change. > > If this argument was valid, people wouldn't be so worried about > maintaining 2.5 compatibility in their libraries. Consider if I tried > to make this argument to justify everyone dropping 2.5 and earlier > support today: > > """On the consumer side, folks who want to run 2.6+ codebases on older > Linux distros have the wherewithal to compile their own more recent > Python 2 (or use a PPA or > equivalent) until they can move to a more recent version of their distro.""" Fair point. That said, personally, I have given up entirely on Python 2.4 and 2.5 support for newer versions of my OSS libraries. I continue to backport fixes and (some) features to older library versions so folks can run those on systems that require older Pythons. I gave up 2.5 support fairly recently across everything new, and I gave up support for 2.4 a year ago or more in new releases with the same intent. In reality, there is only one major platform that requires 2.4: RHEL 5 and folks who use it will just need to also use old versions of popular libraries; trying to support it for all future feature work until it's EOLed is not sane unless someone pays for it. Python 2.5 has slightly more compelling platforms (GAE and Jython), but GAE is moving to Python 2.7 and Jython is a bit moribund these days and is not really popular enough that a critical mass of folks will clamor for new-and-shiny releases that run on it. The upshot is that most newly created code only needs to run on Python 2.6 and *some* version of Python 3. And being able to eventually write that code in a nonsucky subset of Python 2/3 is important to me, because I'm going to be developing software in that subset for many years (way past the timeframe we're talking about in which Python 3.2 will rule the roost). > It's simply not true in the general case - people don't maintain 2.4+ > compatibility for fun, they do it because RHEL5 (and CentOS 5, etc) > are still reasonably common and ship with 2.4 as the system Python. As > soon as you switch away from the system provided Python, you're > switching away from the vendors entire pre-packaged Python *stack*, > not just the interpreter itself. You then have to install (and > generally build) *everything* for yourself. While that is certainly > possible these days (and a lot simpler than it used to be), it's still > not trivial [1]. > > Since 3.2 is already quite usable for applications that aren't > fighting with the "native strings" problem (which seems to be the > common thread running through the complaints I've heard from web > framework authors), and with it being included in at least the next > Ubuntu LTS, current versions of Fedora, Arch, etc, it's going to be > around for a long time. Ignoring 3.1 is a reasonable option. Ignoring > 3.2 entirely is unlikely to be viable for anyone that is interested in > supporting 3.x within the next couple of years - the 3.3 release is at > least 9 months away, and it's also going to take a while for it to > make its way into distros after the final release gets published on > python.org. > > Hence my suggestion: perhaps the problem is the fact that PEP 3.3/WSGI > 1.0.1 introduced the "native string" concept as a minimalist hack to > try to get a usable gateway interface in Python 3, and that just > doesn't work in practice when attempting to straddle 2.x and 3.x > (because the values WSGI is dealing with aren't really text, they're > bytes, only *some* of which represent text). Perhaps a PEP 444 based > model would be less painful and more coherent in the long run? Possibly. I was the original author of PEP 444 with help from Armin. (although it has since been taken up by Alice and I do not support the updates it has received since then). A bytes-oriented WSGI-like protocol was always the saner option. The native string idea optimized in exactly the wrong place, which was to make it easy to write WSGI middleware, where you're required to do lots of textlike manipulation of header values. The idea of using bytes in places where PEP 3333 now mandates native strings was rejected because people were (somewhat justifiably) horrified at what they had to do in order to attempt treat bytes like strings in this context on Python 3 at the time. It has gotten better, but maybe still not better enough to appease the folks who blocked the idea originally. But all of that is just arguing with the umpire at this point. Promoting and getting consensus about a different protocol will hurt a lot. PEP 3333 was borne of months of intense periods of arguing and compromise. It is the way it is now because everyone was too exhausted to argue about it any more. I don't think that has changed much since it was accepted, and asking folks to go back to that particular drawing board is unlikely to have promising results. Folks have already spent many hours, and lots of money on implementations that the current PEP. They may hunt us down and murder us one by one. ;-) PEP 3333, to its credit, is also remarkably backwards compatible with PEP 333, requiring very little change in existing Python 2 WSGI implementations, which helps Python 2 folks a lot. Given an effective choice between enabling six lines of code in Python 3.3 to support u'' and months of political wrangling and code rewriting, I'll choose the former any day. If we were talking about a change to Python that actually required nontrivial effort, had some sort of nominal consequence, or had some sort of non-theoretical downside, I'd be a lot less sanguine about it. But this is just a no-brainer in the long term, AFAICT. - C From stefan_ml at behnel.de Fri Dec 9 09:02:35 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 Dec 2011 09:02:35 +0100 Subject: [Python-Dev] Fixing the XML batteries Message-ID: Hi everyone, I think Py3.3 would be a good milestone for cleaning up the stdlib support for XML. Note upfront: you may or may not know me as the maintainer of lxml, the de-facto non-stdlib standard Python XML tool. This (lengthy) post was triggered by the following kind of conversation that I keep having with new XML users in Python (mostly on c.l.py), which hints at some serious flaw in the stdlib. User: I'm trying to do XML stuff XYZ in Python and have problem ABC. Me: What library are you using? Could you show us some code? User: My code looks like this snippet: ... Me: You are using minidom which is known to be hard to use, slow and uses lots of memory. Use the xml.etree.ElementTree package instead, or rather its C implementation cElementTree, also in the stdlib. User (coming back after a while): thanks, that was exactly what [I didn't know] I was looking for. What does this tell us? 1) MiniDOM is what new users find first. It's highly visible because there are still lots of ancient "Python and XML" web pages out there that date back from the time before Python 2.5 (or rather something like 2.2), when it was the only XML tree library in the stdlib. It's also the first hit from the top when you search for "XML" on the stdlib docs page and contains the (to some people) familiar word "DOM", which lets users stop their search and start writing code, not expecting to find a separate alternative in the same stdlib, way further down. And the description as "mini", "simple" and "lightweight" suggests to users that it's going to be easy to use and efficient. 2) MiniDOM is not what users want. It leads to complicated, unpythonic code and lots of problems. It is neither easy to use, nor efficient, nor "lightweight", "simple" or "mini", not in absolute numbers (see http://bugs.python.org/issue11379#msg148584 and following for a recent discussion). It's also badly maintained in the sense that its performance characteristics could likely be improved, but no-one is seriously interested in doing that, because it would not lead to something that actually *is* fast or memory friendly compared to any of the 'real' alternatives that are available right now. 3) ElementTree is what users should use, MiniDOM is not. ElementTree was added to the stdlib in Py2.5 on popular demand, exactly because it is very easy to use, very fast, and very memory friendly. And because users did not want to use MiniDOM any more. Today, ElementTree has a rather straight upgrade path towards lxml.etree if more XML features like validation or XSLT are needed. MiniDOM has nothing like that to offer. It's a dead end. 4) In the stdlib, cElementTree is independent of ElementTree, but totally hidden in the documentation. In conversations like the above, it's unnecessarily complex to explain to users that there is ElementTree (which is documented in the stdlib), but that what they want to use is really cElementTree, which has the same API but does not have a stdlib documentation page that I can send them to. Note that the other Python implementations simply provide cElementTree as an alias for ElementTree. That leaves CPython as the only Python implementation that really has these two separate modules. So, there are many problems here. And I think they make it unnecessarily complicated for users to process XML in Python and that the current situation helps in turning away new users from Python as a language for XML processing. Python does have impressively great tools for working with XML. It's just that the stdlib and its documentation do not reflect or even appreciate that. What should change? a) The stdlib documentation should help users to choose the right tool right from the start. Instead of using the totally misleading wording that it uses now, it should be honest about the performance characteristics of MiniDOM and should actively suggest that those who don't know what to choose (or even *that* they can choose) should not use MiniDOM in the first place. I created a ticket (issue11379) for a minor step in this direction, but given the responses, I'm rather convinced that there's a lot more that can be done and should be done, and that it should be done now, right for the next release. b) cElementTree should finally loose it's "special" status as a separate library and disappear as an accelerator module behind ElementTree. This has been suggested a couple of times already, and AFAIR, there was some opposition because 1) ET was maintained outside of the stdlib and 2) the APIs of both were not identical. However, getting ET 1.3 into Py2.7 and 3.2 was a U-turn. Today, ET is *only* being maintained in the stdlib by Florent Xicluna (who is doing a good job with it), and ET 1.3 has basically made the APIs of both implementations compatible again. So, 3.3 would be the right milestone for fixing the "two libs for one" quirk. Given that this is the third time during the last couple of years that I'm suggesting to finally fix the stdlib and its documentation, I won't provide any further patches before it has finally been accepted that a) this is a problem and b) it should be fixed, thus allowing the patches to actually serve a purpose. If we can agree on that, I'll happily help in making this change happen. Stefan From ncoghlan at gmail.com Fri Dec 9 09:09:46 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 9 Dec 2011 18:09:46 +1000 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323416285.2710.219.camel@thinko> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com> <1323408839.2710.143.camel@thinko> <1323410470.2710.158.camel@thinko> <1323416285.2710.219.camel@thinko> Message-ID: Given that WSGI 1.0.1 is defined in terms of native strings and restoring u'' support allows that to be expressed clearly in a shared codebase, I at least understand the point of the suggestion now. I'm not quite convinced restoring u'' is the right answer as yet, but a solid use case is always a nice place to start :) -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) On Dec 9, 2011 5:38 PM, "Chris McDonough" wrote: > On Fri, 2011-12-09 at 16:36 +1000, Nick Coghlan wrote: > > On Fri, Dec 9, 2011 at 4:01 PM, Chris McDonough > wrote: > > > On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases > > > will have the wherewithal to compile their own Python 3 (or use a PPA > or > > > equivalent) until the distros catch up. > > > > > > So I'm not sure why 3.2 not having support for u'' should be a real > > > blocker for the change. > > > > If this argument was valid, people wouldn't be so worried about > > maintaining 2.5 compatibility in their libraries. Consider if I tried > > to make this argument to justify everyone dropping 2.5 and earlier > > support today: > > > > """On the consumer side, folks who want to run 2.6+ codebases on older > > Linux distros have the wherewithal to compile their own more recent > > Python 2 (or use a PPA or > > equivalent) until they can move to a more recent version of their > distro.""" > > Fair point. > > That said, personally, I have given up entirely on Python 2.4 and 2.5 > support for newer versions of my OSS libraries. I continue to backport > fixes and (some) features to older library versions so folks can run > those on systems that require older Pythons. I gave up 2.5 support > fairly recently across everything new, and I gave up support for 2.4 a > year ago or more in new releases with the same intent. > > In reality, there is only one major platform that requires 2.4: RHEL 5 > and folks who use it will just need to also use old versions of popular > libraries; trying to support it for all future feature work until it's > EOLed is not sane unless someone pays for it. Python 2.5 has slightly > more compelling platforms (GAE and Jython), but GAE is moving to Python > 2.7 and Jython is a bit moribund these days and is not really popular > enough that a critical mass of folks will clamor for new-and-shiny > releases that run on it. > > The upshot is that most newly created code only needs to run on Python > 2.6 and *some* version of Python 3. And being able to eventually write > that code in a nonsucky subset of Python 2/3 is important to me, because > I'm going to be developing software in that subset for many years (way > past the timeframe we're talking about in which Python 3.2 will rule the > roost). > > > It's simply not true in the general case - people don't maintain 2.4+ > > compatibility for fun, they do it because RHEL5 (and CentOS 5, etc) > > are still reasonably common and ship with 2.4 as the system Python. As > > soon as you switch away from the system provided Python, you're > > switching away from the vendors entire pre-packaged Python *stack*, > > not just the interpreter itself. You then have to install (and > > generally build) *everything* for yourself. While that is certainly > > possible these days (and a lot simpler than it used to be), it's still > > not trivial [1]. > > > > Since 3.2 is already quite usable for applications that aren't > > fighting with the "native strings" problem (which seems to be the > > common thread running through the complaints I've heard from web > > framework authors), and with it being included in at least the next > > Ubuntu LTS, current versions of Fedora, Arch, etc, it's going to be > > around for a long time. Ignoring 3.1 is a reasonable option. Ignoring > > 3.2 entirely is unlikely to be viable for anyone that is interested in > > supporting 3.x within the next couple of years - the 3.3 release is at > > least 9 months away, and it's also going to take a while for it to > > make its way into distros after the final release gets published on > > python.org. > > > > Hence my suggestion: perhaps the problem is the fact that PEP 3.3/WSGI > > 1.0.1 introduced the "native string" concept as a minimalist hack to > > try to get a usable gateway interface in Python 3, and that just > > doesn't work in practice when attempting to straddle 2.x and 3.x > > (because the values WSGI is dealing with aren't really text, they're > > bytes, only *some* of which represent text). Perhaps a PEP 444 based > > model would be less painful and more coherent in the long run? > > Possibly. I was the original author of PEP 444 with help from Armin. > (although it has since been taken up by Alice and I do not support the > updates it has received since then). > > A bytes-oriented WSGI-like protocol was always the saner option. The > native string idea optimized in exactly the wrong place, which was to > make it easy to write WSGI middleware, where you're required to do lots > of textlike manipulation of header values. The idea of using bytes in > places where PEP 3333 now mandates native strings was rejected because > people were (somewhat justifiably) horrified at what they had to do in > order to attempt treat bytes like strings in this context on Python 3 at > the time. It has gotten better, but maybe still not better enough to > appease the folks who blocked the idea originally. > > But all of that is just arguing with the umpire at this point. > Promoting and getting consensus about a different protocol will hurt a > lot. PEP 3333 was borne of months of intense periods of arguing and > compromise. It is the way it is now because everyone was too exhausted > to argue about it any more. I don't think that has changed much since > it was accepted, and asking folks to go back to that particular drawing > board is unlikely to have promising results. Folks have already spent > many hours, and lots of money on implementations that the current PEP. > They may hunt us down and murder us one by one. ;-) PEP 3333, to its > credit, is also remarkably backwards compatible with PEP 333, requiring > very little change in existing Python 2 WSGI implementations, which > helps Python 2 folks a lot. > > Given an effective choice between enabling six lines of code in Python > 3.3 to support u'' and months of political wrangling and code rewriting, > I'll choose the former any day. If we were talking about a change to > Python that actually required nontrivial effort, had some sort of > nominal consequence, or had some sort of non-theoretical downside, I'd > be a lot less sanguine about it. But this is just a no-brainer in the > long term, AFAICT. > > - C > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Fri Dec 9 09:20:42 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 09 Dec 2011 09:20:42 +0100 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <20111208223408.0e2e8bd1@limelight.wooz.org> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com>

<20111208223408.0e2e8bd1@limelight.wooz.org> Message-ID: <4EE1C4DA.9060809@v.loewis.de> > Sorry, I don't understand this. What does it mean to be "str in both > versions"? And why would you want that? One use case (and the only one I'm aware of) is to pass keyword parameters. Python 2 insists that they are str (and doesn't accept unicode), Python 3 insists that they are str (and doesn't accept bytes). This is fairly uncommon as a problem, though, and is also solved in Python 2.6, which does accept Unicode strings as keyword parameter names. Regards, Martin From martin at v.loewis.de Fri Dec 9 09:25:08 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 09 Dec 2011 09:25:08 +0100 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: <1323405204.2710.139.camel@thinko> References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com>

<1323405204.2710.139.camel@thinko> Message-ID: <4EE1C5E4.6090602@v.loewis.de> > They fail to hear the "again" in that sentence. I've clearly already > thought about the distinction between bytes and text at least once: > that's *why* I'm using a u'' literal there. I shouldn't have to think > about it again to service syntax constraints. Code that is more > explicit than strictly necessary should not be needlessly punished. But you don't have to think about this *again*, in none of the proposed alternatives (whether you use a u() function, whether you use the future import, or whether you use 2to3). They differ only (slightly) in how you spell Unicode literals, but all provide for explicit spelling of Unicode literals when applied. Regards, Martin From martin at v.loewis.de Fri Dec 9 09:32:03 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 09 Dec 2011 09:32:03 +0100 Subject: [Python-Dev] readd u'' literal support in 3.3? In-Reply-To: References: <1323320919.2710.24.camel@thinko> <5242067.5aBSYdFaIB@einstein> <6EB3EF7C-C742-44BD-9588-B6088282D146@langa.pl> <3344831.JP9Cfj4Ety@einstein>

<4EE12BAA.1050601@v.loewis.de> <37AC50BA-EE24-4CFC-8B16-8A2C567A6F9F@twistedmatrix.com>

<1323405204.2710.139.camel@thinko> Message-ID: <4EE1C783.8050306@v.loewis.de> > Or, alternatively, you use 'six' (or a similar compatibility module) > and ensure unicode at runtime, using native or binary strings > otherwise: > > ---------- > from six import u > foo = u("this is a Unicode string in both Python 2.x and 3.x") > bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x" > baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" > ---------- An alternative here is to use a function for bar, not foo: from __future__ import unicode_literals from six.next import native_str foo = "this is a Unicode string in both Python 2.x and 3.x" bar = native_str("this is an 7-bit string in Python 2.x" " and a Unicode string in 3.x") baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" Which of them is "better" depends on which of the two string types are more common. Regards, Martin From martin at v.loewis.de Fri Dec 9 09:41:15 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 09 Dec 2011 09:41:15 +0100 Subject: [Python-Dev] Fixing the XML batteries In-Reply-To: References: Message-ID: <4EE1C9AB.2040301@v.loewis.de> > a) The stdlib documentation should help users to choose the right tool > right from the start. Instead of using the totally misleading wording > that it uses now, it should be honest about the performance > characteristics of MiniDOM and should actively suggest that those who > don't know what to choose (or even *that* they can choose) should not > use MiniDOM in the first place. I disagree. The right approach is not to document performance problems, but to fix them. > b) cElementTree should finally loose it's "special" status as a separate > library and disappear as an accelerator module behind ElementTree. This > has been suggested a couple of times already, and AFAIR, there was some > opposition because 1) ET was maintained outside of the stdlib and 2) the > APIs of both were not identical. However, getting ET 1.3 into Py2.7 and > 3.2 was a U-turn. Unfortunately (?), there is a near-contract-like agreement with Fredrik Lundh that any significant changes to ElementTree in the standard library have to be agreed by him. So whatever change you plan: make sure Fredrik gives his explicit support. Regards, Martin From martin at v.loewis.de Fri Dec 9 09:44:13 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 09 Dec 2011 09:44:13 +0100 Subject: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage() In-Reply-To: <20111209013535.6fb38068@pitrou.net> References: <20111209013535.6fb38068@pitrou.net> Message-ID: <4EE1CA5D.70705@v.loewis.de> Am 09.12.2011 01:35, schrieb Antoine Pitrou: > On Fri, 09 Dec 2011 00:16:02 +0100 > victor.stinner wrote: >> >> +.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) >> + >> + Get a new copy of a Unicode object. >> + >> + .. versionadded:: 3.3 > > I'm not sure I understand. Why would you make a copy of an immutable > object? It can convert a unicode subtype object into a an exact unicode object. I'd rename it to _PyUnicode_AsExactUnicode, and undocument it. Regards, Martin From stefan_ml at behnel.de Fri Dec 9 09:59:24 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 Dec 2011 09:59:24 +0100 Subject: [Python-Dev] Fixing the XML batteries In-Reply-To: <4EE1C9AB.2040301@v.loewis.de> References: <4EE1C9AB.2040301@v.loewis.de> Message-ID: