From guido at python.org Wed Oct 1 00:04:03 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Sep 2008 15:04:03 -0700 Subject: [Python-Dev] Patch for an initial support of bytes filename in Python3 In-Reply-To: <20080930184751.31635.1484325691.divmod.xquotient.520@weber.divmod.com> References: <200809300247.20349.victor.stinner@haypocalc.com> <20080930132151.31635.132601277.divmod.xquotient.434@weber.divmod.com> <20080930175932.31635.989735053.divmod.xquotient.478@weber.divmod.com> <20080930184751.31635.1484325691.divmod.xquotient.520@weber.divmod.com> Message-ID: On Tue, Sep 30, 2008 at 11:47 AM, wrote: > > On 05:56 pm, guido at python.org wrote: >> >> On Tue, Sep 30, 2008 at 10:59 AM, wrote: >>> >>> On 02:32 pm, guido at python.org wrote: > >>> In the absence of a 2.6 getcwdb, perhaps the fixer could just drop the >>> "benefit of the doubt" case? It could always be added to 2.7, and the >>> parity release of 2to3 could have a --2.7 switch that would modify the >>> behavior of this and other fixers. >> >> I'm not sure what you're proposing. *My* proposal is that 2to3 changes >> os.getcwdu() calls to os.getcwd() and leaves os.getcwd() calls alone >> -- there's no way to tell whether os.getcwdb() would be a better >> match, and for portable code, it won't be (since os.getcwdb() is a >> Unix-only thing). > > My proposal is simply to change getcwd to getcwdb, and getcwdu to getcwd. > This preserves whatever bytes/text behavior you are expecting from 2.6 into > 3.0. Granted, the fact that unicode is really always the right thing to do > on Windows complicates things. Plus, even on Linux Unicode is *usually* what you should be doing, unless you're writing a backup tool. > I already tend to avoid os.getcwd() though, and this is just one more reason > to avoid it. In the rare cases where I really do need it, it looks like > os.path.abspath(b".") / os.path.abspath(u".") will provide the clarity that > I want. Or os.path.expanduser('~') vs. os.path.expanduser(b'~'). :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Wed Oct 1 00:04:09 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 30 Sep 2008 16:04:09 -0600 Subject: [Python-Dev] Filename as byte string in python 2.6 or 3.0? In-Reply-To: <48E29D7E.6080406@gmail.com> References: <200809271404.25654.victor.stinner@haypocalc.com> <200809291250.03291.eckhardt@satorlaser.com> <48E0B8FB.9070701@egenix.com> <200809291359.06334.eckhardt@satorlaser.com> <20080929140133.31635.1170224088.divmod.xquotient.314@weber.divmod.com> <48E15B83.9040205@v.loewis.de> <48E1F53B.7030901@gmail.com> <48E29D7E.6080406@gmail.com> Message-ID: On Tue, Sep 30, 2008 at 3:43 PM, Nick Coghlan wrote: > Guido van Rossum wrote: >> The callback would either be an extra argument to all >> system calls (bad, ugly etc., and why not go with the existing unicode >> encoding and error flags if we're adding extra args?) or would be >> global, where I'd be worried that it might interfere with the proper >> operation of library code that is several abstractions away from >> whoever installed the callback, not under their control, and not >> expecting the callback. >> >> I suppose I may have totally misunderstood your proposal, but in >> general I find callbacks unwieldy. > > Not really - later in the email, I actually pointed out that exposing > the unicode errors flag for the implicit PyUnicode_Decode invocations > would be enough to enable a callback mechanism. > > However, James's post pointing out that this is a problem that also > affects environment variables and command line arguments, not just file > paths completely kills any hope of purely callback based approach - that > processing needs to "just work" without any additional intervention from > the application. > > Of the suggestions I've seen so far, I like Marcin's Mono-inspired > NULL-escape codec idea the best. Since these strings all come from parts > of the environment where NULLs are not permitted, a simple "'\0' in > text" check will immediately identify any strings where decoding failed > (for applications which care about the difference and want to try to do > better), while applications which don't care will receive perfectly > valid Python strings that can be passed around and manipulated as if the > decoding error never happened. It avoids the technical problems, but it's still magical behaviour that users have to learn, whereas bytes/unicode polymorphism uses the distinctions you should already know about. There's also a problem of how to turn it on. I'm against automatically Python changing the filesystem encoding, no matter how well intentioned. Better to let the app do that, which is easy and could be done for all apps (not just python!) if someone defined a libc encoding of "null-escaped UTF-8". On the whole I'm only -0 on it (compared to -1 for UTF-8b). -- Adam Olsen, aka Rhamphoryncus From Jack.Jansen at cwi.nl Wed Oct 1 00:05:22 2008 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Wed, 1 Oct 2008 00:05:22 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <48E29D3B.5030900@v.loewis.de> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E29D3B.5030900@v.loewis.de> Message-ID: On 30-Sep-2008, at 23:42 , Martin v. L?wis wrote: > It's the other way 'round: On Windows, Unicode file names are the > natural choice, and byte strings have limitations. In a sense, Windows > got it right - but then, they started later. Unix missed the > opportunity > of declaring that all file APIs are UTF-8 (except for Plan-9 and OS X, > neither being "true" Unix). How does windows (and Python on windows) handle NFC versus NFD issues? Can I have two files called "?mlaut.txt", one in NFD and one NFC form? And are both of those representable on the Python side (i.e. can they both be returned from listdir() and passed to open())? CIf I compare these two filenames, do they compare differently? -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Oct 1 00:05:36 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Sep 2008 15:05:36 -0700 Subject: [Python-Dev] Patch for an initial support of bytes filename in Python3 In-Reply-To: References: <200809300247.20349.victor.stinner@haypocalc.com> <20080930132151.31635.132601277.divmod.xquotient.434@weber.divmod.com> <20080930175932.31635.989735053.divmod.xquotient.478@weber.divmod.com> Message-ID: On Tue, Sep 30, 2008 at 12:07 PM, Simon Cross wrote: > On Tue, Sep 30, 2008 at 7:56 PM, Guido van Rossum wrote: >> (since os.getcwdb() is a Unix-only thing). > > I would be happier if all the Unix byte functions existed on Windows > fell back to something like encoding the filenames to/from UTF-8. Then > at least it would be possible for programs to support reading all > files on both Unix and Windows without having to perform some sort of > explicit check to see whether os.getcwdb() and friends are supported. Actually on Windows the syscalls use the encoding that Microsoft uses -- when using bytes we use the Windows bytes API and when using str we use the Windows wide API. That's the most platform-compatible approach. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Wed Oct 1 00:18:33 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 01 Oct 2008 08:18:33 +1000 Subject: [Python-Dev] Filename as byte string in python 2.6 or 3.0? In-Reply-To: References: <200809271404.25654.victor.stinner@haypocalc.com> <200809291250.03291.eckhardt@satorlaser.com> <48E0B8FB.9070701@egenix.com> <200809291359.06334.eckhardt@satorlaser.com> <20080929140133.31635.1170224088.divmod.xquotient.314@weber.divmod.com> <48E15B83.9040205@v.loewis.de> <48E1F53B.7030901@gmail.com> <48E29D7E.6080406@gmail.com> Message-ID: <48E2A5B9.5090005@gmail.com> Adam Olsen wrote: > On Tue, Sep 30, 2008 at 3:43 PM, Nick Coghlan wrote: >> Of the suggestions I've seen so far, I like Marcin's Mono-inspired >> NULL-escape codec idea the best. Since these strings all come from parts >> of the environment where NULLs are not permitted, a simple "'\0' in >> text" check will immediately identify any strings where decoding failed >> (for applications which care about the difference and want to try to do >> better), while applications which don't care will receive perfectly >> valid Python strings that can be passed around and manipulated as if the >> decoding error never happened. > > It avoids the technical problems, but it's still magical behaviour > that users have to learn, whereas bytes/unicode polymorphism uses the > distinctions you should already know about. > > There's also a problem of how to turn it on. I'm against > automatically Python changing the filesystem encoding, no matter how > well intentioned. Better to let the app do that, which is easy and > could be done for all apps (not just python!) if someone defined a > libc encoding of "null-escaped UTF-8". > > On the whole I'm only -0 on it (compared to -1 for UTF-8b). For the decoding side, you wouldn't need to do it as a codec - you could do it as a 'nullescape' error handler (since NULLs can't be present in the byte sequences being decoded, there is no need to worry about escaping anything when decoding is successful). Converting those NULL escaped strings back into something the filesystem can understand would obviously need a custom codec though, but some kind of application level handling of bad filenames is going to be needed no matter how we deal with bad encoding on the input side. That said, I don't think this is something we (or, more to the point, Guido) need to make a decision on right now - for 3.0, having bytes-level APIs that can see everything, and Unicode APIs that ignore badly encoded filenames is worth trying. If it proves inadequate, then we can revisit the idea of some kind of implicit escaping mechanism in the Unicode APIs for 3.1 when there is more time for a proper PEP. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Wed Oct 1 00:20:57 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Sep 2008 15:20:57 -0700 Subject: [Python-Dev] Filename as byte string in python 2.6 or 3.0? In-Reply-To: <48E29D7E.6080406@gmail.com> References: <200809271404.25654.victor.stinner@haypocalc.com> <200809291250.03291.eckhardt@satorlaser.com> <48E0B8FB.9070701@egenix.com> <200809291359.06334.eckhardt@satorlaser.com> <20080929140133.31635.1170224088.divmod.xquotient.314@weber.divmod.com> <48E15B83.9040205@v.loewis.de> <48E1F53B.7030901@gmail.com> <48E29D7E.6080406@gmail.com> Message-ID: On Tue, Sep 30, 2008 at 2:43 PM, Nick Coghlan wrote: > Of the suggestions I've seen so far, I like Marcin's Mono-inspired > NULL-escape codec idea the best. Since these strings all come from parts > of the environment where NULLs are not permitted, a simple "'\0' in > text" check will immediately identify any strings where decoding failed > (for applications which care about the difference and want to try to do > better), while applications which don't care will receive perfectly > valid Python strings that can be passed around and manipulated as if the > decoding error never happened. I'm not so sure. While it maintains *internal* consistency, printing and displaying those filenames isn't likely going to give useful results. E.g. on the terminal emulator I happen to be using right now null bytes are simply ignored. Another danger might be that the null character may be seen as the end of a string by some other library. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Wed Oct 1 00:21:04 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 01 Oct 2008 00:21:04 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: References: <200809291407.55291.victor.stinner@haypocalc.com> <200809300202.38574.victor.stinner@haypocalc.com> <48E28C31.6060606@v.loewis.de> Message-ID: <48E2A650.4000108@v.loewis.de> >> My concern still is that it brings the bytes type into the status of >> another character string type, which is really bad, and will require >> further modifications to Python for the lifetime of 3.x. > > I'd like to understand why this is "really bad". I though it was by > design that the str and bytes types behave pretty similarly. You can > use both as dict keys. If they have to behave pretty similarly, they have to be supported in all APIs that deal with text. For example, people will demand that printing bytes should just copy them onto the stream (rather than invoking repr()), and writing them onto a text stream should work the same way. GUI library should support them, the XML libraries, and so on. Where will you stop, and tell people that bytes are just not supposed to do this or that? >> This is because applications will then regularly use byte strings for >> file names on Unix, and regular strings on Windows, and then expect >> the program to work the same without further modifications. > > It seems that bytes arguments actually *do* work on Windows -- somehow > they get decoded. (Unless Terry's report was from 2.x.) To a limited degree - see my other message. Don't try to listdir a directory with characters outside CP_ACP (it will give you invalid file names). > Actually something like that may not be a bad idea. Ian Bicking's > webob supports similar double APIs for getting the request parameters > out of a request object; I believe request.GET['x'] is a text object > and request.GET_str['x'] is the corresponding uninterpreted bytes > sequence. I would prefer to have os.environb over os.environ[b"PATH"] > though. And would you keep them synchronized? > I assume at some point we can stop and have sufficiently low-level > interfaces that everyone can agree are in bytes only. Bytes aren't > going away. How does Java deal with this? Its File class doesn't seem > to deal in bytes at all. What would its listFiles() method do with > undecodable filenames? Apparently (JDK 1.5.0_16, on Linux), it decodes undecodable bytes/byte sequences as U+FFFD (REPLACEMENT CHARACTER). Opening such a file will fail with FileNotFoundException. IOW, Java hasn't solved the problem in the last 10 years. Marcin Kowalczyk did a more thorough analysis about a year ago in http://mail.python.org/pipermail/python-3000/2007-September/010450.html Regards, Martin From guido at python.org Wed Oct 1 00:23:15 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Sep 2008 15:23:15 -0700 Subject: [Python-Dev] Filename as byte string in python 2.6 or 3.0? In-Reply-To: <48E2A5B9.5090005@gmail.com> References: <200809271404.25654.victor.stinner@haypocalc.com> <200809291359.06334.eckhardt@satorlaser.com> <20080929140133.31635.1170224088.divmod.xquotient.314@weber.divmod.com> <48E15B83.9040205@v.loewis.de> <48E1F53B.7030901@gmail.com> <48E29D7E.6080406@gmail.com> <48E2A5B9.5090005@gmail.com> Message-ID: On Tue, Sep 30, 2008 at 3:18 PM, Nick Coghlan wrote: > That said, I don't think this is something we (or, more to the point, > Guido) need to make a decision on right now - for 3.0, having > bytes-level APIs that can see everything, and Unicode APIs that ignore > badly encoded filenames is worth trying. If it proves inadequate, then > we can revisit the idea of some kind of implicit escaping mechanism in > the Unicode APIs for 3.1 when there is more time for a proper PEP. Right. Given that most syscalls already support both bytes and (unicode) str, the simplest thing to do is to take this a bit further, along the lines of Victor's patches, which I'm reviewing in Rietveld right now: http://codereview.appspot.com/3055 -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Wed Oct 1 00:28:22 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 01 Oct 2008 00:28:22 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <83758335-97EA-441B-A783-05F16EBE6D7A@fuhm.net> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E29CB1.5010309@v.loewis.de> <83758335-97EA-441B-A783-05F16EBE6D7A@fuhm.net> Message-ID: <48E2A806.6020607@v.loewis.de> > Yes! If there is a byte-string access method for Windows, pretty please > make it decode from UTF-8 internally and call the Unicode version of the > Windows APIs. The non-unicode windows APIs are pretty much just broken > -- Ideally, Python should never be calling those. I don't think we will manage to release Python 3.0 this year if that change is to be implemented. And then, I don't think the release manager will agree to such a delay. I disagree that the ANSI APIs are broken. For most users (and by that, I mean much more than 99% of the world population with access to Windows computers), they work just fine. You have to deliberately try to break them, or work in an environment were you speak multiple languages (with conflicting scripts) simultaneously. Practicality beats purity, and I applaud Microsoft for such a foresighted design (they are guilty for bad designs in other places, but this one really gives a good tradeoff of all issues, all things considered). Regards, Martin From martin at v.loewis.de Wed Oct 1 00:32:03 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 01 Oct 2008 00:32:03 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: References: <200809291407.55291.victor.stinner@haypocalc.com> <48E29D3B.5030900@v.loewis.de> Message-ID: <48E2A8E3.3070805@v.loewis.de> > How does windows (and Python on windows) handle NFC versus NFD issues? That's left to the application. > Can I have two files called "?mlaut.txt", one in NFD and one NFC form? Yes, you can. It sounds confusing, but only in a theoretical way. You never have combining characters on Windows (at least, I don't). The keyboard input defaults to NFC, and users normally don't type file names, anyways, except when creating the files - later, they just use the mouse to indicate what file they want to act on. > And are both of those representable on the Python side (i.e. can they > both be returned from listdir() and passed to open())? Certainly! > CIf I compare > these two filenames, do they compare differently? Certainly! Regards, Martin From guido at python.org Wed Oct 1 00:33:50 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Sep 2008 15:33:50 -0700 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <48E2A650.4000108@v.loewis.de> References: <200809291407.55291.victor.stinner@haypocalc.com> <200809300202.38574.victor.stinner@haypocalc.com> <48E28C31.6060606@v.loewis.de> <48E2A650.4000108@v.loewis.de> Message-ID: On Tue, Sep 30, 2008 at 3:21 PM, "Martin v. L?wis" wrote: >>> My concern still is that it brings the bytes type into the status of >>> another character string type, which is really bad, and will require >>> further modifications to Python for the lifetime of 3.x. >> >> I'd like to understand why this is "really bad". I though it was by >> design that the str and bytes types behave pretty similarly. You can >> use both as dict keys. > > If they have to behave pretty similarly, they have to be supported in > all APIs that deal with text. I don't see how you get from "pretty similarly" to "all APIs". :-) > For example, people will demand that > printing bytes should just copy them onto the stream (rather than > invoking repr()), and writing them onto a text stream should work the > same way. GUI library should support them, the XML libraries, and so > on. > > Where will you stop, and tell people that bytes are just not supposed > to do this or that? Printing a bytes object already works, and displays its repr(), which is guaranteed to be pure ASCII (unlike the repr() of a unicode str object in Py3k). All the others you mention will cause breakage as they should -- these errors exist to force the programmer to think about encodings or conversions. I don't see that as a big burden because the only way there could be bytes here in the first place is when the user explicitly requested bytes. A program that only ever passes text strings to the os module is only ever going to get text strings back. >>> This is because applications will then regularly use byte strings for >>> file names on Unix, and regular strings on Windows, and then expect >>> the program to work the same without further modifications. >> >> It seems that bytes arguments actually *do* work on Windows -- somehow >> they get decoded. (Unless Terry's report was from 2.x.) > > To a limited degree - see my other message. Don't try to listdir a > directory with characters outside CP_ACP (it will give you invalid > file names). Understood. >> Actually something like that may not be a bad idea. Ian Bicking's >> webob supports similar double APIs for getting the request parameters >> out of a request object; I believe request.GET['x'] is a text object >> and request.GET_str['x'] is the corresponding uninterpreted bytes >> sequence. I would prefer to have os.environb over os.environ[b"PATH"] >> though. > > And would you keep them synchronized? Yes, the bytes versions would be the canonical version and the str version would wrap around that -- though updating the str version would also update the bytes version. Some keys would be missing from the str version (or perhaps they would raise exceptions or default to some other error handler, like ignore or replace). >> I assume at some point we can stop and have sufficiently low-level >> interfaces that everyone can agree are in bytes only. Bytes aren't >> going away. How does Java deal with this? Its File class doesn't seem >> to deal in bytes at all. What would its listFiles() method do with >> undecodable filenames? > > Apparently (JDK 1.5.0_16, on Linux), it decodes undecodable bytes/byte > sequences as U+FFFD (REPLACEMENT CHARACTER). Opening such a file will > fail with FileNotFoundException. > > IOW, Java hasn't solved the problem in the last 10 years. Marcin > Kowalczyk did a more thorough analysis about a year ago in > > http://mail.python.org/pipermail/python-3000/2007-September/010450.html I can't say I like the Java solution. I would like to be able to write a robust backup tool in Python, even if the code needed to make it work everywhere isn't going to win any prizes (due to the need to use bytes on Unix, str on Windows). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From foom at fuhm.net Wed Oct 1 00:36:23 2008 From: foom at fuhm.net (James Y Knight) Date: Tue, 30 Sep 2008 18:36:23 -0400 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <48E2A650.4000108@v.loewis.de> References: <200809291407.55291.victor.stinner@haypocalc.com> <200809300202.38574.victor.stinner@haypocalc.com> <48E28C31.6060606@v.loewis.de> <48E2A650.4000108@v.loewis.de> Message-ID: <0DBCA888-43DA-4DE9-952F-A377E96B286D@fuhm.net> On Sep 30, 2008, at 6:21 PM, Martin v. L?wis wrote: > IOW, Java hasn't solved the problem in the last 10 years. Java is already really bad at being a small little language to write cooperating tools in. I'd never even attempt to write a little pipeline filter in Java -- I've already pretty much learned to expect Java applications to be in their own world, so I'd hardly find it surprising if a Java app could only read files it wrote itself, nevermind files in odd encodings. Python, on the other hand, is an awesome tool for writing small little scripts that interact well with the surrounding environment, Just The Way It Is, without trying to layer so much abstraction upon it so that you lose functionality. Moving away from that would be unfortunate. James From Jack.Jansen at cwi.nl Wed Oct 1 00:49:57 2008 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Wed, 1 Oct 2008 00:49:57 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <48E2A8E3.3070805@v.loewis.de> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E29D3B.5030900@v.loewis.de> <48E2A8E3.3070805@v.loewis.de> Message-ID: <82D029DA-C218-4631-A68E-CE3DBB03494A@cwi.nl> On 1-Oct-2008, at 00:32 , Martin v. L?wis wrote: > >> How does windows (and Python on windows) handle NFC versus NFD >> issues? > > That's left to the application. > >> Can I have two files called "?mlaut.txt", one in NFD and one NFC >> form? > > Yes, you can. It sounds confusing, but only in a theoretical way. You > never have combining characters on Windows (at least, I don't). The > keyboard input defaults to NFC, and users normally don't type file > names, anyways, except when creating the files - later, they just use > the mouse to indicate what file they want to act on. > >> And are both of those representable on the Python side (i.e. can they >> both be returned from listdir() and passed to open())? > > Certainly! > >> CIf I compare >> these two filenames, do they compare differently? > > Certainly! Actually, that all sounds pretty non-confusing to me:-) So, normal users will always have the one form, and if by chance they get the other form they can still use the file. Also from Python, even when doing listdir() and then open(), everything will work just as expected. That there are two files that have a similar visual representation is not too bad, the same happens with ellipses versus dot-dot-dot and many other cases. Which means the only problem area left is unix filesystems (whether on Linux or mounted remotely on MacOS or whatever), where filenames are really byte strings with only / and nul illegal. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Oct 1 01:08:39 2008 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 1 Oct 2008 09:08:39 +1000 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <48E29CB1.5010309@v.loewis.de> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E29CB1.5010309@v.loewis.de> Message-ID: <200810010908.40274.steve@pearwood.info> On Wed, 1 Oct 2008 07:40:01 am Martin v. L?wis wrote: > >> On Windows, we might reject bytes filenames for all file > >> operations: open(), unlink(), os.path.join(), etc. (raise a > >> TypeError or UnicodeError) > > > > Since I've seen no objections to this yet: please no. If we offer a > > "lower-level" bytes filename API, it should work for all platforms. > > Unfortunately, it can't. You cannot represent all possible file names > in a byte string in Windows (just as you can't do so in a Unicode > string on Unix). Sorry, maybe I'm just being thick here, but I don't understand how that is possible. On the physical disk, each Windows file name must be represented by a byte string, yes? So how is it possible that there are Windows files with names that can't be represented as a byte string? What have I missed? -- Steven From guido at python.org Wed Oct 1 01:21:37 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Sep 2008 16:21:37 -0700 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <200810010908.40274.steve@pearwood.info> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E29CB1.5010309@v.loewis.de> <200810010908.40274.steve@pearwood.info> Message-ID: On Tue, Sep 30, 2008 at 4:08 PM, Steven D'Aprano wrote: > On Wed, 1 Oct 2008 07:40:01 am Martin v. L?wis wrote: >> >> On Windows, we might reject bytes filenames for all file >> >> operations: open(), unlink(), os.path.join(), etc. (raise a >> >> TypeError or UnicodeError) >> > >> > Since I've seen no objections to this yet: please no. If we offer a >> > "lower-level" bytes filename API, it should work for all platforms. >> >> Unfortunately, it can't. You cannot represent all possible file names >> in a byte string in Windows (just as you can't do so in a Unicode >> string on Unix). > > Sorry, maybe I'm just being thick here, but I don't understand how that > is possible. On the physical disk, each Windows file name must be > represented by a byte string, yes? So how is it possible that there are > Windows files with names that can't be represented as a byte string? > What have I missed? I believe on disk it uses UTF-16. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steve at pearwood.info Wed Oct 1 02:04:39 2008 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 1 Oct 2008 10:04:39 +1000 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: References: <200809291407.55291.victor.stinner@haypocalc.com> <200810010908.40274.steve@pearwood.info> Message-ID: <200810011004.39759.steve@pearwood.info> On Wed, 1 Oct 2008 09:21:37 am you wrote: > On Tue, Sep 30, 2008 at 4:08 PM, Steven D'Aprano wrote: > > On Wed, 1 Oct 2008 07:40:01 am Martin v. L?wis wrote: > >> >> On Windows, we might reject bytes filenames for all file > >> >> operations: open(), unlink(), os.path.join(), etc. (raise a > >> >> TypeError or UnicodeError) > >> > > >> > Since I've seen no objections to this yet: please no. If we > >> > offer a "lower-level" bytes filename API, it should work for all > >> > platforms. > >> > >> Unfortunately, it can't. You cannot represent all possible file > >> names in a byte string in Windows (just as you can't do so in a > >> Unicode string on Unix). > > > > Sorry, maybe I'm just being thick here, but I don't understand how > > that is possible. On the physical disk, each Windows file name must > > be represented by a byte string, yes? So how is it possible that > > there are Windows files with names that can't be represented as a > > byte string? What have I missed? > > I believe on disk it uses UTF-16. Which is made up of bytes. There may be byte sequences that are illegal UTF-16, but that's not what Martin said. I don't understand how there can be UTF-16 sequences which don't correspond to some sequence of bytes. How would they be represented in memory? Is this to do with the endianness of the UTF-16 sequence? -- Steven From murman at gmail.com Wed Oct 1 02:16:19 2008 From: murman at gmail.com (Michael Urman) Date: Tue, 30 Sep 2008 19:16:19 -0500 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <200810011004.39759.steve@pearwood.info> References: <200809291407.55291.victor.stinner@haypocalc.com> <200810010908.40274.steve@pearwood.info> <200810011004.39759.steve@pearwood.info> Message-ID: On Tue, Sep 30, 2008 at 7:04 PM, Steven D'Aprano wrote: >> I believe on disk it uses UTF-16. > > Which is made up of bytes. There may be byte sequences that are illegal > UTF-16, but that's not what Martin said. I don't understand how there > can be UTF-16 sequences which don't correspond to some sequence of > bytes. How would they be represented in memory? Is this to do with the > endianness of the UTF-16 sequence? It has to do with the internal mapping between the ANSI and Unicode functions. On NT systems, CreateFileA will map the ANSI bytestring to a Unicode filename via the active code page, and call CreateFileW accordingly. The active code page cannot be set to something as useful as UTF-8, so given any actual code page (1252, 932, etc.) there are Unicode strings that cannot be represented with a bytestring provided to the ANSI function. -- Michael Urman From victor.stinner at haypocalc.com Wed Oct 1 02:17:33 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 1 Oct 2008 02:17:33 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <48E2A806.6020607@v.loewis.de> References: <200809291407.55291.victor.stinner@haypocalc.com> <83758335-97EA-441B-A783-05F16EBE6D7A@fuhm.net> <48E2A806.6020607@v.loewis.de> Message-ID: <200810010217.33570.victor.stinner@haypocalc.com> Le Wednesday 01 October 2008 00:28:22 Martin v. L?wis, vous avez ?crit?: > I don't think we will manage to release Python 3.0 this year if that > change is to be implemented. And then, I don't think the release manager > will agree to such a delay. The minimum change is to disallow bytes/str mix: - os.listdir(unicode)->unicode and ignore invalid files (current behaviour is to return unicode and bytes) - os.readlink(unicode)->unicode or raise an error (current behaviour is to return unicode or bytes) - remove os.getcwdu() (use its code -which is better- for getcwd) and fix the test_unicode_file.py listdir() change (ignore invalid filenames) is important to avoid strange bugs in os.path.*(), glob.*() or on displaying a filename. I can generate a specific patch for these issues. It's just a subset of my last patch. -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ From greg.ewing at canterbury.ac.nz Wed Oct 1 02:33:51 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 01 Oct 2008 12:33:51 +1200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <48E243CA.1090604@egenix.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <200809300202.38574.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E20017.3020405@egenix.com> <48E243CA.1090604@egenix.com> Message-ID: <48E2C56F.50300@canterbury.ac.nz> M.-A. Lemburg wrote: > In the end, I think it's better not to be clever and just return > the filenames that cannot be decoded as bytes objects in os.listdir(). But since it's a rare occurrence, most applications are just going to ignore the issue, and then fail unexpectedly one day on some unsuspecting user that doesn't have the inclination to go diving into the code to fix it. -- Greg From glyph at divmod.com Wed Oct 1 03:27:26 2008 From: glyph at divmod.com (glyph at divmod.com) Date: Wed, 01 Oct 2008 01:27:26 -0000 Subject: [Python-Dev] Filename as byte string in python 2.6 or 3.0? In-Reply-To: References: <200809271404.25654.victor.stinner@haypocalc.com> <48E15B83.9040205@v.loewis.de> <48E1F53B.7030901@gmail.com> <1222771976.2598.39.camel@localhost> <1222785585.2598.45.camel@localhost> <20080930181231.31635.1557225685.divmod.xquotient.503@weber.divmod.com> <20080930184258.31635.76988507.divmod.xquotient.511@weber.divmod.com> Message-ID: <20081001012726.31635.1673198315.divmod.xquotient.605@weber.divmod.com> On 30 Sep, 09:37 pm, guido at python.org wrote: >On Tue, Sep 30, 2008 at 11:42 AM, wrote: >>There are other ways to glean this knowledge; for example, looking at >>the >>'iocharset' or 'nls' mount options supplied to mount various >>filesystems. >I know we could do a better job, but absent anyone who knows what >they're doing we've chosen a fairly conservative approach. I certainly >hope that someone will contribute some mean encoding-guessing code to >the stdlib that users can use. I'm not sure if I'll ever endorse doing >this automatically in io.open(), though I'd be fine with a convention >like passing encoding="guess". I think the conservative approach is actually correct, or rather, as close to correct as it is possible to get in this mess. Inspecting these fantastically obscure options is only likely to be helpful in a tool which tries to correct filesystem encoding errors on legacy data. I wouldn't even know about them if I hadn't written several such tools (well, just little scripts, really) in the past. I was just verifying that I wasn't missing some "right way" which would let someone else do the guesswork for me. In reality, you have two options for filesystem encoding on Linux: * UTF-8 * fall in a well and die The OS will happily let you create a completely nonsensical environment where no application can possibly do anything reasonable: set LC_ALL to KOI8R, mount your USB keychain as Shift_JIS and your windows partition as ISO-8859-8. Of course nobody would actually _do_ this, because they want things to work, so everything is gradually evolving to a default of UTF-8 everywhere. In practice, however, there are still problems with CIFS/SMB shares where other clients have different ideas about encoding. I've experienced this most commonly when sharing with Macs, which have very particular and different ideas about normalization, as has already been discussed in this thread. From glyph at divmod.com Wed Oct 1 04:06:25 2008 From: glyph at divmod.com (glyph at divmod.com) Date: Wed, 01 Oct 2008 02:06:25 -0000 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: References: <200809291407.55291.victor.stinner@haypocalc.com> <200809300202.38574.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E2865A.3010404@v.loewis.de> Message-ID: <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> On 30 Sep, 09:22 pm, guido at python.org wrote: >On Tue, Sep 30, 2008 at 1:04 PM, "Martin v. L?wis" >wrote: >>Guido van Rossum wrote: >>>On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. L?wis" >>> wrote: >>>Martin, I don't understand why you are in favor of storing raw bytes >>>encoded as Latin-1 in Unicode string objects, which clearly gives >>>rise >>>to mojibake. This is my word of the day, by the way. Reading this whole thread was _totally_ worth it to learn about "mojibake". Obviously I'm familiar with the phenomenon but somehow I'd never heard this awesome term before. >I am also encouraged by Glyph's support for (a). He has a lot of >practical experience. Thanks for the vote of confidence. I hope for all our sakes that you're not over-valuing that experience ;-). For what it's worth, I can see MvL's point in that I think there is some danger in generating confusion by adding _too many_ string-like functions to the bytes type. I don't want my suggestion to contribute to the confusion between bytes and text. However, Martin, I can promise you that I will _never_ ask for any convenience functions related to bytes as a result of this decision. I want bytes to come back from filesystem APIs because I intend to have a wrapper layer which knows two things about the file: the bytes (which are needed to talk to POSIX filesystem APIs) and the characters (which are computed from those bytes, can be safely renormalized, displayed to users, etc). On Windows this filesystem wrapper will necessarily behave differently, and will not use bytes for anything. Any formatting beyond joining path segments together and possibly splitting extensions off will be done on character strings, not byte strings. The proposal of using U+0000 seems like it would have been almost the same from such a wrapper's perspective, except (A) people using the filesystem APIs without the benefit of such a wrapper would have been even more screwed, and (B) there are a few nasty corner-cases when dealing with surrogate (i.e. invalid, in UTF-8) code points which I'm not quite sure what it would have done with. Guido already mentioned "libraries" as a hypothetical issue, but here's a real-world problem that results from putting NULLs into filenames. Consider this program: import gtk w = gtk.Window() b = gtk.Button(u"\u0000/hello/world") w.add(b) w.show_all() gtk.main() which emits this message: TypeError: OGtkButton.__init__() argument 1 must be string without null bytes or None, not unicode SQLite has a similar problem with NULLs, and I'm definitely sticking paths in there, too. Eventually I'd like to propose such a path type for inclusion in the stdlib, but that will have to wait for issues like to be resolved. From rhamph at gmail.com Wed Oct 1 04:22:08 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 30 Sep 2008 20:22:08 -0600 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <200809300202.38574.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E2865A.3010404@v.loewis.de> <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> Message-ID: On Tue, Sep 30, 2008 at 8:06 PM, wrote: > The proposal of using U+0000 seems like it would have been almost the same > from such a wrapper's perspective, except (A) people using the filesystem > APIs without the benefit of such a wrapper would have been even more > screwed, and (B) there are a few nasty corner-cases when dealing with > surrogate (i.e. invalid, in UTF-8) code points which I'm not quite sure what > it would have done with. Surrogates in UTF-8 *should* be treated as errors, but current python is far too lax. That actually leads to another problem: improving validating will change what gets escaped and what doesn't. http://bugs.python.org/issue3297 http://bugs.python.org/issue3672 -- Adam Olsen, aka Rhamphoryncus From foom at fuhm.net Wed Oct 1 05:32:04 2008 From: foom at fuhm.net (James Y Knight) Date: Tue, 30 Sep 2008 23:32:04 -0400 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <200809300202.38574.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E2865A.3010404@v.loewis.de> <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> Message-ID: <22920D6A-8B70-4E6D-BE99-D7447D831B41@fuhm.net> On Sep 30, 2008, at 10:06 PM, glyph at divmod.com wrote: > However, Martin, I can promise you that I will _never_ ask for any > convenience functions related to bytes as a result of this > decision. I want bytes to come back from filesystem APIs because I > intend to have a wrapper layer which knows two things about the > file: the bytes (which are needed to talk to POSIX filesystem APIs) > and the characters (which are computed from those bytes, can be > safely renormalized, displayed to users, etc). On Windows this > filesystem wrapper will necessarily behave differently, and will not > use bytes for anything. Any formatting beyond joining path segments > together and possibly splitting extensions off will be done on > character strings, not byte strings. Can you clarify what proposal you are supporting for Python: 1) Two sets of APIs, one returning unicode strings, and one returning bytestrings. (subpoints: what does the unicode-returning API do when it cannot decode the bytestring into unicode? raise exception, pretend argument/envvar/file didn't exist/?) or 2) All APIs return bytestrings only. Converting to unicode is considered lossy, and would have to be done by applications for display purposes only. I really don't understand the reasoning for (1). It seems to me that most software (probably including all of the Python stdlib) would continue to use the unicode string API. Switching all of the Python stdlib to use the bytestring APIs instead would certainly be a large undertaking, and would have all sorts of ripple-on API changes (e.g. __file__). So I can only imagine that if you're proposing (1), you're doing so without the intention of suggesting that Python be converted to use it. And so, of course, that doesn't really fix things (such as getcwd failing if your cwd is a path that is undecodeable in the current locale, or well, currently, python refusing to even start). If you're proposing (2), it's at least as large an undertaking as (1) + converting Python to use the optional bytestring APIs. But at least it avoids exposing an API that people ought not use, and does make it obvious what still needs to be fixed: the unfixed code simply won't run at all. > The proposal of using U+0000 seems like it would have been almost > the same from such a wrapper's perspective, except (A) people using > the filesystem APIs without the benefit of such a wrapper would have > been even more screwed I'm not sure what your "more screwed" is comparing against: current py3k behavior? (aka: decoding to Unicode in locale's specified encoding)? I don't see how you can really be more screwed than that: not only can't you send your filename to display in a Gtk+ button, you can't access it at all, even staying within python. > and (B) there are a few nasty corner-cases when dealing with > surrogate (i.e. invalid, in UTF-8) code points which I'm not quite > sure what it would have done with. The lone-surrogate-pair proposal was a totally different proposal than the U+0000 one. James From tjreedy at udel.edu Wed Oct 1 05:46:00 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 30 Sep 2008 23:46:00 -0400 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: References: <200809291407.55291.victor.stinner@haypocalc.com> Message-ID: Guido van Rossum wrote: > No, that's because bytes is missing from the explicit list of > allowable types in io.open. Victor has a one-line trivial patch for > this. Could you try this though? > >>>> import _fileio >>>> _fileio._FileIO(b'tem') >>> import _fileio >>> _fileio._FileIO(b'tem') _fileio._FileIO(3, 'r') >>> From glyph at divmod.com Wed Oct 1 07:19:47 2008 From: glyph at divmod.com (glyph at divmod.com) Date: Wed, 01 Oct 2008 05:19:47 -0000 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <22920D6A-8B70-4E6D-BE99-D7447D831B41@fuhm.net> References: <200809291407.55291.victor.stinner@haypocalc.com> <200809300202.38574.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E2865A.3010404@v.loewis.de> <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> <22920D6A-8B70-4E6D-BE99-D7447D831B41@fuhm.net> Message-ID: <20081001051947.31635.1251804577.divmod.xquotient.807@weber.divmod.com> On 03:32 am, foom at fuhm.net wrote: >On Sep 30, 2008, at 10:06 PM, glyph at divmod.com wrote: >Can you clarify what proposal you are supporting for Python: Sure. Neither of your descriptions is terribly accurate, but I'll try to explain. >1) Two sets of APIs, one returning unicode strings, and one returning >bytestrings. (subpoints: what does the unicode-returning API do when >it cannot decode the bytestring into unicode? raise exception, pretend >argument/envvar/file didn't exist/?) The only API discussed so far which would actually provide two variants is 'getcwd', which would have a 'getcwdb' that gives back bytes instead. Pretty much every other API takes some kind of input. listdir(bytes) would give back bytes, while listdir(text) would give back text. listdir(text) would skip undecodable filenames. Similarly for all the other APIs in os and os.path that take pathnames for input. >2) All APIs return bytestrings only. Converting to unicode is >considered lossy, and would have to be done by applications for >display purposes only. This is a bad way to do things, because on Windows, filenames *really are* unicode. Converting to bytes is what's lossy. (See previous discussion of active codepages and CreateFileA/CreateFileW.) >I really don't understand the reasoning for (1). The reasoning is that a lot of software doesn't care if it's wrong for edge cases, it's really hard to come up with something that's correct with respect to all of those edge cases (absurdly difficult, if you need to stay in the straightjacket of string / bytes types, as well as provide a useful library interface - which is why we're having this discussion). But, it should be _possible_ to write software that's correct in the face of those edge cases. And - let's not forget this - the worlds of POSIX and Windows really are different and really do require subtly different inputs. Python can try to paper over this like Java does and make it impossible to write certain classes of application, or it can just provide an ugly, slightly inconsistent API that exposes the ugly, slightly inconsistent reality. Modulo the issues you've raised which I don't think the proposal totally covers yet (abspath with a non-decodable cwd) I think it strikes a nice balance; allow people to live in the delusion of unicode-on-POSIX and have software that mostly works, most of the time, or allow them to face the unpleasantness and spend the effort to get something really solid. I think the _right_ answer to all of this is to (A) make FilePath work completely correctly for every totally insane edge case ever, and (B) include it in the stdlib. One day I think we'll do that. But nobody has the time or energy to do even the first part of that *right now*, before 3.0 is released, so I'm just looking for something which it will be possible to build FilePath, or something like it, on top of, without breaking other people's applications who rely on the os module directly too badly. >It seems to me that most software (probably including all of the >Python stdlib) would continue to use the unicode string API. That's true. And that software wouldn't handle these edge cases completely correctly. As Guido put it, "it's a quality of implementation issue". >Switching all of the Python stdlib to use the bytestring APIs instead >would certainly be a large undertaking, and would have all sorts of >ripple-on API changes (e.g. __file__). I am not quite sure what to do about __file__. My preference would probably be to use unicode filename for consistency so it can always be displayed, but provide a second attribute (__open_file__?) that would be sometimes unicode, sometimes bytes, which would be guaranteed to work with open(). I suspect that most software which interacts with __file__ on a deep level would be of the variety which would deal with the edge cases. But where the Python stdlib wants a pathname it should be accepting either bytes or unicode, as all of the os.path functions want. This does kind of suck, but the alternatives are to encode crazy extra information in unicode path names that cannot be exchanged with other programs (or with users: NULL is potentially the worst bogus character from a UI perspective), or revert to bytes for everything (which is a non-solution, c.f. Windows above). >So I can only imagine that if you're proposing (1), you're doing so >without the intention of suggesting that Python be converted to use >it. Maybe updating the stdlib to be correct in the face of such changes is hard, but it doesn't seem intractible. Taken together, it looks like there are only about 100 calls in the stdlib to both getcwd and abspath together, and I suspect many of them are for purely aesthetic purposes and could just be eliminated, and many of them are redefinitions of the functions and don't need any changes. All the other path manipulation functions would continue to work as-is, although some of them might skip undecodable files. >And so, of course, that doesn't really fix things (such as getcwd >failing if your cwd is a path that is undecodeable in the current >locale, or well, currently, python refusing to even start). The proposal as I understand it so far doesn't address this specifically, so I'll try to. os.getcwd, os.path.abspath, and os.path.realpath (when called with unicode) will probably need to do something gross if they're called on a non-decodable directory. One thing that comes to mind is to create a temporary symbolic link and return u'/tmp/python-$YOURUID-undecodable/$GUID/something'. I hope someone else has a better idea, especially since that sort of defeats the purpose of realpath. On the other hand, even this strawman answer is correct for pretty much any sane purpose, and if you _really_ care, you need to learn that you have to use and ask for bytes, on POSIX, to deal with such corner cases. >If you're proposing (2), (...) Luckily I'm not. >>The proposal of using U+0000 seems like it would have been almost the >>same from such a wrapper's perspective, except (A) people using the >>filesystem APIs without the benefit of such a wrapper would have been >>even more screwed > >I'm not sure what your "more screwed" is comparing against: current >py3k behavior? (aka: decoding to Unicode in locale's specified >encoding)? I don't see how you can really be more screwed than that: >not only can't you send your filename to display in a Gtk+ button, you >can't access it at all, even staying within python. You're screwed if you're trying to access files in a portable way without worrying at all about encodings. There are files you won't be able to access, there are conditions you won't be able to deal with. Sorry, but POSIX sucks and that's life. You're _more_ screwed if you're trying to access those files in a portable way without worrying about encodings, and the API you're using is giving you back invalid, magic path names, with NULLs rather than being slightly lossy and dropping filenames you (obviously, by virtue of the way you requested those filenames) won't be able to deal with. So I was talking here about the default behavior in the case of a naive program that wants to pretend all paths are unicode. >>and (B) there are a few nasty corner-cases when dealing with >>surrogate (i.e. invalid, in UTF-8) code points which I'm not quite >>sure what it would have done with. > >The lone-surrogate-pair proposal was a totally different proposal than >the U+0000 one. I wasn't referring to the lone-surrogate-pair encoding trick, I was referring to the fact that some people are going to want to treat surrogate pairs as encoding errors (i.e. include the NULL byte) and some will want to treat them as valid. If you want them to be valid you have to normalize away the surrogates in order to talk to other software, but you can't do that because then you'll get different bytes when you re- encode them. There's probably a way around that but it would be subtle and controversial no matter how you did it. From martin at v.loewis.de Wed Oct 1 07:21:44 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 01 Oct 2008 07:21:44 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <200810010908.40274.steve@pearwood.info> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E29CB1.5010309@v.loewis.de> <200810010908.40274.steve@pearwood.info> Message-ID: <48E308E8.5060302@v.loewis.de> > Sorry, maybe I'm just being thick here, but I don't understand how that > is possible. On the physical disk, each Windows file name must be > represented by a byte string, yes? So how is it possible that there are > Windows files with names that can't be represented as a byte string? > What have I missed? That we are not really free to choose the byte representation when choosing byte strings. Microsoft has defined how char* (i.e. byte strings) are to be interpreted when interpreting them as byte strings, namely in the ANSI code page. That code page is not capable of representing all file names. We could, for example, use the same representation as is used on disk. However, a) there is no API to find out what that representation is, and b) it is not null-byte free, a property often desired for file names, and c) because it contains null bytes, it won't be easy to display such file names on stdout, or in a GUI window. Regards, Martin From martin at v.loewis.de Wed Oct 1 07:27:47 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 01 Oct 2008 07:27:47 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <200809300202.38574.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E2865A.3010404@v.loewis.de> <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> Message-ID: <48E30A53.5040708@v.loewis.de> > However, Martin, I can promise you that I will _never_ ask for any > convenience functions related to bytes as a result of this decision. :-) Regards, Martin From hodgestar+pythondev at gmail.com Wed Oct 1 09:52:01 2008 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 1 Oct 2008 09:52:01 +0200 Subject: [Python-Dev] Patch for an initial support of bytes filename in Python3 In-Reply-To: References: <200809300247.20349.victor.stinner@haypocalc.com> <20080930132151.31635.132601277.divmod.xquotient.434@weber.divmod.com> <20080930175932.31635.989735053.divmod.xquotient.478@weber.divmod.com> Message-ID: On Wed, Oct 1, 2008 at 12:05 AM, Guido van Rossum wrote: > Actually on Windows the syscalls use the encoding that Microsoft uses > -- when using bytes we use the Windows bytes API and when using str we > use the Windows wide API. That's the most platform-compatible > approach. Woot. As long as the Python file API is consistent across the two platforms, I'm happy. :) From eckhardt at satorlaser.com Wed Oct 1 09:54:47 2008 From: eckhardt at satorlaser.com (Ulrich Eckhardt) Date: Wed, 1 Oct 2008 09:54:47 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <48E20017.3020405@egenix.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E20017.3020405@egenix.com> Message-ID: <200810010954.47564.eckhardt@satorlaser.com> On Tuesday 30 September 2008, M.-A. Lemburg wrote: > On 2008-09-30 08:00, Martin v. L?wis wrote: > >> Change the default file system encoding to store bytes in Unicode is > >> like introducing a new Python type: . > > > > Exactly. Seems like the best solution to me, despite your polemics. > > Not a bad idea... have os.listdir() return Unicode subclasses that work > like file handles, ie. they have an extra buffer that holds the original > bytes value received from the underlying C API. Why does it have to be a Unicode subclass? In my eyes, a Unicode object promises a few things, in particular that it contains a Unicode string. If it now suddenly contains bytes without any further meaning, that would be bad. What I wonder is what the requirements on path handling are. I'll try to list the ones I can see: 1. A path received from the system should be preserved, so it can be given to the system later on. IOW, the internal representation should not loose any information compared to the one used by the OS. 2. Typical operations like joining two path segments or moving to the parent dir should be defined. 3. There must be a way to display the path to the user. IOW, there should be a way to turn the path into a string that the user can recognise, according to some encoding. Note that this is not always possible, so this can fail. 4. There must be a way to receive a path from the user. That means that there must be a way from a user-entered string to a path. Note that this, too, isn't always possible and can fail. 5. The conversion between a string and a path should be configurable, defaults retrieved from the system. This is so that most operations will just work and do the thing that the user expects. 6. There should be a way to modify the path data itself. This of course requires knowledge about the internals but gives full power to the programmer. For requirement 3, I would say a lossy conversion to a string would be enough, i.e. try to convert the path to a Unicode string and use a question mark or some escaping to mark parts that can't be decoded. It will allow users to recognise the decodeable parts of the path with hopefully just a few characters left without decoding. For requirement 4, a failure to encode a string to a path must result in a loud failure, i.e. an exception. This is because the user entered a path that we can't use, any guessing what the user might have wanted is futile. Are there any points to add? Uli -- Sator Laser GmbH Gesch?ftsf?hrer: Thorsten F?cking, Amtsgericht Hamburg HR B62 932 ************************************************************************************** Visit our website at ************************************************************************************** Diese E-Mail einschlie?lich s?mtlicher Anh?nge ist nur f?r den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empf?nger sein sollten. Die E-Mail ist in diesem Fall zu l?schen und darf weder gelesen, weitergeleitet, ver?ffentlicht oder anderweitig benutzt werden. E-Mails k?nnen durch Dritte gelesen werden und Viren sowie nichtautorisierte ?nderungen enthalten. Sator Laser GmbH ist f?r diese Folgen nicht verantwortlich. ************************************************************************************** From hodgestar+pythondev at gmail.com Wed Oct 1 10:05:38 2008 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 1 Oct 2008 10:05:38 +0200 Subject: [Python-Dev] Patch for an initial support of bytes filename in Python3 In-Reply-To: References: <200809300247.20349.victor.stinner@haypocalc.com> <20080930132151.31635.132601277.divmod.xquotient.434@weber.divmod.com> <20080930175932.31635.989735053.divmod.xquotient.478@weber.divmod.com> <20080930184751.31635.1484325691.divmod.xquotient.520@weber.divmod.com> Message-ID: On Wed, Oct 1, 2008 at 12:04 AM, Guido van Rossum wrote: > Plus, even on Linux Unicode is *usually* what you should be doing, > unless you're writing a backup tool. I still find this line of reasoning a bit worrying. Imagine an end user application like a music player. The user discovers that he can't see some .mp3 or .ogg file from the music player that is visibile is the file manager. I would expect him to file a bug on the music player. If the bug was closed with "fix the filename" I imagine the user would respond with "but other programs can access it just fine". I'm not unhappy with the solution Victor is proposing, but I imagine that when I start coding projects in 3.0 I'll default to the bytes versions of the filename methods and use b"path".decode(sys.getfilesystemencoding(), "replace") if I need to get Unicode. From victor.stinner at haypocalc.com Wed Oct 1 10:43:25 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 1 Oct 2008 10:43:25 +0200 Subject: [Python-Dev] =?utf-8?q?=5BPython-3000=5D_New_proposition_for_Pyth?= =?utf-8?q?on3_bytes=09filename_issue?= In-Reply-To: <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> Message-ID: <200810011043.25662.victor.stinner@haypocalc.com> Le Wednesday 01 October 2008 04:06:25 glyph at divmod.com, vous avez ?crit?: > b = gtk.Button(u"\u0000/hello/world") > > which emits this message: > TypeError: OGtkButton.__init__() argument 1 must be string without > null bytes or None, not unicode > > SQLite has a similar problem with NULLs, and I'm definitely sticking > paths in there, too. I think that you can say "all C libraries". Would it possible to convert the encoded string to bytes just before call Gtk? (job done by some Python internals, not as an explicit conversion) I don't know if it would help the discussion, but Java uses its own modified UTF-8 encoding: * NUL byte is encoded as 0xc0 0x80 instead of 0x00 * Java doesn't support unicode > 0xFFFF (bouuuuh!) http://java.sun.com/javase/6/docs/api/java/io/DataInput.html#modified-utf-8 -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ From mal at egenix.com Wed Oct 1 11:32:30 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 01 Oct 2008 11:32:30 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <200810010954.47564.eckhardt@satorlaser.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E20017.3020405@egenix.com> <200810010954.47564.eckhardt@satorlaser.com> Message-ID: <48E343AE.3080009@egenix.com> On 2008-10-01 09:54, Ulrich Eckhardt wrote: > On Tuesday 30 September 2008, M.-A. Lemburg wrote: >> On 2008-09-30 08:00, Martin v. L?wis wrote: >>>> Change the default file system encoding to store bytes in Unicode is >>>> like introducing a new Python type: . >>> Exactly. Seems like the best solution to me, despite your polemics. >> Not a bad idea... have os.listdir() return Unicode subclasses that work >> like file handles, ie. they have an extra buffer that holds the original >> bytes value received from the underlying C API. > > Why does it have to be a Unicode subclass? In my eyes, a Unicode object > promises a few things, in particular that it contains a Unicode string. If it > now suddenly contains bytes without any further meaning, that would be bad. Please read my entire email. I was proposing to store the underlying non-decodeable byte string value in such a subclass. The Unicode value of the object would then be that underlying value decoded as e.g. Latin-1 in order to be able to work on it as text. Path operations would have to be made aware of such subclasses and operate on the underlying bytes value. However, like Guido mentioned, this only works if all components are indeed aware of such subclasses... and that's likely to fail for code outside the stdlib. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 01 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From stephen at xemacs.org Wed Oct 1 12:25:36 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 01 Oct 2008 19:25:36 +0900 Subject: [Python-Dev] Patch for an initial support of bytes filename in Python3 In-Reply-To: References: <200809300247.20349.victor.stinner@haypocalc.com> <20080930132151.31635.132601277.divmod.xquotient.434@weber.divmod.com> <20080930175932.31635.989735053.divmod.xquotient.478@weber.divmod.com> <20080930184751.31635.1484325691.divmod.xquotient.520@weber.divmod.com> Message-ID: <873ajgpq73.fsf@xemacs.org> Simon Cross writes: > I still find this line of reasoning a bit worrying. Imagine an end > user application like a music player. The user discovers that he can't > see some .mp3 or .ogg file from the music player that is visibile is > the file manager. I would expect him to file a bug on the music > player. If the bug was closed with "fix the filename" I imagine the > user would respond with "but other programs can access it just fine". And the user would very likely be *wrong*. The file manager is displaying it, but in the nature of things file managers *don't access files*, they access *directories*. The files they pass to other apps to access. That's precisely the kind of situation that Georg Brandl was describing with OpenOffice. > I'm not unhappy with the solution Victor is proposing, but I imagine > that when I start coding projects in 3.0 I'll default to the bytes > versions of the filename methods and use > b"path".decode(sys.getfilesystemencoding(), "replace") if I need to > get Unicode. But now the user will file a bug because in the file opening dialog they can't *read* their Chinese file names on their USB key because they are appearing in (system encoding) Cyrillic. Do you begin to see the nature of the Catch-22 here? I don't expect the user to be very sympathetic when you tell her to fix the filenames, but it's not as easy as you would think to get this right. From hodgestar+pythondev at gmail.com Wed Oct 1 12:48:35 2008 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 1 Oct 2008 12:48:35 +0200 Subject: [Python-Dev] Patch for an initial support of bytes filename in Python3 In-Reply-To: <873ajgpq73.fsf@xemacs.org> References: <200809300247.20349.victor.stinner@haypocalc.com> <20080930132151.31635.132601277.divmod.xquotient.434@weber.divmod.com> <20080930175932.31635.989735053.divmod.xquotient.478@weber.divmod.com> <20080930184751.31635.1484325691.divmod.xquotient.520@weber.divmod.com> <873ajgpq73.fsf@xemacs.org> Message-ID: On Wed, Oct 1, 2008 at 12:25 PM, Stephen J. Turnbull wrote: > Simon Cross writes: > > > I still find this line of reasoning a bit worrying. Imagine an end > > user application like a music player. The user discovers that he can't > > see some .mp3 or .ogg file from the music player that is visibile is > > the file manager. I would expect him to file a bug on the music > > player. If the bug was closed with "fix the filename" I imagine the > > user would respond with "but other programs can access it just fine". > > And the user would very likely be *wrong*. The file manager is > displaying it, but in the nature of things file managers *don't access > files*, they access *directories*. The files they pass to other apps > to access. Exactly the same reasoning applies to files in a directory with an odd name. > > I'm not unhappy with the solution Victor is proposing, but I imagine > > that when I start coding projects in 3.0 I'll default to the bytes > > versions of the filename methods and use > > b"path".decode(sys.getfilesystemencoding(), "replace") if I need to > > get Unicode. > > But now the user will file a bug because in the file opening dialog > they can't *read* their Chinese file names on their USB key because > they are appearing in (system encoding) Cyrillic. Do you begin to see > the nature of the Catch-22 here? > > I don't expect the user to be very sympathetic when you tell her to > fix the filenames, but it's not as easy as you would think to get this > right. a) There is some chance that at least ASCII characters will be displayed correctly if getfilesystemencoding() is similar to the encoding used and corrupted filenames will display correctly except for corrupted characters. b) The user will at least be able to access the file. It's a more graceful degredation of functionality than not being able to work with the file at all. Schiavo Simon From barry at python.org Wed Oct 1 14:23:19 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 1 Oct 2008 08:23:19 -0400 Subject: [Python-Dev] Python 2.6 final today Message-ID: <1954BD33-02D8-4FC1-A027-97DCDD687A32@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've been out of town since Friday, but I don't yet see anything in the 700 billion email messages I'm now catching up on that leads me to think we need to delay the release. Yay! I will be on irc later today and will be trolling through the tracker and buildbots soon. Don't trust email to get an important issue in front of me today, please use irc or submit a showstopper bug against 2.6 if something /must/ be addressed before today's release. I'm going to make a test release at around 1600UTC today, just to see how building the docs and such go. I'm still planning on doing the final final release at about 2200UTC. If you need to coordinate with me (e.g. press releases, Windows builds, etc.) please meeting me on #python-dev on irc.freenode.net. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSONrt3EjvBPtnXfVAQJn7wP9HSieFM7daE5vbvsuJGZtHyC2NFmT5Rsm Fd/ce6CvLzGEkUQ5GQs09TtiZZbIYiObUNkbVQBV8Zbu7A9S3fx7PBpHmPOnIIbr Dfw39pphdKE76yoJmC7OkFTlDbOw6rbuD+JLAzCgcjxx1MqL1Cx08vl2/WEJf3Fl izAVTI2Bwwc= =BCvf -----END PGP SIGNATURE----- From ncoghlan at gmail.com Wed Oct 1 14:43:23 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 01 Oct 2008 22:43:23 +1000 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <20081001051947.31635.1251804577.divmod.xquotient.807@weber.divmod.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <200809300202.38574.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E2865A.3010404@v.loewis.de> <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> <22920D6A-8B70-4E6D-BE99-D7447D831B41@fuhm.net> <20081001051947.31635.1251804577.divmod.xquotient.807@weber.divmod.com> Message-ID: <48E3706B.9060308@gmail.com> glyph at divmod.com wrote: > The reasoning is that a lot of software doesn't care if it's wrong for > edge cases, it's really hard to come up with something that's correct > with respect to all of those edge cases (absurdly difficult, if you need > to stay in the straightjacket of string / bytes types, as well as > provide a useful library interface - which is why we're having this > discussion). But, it should be _possible_ to write software that's > correct in the face of those edge cases. I just wanted to highlight this as something to keep in mind during this discussion: we want to keep the easy things easy and make the difficult things possible. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From techtonik at gmail.com Wed Oct 1 14:55:30 2008 From: techtonik at gmail.com (techtonik) Date: Wed, 1 Oct 2008 15:55:30 +0300 Subject: [Python-Dev] Determine minimum required version for a script Message-ID: Can somebody remind how to check script compatibility with old Python versions? I can remember PHP_CompatInfo class for PHP that parses a script or directory to find out the minimum version and extensions required for them to run, and I wonder if there was anything like this for Python? -- --anatoly t. From stephen at xemacs.org Wed Oct 1 15:53:51 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 01 Oct 2008 22:53:51 +0900 Subject: [Python-Dev] Patch for an initial support of bytes filename in Python3 In-Reply-To: References: <200809300247.20349.victor.stinner@haypocalc.com> <20080930132151.31635.132601277.divmod.xquotient.434@weber.divmod.com> <20080930175932.31635.989735053.divmod.xquotient.478@weber.divmod.com> <20080930184751.31635.1484325691.divmod.xquotient.520@weber.divmod.com> <873ajgpq73.fsf@xemacs.org> Message-ID: <87y718o1zk.fsf@xemacs.org> Simon Cross writes: > a) There is some chance that at least ASCII characters will be > displayed correctly if getfilesystemencoding() is similar to the > encoding used and corrupted filenames will display correctly except > for corrupted characters. All you're saying is that the cases *you* can imagine running into work better. All I'm saying is the opposite. We're both right; the point is that that means that Python can't be, not all of the time. We know from experience (Emacs/Mule, Java) that trying to impose a theoretical system on encoding just doesn't work by itself[1], and in fact creates other problems by its very rigidity. I'd like to see Python not fall into that trap, too. Footnotes: [1] It needs system-level support as in Windows and Mac OS X. From guido at python.org Wed Oct 1 16:15:02 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 1 Oct 2008 07:15:02 -0700 Subject: [Python-Dev] Patch for an initial support of bytes filename in Python3 In-Reply-To: References: <200809300247.20349.victor.stinner@haypocalc.com> <20080930132151.31635.132601277.divmod.xquotient.434@weber.divmod.com> <20080930175932.31635.989735053.divmod.xquotient.478@weber.divmod.com> <20080930184751.31635.1484325691.divmod.xquotient.520@weber.divmod.com> Message-ID: On Wed, Oct 1, 2008 at 1:05 AM, Simon Cross wrote: > On Wed, Oct 1, 2008 at 12:04 AM, Guido van Rossum wrote: >> Plus, even on Linux Unicode is *usually* what you should be doing, >> unless you're writing a backup tool. > > I still find this line of reasoning a bit worrying. Imagine an end > user application like a music player. The user discovers that he can't > see some .mp3 or .ogg file from the music player that is visibile is > the file manager. I would expect him to file a bug on the music > player. If the bug was closed with "fix the filename" I imagine the > user would respond with "but other programs can access it just fine". I see nothing wrong with this scenario. If undecodable filenames are a common thing then the authors of the music player should be using the bytes variant of the API, and if they get enough bugs like this they will fix their code to do so. OTOH if this is not common the response "rename the file" is totally reasonable -- you have to prioritize your bugs or else you'll never get any software released, and the occasional work-around is a given. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gh at ghaering.de Wed Oct 1 16:50:18 2008 From: gh at ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=) Date: Wed, 01 Oct 2008 16:50:18 +0200 Subject: [Python-Dev] Determine minimum required version for a script In-Reply-To: References: Message-ID: <48E38E2A.6090208@ghaering.de> techtonik wrote: > Can somebody remind how to check script compatibility with old Python versions? > > I can remember PHP_CompatInfo class for PHP that parses a script or directory to > find out the minimum version and extensions required for them to run, > and I wonder > if there was anything like this for Python? You posted on the *python-dev* mailing list. On this list the key *Python developers* discuss the future of the language and its implementation. Topics include Python design issues, release mechanics, and maintenance of existing releases. Please, do not post general Python questions to this list! For help with Python please use the mailing list python-list at python.org or the newsgroup comp.lang.python. Messages are routed between the two. so they're basically the same thing. -- Gerhard From janssen at parc.com Wed Oct 1 17:54:15 2008 From: janssen at parc.com (Bill Janssen) Date: Wed, 1 Oct 2008 08:54:15 PDT Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <48E343AE.3080009@egenix.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E20017.3020405@egenix.com> <200810010954.47564.eckhardt@satorlaser.com> <48E343AE.3080009@egenix.com> Message-ID: <74342.1222876455@parc.com> M.-A. Lemburg wrote: > On 2008-10-01 09:54, Ulrich Eckhardt wrote: > > On Tuesday 30 September 2008, M.-A. Lemburg wrote: > >> On 2008-09-30 08:00, Martin v. L?wis wrote: > >>>> Change the default file system encoding to store bytes in Unicode is > >>>> like introducing a new Python type: . > >>> Exactly. Seems like the best solution to me, despite your polemics. > >> Not a bad idea... have os.listdir() return Unicode subclasses that work > >> like file handles, ie. they have an extra buffer that holds the original > >> bytes value received from the underlying C API. > > > > Why does it have to be a Unicode subclass? In my eyes, a Unicode object > > promises a few things, in particular that it contains a Unicode string. If it > > now suddenly contains bytes without any further meaning, that would be bad. > > Please read my entire email. I was proposing to store the underlying > non-decodeable byte string value in such a subclass. The Unicode value > of the object would then be that underlying value decoded as e.g. > Latin-1 in order to be able to work on it as text. I'm actually sort of liking this idea. A Pathname class, for convenience a subtype of String, but containing the underlying binary representation used by the OS. Even non-unicode pathnames could be represented. Bill From barry at python.org Wed Oct 1 18:01:33 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 1 Oct 2008 12:01:33 -0400 Subject: [Python-Dev] Python security team In-Reply-To: <48E20D25.4090503@suse.cz> References: <200809271754.29068.victor.stinner@haypocalc.com> <48E0C60B.5060006@novell.com> <48E20D25.4090503@suse.cz> Message-ID: <1CD8E216-42C8-454F-AE8B-ABC2D71FB5BC@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 30, 2008, at 7:27 AM, Jan Mate(jek wrote: > Thanks for your answer. I guess the process is the real problem then. > - From what i could observe, the connection between vendor-sec and > PSRT is > not really working as it should. > (And then of course you need some kind of upstream flow too, because > not > everyone reports to PSRT.) Please remember that the proper way to contact the PSRT is via security at python.org . FWIW, I am in favor of adding a few trusted people to the team, but only if they're willing to actually get stuff done :). Clearly the current team is too swamped to act effectively, myself included. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSOOe3XEjvBPtnXfVAQJ5JgP/dDg+SPLeQ4yBQ/CYxJEh3/Xm2B+2KV5U 9RUjp7W7z2iC/Bz7qwJlui0Z30KaaZ/whMqTuh+5ZYDlrmUDUh9Tl88OyngHOBxy R/SYmluOlYUPdmjUHQYWXf5Bl9JVX9vtZ3LaFKPUo8KJf+dQDFSK3guxnIr5+Jjt oJjX+52vilM= =nJse -----END PGP SIGNATURE----- From glyph at divmod.com Wed Oct 1 18:20:06 2008 From: glyph at divmod.com (glyph at divmod.com) Date: Wed, 01 Oct 2008 16:20:06 -0000 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <74342.1222876455@parc.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E20017.3020405@egenix.com> <200810010954.47564.eckhardt@satorlaser.com> <48E343AE.3080009@egenix.com> <74342.1222876455@parc.com> Message-ID: <20081001162006.31635.1753470290.divmod.xquotient.824@weber.divmod.com> On 03:54 pm, janssen at parc.com wrote: >I'm actually sort of liking this idea. A Pathname class, for >convenience >a subtype of String, but containing the underlying binary >representation >used by the OS. Even non-unicode pathnames could be represented. On the one hand, I agree with you - except for the part where it's a subtype of String, that doesn't work. In case I haven't mentioned it enough times already: http://twistedmatrix.com/documents/8.1.0/api/twisted.python.filepath.FilePath.html On the other hand, we've all been on this merry-go-round before: http://www.python.org/dev/peps/pep-0355/ Note especially the rejection notice: "Subclassing from str is a particularly bad idea". Again, one day I'd really like to add one of these to Python. Now is not the time. From janssen at parc.com Wed Oct 1 19:14:00 2008 From: janssen at parc.com (Bill Janssen) Date: Wed, 1 Oct 2008 10:14:00 PDT Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <20081001162006.31635.1753470290.divmod.xquotient.824@weber.divmod.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E20017.3020405@egenix.com> <200810010954.47564.eckhardt@satorlaser.com> <48E343AE.3080009@egenix.com> <74342.1222876455@parc.com> <20081001162006.31635.1753470290.divmod.xquotient.824@weber.divmod.com> Message-ID: <75388.1222881240@parc.com> glyph at divmod.com wrote: > > I'm actually sort of liking this idea. A Pathname class, for > > convenience > > a subtype of String, but containing the underlying binary > > representation > >used by the OS. Even non-unicode pathnames could be represented. > > On the one hand, I agree with you - except for the part where it's a > subtype of String, that doesn't work. In case I haven't mentioned it > enough times already: > > http://twistedmatrix.com/documents/8.1.0/api/twisted.python.filepath.FilePath.html > > On the other hand, we've all been on this merry-go-round before: > > http://www.python.org/dev/peps/pep-0355/ > > Note especially the rejection notice: "Subclassing from str is a > particularly bad idea". Yes, the only real justification for it is to not break existing code (otherwise, calling str() is not that much of an ordeal). > On the other hand, we've all been on this merry-go-round before: > > http://www.python.org/dev/peps/pep-0355/ The very existence of os.path seems a good argument that something like this is useful. Perhaps PEP 355 just went too far. Bill From martin at v.loewis.de Wed Oct 1 21:08:50 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 01 Oct 2008 21:08:50 +0200 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <200810011043.25662.victor.stinner@haypocalc.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <20081001020625.31635.800517030.divmod.xquotient.681@weber.divmod.com> <200810011043.25662.victor.stinner@haypocalc.com> Message-ID: <48E3CAC2.6010203@v.loewis.de> >> SQLite has a similar problem with NULLs, and I'm definitely sticking >> paths in there, too. > > I think that you can say "all C libraries". Just for the sake of nit-picking: the socket library, and the regular POSIX stream IO library (as well as C standard "unformatted" IO) deal just fine with embedded NULL characters. > * Java doesn't support unicode > 0xFFFF (bouuuuh!) I don't think that is true anymore. Regards, Martin From ncoghlan at gmail.com Wed Oct 1 23:39:42 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 02 Oct 2008 07:39:42 +1000 Subject: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue In-Reply-To: <75388.1222881240@parc.com> References: <200809291407.55291.victor.stinner@haypocalc.com> <48E1C097.8030309@v.loewis.de> <48E20017.3020405@egenix.com> <200810010954.47564.eckhardt@satorlaser.com> <48E343AE.3080009@egenix.com> <74342.1222876455@parc.com> <20081001162006.31635.1753470290.divmod.xquotient.824@weber.divmod.com> <75388.1222881240@parc.com> Message-ID: <48E3EE1E.5000300@gmail.com> Bill Janssen wrote: > Perhaps PEP 355 just went too far. That was certainly one of the major objections to it. A filesystem path object which didn't try to combine a half-dozen different modules into methods on a single object, but instead focused on solving a few specific problems with using raw strings as file paths would have a far greater chance of acceptance. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From wgheath at gmail.com Thu Oct 2 04:06:43 2008 From: wgheath at gmail.com (William Heath) Date: Wed, 1 Oct 2008 19:06:43 -0700 Subject: [Python-Dev] self signing a py2exe winxp executable with signtool Message-ID: Hi All, I am trying to figure out how to self sign a py2exe winxp executable with signtool. Anyone know? I saw this which looked kind of promising: http://markmail.org/message/zj5nzechzgmjuu7c#query:signtool%20python+page:1+mid:s4jrb2hter4zxvg3+state:results -Tim P.S. Python rocks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu Oct 2 05:46:45 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 1 Oct 2008 23:46:45 -0400 Subject: [Python-Dev] RELEASED Python 2.6 final Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On behalf of the Python development team and the Python community, I am happy to announce the release of Python 2.6 final. This is the production-ready version of the latest in the Python 2 series. There are many new features and modules, improvements, bug fixes, and other changes in Python 2.6. Please see the "What's new" page for details http://docs.python.org/dev/whatsnew/2.6.html as well as PEP 361 http://www.python.org/dev/peps/pep-0361/ While Python 2.6 is backward compatible with earlier versions of Python, 2.6 has many tools and features that will help you migrate to Python 3. Wherever possible, Python 3.0 features have been added without affecting existing code. In other cases, the new features can be enabled through the use of __future__ imports and command line switches. Python 3.0 is currently in release candidate and will be available later this year. Both Python 2 and Python 3 will be supported for the foreseeable future. Source tarballs, Windows installers, and Mac disk images can be downloaded from the Python 2.6 page: http://www.python.org/download/releases/2.6/ (Please note that due to quirks in the earth's time zones, the Windows installers will be available shortly.) Bugs can be reported in the Python bug tracker: http://bugs.python.org Enjoy, - -Barry Barry Warsaw barry at python.org Python 2.6/3.0 Release Manager (on behalf of the entire python-dev team) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSOREJ3EjvBPtnXfVAQLAigP/aEnrdvAqk7wbNQLFbmBonIr2YQbd1vEu TyTr5imYXFWGNfv1/JMeMBjMfwpHi1bgPEDTLEZdhDRNj/G1h4NqqnpfJS0lfIaU 4JBKwnsO80se/RGyupcs5f09UdKxOljhbFKEw46CHDkd9lE+cqy2yhetEwyx3c3+ AVC11sjcO54= =Oxo3 -----END PGP SIGNATURE----- From aahz at pythoncraft.com Thu Oct 2 05:55:40 2008 From: aahz at pythoncraft.com (Aahz) Date: Wed, 1 Oct 2008 20:55:40 -0700 Subject: [Python-Dev] RELEASED Python 2.6 final In-Reply-To: References: Message-ID: <20081002035540.GA3865@panix.com> Huzzah! -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "...if I were on life-support, I'd rather have it run by a Gameboy than a Windows box." --Cliff Wells, comp.lang.python, 3/13/2002 From guido at python.org Thu Oct 2 05:59:02 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 1 Oct 2008 20:59:02 -0700 Subject: [Python-Dev] RELEASED Python 2.6 final In-Reply-To: References: Message-ID: Congratulations, Barry!!! On Wed, Oct 1, 2008 at 8:46 PM, Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On behalf of the Python development team and the Python community, I am > happy to announce the release of Python 2.6 final. This is the > production-ready version of the latest in the Python 2 series. > > There are many new features and modules, improvements, bug fixes, and other > changes in Python 2.6. Please see the "What's new" page for details > > http://docs.python.org/dev/whatsnew/2.6.html > > as well as PEP 361 > > http://www.python.org/dev/peps/pep-0361/ > > While Python 2.6 is backward compatible with earlier versions of Python, 2.6 > has many tools and features that will help you migrate to Python 3. > Wherever possible, Python 3.0 features have been added without affecting > existing code. In other cases, the new features can be enabled through the > use of __future__ imports and command line switches. > > Python 3.0 is currently in release candidate and will be available later > this year. Both Python 2 and Python 3 will be supported for the foreseeable > future. > > Source tarballs, Windows installers, and Mac disk images can be downloaded > from the Python 2.6 page: > > http://www.python.org/download/releases/2.6/ > > (Please note that due to quirks in the earth's time zones, the Windows > installers will be available shortly.) Can someone who's still up fix add this note to the website? It looks a little dodgy just linking to a 404 error... :-( > Bugs can be reported in the Python bug tracker: > > http://bugs.python.org > > Enjoy, > - -Barry > > Barry Warsaw > barry at python.org > Python 2.6/3.0 Release Manager > (on behalf of the entire python-dev team) > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (Darwin) > > iQCVAwUBSOREJ3EjvBPtnXfVAQLAigP/aEnrdvAqk7wbNQLFbmBonIr2YQbd1vEu > TyTr5imYXFWGNfv1/JMeMBjMfwpHi1bgPEDTLEZdhDRNj/G1h4NqqnpfJS0lfIaU > 4JBKwnsO80se/RGyupcs5f09UdKxOljhbFKEw46CHDkd9lE+cqy2yhetEwyx3c3+ > AVC11sjcO54= > =Oxo3 > -----END PGP SIGNATURE----- > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steve at pearwood.info Thu Oct 2 06:39:36 2008 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 2 Oct 2008 14:39:36 +1000 Subject: [Python-Dev] RELEASED Python 2.6 final In-Reply-To: References: Message-ID: <200810021439.36718.steve@pearwood.info> On Thu, 2 Oct 2008 01:46:45 pm Barry Warsaw wrote: > On behalf of the Python development team and the Python community, I > am happy to announce the release of Python 2.6 final. This is the > production-ready version of the latest in the Python 2 series. I'd like to thank you all very much for your hard work and for making such a great language. Cheers! -- Steven From divinekid at gmail.com Thu Oct 2 06:19:32 2008 From: divinekid at gmail.com (Haoyu Bai) Date: Thu, 2 Oct 2008 12:19:32 +0800 Subject: [Python-Dev] RELEASED Python 2.6 final In-Reply-To: References: Message-ID: <1d7983e80810012119u6f51b460lbdef685a035c0538@mail.gmail.com> On Thu, Oct 2, 2008 at 11:59 AM, Guido van Rossum wrote: > Congratulations, Barry!!! > > On Wed, Oct 1, 2008 at 8:46 PM, Barry Warsaw wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> On behalf of the Python development team and the Python community, I am >> happy to announce the release of Python 2.6 final. This is the >> production-ready version of the latest in the Python 2 series. >> >> There are many new features and modules, improvements, bug fixes, and other >> changes in Python 2.6. Please see the "What's new" page for details >> >> http://docs.python.org/dev/whatsnew/2.6.html >> >> as well as PEP 361 >> >> http://www.python.org/dev/peps/pep-0361/ >> >> While Python 2.6 is backward compatible with earlier versions of Python, 2.6 >> has many tools and features that will help you migrate to Python 3. >> Wherever possible, Python 3.0 features have been added without affecting >> existing code. In other cases, the new features can be enabled through the >> use of __future__ imports and command line switches. >> >> Python 3.0 is currently in release candidate and will be available later >> this year. Both Python 2 and Python 3 will be supported for the foreseeable >> future. >> >> Source tarballs, Windows installers, and Mac disk images can be downloaded >> from the Python 2.6 page: >> >> http://www.python.org/download/releases/2.6/ >> >> (Please note that due to quirks in the earth's time zones, the Windows >> installers will be available shortly.) > > Can someone who's still up fix add this note to the website? It looks > a little dodgy just linking to a 404 error... :-( > >> Bugs can be reported in the Python bug tracker: >> >> http://bugs.python.org >> >> Enjoy, >> - -Barry >> >> Barry Warsaw >> barry at python.org >> Python 2.6/3.0 Release Manager >> (on behalf of the entire python-dev team) >> >> -----BEGIN PGP SIGNATURE----- >> Version: GnuPG v1.4.9 (Darwin) >> >> iQCVAwUBSOREJ3EjvBPtnXfVAQLAigP/aEnrdvAqk7wbNQLFbmBonIr2YQbd1vEu >> TyTr5imYXFWGNfv1/JMeMBjMfwpHi1bgPEDTLEZdhDRNj/G1h4NqqnpfJS0lfIaU >> 4JBKwnsO80se/RGyupcs5f09UdKxOljhbFKEw46CHDkd9lE+cqy2yhetEwyx3c3+ >> AVC11sjcO54= >> =Oxo3 >> -----END PGP SIGNATURE----- >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/guido%40python.org >> > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) Now almost all the pages on docs.python.org can't be accessed. For example http://docs.python.org/lib/lib.html returns 403 forbidden. Is the online docs under updating to 2.6, or there's something wrong? -- Haoyu Bai From lists at cheimes.de Thu Oct 2 08:08:17 2008 From: lists at cheimes.de (Christian Heimes) Date: Thu, 02 Oct 2008 08:08:17 +0200 Subject: [Python-Dev] RELEASED Python 2.6 final In-Reply-To: References: Message-ID: Nice! Python 2.7 is waiting, let's get started! :) Christian From martin at v.loewis.de Thu Oct 2 08:28:27 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Oct 2008 08:28:27 +0200 Subject: [Python-Dev] self signing a py2exe winxp executable with signtool In-Reply-To: References: Message-ID: <48E46A0B.6070806@v.loewis.de> > I am trying to figure out how to self sign a py2exe winxp executable > with signtool. Anyone know? Dear William, This list (python-dev) is for the development of Python, not the development with Python. I recommend to use either python-list, or the py2exe-users list for this question. Regards, Martin From itconsense at gmail.com Thu Oct 2 09:57:43 2008 From: itconsense at gmail.com (itconsense at gmail.com) Date: Thu, 02 Oct 2008 09:57:43 +0200 Subject: [Python-Dev] python docs website not accessible! Message-ID: Hi, I'm not sure, if this is the right place to post. The python-docs on www.python.org are not accessible. The overview http://docs.python.org/lib/lib.html works fine, but no link on the page i have tried works. http://docs.python.org/lib/doctest-unittest-api.html http://docs.python.org/lib/profile-instant.html Is python ceasing to be open source with 2.6? ;-) Thanks for caring, -Tom From thomas at python.org Thu Oct 2 11:59:11 2008 From: thomas at python.org (Thomas Wouters) Date: Thu, 2 Oct 2008 11:59:11 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed Message-ID: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> I hotfixed docs.python.org and www.python.org/doc with some cutesy improv -- the URLs changed from .../lib/ to ../library/, and any HTML pages inside them are completely different. So, any http://docs.python.org/lib/... URL now redirects to the toplevel http://docs.python.org/library/ (and similar for www.python.org/doc/lib.) If anyone feels particularly frustrated by the old URLs breaking, I wouldn't mind adding a redirection for each individual URL as long as I don't have to build that mapping :-) Georg is working on fixing the main www.python.org/doc page, I believe, as well as providing downloadable docs. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Oct 2 12:44:49 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 Oct 2008 10:44:49 +0000 (UTC) Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> Message-ID: Thomas Wouters python.org> writes: > > If anyone feels particularly frustrated by the old URLs breaking, I wouldn't mind adding a redirection for each individual URL as long as I don't have to build that mapping Well in general URLs aren't supposed to break (except the ones which are deliberately temporary). Could a RewriteRule do the trick? From thomas at python.org Thu Oct 2 13:28:18 2008 From: thomas at python.org (Thomas Wouters) Date: Thu, 2 Oct 2008 13:28:18 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> Message-ID: <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> On Thu, Oct 2, 2008 at 12:44, Antoine Pitrou wrote: > Thomas Wouters python.org> writes: > > > > If anyone feels particularly frustrated by the old URLs breaking, I > wouldn't > mind adding a redirection for each individual URL as long as I don't have > to > build that mapping > > Well in general URLs aren't supposed to break (except the ones which are > deliberately temporary). Could a RewriteRule do the trick? > Not a single one, no. The URLs *all* changed. There is not a single one that's the same. We may be able to do a single rewrite rule for most of the module-*.html URLs, but everything else -- and there is quite a lot of 'else' in the 2.5-and-earlier docs -- needs a better mapping. Feel free to send me that mapping :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug.hellmann at gmail.com Thu Oct 2 13:52:16 2008 From: doug.hellmann at gmail.com (Doug Hellmann) Date: Thu, 2 Oct 2008 07:52:16 -0400 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: On Oct 2, 2008, at 7:28 AM, Thomas Wouters wrote: > > > On Thu, Oct 2, 2008 at 12:44, Antoine Pitrou > wrote: > Thomas Wouters python.org> writes: > > > > If anyone feels particularly frustrated by the old URLs breaking, > I wouldn't > mind adding a redirection for each individual URL as long as I don't > have to > build that mapping > > Well in general URLs aren't supposed to break (except the ones which > are > deliberately temporary). Could a RewriteRule do the trick? > > Not a single one, no. The URLs *all* changed. There is not a single > one that's the same. We may be able to do a single rewrite rule for > most of the module-*.html URLs, but everything else -- and there is > quite a lot of 'else' in the 2.5-and-earlier docs -- needs a better > mapping. Feel free to send me that mapping :-) Perhaps it has already been suggested and rejected for some reason, but we could include the major/minor version numbers in the URLs. That would make it easier to rewrite old URLs, and I assume there will be 2.x and 3.x documentation available online for some period of time? docs.python.org/lib/* could redirect to docs.python.org/2.5/lib/* docs.pyhton.org/ (note no *) could redirect to docs.python.org/2.6/ and include a link to docs.python.org/3.0/ That way all of the old references (in Google and bookmarks) would still work. Perhaps we should restore the old version of the files until this is resolved? Being redirected to the top landing page is a little disconcerting if you come to the site through a search engine and aren't familiar with the organization of the manual. For example, I went to look for the documentation on how slots work, and ended up at the top of the reference manual. The local search didn't work (no results), "slots" isn't in the index, and google still has the old URL. Doug -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Oct 2 13:35:11 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 02 Oct 2008 13:35:11 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: <1222947311.7054.3.camel@fsol> > > Not a single one, no. The URLs *all* changed. There is not a single > one that's the same. We may be able to do a single rewrite rule for > most of the module-*.html URLs, but everything else -- and there is > quite a lot of 'else' in the 2.5-and-earlier docs -- needs a better > mapping. Feel free to send me that mapping :-) My bad. I thought it was just a matter of doing a generic substitution. Well, then we'll have to live with it I suppose :) Regards Antoine. From barry at python.org Thu Oct 2 14:11:47 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 2 Oct 2008 08:11:47 -0400 Subject: [Python-Dev] RELEASED Python 2.6 final In-Reply-To: <1d7983e80810012119u6f51b460lbdef685a035c0538@mail.gmail.com> References: <1d7983e80810012119u6f51b460lbdef685a035c0538@mail.gmail.com> Message-ID: <7E7417E8-8BA8-4974-94F7-ABC0A143F710@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 2, 2008, at 12:19 AM, Haoyu Bai wrote: > Now almost all the pages on docs.python.org can't be accessed. For > example http://docs.python.org/lib/lib.html returns 403 forbidden. Thanks to Georg and Thomas, the docs should all be fixed now. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSOS6g3EjvBPtnXfVAQKw5wP+I2L6qPZWp1qDs7qSRdlOE5xSAazhnuzE h7gCUWah0tcewuJC38cE7kNAVkpmp9suBbGgI+FRYTeJJpoO109Io4cF4fRvYd2H NpVfhIOo6VUchNTnsdtP4UzuaIKnkCKgWxMPPjMW9jEZlHPNdOC8stTsxOq1FWFt hlJscML5yQA= =wPLO -----END PGP SIGNATURE----- From g.brandl at gmx.net Thu Oct 2 14:17:37 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 02 Oct 2008 14:17:37 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: Doug Hellmann schrieb: >> Not a single one, no. The URLs *all* changed. There is not a single >> one that's the same. We may be able to do a single rewrite rule for >> most of the module-*.html URLs, but everything else -- and there is >> quite a lot of 'else' in the 2.5-and-earlier docs -- needs a better >> mapping. Feel free to send me that mapping :-) > > Perhaps it has already been suggested and rejected for some reason, but > we could include the major/minor version numbers in the URLs. That > would make it easier to rewrite old URLs, and I assume there will be 2.x > and 3.x documentation available online for some period of time? > > docs.python.org/lib/* could redirect to docs.python.org/2.5/lib/* That would be possible, but not sensible IMO -- it doesn't make people update their links, instead keeps links to outdated documentation. > docs.pyhton.org/ (note no *) could redirect to docs.python.org/2.6/ and > include a link to docs.python.org/3.0/ We already have archived versioned docs at http://www.python.org/doc/X.Y. > That way all of the old references (in Google and bookmarks) would still > work. > > Perhaps we should restore the old version of the files until this is > resolved? Being redirected to the top landing page is a little > disconcerting if you come to the site through a search engine and aren't > familiar with the organization of the manual. For example, I went to > look for the documentation on how slots work, and ended up at the top of > the reference manual. The local search didn't work (no results), > "slots" isn't in the index, and google still has the old URL. __slots__ is in the index (with the underscores). The local search shows me __slots__ as the first result when I search for "__slots__" or "slots". As for Google, I can only assume it will soon update its index. Nevertheless, I will come up with a mapping for the old module URLs, which is relatively easy. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From ncoghlan at gmail.com Thu Oct 2 14:18:02 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 02 Oct 2008 22:18:02 +1000 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: <48E4BBFA.7010800@gmail.com> Doug Hellmann wrote: > Perhaps it has already been suggested and rejected for some reason, but > we could include the major/minor version numbers in the URLs. That > would make it easier to rewrite old URLs, and I assume there will be 2.x > and 3.x documentation available online for some period of time? The old doc directories are already kept around (all the way back to 1.4 in fact: http://www.python.org/doc/1.4/) As a quick fix for the old links, a rewrite rule to map such links to the 2.5 docs seems like a very good idea to me. Since old URLs all use abbreviations in the directory name (tut, lib, mac, ref, ext, api, doc, inst, dist), it should be straightforward to redirect them without affecting the links to the new docs (tutorial, using, reference, howto, extending, c-api, install, distutils, documenting). > Perhaps we should restore the old version of the files until this is > resolved? Being redirected to the top landing page is a little > disconcerting if you come to the site through a search engine and aren't > familiar with the organization of the manual. A redirect rule to the 2.5.2 docs for the old naming scheme is probably a better short-term solution. > For example, I went to > look for the documentation on how slots work, and ended up at the top of > the reference manual. The local search didn't work (no results), > "slots" isn't in the index, and google still has the old URL. The quick search is actually working for me these days (it wasn't for a while when the new docs were still in development). (e.g. the first hit I get when searching for "slots" now is http://www.python.org/doc/2.6/reference/datamodel.html?highlight=slots#__slots__) I believe it's a Javascript based search though, so there may be issues with browser compatibility (or user's with JS disabled). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From rhamph at gmail.com Thu Oct 2 14:25:56 2008 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 2 Oct 2008 06:25:56 -0600 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: On Thu, Oct 2, 2008 at 6:17 AM, Georg Brandl wrote: > Doug Hellmann schrieb: > >>> Not a single one, no. The URLs *all* changed. There is not a single >>> one that's the same. We may be able to do a single rewrite rule for >>> most of the module-*.html URLs, but everything else -- and there is >>> quite a lot of 'else' in the 2.5-and-earlier docs -- needs a better >>> mapping. Feel free to send me that mapping :-) >> >> Perhaps it has already been suggested and rejected for some reason, but >> we could include the major/minor version numbers in the URLs. That >> would make it easier to rewrite old URLs, and I assume there will be 2.x >> and 3.x documentation available online for some period of time? >> >> docs.python.org/lib/* could redirect to docs.python.org/2.5/lib/* > > That would be possible, but not sensible IMO -- it doesn't make people > update their links, instead keeps links to outdated documentation. > >> docs.pyhton.org/ (note no *) could redirect to docs.python.org/2.6/ and >> include a link to docs.python.org/3.0/ > > We already have archived versioned docs at http://www.python.org/doc/X.Y. Why not use versioned URLs, but with a link at the top of old pages saying they're outdated, linking to the new version. Either way they should update their links, but this way you don't shoot them in the foot to do it. Breaking old links should be avoided if at all possible. -- Adam Olsen, aka Rhamphoryncus From ncoghlan at gmail.com Thu Oct 2 14:33:27 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 02 Oct 2008 22:33:27 +1000 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: <48E4BF97.50706@gmail.com> Georg Brandl wrote: > Nevertheless, I will come up with a mapping for the old module URLs, > which is relatively easy. Best solution of all :) I was actually only suggesting redirecting to the old docs until such a mapping was available - but if that mapping will be available fairly soon, then bumping old links up to the base URL for a day or two won't be too bad. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From g.brandl at gmx.net Thu Oct 2 14:34:19 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 02 Oct 2008 14:34:19 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: Adam Olsen schrieb: > On Thu, Oct 2, 2008 at 6:17 AM, Georg Brandl wrote: >> Doug Hellmann schrieb: >> >>>> Not a single one, no. The URLs *all* changed. There is not a single >>>> one that's the same. We may be able to do a single rewrite rule for >>>> most of the module-*.html URLs, but everything else -- and there is >>>> quite a lot of 'else' in the 2.5-and-earlier docs -- needs a better >>>> mapping. Feel free to send me that mapping :-) >>> >>> Perhaps it has already been suggested and rejected for some reason, but >>> we could include the major/minor version numbers in the URLs. That >>> would make it easier to rewrite old URLs, and I assume there will be 2.x >>> and 3.x documentation available online for some period of time? >>> >>> docs.python.org/lib/* could redirect to docs.python.org/2.5/lib/* >> >> That would be possible, but not sensible IMO -- it doesn't make people >> update their links, instead keeps links to outdated documentation. >> >>> docs.pyhton.org/ (note no *) could redirect to docs.python.org/2.6/ and >>> include a link to docs.python.org/3.0/ >> >> We already have archived versioned docs at http://www.python.org/doc/X.Y. > > Why not use versioned URLs, but with a link at the top of old pages > saying they're outdated, linking to the new version. Either way they > should update their links, but this way you don't shoot them in the > foot to do it. If linking to the new version could be done easily, we could as well directly redirect. The problem is that having that mapping in the first place is hard. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From doug.hellmann at gmail.com Thu Oct 2 14:35:00 2008 From: doug.hellmann at gmail.com (Doug Hellmann) Date: Thu, 2 Oct 2008 08:35:00 -0400 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: <9A6A51AD-6EA2-4CF3-8734-CAFD76CAF421@gmail.com> On Oct 2, 2008, at 8:17 AM, Georg Brandl wrote: > Doug Hellmann schrieb: > >>> Not a single one, no. The URLs *all* changed. There is not a single >>> one that's the same. We may be able to do a single rewrite rule for >>> most of the module-*.html URLs, but everything else -- and there is >>> quite a lot of 'else' in the 2.5-and-earlier docs -- needs a better >>> mapping. Feel free to send me that mapping :-) >> >> Perhaps it has already been suggested and rejected for some reason, >> but >> we could include the major/minor version numbers in the URLs. That >> would make it easier to rewrite old URLs, and I assume there will >> be 2.x >> and 3.x documentation available online for some period of time? >> >> docs.python.org/lib/* could redirect to docs.python.org/2.5/lib/* > > That would be possible, but not sensible IMO -- it doesn't make people > update their links, instead keeps links to outdated documentation. The documentation isn't outdated if you're still running Python 2.5, as a lot of people will be. Not everyone gets to upgrade right away when there's a new release. For example, the product we build at work depends on 2.5 and we don't have time in our schedule to upgrade right away. It may be several months before we do. >> docs.pyhton.org/ (note no *) could redirect to docs.python.org/2.6/ >> and >> include a link to docs.python.org/3.0/ > > We already have archived versioned docs at http://www.python.org/doc/X.Y > . Great, so we can just redirect the old links over there. If you can make them point to the correct form of the new docs, that would be even better, but at least sending them to the old docs means they point to *something* useful. >> That way all of the old references (in Google and bookmarks) would >> still >> work. >> >> Perhaps we should restore the old version of the files until this is >> resolved? Being redirected to the top landing page is a little >> disconcerting if you come to the site through a search engine and >> aren't >> familiar with the organization of the manual. For example, I went to >> look for the documentation on how slots work, and ended up at the >> top of >> the reference manual. The local search didn't work (no results), >> "slots" isn't in the index, and google still has the old URL. > > __slots__ is in the index (with the underscores). The local search > shows me > __slots__ as the first result when I search for "__slots__" or > "slots". OK, searching for "slots" at http://docs.python.org found several results this time. I don't know why it would have given me no results the last time, but I found what I needed. Doug From doug.hellmann at gmail.com Thu Oct 2 14:46:37 2008 From: doug.hellmann at gmail.com (Doug Hellmann) Date: Thu, 2 Oct 2008 08:46:37 -0400 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: On Oct 2, 2008, at 8:34 AM, Georg Brandl wrote: > > If linking to the new version could be done easily, we could as well > directly > redirect. The problem is that having that mapping in the first place > is hard. I was looking for the easy route. If the layout of the new docs changed completely, anything that starts with the old abbreviations (/ lib/, /tut/, /ref/, etc.) could just go over to the 2.5.2 docs, right? You don't need to map every sub-section to its new URL unless you feel really strongly that links to pages in the old organization should point to the new location. Doug From jnoller at gmail.com Thu Oct 2 15:08:34 2008 From: jnoller at gmail.com (Jesse Noller) Date: Thu, 2 Oct 2008 09:08:34 -0400 Subject: [Python-Dev] Doc nits question Message-ID: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> So, we just released and there are a few doc typo bugs being filed - my question is if all doc-fixes have to wait for 2.6.1/2.7 or if we can hotfix the 2.6 docs? -jesse From tseaver at palladion.com Thu Oct 2 15:14:58 2008 From: tseaver at palladion.com (Tres Seaver) Date: Thu, 02 Oct 2008 09:14:58 -0400 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: <48E4C952.7020705@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Georg Brandl wrote: > Adam Olsen schrieb: >> On Thu, Oct 2, 2008 at 6:17 AM, Georg Brandl wrote: >>> Doug Hellmann schrieb: >>> >>>>> Not a single one, no. The URLs *all* changed. There is not a single >>>>> one that's the same. We may be able to do a single rewrite rule for >>>>> most of the module-*.html URLs, but everything else -- and there is >>>>> quite a lot of 'else' in the 2.5-and-earlier docs -- needs a better >>>>> mapping. Feel free to send me that mapping :-) >>>> Perhaps it has already been suggested and rejected for some reason, but >>>> we could include the major/minor version numbers in the URLs. That >>>> would make it easier to rewrite old URLs, and I assume there will be 2.x >>>> and 3.x documentation available online for some period of time? >>>> >>>> docs.python.org/lib/* could redirect to docs.python.org/2.5/lib/* >>> That would be possible, but not sensible IMO -- it doesn't make people >>> update their links, instead keeps links to outdated documentation. >>> >>>> docs.pyhton.org/ (note no *) could redirect to docs.python.org/2.6/ and >>>> include a link to docs.python.org/3.0/ >>> We already have archived versioned docs at http://www.python.org/doc/X.Y. >> Why not use versioned URLs, but with a link at the top of old pages >> saying they're outdated, linking to the new version. Either way they >> should update their links, but this way you don't shoot them in the >> foot to do it. > > If linking to the new version could be done easily, we could as well directly > redirect. The problem is that having that mapping in the first place is hard. Why would you remove the old docs (ones with 2.5 in the URL)? They still provide value for folks who can't yet move to 2.6 / 3.0; forcibly redirecting a versioned URL to "current" can't possibley be sane. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFI5MlR+gerLs4ltQ4RAicWAKC6gxTtxq/CwZXH9SekRs7DD1fFTwCeMyb/ eJqkvkb4zdDGZG8oPvp1GjI= =0Atv -----END PGP SIGNATURE----- From g.brandl at gmx.net Thu Oct 2 15:19:43 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 02 Oct 2008 15:19:43 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: <48E4C952.7020705@palladion.com> References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> <48E4C952.7020705@palladion.com> Message-ID: Tres Seaver schrieb: > Georg Brandl wrote: >>>>> docs.pyhton.org/ (note no *) could redirect to docs.python.org/2.6/ and >>>>> include a link to docs.python.org/3.0/ >>>> We already have archived versioned docs at http://www.python.org/doc/X.Y. >>> Why not use versioned URLs, but with a link at the top of old pages >>> saying they're outdated, linking to the new version. Either way they >>> should update their links, but this way you don't shoot them in the >>> foot to do it. > >> If linking to the new version could be done easily, we could as well directly >> redirect. The problem is that having that mapping in the first place is hard. > > Why would you remove the old docs (ones with 2.5 in the URL)? They > still provide value for folks who can't yet move to 2.6 / 3.0; forcibly > redirecting a versioned URL to "current" can't possibley be sane. That's true; it's also not what I meant. The versioned docs will of course always stay there. The question is what to do for URLs that refer to docs.python.org, but with old filenames. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From g.brandl at gmx.net Thu Oct 2 15:21:10 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 02 Oct 2008 15:21:10 +0200 Subject: [Python-Dev] Doc nits question In-Reply-To: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> Message-ID: Jesse Noller schrieb: > So, we just released and there are a few doc typo bugs being filed - > my question is if all doc-fixes have to wait for 2.6.1/2.7 or if we > can hotfix the 2.6 docs? I intend to set things up so that the docs at docs.python.org are continually rebuilt, just like the /dev docs were until now. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From jnoller at gmail.com Thu Oct 2 15:26:10 2008 From: jnoller at gmail.com (Jesse Noller) Date: Thu, 2 Oct 2008 09:26:10 -0400 Subject: [Python-Dev] Doc nits question In-Reply-To: References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> Message-ID: <4222a8490810020626n3b3c75bo39f4b4552d6773d7@mail.gmail.com> On Thu, Oct 2, 2008 at 9:21 AM, Georg Brandl wrote: > Jesse Noller schrieb: >> So, we just released and there are a few doc typo bugs being filed - >> my question is if all doc-fixes have to wait for 2.6.1/2.7 or if we >> can hotfix the 2.6 docs? > > I intend to set things up so that the docs at docs.python.org are continually > rebuilt, just like the /dev docs were until now. > > Georg Fantastic, so the doc updates should go to the 2.6 branch, correct? (Not that I'm suggesting checking in all willy nilly) From ncoghlan at gmail.com Thu Oct 2 15:35:45 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 02 Oct 2008 23:35:45 +1000 Subject: [Python-Dev] Doc nits question In-Reply-To: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> Message-ID: <48E4CE31.2010505@gmail.com> Jesse Noller wrote: > So, we just released and there are a few doc typo bugs being filed - > my question is if all doc-fixes have to wait for 2.6.1/2.7 or if we > can hotfix the 2.6 docs? Well the fixes can definitely all go in to SVN on both the trunk and the maintenance branch. As to when we update docs.python.org from the maintenance branch... I believe historically it has only been done at each new maintenance release, but I don't see any fundamental problems with the idea of updating it more frequently. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From g.brandl at gmx.net Thu Oct 2 15:36:51 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 02 Oct 2008 15:36:51 +0200 Subject: [Python-Dev] Doc nits question In-Reply-To: <4222a8490810020626n3b3c75bo39f4b4552d6773d7@mail.gmail.com> References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> <4222a8490810020626n3b3c75bo39f4b4552d6773d7@mail.gmail.com> Message-ID: Jesse Noller schrieb: > On Thu, Oct 2, 2008 at 9:21 AM, Georg Brandl wrote: >> Jesse Noller schrieb: >>> So, we just released and there are a few doc typo bugs being filed - >>> my question is if all doc-fixes have to wait for 2.6.1/2.7 or if we >>> can hotfix the 2.6 docs? >> >> I intend to set things up so that the docs at docs.python.org are continually >> rebuilt, just like the /dev docs were until now. >> >> Georg > > Fantastic, so the doc updates should go to the 2.6 branch, correct? > (Not that I'm suggesting checking in all willy nilly) This is another thing that needs to be discussed: how to handle backports between 2.6 and 2.7. Up to now, we backported changes from trunk to maint manually, but after the experience we've had with svnmerge, I see several possibilities: 1. Do bugfixes in maint, merge them to trunk via svnmerge. This has the drawback that you have to work in two different repos for fixes vs. new features. The advantage however is that normally all fixes that go into maint apply to trunk as well, so almost no blocks need to be done. However, since Py3k merges are done from trunk, the 3k merge will see merges as single commits, so they aren't easy to block if not applicable. This will mean more conflicts. 2. Do bugfixes in trunk, and merge them to maint via svnmerge. Arguments as for 1, but reversed: many blocks, but less problems with 3k. 3. Backport bugfixes manually, like for the previous maintenance branches. cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From ncoghlan at gmail.com Thu Oct 2 15:38:37 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 02 Oct 2008 23:38:37 +1000 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> <48E4C952.7020705@palladion.com> Message-ID: <48E4CEDD.50109@gmail.com> Georg Brandl wrote: > That's true; it's also not what I meant. The versioned docs will of course > always stay there. The question is what to do for URLs that refer to > docs.python.org, but with old filenames. I still like the idea of redirecting such URLs to the old 2.5.2 docs as a short-term fix, with a more complex remapping to the appropriate 2.6 files when it is available. (Whether or not the first part is worth doing obviously depends on how much time you expect the second part to take). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From skip at pobox.com Thu Oct 2 15:47:34 2008 From: skip at pobox.com (skip at pobox.com) Date: Thu, 2 Oct 2008 08:47:34 -0500 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: <1222947311.7054.3.camel@fsol> References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> <1222947311.7054.3.camel@fsol> Message-ID: <18660.53494.563893.905668@montanaro-dyndns-org.local> >> Not a single one, no. The URLs *all* changed. There is not a single >> one that's the same. We may be able to do a single rewrite rule for >> most of the module-*.html URLs, but everything else -- and there is >> quite a lot of 'else' in the 2.5-and-earlier docs -- needs a better >> mapping. Feel free to send me that mapping :-) Antoine> My bad. I thought it was just a matter of doing a generic Antoine> substitution. Well, then we'll have to live with it I suppose Antoine> :) Unfortunately, without some mapping the search engines will toss everything out. While they will eventually get around to fetching http://docs.python.org/ and traversing the tree of pages, but that might take awhile. I won't have time for the next day or two to scan the docs error log, but if I can come up with a list of the ten most frequent failures I suspect we can easily define RewriteRule directives for them. Skip From thomas at python.org Thu Oct 2 15:51:08 2008 From: thomas at python.org (Thomas Wouters) Date: Thu, 2 Oct 2008 15:51:08 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: <18660.53494.563893.905668@montanaro-dyndns-org.local> References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> <1222947311.7054.3.camel@fsol> <18660.53494.563893.905668@montanaro-dyndns-org.local> Message-ID: <9e804ac0810020651na3c0a8cv9c94954f0dea6401@mail.gmail.com> On Thu, Oct 2, 2008 at 15:47, wrote: > >> Not a single one, no. The URLs *all* changed. There is not a single > >> one that's the same. We may be able to do a single rewrite rule for > >> most of the module-*.html URLs, but everything else -- and there is > >> quite a lot of 'else' in the 2.5-and-earlier docs -- needs a better > >> mapping. Feel free to send me that mapping :-) > > Antoine> My bad. I thought it was just a matter of doing a generic > Antoine> substitution. Well, then we'll have to live with it I suppose > Antoine> :) > > Unfortunately, without some mapping the search engines will toss everything > out. While they will eventually get around to fetching > http://docs.python.org/ and traversing the tree of pages, but that might > take awhile. I won't have time for the next day or two to scan the docs > error log, but if I can come up with a list of the ten most frequent > failures I suspect we can easily define RewriteRule directives for them. To be sure, the URLs *are* mapped. They're just mapped to something other than they were mapped to before -- because those pages no longer exist for the 'current version' of the documentation. Pages covering the same or nearly the same thing may exist in some cases, but not in others. We can do a best-effort to redirect the old URLs to something covering the same information, or we can wait a few days to let search engines realize the URLs changed, and let everyone else deal with searching a little further for the information they had bookmarked. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: From skip at pobox.com Thu Oct 2 15:51:51 2008 From: skip at pobox.com (skip at pobox.com) Date: Thu, 2 Oct 2008 08:51:51 -0500 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: <48E4BBFA.7010800@gmail.com> References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> <48E4BBFA.7010800@gmail.com> Message-ID: <18660.53751.895687.800610@montanaro-dyndns-org.local> Nick> The old doc directories are already kept around (all the way back Nick> to 1.4 in fact: http://www.python.org/doc/1.4/) Nick> As a quick fix for the old links, a rewrite rule to map such links Nick> to the 2.5 docs seems like a very good idea to me. Since old URLs Nick> all use abbreviations in the directory name (tut, lib, mac, ref, Nick> ext, api, doc, inst, dist), it should be straightforward to Nick> redirect them without affecting the links to the new docs Nick> (tutorial, using, reference, howto, extending, c-api, install, Nick> distutils, documenting). Yes, we should probably still get the top-level links redirected to the new docs though. The 2.5 tutorial is probably going to get stale over time while the 2.6 version will be updated at least until 2.7 is released. Skip From fdrake at acm.org Thu Oct 2 15:56:25 2008 From: fdrake at acm.org (Fred Drake) Date: Thu, 02 Oct 2008 09:56:25 -0400 Subject: [Python-Dev] Doc nits question In-Reply-To: References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> Message-ID: <18122A2B-CCB0-4850-BDA4-10291780E9D2@acm.org> On Oct 2, 2008, at 9:21 AM, Georg Brandl wrote: > I intend to set things up so that the docs at docs.python.org are > continually > rebuilt, just like the /dev docs were until now. Wonderful! This should help avoid repeat reports of simple typos. At one point, we started to separate the documentation releases so that update releases could be easily pushed at times when there wasn't a corresponding Python release. There are a couple of examples of these in the specific-versions list, IIRC. These have version numbers ending with "p1" (for patch 1; no more than one patched version was ever released for any particular Python version). It may be worth trying this or something like it again as well, if there's enough volunteer time available. Such versions would need to be clearly marked on every page as to when they were updated, so that readers can tell if they have the latest update. -Fred -- Fred L. Drake, Jr. From thomas at python.org Thu Oct 2 16:41:39 2008 From: thomas at python.org (Thomas Wouters) Date: Thu, 2 Oct 2008 16:41:39 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: <18660.53751.895687.800610@montanaro-dyndns-org.local> References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> <48E4BBFA.7010800@gmail.com> <18660.53751.895687.800610@montanaro-dyndns-org.local> Message-ID: <9e804ac0810020741g7d7cb193h647b512a0efec68b@mail.gmail.com> On Thu, Oct 2, 2008 at 15:51, wrote: > > Nick> The old doc directories are already kept around (all the way back > Nick> to 1.4 in fact: http://www.python.org/doc/1.4/) > > Nick> As a quick fix for the old links, a rewrite rule to map such links > Nick> to the 2.5 docs seems like a very good idea to me. Since old URLs > Nick> all use abbreviations in the directory name (tut, lib, mac, ref, > Nick> ext, api, doc, inst, dist), it should be straightforward to > Nick> redirect them without affecting the links to the new docs > Nick> (tutorial, using, reference, howto, extending, c-api, install, > Nick> distutils, documenting). > > Yes, we should probably still get the top-level links redirected to the new > docs though. The 2.5 tutorial is probably going to get stale over time > while the 2.6 version will be updated at least until 2.7 is released. > After discussing on #python-dev (briefly), I made the toplevel directories refer to the new, 2.6 toplevel directories, but deeper URLs in the old directories redirect to www.python.org/doc/2.5.2/. I still think this is the wrong approach, especially in the long term: it means people who just follow old documentation links will not see the new results, and search engines will not realize the pages are effectively stale. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: From aahz at pythoncraft.com Thu Oct 2 16:47:13 2008 From: aahz at pythoncraft.com (Aahz) Date: Thu, 2 Oct 2008 07:47:13 -0700 Subject: [Python-Dev] python docs website not accessible! In-Reply-To: References: Message-ID: <20081002144713.GA5977@panix.com> On Thu, Oct 02, 2008, itconsense at gmail.com wrote: > >I'm not sure, if this is the right place to post. The python-docs on >www.python.org are not accessible. This is definitely the wrong place to post. As usual for most sites, webmaster at python.org is the right place. But please don't bother, we've already had about twenty other people reporting. ;-) -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "...if I were on life-support, I'd rather have it run by a Gameboy than a Windows box." --Cliff Wells, comp.lang.python, 3/13/2002 From g.brandl at gmx.net Thu Oct 2 19:17:40 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 02 Oct 2008 19:17:40 +0200 Subject: [Python-Dev] Doc nits question In-Reply-To: <18122A2B-CCB0-4850-BDA4-10291780E9D2@acm.org> References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> <18122A2B-CCB0-4850-BDA4-10291780E9D2@acm.org> Message-ID: Fred Drake schrieb: > On Oct 2, 2008, at 9:21 AM, Georg Brandl wrote: >> I intend to set things up so that the docs at docs.python.org are >> continually >> rebuilt, just like the /dev docs were until now. > > Wonderful! This should help avoid repeat reports of simple typos. > > At one point, we started to separate the documentation releases so > that update releases could be easily pushed at times when there wasn't > a corresponding Python release. There are a couple of examples of > these in the specific-versions list, IIRC. These have version numbers > ending with "p1" (for patch 1; no more than one patched version was > ever released for any particular Python version). > > It may be worth trying this or something like it again as well, if > there's enough volunteer time available. Such versions would need to > be clearly marked on every page as to when they were updated, so that > readers can tell if they have the latest update. All Sphinx-generated pages currently have a "last update on:" in the footer. Do you think that suffices for this purpose? Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From fdrake at acm.org Thu Oct 2 19:22:11 2008 From: fdrake at acm.org (Fred Drake) Date: Thu, 02 Oct 2008 13:22:11 -0400 Subject: [Python-Dev] Doc nits question In-Reply-To: References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> <18122A2B-CCB0-4850-BDA4-10291780E9D2@acm.org> Message-ID: <16EFB37E-A1F8-438F-BBB6-9D3A30C50B73@acm.org> On Oct 2, 2008, at 1:17 PM, Georg Brandl wrote: > All Sphinx-generated pages currently have a "last update on:" in the > footer. > Do you think that suffices for this purpose? Yes, I do. -Fred -- Fred L. Drake, Jr. From theller at ctypes.org Thu Oct 2 19:54:23 2008 From: theller at ctypes.org (Thomas Heller) Date: Thu, 02 Oct 2008 19:54:23 +0200 Subject: [Python-Dev] Real segmentation fault handler In-Reply-To: <200809300105.53473.victor.stinner@haypocalc.com> References: <200809300105.53473.victor.stinner@haypocalc.com> Message-ID: Victor Stinner schrieb: > Hi, > > I would like to be able to catch SIGSEGV in my Python code! So I started to > hack Python trunk to support this feature. The idea is to use a signal > handler which call longjmp(), and add setjmp() at Py_EvalFrameEx() enter. On windows, ctypes catches fatal errors (exception violations) in foreign function calls, thanks to windows structured exception handling. On other platforms, there is the WAD module by David Beazley which may do something similar: http://www.dabeaz.com/papers/Python2001/python.html I do not know whether the code itself is still available or not. -- Thanks, Thomas From fredrik at pythonware.com Thu Oct 2 19:50:30 2008 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 02 Oct 2008 19:50:30 +0200 Subject: [Python-Dev] c99 comments in the 2.6 code base? Message-ID: http://drj11.wordpress.com/2008/10/02/python-and-bragging-about-c89/ mentions that Objects/frameobject.c contains a C99-style comment, which means that Python 2.6 won't build on AIX. shouldn't we use a suitable gcc option for the buildbots to prevent that from happening? From lists at cheimes.de Thu Oct 2 20:23:18 2008 From: lists at cheimes.de (Christian Heimes) Date: Thu, 02 Oct 2008 20:23:18 +0200 Subject: [Python-Dev] c99 comments in the 2.6 code base? In-Reply-To: References: Message-ID: Fredrik Lundh wrote: > http://drj11.wordpress.com/2008/10/02/python-and-bragging-about-c89/ > > mentions that Objects/frameobject.c contains a C99-style comment, which > means that Python 2.6 won't build on AIX. > > shouldn't we use a suitable gcc option for the buildbots to prevent that > from happening? Ouch! This shouldn't have happend. I'm going to discuss the matter on #python-dev. Perhaps --with-pydebug could add more restrict error checking to the Makefile like -std=c89 -pedantic -Werror Christian From solipsis at pitrou.net Thu Oct 2 20:37:15 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 Oct 2008 18:37:15 +0000 (UTC) Subject: [Python-Dev] c99 comments in the 2.6 code base? References: Message-ID: Christian Heimes cheimes.de> writes: > > Ouch! This shouldn't have happend. I'm going to discuss the matter on > #python-dev. Perhaps --with-pydebug could add more restrict error > checking to the Makefile like -std=c89 -pedantic -Werror As discussed on python-dev, I think it should also be added in release mode. Some developers probably never compile in debug mode (*), and compiling in release mode is useful when you want to do performance tuning. (*) not thinking of anyone in particular ! Regards Antoine. From g.brandl at gmx.net Thu Oct 2 21:16:00 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 02 Oct 2008 21:16:00 +0200 Subject: [Python-Dev] Bugfix porting policy (was Re: Doc nits question) In-Reply-To: References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> <4222a8490810020626n3b3c75bo39f4b4552d6773d7@mail.gmail.com> Message-ID: Just now, Christian decided for option 2... Georg > This is another thing that needs to be discussed: how to handle backports > between 2.6 and 2.7. Up to now, we backported changes from trunk to maint > manually, but after the experience we've had with svnmerge, I see several > possibilities: > > 1. Do bugfixes in maint, merge them to trunk via svnmerge. This has the > drawback that you have to work in two different repos for fixes vs. > new features. The advantage however is that normally all fixes that > go into maint apply to trunk as well, so almost no blocks need to be done. > However, since Py3k merges are done from trunk, the 3k merge will see > merges as single commits, so they aren't easy to block if not applicable. > This will mean more conflicts. > > 2. Do bugfixes in trunk, and merge them to maint via svnmerge. > Arguments as for 1, but reversed: many blocks, but less problems with 3k. > > 3. Backport bugfixes manually, like for the previous maintenance branches. -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From barry at python.org Thu Oct 2 21:27:10 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 2 Oct 2008 15:27:10 -0400 Subject: [Python-Dev] RELEASED Python 2.6 final In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 1, 2008, at 11:46 PM, Barry Warsaw wrote: > On behalf of the Python development team and the Python community, I > am happy to announce the release of Python 2.6 final. This is the > production-ready version of the latest in the Python 2 series. > > Source tarballs, Windows installers, and Mac disk images can be > downloaded from the Python 2.6 page: > > http://www.python.org/download/releases/2.6/ Due to a minor snafu in our build scripts, the source tgz and tar.bz2 files contained some extra cruft. I have created and uploaded new tarballs but I have /not/ bumped the Python version number since they were made from exactly the same Subversion tag. The new tarballs are identical to the originals except that they don't contain the cruft (.svn files and such). If you have already downloaded the tarballs, you do not need to download the new ones. The new tarballs are about 2MB smaller though. With apologies, - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCUAwUBSOUgjnEjvBPtnXfVAQJ3gQP4mxxW9kaaMlsg7yd1uNcgYa29pitYdF2+ DhFgrCajPZpskc3XlKbPcnPJWT8wtI/EIC5QcPEpAWCHECrTUHzPyGLNeMQz0kFF /ZGCGbef7Mc/JaZvEyF6OATnKhYA5XyUOPdddygx6oar/Y6ZbK2JyLR4pvzh+gtQ SA+u6OPIpQ== =7uu8 -----END PGP SIGNATURE----- From lists at cheimes.de Thu Oct 2 21:47:15 2008 From: lists at cheimes.de (Christian Heimes) Date: Thu, 02 Oct 2008 21:47:15 +0200 Subject: [Python-Dev] c99 comments in the 2.6 code base? In-Reply-To: References: Message-ID: Fredrik Lundh wrote: > http://drj11.wordpress.com/2008/10/02/python-and-bragging-about-c89/ I've found several more occasions of // comments and one usage of inline. We *really* should have some way to compile Python with C89 checks Python doesn't compile with the -pedantic option but it compiles with -std=c89 -Werror after I've applied some patches. I've added a new make command to add extra checks. Maybe the build bots could use "make c89" instead of "make" to build Python? c89: $(MAKE) CFLAGS="$(CFLAGS) -std=c89 -Werror" Christian From martin at v.loewis.de Thu Oct 2 22:13:13 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Oct 2008 22:13:13 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> Message-ID: <48E52B59.7060109@v.loewis.de> > Why not use versioned URLs, but with a link at the top of old pages > saying they're outdated, linking to the new version. Either way they > should update their links, but this way you don't shoot them in the > foot to do it. Wouldn't that require changes to the old pages? Regards, Martin From martin at v.loewis.de Thu Oct 2 22:18:31 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Oct 2008 22:18:31 +0200 Subject: [Python-Dev] Bugfix porting policy (was Re: Doc nits question) In-Reply-To: References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> <4222a8490810020626n3b3c75bo39f4b4552d6773d7@mail.gmail.com> Message-ID: <48E52C97.5030909@v.loewis.de> >> 2. Do bugfixes in trunk, and merge them to maint via svnmerge. >> Arguments as for 1, but reversed: many blocks, but less problems with 3k. I'm not so sure that we need to block all the changes that we don't want, though: it would be sufficient to just not merge them, right? (of course, somebody could go over it from time to time and block everything older than a month that was still available, just to prevent accidental merging) Regards, Martin From martin at v.loewis.de Thu Oct 2 22:19:17 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Oct 2008 22:19:17 +0200 Subject: [Python-Dev] c99 comments in the 2.6 code base? In-Reply-To: References: Message-ID: <48E52CC5.8080107@v.loewis.de> > shouldn't we use a suitable gcc option for the buildbots to prevent that > from happening? Which one specifically? Regards, Martin From musiccomposition at gmail.com Thu Oct 2 22:26:20 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Thu, 2 Oct 2008 15:26:20 -0500 Subject: [Python-Dev] Bugfix porting policy (was Re: Doc nits question) In-Reply-To: <48E52C97.5030909@v.loewis.de> References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> <4222a8490810020626n3b3c75bo39f4b4552d6773d7@mail.gmail.com> <48E52C97.5030909@v.loewis.de> Message-ID: <1afaf6160810021326m5b54ad0dh2f890231be451b88@mail.gmail.com> On Thu, Oct 2, 2008 at 3:18 PM, "Martin v. L?wis" wrote: >>> 2. Do bugfixes in trunk, and merge them to maint via svnmerge. >>> Arguments as for 1, but reversed: many blocks, but less problems with 3k. > > I'm not so sure that we need to block all the changes that we don't > want, though: it would be sufficient to just not merge them, right? A large merge queue would accumulate making hard for someone to pick out the bugfixes. Of course, people could just merge fixes right after they apply it to the trunk, though. > > (of course, somebody could go over it from time to time and block > everything older than a month that was still available, just to prevent > accidental merging) > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/musiccomposition%40gmail.com > -- Cheers, Benjamin Peterson "There's no place like 127.0.0.1." From martin at v.loewis.de Thu Oct 2 22:31:38 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 02 Oct 2008 22:31:38 +0200 Subject: [Python-Dev] Bugfix porting policy (was Re: Doc nits question) In-Reply-To: <1afaf6160810021326m5b54ad0dh2f890231be451b88@mail.gmail.com> References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> <4222a8490810020626n3b3c75bo39f4b4552d6773d7@mail.gmail.com> <48E52C97.5030909@v.loewis.de> <1afaf6160810021326m5b54ad0dh2f890231be451b88@mail.gmail.com> Message-ID: <48E52FAA.6000104@v.loewis.de> > A large merge queue would accumulate making hard for someone to pick > out the bugfixes. Of course, people could just merge fixes right after > they apply it to the trunk, though. I think they should. To my knowledge, nobody goes through the changelog anymore trying to find out what needs backporting. Tracking what still needs merging should happen in the bug tracker, by leaving the report open until merging has been done. Every change that isn't immediately merged and doesn't have an open issue just won't get merged at all. Regards, Martin From rhamph at gmail.com Thu Oct 2 22:31:59 2008 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 2 Oct 2008 14:31:59 -0600 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: <48E52B59.7060109@v.loewis.de> References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> <48E52B59.7060109@v.loewis.de> Message-ID: On Thu, Oct 2, 2008 at 2:13 PM, "Martin v. L?wis" wrote: >> Why not use versioned URLs, but with a link at the top of old pages >> saying they're outdated, linking to the new version. Either way they >> should update their links, but this way you don't shoot them in the >> foot to do it. > > Wouldn't that require changes to the old pages? Hopefully just to whatever common templating they're using. I'm not familiar with how they're generated though. -- Adam Olsen, aka Rhamphoryncus From g.brandl at gmx.net Thu Oct 2 22:33:49 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 02 Oct 2008 22:33:49 +0200 Subject: [Python-Dev] Bugfix porting policy (was Re: Doc nits question) In-Reply-To: <48E52FAA.6000104@v.loewis.de> References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> <4222a8490810020626n3b3c75bo39f4b4552d6773d7@mail.gmail.com> <48E52C97.5030909@v.loewis.de> <1afaf6160810021326m5b54ad0dh2f890231be451b88@mail.gmail.com> <48E52FAA.6000104@v.loewis.de> Message-ID: Martin v. L?wis schrieb: >> A large merge queue would accumulate making hard for someone to pick >> out the bugfixes. Of course, people could just merge fixes right after >> they apply it to the trunk, though. > > I think they should. To my knowledge, nobody goes through the changelog > anymore trying to find out what needs backporting. Tracking what still > needs merging should happen in the bug tracker, by leaving the report > open until merging has been done. Every change that isn't immediately > merged and doesn't have an open issue just won't get merged at all. This is why it's good to track what was merged and what not via svnmerge, because it cannot miss commits. It also is easy for someone to select which stuff to merge if the commit message on the trunk indicates backportable fixes. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From martin at v.loewis.de Thu Oct 2 22:36:35 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Oct 2008 22:36:35 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> <48E52B59.7060109@v.loewis.de> Message-ID: <48E530D3.1070907@v.loewis.de> >> Wouldn't that require changes to the old pages? > > Hopefully just to whatever common templating they're using. I'm not > familiar with how they're generated though. That's exactly the problem: they are generated. I don't think it's feasible to regenerate them, and still expect the output to be the same. Also, I don't think the generator supports templating in the way you might expect it to. To be specific, it's latex2html. Regards, Martin From g.brandl at gmx.net Thu Oct 2 22:38:42 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 02 Oct 2008 22:38:42 +0200 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: <9e804ac0810020741g7d7cb193h647b512a0efec68b@mail.gmail.com> References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> <48E4BBFA.7010800@gmail.com> <18660.53751.895687.800610@montanaro-dyndns-org.local> <9e804ac0810020741g7d7cb193h647b512a0efec68b@mail.gmail.com> Message-ID: Thomas Wouters schrieb: > After discussing on #python-dev (briefly), I made the toplevel > directories refer to the new, 2.6 toplevel directories, but deeper URLs > in the old directories redirect to www.python.org/doc/2.5.2/ > . I still think this is the wrong > approach, especially in the long term: it means people who just follow > old documentation links will not see the new results, and search engines > will not realize the pages are effectively stale. I'll work on a more thorough redirection in about two weeks' time. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From ncoghlan at gmail.com Thu Oct 2 23:49:23 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 03 Oct 2008 07:49:23 +1000 Subject: [Python-Dev] www.python.org/doc and docs.python.org hotfixed In-Reply-To: <9e804ac0810020741g7d7cb193h647b512a0efec68b@mail.gmail.com> References: <9e804ac0810020259w4b612cefk7f59f288c52e42d3@mail.gmail.com> <9e804ac0810020428r51a31e5cha910842172db94c6@mail.gmail.com> <48E4BBFA.7010800@gmail.com> <18660.53751.895687.800610@montanaro-dyndns-org.local> <9e804ac0810020741g7d7cb193h647b512a0efec68b@mail.gmail.com> Message-ID: <48E541E3.4080709@gmail.com> Thomas Wouters wrote: > > > On Thu, Oct 2, 2008 at 15:51, > > wrote: > > > Nick> The old doc directories are already kept around (all the > way back > Nick> to 1.4 in fact: http://www.python.org/doc/1.4/) > > Nick> As a quick fix for the old links, a rewrite rule to map > such links > Nick> to the 2.5 docs seems like a very good idea to me. Since > old URLs > Nick> all use abbreviations in the directory name (tut, lib, mac, > ref, > Nick> ext, api, doc, inst, dist), it should be straightforward to > Nick> redirect them without affecting the links to the new docs > Nick> (tutorial, using, reference, howto, extending, c-api, install, > Nick> distutils, documenting). > > Yes, we should probably still get the top-level links redirected to > the new > docs though. The 2.5 tutorial is probably going to get stale over time > while the 2.6 version will be updated at least until 2.7 is released. > > > After discussing on #python-dev (briefly), I made the toplevel > directories refer to the new, 2.6 toplevel directories, but deeper URLs > in the old directories redirect to www.python.org/doc/2.5.2/ > . I still think this is the wrong > approach, especially in the long term: it means people who just follow > old documentation links will not see the new results, and search engines > will not realize the pages are effectively stale. Agreed, but I think it's a better near-term solution than dumping deep links back at the top of the relevant document (it always annoys me when web sites do that). Long term, remapping even the deep links to the appropriate part of the new docs should hopefully be possible. For the search engine issue, is there any way we can tell robots to ignore the rewrite rules so they see the broken links? (although even that may not be ideal, since what we really want is to tell the robot the link is broken, and provide the new alternative) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From musiccomposition at gmail.com Fri Oct 3 00:13:16 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Thu, 2 Oct 2008 17:13:16 -0500 Subject: [Python-Dev] 2to3 bug fixes Message-ID: <1afaf6160810021513u78a76355m22637eb20b6e0205@mail.gmail.com> What should the policy on 2to3 bug fixes be for the maintenance branch? I'm asking because I remember vaguely someone suggesting that new 2to3 fixers could fit into that category. So, should I only merge "pure" bug fixes, or do I get to stretch the definition a little? -- Cheers, Benjamin Peterson "There's no place like 127.0.0.1." From guido at python.org Fri Oct 3 00:16:45 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 2 Oct 2008 15:16:45 -0700 Subject: [Python-Dev] 2to3 bug fixes In-Reply-To: <1afaf6160810021513u78a76355m22637eb20b6e0205@mail.gmail.com> References: <1afaf6160810021513u78a76355m22637eb20b6e0205@mail.gmail.com> Message-ID: On Thu, Oct 2, 2008 at 3:13 PM, Benjamin Peterson wrote: > What should the policy on 2to3 bug fixes be for the maintenance > branch? I'm asking because I remember vaguely someone suggesting that > new 2to3 fixers could fit into that category. > > So, should I only merge "pure" bug fixes, or do I get to stretch the > definition a little? We agreed that new fixers (and possibly other changes) to 2to3 are fair game for bugfix releases. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From victor.stinner at haypocalc.com Fri Oct 3 00:36:29 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 3 Oct 2008 00:36:29 +0200 Subject: [Python-Dev] 2to3 bug fixes In-Reply-To: <1afaf6160810021513u78a76355m22637eb20b6e0205@mail.gmail.com> References: <1afaf6160810021513u78a76355m22637eb20b6e0205@mail.gmail.com> Message-ID: <200810030036.29727.victor.stinner@haypocalc.com> Le Friday 03 October 2008 00:13:16 Benjamin Peterson, vous avez ?crit?: > What should the policy on 2to3 bug fixes be for the maintenance > branch? I'm asking because I remember vaguely someone suggesting that > new 2to3 fixers could fit into that category. Python3 removes os.getcwdu() and introduces os.getcwdb(). A fixer should replace "os.getcwdu()" to "os.getcwd()". See for example attached fixer (which also replaced "getcwdu()" to "getcwd()"). -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ -------------- next part -------------- A non-text attachment was scrubbed... Name: fix_getcwdu.py Type: application/x-python Size: 572 bytes Desc: not available URL: From musiccomposition at gmail.com Fri Oct 3 01:06:33 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Thu, 2 Oct 2008 18:06:33 -0500 Subject: [Python-Dev] 2to3 bug fixes In-Reply-To: <200810030036.29727.victor.stinner@haypocalc.com> References: <1afaf6160810021513u78a76355m22637eb20b6e0205@mail.gmail.com> <200810030036.29727.victor.stinner@haypocalc.com> Message-ID: <1afaf6160810021606w2e5a9810gec94097ab5205ff6@mail.gmail.com> On Thu, Oct 2, 2008 at 5:36 PM, Victor Stinner wrote: > Le Friday 03 October 2008 00:13:16 Benjamin Peterson, vous avez ?crit : >> What should the policy on 2to3 bug fixes be for the maintenance >> branch? I'm asking because I remember vaguely someone suggesting that >> new 2to3 fixers could fit into that category. > > Python3 removes os.getcwdu() and introduces os.getcwdb(). A fixer should > replace "os.getcwdu()" to "os.getcwd()". See for example attached fixer > (which also replaced "getcwdu()" to "getcwd()"). Once again, please post it to the tracker and assign it to me. -- Cheers, Benjamin Peterson "There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner." From lists at cheimes.de Fri Oct 3 01:30:59 2008 From: lists at cheimes.de (Christian Heimes) Date: Fri, 03 Oct 2008 01:30:59 +0200 Subject: [Python-Dev] c99 comments in the 2.6 code base? In-Reply-To: <48E52CC5.8080107@v.loewis.de> References: <48E52CC5.8080107@v.loewis.de> Message-ID: <48E559B3.6020408@cheimes.de> Martin v. L?wis wrote: >> shouldn't we use a suitable gcc option for the buildbots to prevent that >> from happening? > > Which one specifically? I suggest we add "-std=c89" to CFLAGS. We could also add a new target called buildbot to the Makefile that appends "-std=c89 -Werror" to CFLAGS. I don't think it's wise to add "-Werror" to the standard build target. However a new build target with extra checks should help to detect errors much sooner. Christian From martin at v.loewis.de Fri Oct 3 02:11:11 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 03 Oct 2008 02:11:11 +0200 Subject: [Python-Dev] c99 comments in the 2.6 code base? In-Reply-To: <48E559B3.6020408@cheimes.de> References: <48E52CC5.8080107@v.loewis.de> <48E559B3.6020408@cheimes.de> Message-ID: <48E5631F.3000104@v.loewis.de> >>> shouldn't we use a suitable gcc option for the buildbots to prevent that >>> from happening? >> >> Which one specifically? > > I suggest we add "-std=c89" to CFLAGS. That needs thorough testing, in particular across many old Linux distributions. It might be that some sets of Linux header files rely on GNU C extensions, without using the __extension__ keyword. > We could also add a new target > called buildbot to the Makefile that appends "-std=c89 -Werror" to > CFLAGS. I don't think it's wise to add "-Werror" to the standard build > target. However a new build target with extra checks should help to > detect errors much sooner. That would need to go along with a policy that Python must never cause GCC to emit any warnings. Regards, Martin From skip at pobox.com Fri Oct 3 02:40:20 2008 From: skip at pobox.com (skip at pobox.com) Date: Thu, 2 Oct 2008 19:40:20 -0500 Subject: [Python-Dev] c99 comments in the 2.6 code base? In-Reply-To: <48E5631F.3000104@v.loewis.de> References: <48E52CC5.8080107@v.loewis.de> <48E559B3.6020408@cheimes.de> <48E5631F.3000104@v.loewis.de> Message-ID: <18661.27124.75089.716559@montanaro-dyndns-org.local> >>>> shouldn't we use a suitable gcc option for the buildbots to prevent >>>> that from happening? >>> >>> Which one specifically? >> >> I suggest we add "-std=c89" to CFLAGS. Martin> That needs thorough testing, in particular across many old Linux Martin> distributions. It might be that some sets of Linux header files Martin> rely on GNU C extensions, without using the __extension__ Martin> keyword. Surely we don't need to be that careful with the buildbots do we? If anything, I think it would be a good idea to be more strict for them than the default. Skip From tjreedy at udel.edu Fri Oct 3 02:54:48 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 02 Oct 2008 20:54:48 -0400 Subject: [Python-Dev] Doc nits question In-Reply-To: References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> <18122A2B-CCB0-4850-BDA4-10291780E9D2@acm.org> Message-ID: Georg Brandl wrote: > Fred Drake schrieb: >> On Oct 2, 2008, at 9:21 AM, Georg Brandl wrote: >>> I intend to set things up so that the docs at docs.python.org are >>> continually >>> rebuilt, just like the /dev docs were until now. Will you do the same for the 3.0 version? http://docs.python.org/dev/3.0/ The following page has no reference to the 3.0 version http://www.python.org/doc/versions/ The 'unreleased' link at the top goes to a link that only referenced the SVN version and not the built version above. Adding a link to the build version would have made it easier to find ;-). > All Sphinx-generated pages currently have a "last update on:" in the footer. > Do you think that suffices for this purpose? It certainly would limit a search for closed issues not incorporated in the update (to avoiding duplication). From nnorwitz at gmail.com Fri Oct 3 06:04:32 2008 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 2 Oct 2008 21:04:32 -0700 Subject: [Python-Dev] Doc nits question In-Reply-To: References: <4222a8490810020608p7dffab32v53287577dc447edb@mail.gmail.com> Message-ID: On Thu, Oct 2, 2008 at 6:21 AM, Georg Brandl wrote: > Jesse Noller schrieb: >> So, we just released and there are a few doc typo bugs being filed - >> my question is if all doc-fixes have to wait for 2.6.1/2.7 or if we >> can hotfix the 2.6 docs? > > I intend to set things up so that the docs at docs.python.org are continually > rebuilt, just like the /dev docs were until now. The 2.6 docs are now updated similar to how 2.5 was (hourly). 2.5 docs are no longer updated. In case you can't guess the url, it's: http://docs.python.org/dev/2.6/ 3.0 should continue to work. Let me know if you have any problems. n > > Georg > > -- > Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. > Four shall be the number of spaces thou shalt indent, and the number of thy > indenting shall be four. Eight shalt thou not indent, nor either indent thou > two, excepting that thou then proceed to four. Tabs are right out. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/nnorwitz%40gmail.com > From adde at trialcode.com Fri Oct 3 12:10:59 2008 From: adde at trialcode.com (Andreas Nilsson) Date: Fri, 3 Oct 2008 12:10:59 +0200 Subject: [Python-Dev] if-syntax for regular for-loops Message-ID: <652F78A2-2DD3-474E-9599-C7B6A3BA1097@trialcode.com> Hi. First post so here it goes. My name is Adde, and I'm a Swedish software developer. I've been programming for about 23 years now since starting with Basic on the C64. I've been through most well known and a couple of lesser known languages in search of the perfect one. At the moment I'm writing a custom ctypes interface to the Firebird database (need access to advanced features, portability to Windows and I definitely don't enjoy setting up GCC on Windows). I've programmed a lot of C/C++ in my days so I thought I'd at least join the list and see if anything piques my interest enough to dive in. With that out of the way, on to todays subject: I use list comprehensions and generator expressions a lot and lately I've found myself writing a lot of code like this: for i in items if i.some_field == some_value: i.do_something() Naturally it won't work but it seems like a pretty straight-forward extension to allow compressing simple loops to fit on one line. The alternative, in my eyes, suggests there's something more happening than a simple include-test which makes it harder to comprehend. for i in items: if i.some_field == some_value: i.do_something() One possibility of course is to use a generator-expression but that makes it look like there are two for loops and it feels like a waste setting up a generator just for filtering. for i in (i for i in items if some_field == some_value): i.do_something() Stupid idea? Am I missing some obviously better way of achieving the same result? Thanks, Adde From 00ai99 at gmail.com Fri Oct 3 12:35:12 2008 From: 00ai99 at gmail.com (David Gowers) Date: Fri, 3 Oct 2008 20:05:12 +0930 Subject: [Python-Dev] if-syntax for regular for-loops In-Reply-To: <652F78A2-2DD3-474E-9599-C7B6A3BA1097@trialcode.com> References: <652F78A2-2DD3-474E-9599-C7B6A3BA1097@trialcode.com> Message-ID: <23f4e3390810030335m69f53958v2d3bcf0617a8b65e@mail.gmail.com> Hi Andreas, On Fri, Oct 3, 2008 at 7:40 PM, Andreas Nilsson wrote: > Hi. > First post so here it goes. > My name is Adde, and I'm a Swedish software developer. I've been programming > for about 23 years now since starting with Basic on the C64. I've been > through most well known and a couple of lesser known languages in search of > the perfect one. At the moment I'm writing a custom ctypes interface to the > Firebird database (need access to advanced features, portability to Windows > and I definitely don't enjoy setting up GCC on Windows). > I've programmed a lot of C/C++ in my days so I thought I'd at least join the > list and see if anything piques my interest enough to dive in. > > With that out of the way, on to todays subject: > I use list comprehensions and generator expressions a lot and lately I've > found myself writing a lot of code like this: > > for i in items if i.some_field == some_value: i.do_something() > > Naturally it won't work but it seems like a pretty straight-forward > extension to allow compressing simple loops to fit on one line. The > alternative, in my eyes, suggests there's something more happening than a > simple include-test which makes it harder to comprehend. > > for i in items: > if i.some_field == some_value: i.do_something() > > One possibility of course is to use a generator-expression but that makes it > look like there are two for loops and it feels like a waste setting up a > generator just for filtering. > > for i in (i for i in items if some_field == some_value): > i.do_something() > > Stupid idea? Am I missing some obviously better way of achieving the same > result? List comprehension. [i.do_something() for i in items if i.some_field == some_value] With the restriction that the statement you use must seem to return an expression.. For example [print(i) for i in range(9) if i % 2] Fails with SyntaxError, whereas def f(v): print (v) [f(i) for i in range(9) if i % 2] correctly prints 1 3 5 7 HTH, David -- Everything has reasons. Nothing has justification. ?io havas kialojn; Nenia?o havas pravigeron. From leif.walsh at gmail.com Fri Oct 3 16:29:33 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Fri, 3 Oct 2008 10:29:33 -0400 Subject: [Python-Dev] if-syntax for regular for-loops In-Reply-To: <652F78A2-2DD3-474E-9599-C7B6A3BA1097@trialcode.com> References: <652F78A2-2DD3-474E-9599-C7B6A3BA1097@trialcode.com> Message-ID: On Fri, Oct 3, 2008 at 6:10 AM, Andreas Nilsson wrote: > With that out of the way, on to todays subject: > I use list comprehensions and generator expressions a lot and lately I've > found myself writing a lot of code like this: > > for i in items if i.some_field == some_value: i.do_something() > > Naturally it won't work but it seems like a pretty straight-forward > extension to allow compressing simple loops to fit on one line. The > alternative, in my eyes, suggests there's something more happening than a > simple include-test which makes it harder to comprehend. > > for i in items: > if i.some_field == some_value: i.do_something() > > One possibility of course is to use a generator-expression but that makes it > look like there are two for loops and it feels like a waste setting up a > generator just for filtering. > > for i in (i for i in items if some_field == some_value): > i.do_something() > > Stupid idea? Am I missing some obviously better way of achieving the same > result? It's been discussed already. Essentially, all that saves is a newline or two, which, as I think has been generally accepted, tends to hurt readability. Here's the full discussion: http://www.mail-archive.com/python-dev at python.org/msg29276.html Other than that, welcome! -- Cheers, Leif From status at bugs.python.org Fri Oct 3 18:06:35 2008 From: status at bugs.python.org (Python tracker) Date: Fri, 3 Oct 2008 18:06:35 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20081003160635.D8885785E0@psf.upfronthosting.co.za> ACTIVITY SUMMARY (09/26/08 - 10/03/08) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 2074 open (+39) / 13779 closed (+16) / 15853 total (+55) Open issues with patches: 678 Average duration of open issues: 712 days. Median duration of open issues: 1835 days. Open Issues Breakdown open 2058 (+38) pending 16 ( +1) Issues Created Or Reopened (56) _______________________________ pprint._safe_repr is not general enough in one instance 09/26/08 http://bugs.python.org/issue3976 created erickt Check PyInt_AsSsize_t/PyLong_AsSsize_t error 09/26/08 CLOSED http://bugs.python.org/issue3977 created haypo patch ZipFileExt.read() can be incredibly slow 09/26/08 http://bugs.python.org/issue3978 created lightstruk patch Doctest failing when it should pass 09/26/08 CLOSED http://bugs.python.org/issue3979 created pupeno win32file.GetCommState incorrect handling of DCB 09/26/08 CLOSED http://bugs.python.org/issue3980 created jiaailun Python 3, IDLE does not start 09/27/08 CLOSED http://bugs.python.org/issue3981 created dah support .format for bytes 09/27/08 http://bugs.python.org/issue3982 created benjamin.peterson Typos in Documentation 09/28/08 CLOSED http://bugs.python.org/issue3983 created Bk python interpreter import dependency with disutils/util 09/28/08 http://bugs.python.org/issue3984 created tarek patch removed string module from distutils [patch] 09/28/08 http://bugs.python.org/issue3985 created tarek patch removed string and type usage from distutils.cmd [patch] 09/28/08 http://bugs.python.org/issue3986 created tarek patch removed types from distutils.core [patch] 09/28/08 http://bugs.python.org/issue3987 created tarek patch Byte warning mode and b'' != '' 09/28/08 http://bugs.python.org/issue3988 created christian.heimes patch Tools\Scripts\2to3.py broken under 3.0 rc1 Windows 09/28/08 CLOSED http://bugs.python.org/issue3989 created arnaud.faucher The Linux2 platform definition is incorrect for alpha, hppa, mip 09/28/08 http://bugs.python.org/issue3990 created ths patch urllib.request.urlopen does not handle non-ASCII characters 09/28/08 http://bugs.python.org/issue3991 created a.badger removed custom log from distutils 09/28/08 http://bugs.python.org/issue3992 created tarek patch Convert documentation to python 3. 09/29/08 CLOSED http://bugs.python.org/issue3993 created LambertDW import fixer misses some symbols 09/29/08 http://bugs.python.org/issue3994 created mhammond iso-xxx/cp1252 inconsistencies in Python 2.* not in 3.* 09/29/08 CLOSED http://bugs.python.org/issue3995 created jmfauth PyOS_CheckStack does not work 09/29/08 http://bugs.python.org/issue3996 created amaury.forgeotdarc patch zipfile and winzip 09/29/08 http://bugs.python.org/issue3997 created vgeorge List.sort docstring has obsolete cmp reference. 09/29/08 CLOSED http://bugs.python.org/issue3998 created tjreedy easy Real segmentation fault handler 09/30/08 http://bugs.python.org/issue3999 created haypo patch Additional 2to3 documentation updates 09/30/08 http://bugs.python.org/issue4000 created LambertDW 2to3 does relative import for modules not in a package. 09/30/08 CLOSED http://bugs.python.org/issue4001 created mhammond patch A Bug in the Documentation 09/30/08 CLOSED http://bugs.python.org/issue4002 created fretai Reference leak in test_cprofile 09/30/08 CLOSED http://bugs.python.org/issue4003 created amaury.forgeotdarc patch, needs review missing newline in "Could not convert argument %s to string" err 09/30/08 http://bugs.python.org/issue4004 created haypo patch pydoc in web server mode tails at initial request 10/01/08 http://bugs.python.org/issue4005 created prologic patch, needs review os.getenv silently discards env variables with non-UTF-8 values 10/01/08 http://bugs.python.org/issue4006 created a.badger make clean fails to delete .a and .so.X.Y files 10/01/08 http://bugs.python.org/issue4007 created skip.montanaro patch, easy IDLE: checksyntax() doesn't support Unicode? 10/01/08 http://bugs.python.org/issue4008 created haypo patch A Minor Glitch in the Documentation 10/01/08 CLOSED http://bugs.python.org/issue4009 created fretai configure options don't trickle down to distutils 10/01/08 http://bugs.python.org/issue4010 created skip.montanaro patch, easy Create DAG for PEP 101 10/01/08 http://bugs.python.org/issue4011 created brett.cannon Minor errors in multiprocessing docs 10/02/08 http://bugs.python.org/issue4012 created kjohnson Python 2.6 Doc/tools folder bigger than in 2.6rc2 10/02/08 http://bugs.python.org/issue4013 created koen Python-2.6-py2.6.egg-info contains Alpha reference 10/02/08 http://bugs.python.org/issue4014 created koen [patch] make installed scripts executable on windows 10/02/08 http://bugs.python.org/issue4015 created techtonik patch improve linecache: reuse tokenize.detect_encoding() and io.open( 10/02/08 http://bugs.python.org/issue4016 created haypo patch IDLE 2.6 broken on OSX (Leopard) 10/02/08 http://bugs.python.org/issue4017 created fzero "for me" installer problem on x64 Vista 10/02/08 http://bugs.python.org/issue4018 created jpe 2.6 (final) uses old icons in Start Menu 10/02/08 CLOSED http://bugs.python.org/issue4019 created craigneuro hasattr boundary case failure 10/02/08 CLOSED http://bugs.python.org/issue4020 created Xore tokenize.detect_encoding(): raise SyntaxError on codecs.lookup() 10/02/08 http://bugs.python.org/issue4021 created haypo patch, patch, needs review 2.6 dependent on c:\python26\ on windows 10/02/08 http://bugs.python.org/issue4022 created armandorowe convert os.getcwdu() to os.getcwd(), and getcwdu() to getcwd() 10/02/08 http://bugs.python.org/issue4023 created haypo patch float(0.0) singleton 10/03/08 http://bugs.python.org/issue4024 created ldeller patch C99 comments in Python 2.6 break build on AIX 6.1 10/03/08 http://bugs.python.org/issue4025 created drj fcntl extension fails to build on AIX 6.1 10/03/08 http://bugs.python.org/issue4026 created drj wrong page index number in reference book of python documentatio 10/03/08 http://bugs.python.org/issue4027 created ray Problem compiling the multiprocessing module on sunos5 10/03/08 http://bugs.python.org/issue4028 created jr244 Documentation displays incorrectly in iexplore. 10/03/08 http://bugs.python.org/issue4029 created LambertDW msi installer does not requires UAC permission on Vista 10/03/08 CLOSED http://bugs.python.org/issue4030 created willwill Fix for bugs relating to ntpath.expanduser() 10/02/08 http://bugs.python.org/issue957650 reopened christian.heimes patch Issues Now Closed (38) ______________________ Line numbers reported by extract_stack are offset by the #-*- en 138 days http://bugs.python.org/issue2832 benjamin.peterson Release notes for 2.6b2 call it an alpha release 61 days http://bugs.python.org/issue3457 akuchling Site-specific configuration hook documentation incorrect 52 days http://bugs.python.org/issue3510 akuchling Ctypes is confused by bitfields of varying integer types 48 days http://bugs.python.org/issue3547 theller patch, needs review super is a built-in type 33 days http://bugs.python.org/issue3736 rhettinger "make check" suggest a testing target under GNU coding standards 31 days http://bugs.python.org/issue3758 ralph.corderoy patch Integer overflow in _hashopenssl.c (CVE-2008-2316) 9 days http://bugs.python.org/issue3886 benjamin.peterson patch, patch, 64bit imageop issue 12 days http://bugs.python.org/issue3894 benjamin.peterson 2.6 regression in socket.ssl method 10 days http://bugs.python.org/issue3910 janssen patch ftplib.FTP.makeport() bug 8 days http://bugs.python.org/issue3911 benjamin.peterson patch Patch to implement a real ftplib test suite 6 days http://bugs.python.org/issue3939 benjamin.peterson patch Help in IDLE should open the chm file 6 days http://bugs.python.org/issue3941 georg.brandl PyObject_CheckReadBuffer crashes on memoryview object 3 days http://bugs.python.org/issue3946 benjamin.peterson patch, needs review Disable Py_USING_MEMORY_DEBUGGER! 7 days http://bugs.python.org/issue3951 haypo patch turtle.py - bug in Screen.__init__() 5 days http://bugs.python.org/issue3956 loewis patch Section permalink html anchors are wrong 3 days http://bugs.python.org/issue3960 georg.brandl bytearray().count() 1 days http://bugs.python.org/issue3967 amaury.forgeotdarc patch, needs review s_push: parser stack overflow MemoryError 1 days http://bugs.python.org/issue3971 benjamin.peterson collections.namedtuple uses exec to create new classes 5 days http://bugs.python.org/issue3974 rhettinger patch Check PyInt_AsSsize_t/PyLong_AsSsize_t error 3 days http://bugs.python.org/issue3977 benjamin.peterson patch Doctest failing when it should pass 0 days http://bugs.python.org/issue3979 georg.brandl win32file.GetCommState incorrect handling of DCB 0 days http://bugs.python.org/issue3980 loewis Python 3, IDLE does not start 1 days http://bugs.python.org/issue3981 dah Typos in Documentation 0 days http://bugs.python.org/issue3983 georg.brandl Tools\Scripts\2to3.py broken under 3.0 rc1 Windows 3 days http://bugs.python.org/issue3989 loewis Convert documentation to python 3. 0 days http://bugs.python.org/issue3993 georg.brandl iso-xxx/cp1252 inconsistencies in Python 2.* not in 3.* 2 days http://bugs.python.org/issue3995 rpetrov List.sort docstring has obsolete cmp reference. 0 days http://bugs.python.org/issue3998 benjamin.peterson easy 2to3 does relative import for modules not in a package. 0 days http://bugs.python.org/issue4001 benjamin.peterson patch A Bug in the Documentation 0 days http://bugs.python.org/issue4002 georg.brandl Reference leak in test_cprofile 0 days http://bugs.python.org/issue4003 brett.cannon patch, needs review A Minor Glitch in the Documentation 0 days http://bugs.python.org/issue4009 georg.brandl 2.6 (final) uses old icons in Start Menu 1 days http://bugs.python.org/issue4019 loewis hasattr boundary case failure 0 days http://bugs.python.org/issue4020 benjamin.peterson msi installer does not requires UAC permission on Vista 0 days http://bugs.python.org/issue4030 loewis Traceback error when compiling Regex 921 days http://bugs.python.org/issue1456280 timehorse Use flush() before os.exevp() 711 days http://bugs.python.org/issue1579477 akuchling asyncore/asynchat patches 476 days http://bugs.python.org/issue1736190 giampaolo.rodola patch Top Issues Most Discussed (10) ______________________________ 27 os.listdir can return byte strings 101 days open http://bugs.python.org/issue3187 18 test_multiprocessing fails on systems with HAVE_SEM_OPEN=0 30 days open http://bugs.python.org/issue3770 12 "for me" installer problem on x64 Vista 1 days open http://bugs.python.org/issue4018 10 Regexp 2.7 (modifications to current re 2.2.2) 171 days open http://bugs.python.org/issue2636 9 Byte warning mode and b'' != '' 5 days open http://bugs.python.org/issue3988 9 support .format for bytes 6 days open http://bugs.python.org/issue3982 8 configure --with-threads on cygwin => crash on thread related t 10 days open http://bugs.python.org/issue3947 8 Multi-process 2to3 70 days open http://bugs.python.org/issue3448 7 Minor errors in multiprocessing docs 2 days open http://bugs.python.org/issue4012 7 configure options don't trickle down to distutils 2 days open http://bugs.python.org/issue4010 From algorias at yahoo.com Fri Oct 3 18:03:20 2008 From: algorias at yahoo.com (Vitor Bosshard) Date: Fri, 3 Oct 2008 09:03:20 -0700 (PDT) Subject: [Python-Dev] if-syntax for regular for-loops Message-ID: <239314.61318.qm@web54401.mail.yahoo.com> ----- Mensaje original ---- > De: Leif Walsh > Para: Andreas Nilsson > CC: python-dev at python.org > Enviado: viernes, 3 de octubre, 2008 10:29:33 > Asunto: Re: [Python-Dev] if-syntax for regular for-loops > > On Fri, Oct 3, 2008 at 6:10 AM, Andreas Nilsson wrote: > > With that out of the way, on to todays subject: > > I use list comprehensions and generator expressions a lot and lately I've > > found myself writing a lot of code like this: > > > > for i in items if i.some_field == some_value: i.do_something() > > > > Naturally it won't work but it seems like a pretty straight-forward > > extension to allow compressing simple loops to fit on one line. The > > alternative, in my eyes, suggests there's something more happening than a > > simple include-test which makes it harder to comprehend. > > > > for i in items: > >? ? ? ? if i.some_field == some_value: i.do_something() > > > > One possibility of course is to use a generator-expression but that makes it > > look like there are two for loops and it feels like a waste setting up a > > generator just for filtering. > > > > for i in (i for i in items if some_field == some_value): > >? ? ? ? i.do_something() > > > > Stupid idea? Am I missing some obviously better way of achieving the same > > result? > > It's been discussed already.? Essentially, all that saves is a newline > or two, which, as I think has been generally accepted, tends to hurt > readability. The exact same argument could be used for list comprehensions themselves.?They exist anyway,?creating inconsistency in the language?(being?almost?identical to for loops regarding syntax) Vitor ____________________________________________________________________________________ Premios MTV 2008?En exclusiva! Fotos, nominados, videos, y mucho m?s! Mira aqu? http://mtvla.yahoo.com/ From musiccomposition at gmail.com Fri Oct 3 23:26:19 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Fri, 3 Oct 2008 16:26:19 -0500 Subject: [Python-Dev] for __future__ import planning Message-ID: <1afaf6160810031426n21514e81ma213b084aff20648@mail.gmail.com> So now that we've released 2.6 and are working hard on shepherding 3.0 out the door, it's time to worry about the next set of releases. :) I propose that we dramatically shorten our release cycle for 2.7/3.1 to roughly a year and put a strong focus stabilizing all the new goodies we included in the last release(s). In the 3.x branch, we should continue to solidify the new code and features that were introduced. One 2.7's main objectives should be binding 3.x and 2.x ever closer. -- Cheers, Benjamin Peterson "There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner." From lists at cheimes.de Sat Oct 4 00:00:40 2008 From: lists at cheimes.de (Christian Heimes) Date: Sat, 04 Oct 2008 00:00:40 +0200 Subject: [Python-Dev] for __future__ import planning In-Reply-To: <1afaf6160810031426n21514e81ma213b084aff20648@mail.gmail.com> References: <1afaf6160810031426n21514e81ma213b084aff20648@mail.gmail.com> Message-ID: <48E69608.2040604@cheimes.de> Benjamin Peterson wrote: > I propose that we dramatically shorten our release cycle for 2.7/3.1 > to roughly a year and put a strong focus stabilizing all the new > goodies we included in the last release(s). In the 3.x branch, we > should continue to solidify the new code and features that were > introduced. One 2.7's main objectives should be binding 3.x and 2.x > ever closer. Hey! That was my idea! I told you the very same idea on IRC a week ago. Shame on you! :) I'm +1 on the proposal. Let's focus on stability and performance for the next release. But before we start planning the next release we need to find a way to sync the development. Soon we have to apply fixes to up to four (again FOUR) branches: 2.6, 2.7, 3.0 and 3.1. We don't have to merge as many code as we did during the py3k phase. But it's still lots of work and we need all the (technical) help we can get. Christian From amauryfa at gmail.com Sat Oct 4 00:14:43 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Sat, 4 Oct 2008 00:14:43 +0200 Subject: [Python-Dev] python-checkins seems broken Message-ID: Hello, I consult very regularly (100x a day) the python-checkins and python-300-checkins mailing list archives: http://mail.python.org/pipermail/python-checkins/2008-October/date.html#end http://mail.python.org/pipermail/python-3000-checkins/2008-October/date.html#end But they did not receive the svn checkins since Friday morning (CEST timezone). They do display the buildbot failures however. I miss these messages, they are for me the best way to keep in sync with the developments. (I think I have read all the commit diffs for three years at least) They are specially important these days, where many people can work on the same files. Do other subscribed people receive these commit messages? Is there a problem with the mailer, or some SVN trigger? -- Amaury Forgeot d'Arc From leif.walsh at gmail.com Sat Oct 4 00:24:52 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Fri, 3 Oct 2008 18:24:52 -0400 Subject: [Python-Dev] if-syntax for regular for-loops In-Reply-To: <278E520B-A3D4-4F54-BCA1-49DEF6421B2D@trialcode.com> References: <652F78A2-2DD3-474E-9599-C7B6A3BA1097@trialcode.com> <278E520B-A3D4-4F54-BCA1-49DEF6421B2D@trialcode.com> Message-ID: On Fri, Oct 3, 2008 at 12:33 PM, Andreas Nilsson wrote: > Thanks for the pointer! > I don't buy the argument that newlines automagically improves readability > though. You also get increased nesting suggesting something interesting is > happening where it isn't and that hurts readability. > And as Vitor said, all other constructions of the form 'for i in items' can > have if-conditions attached to them, it's really not that far-fetched to > assume that the loop behaves the same way. Consistency good, surprises bad. Yeah, I know what you mean, and I kind of liked the idea of adding the if statement to the for loop (for consistency, if nothing else), but it's been discussed before, and plenty of people have made the same argument. Probably not worth it. -- Cheers, Leif From amauryfa at gmail.com Sat Oct 4 00:56:27 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Sat, 4 Oct 2008 00:56:27 +0200 Subject: [Python-Dev] Suspect intermittent failure in buildbots Message-ID: I've noticed an error that comes up from time to time in python 3.0 buildbots. The error is always similar to this one: Traceback (most recent call last): File "E:\cygwin\home\db3l\buildarea\3.0.bolen-windows\build\lib\test\test_io.py", line 900, in testBasicIO self.assertEquals(f.write("abc"), 3) File "E:\cygwin\home\db3l\buildarea\3.0.bolen-windows\build\lib\io.py", line 1486, in write b = encoder.encode(s) File "E:\cygwin\home\db3l\buildarea\3.0.bolen-windows\build\lib\encodings\ascii.py", line 22, in encode return codecs.ascii_encode(input, self.errors)[0] AttributeError: 'NoneType' object has no attribute 'ascii_encode' The most recent is here: http://www.python.org/dev/buildbot/3.0/AMD64%20W2k8%203.0/builds/843/step-test/0 but this already happened on various buildbots: http://www.google.fr/search?q=%27NoneType%27+object+has+no+attribute+encode+site:mail.python.org/pipermail/python-checkins&filter=0 "x86 XP-4 3.0", "amd64 gentoo 3.0", "AMD64 W2k8 3.0", "x86 W2k8 3.0", "g4 osx.4 3.0", "OS X x86 3.0" "x86 XP-3 trunk" yes, even on trunk! Every time, a "codecs" global module variable has been reset to None, either in a codec module (encoding/ascii.py, encoding/mac_roman.py) or in test_io.py. Every time, io.py is not far (which may be normal, it must have the larger usage of encodings written in a .py) I know that modules globals are reset to None on interpreter shutdown, but it does not seem to be the case here: the unit test fail, and fails again when run in verbose mode at the end. I checked that the "codecs" name is a module global: the disassembler shows a LOAD_GLOBAL opcode followed by LOAD_ATTR: 0 LOAD_GLOBAL 0 (codecs) 3 LOAD_ATTR 1 (ascii_decode) ... I fail to imagine a reason, apart from a creeping memory error (in dictionary lookup; chilling idea). Thoughts? -- Amaury Forgeot d'Arc From barry at python.org Sat Oct 4 00:56:29 2008 From: barry at python.org (Barry Warsaw) Date: Fri, 3 Oct 2008 18:56:29 -0400 Subject: [Python-Dev] for __future__ import planning In-Reply-To: <1afaf6160810031426n21514e81ma213b084aff20648@mail.gmail.com> References: <1afaf6160810031426n21514e81ma213b084aff20648@mail.gmail.com> Message-ID: <3DDCFDD1-52DB-487D-AEB4-758CF868945D@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 3, 2008, at 5:26 PM, Benjamin Peterson wrote: > So now that we've released 2.6 and are working hard on shepherding 3.0 > out the door, it's time to worry about the next set of releases. :) > > I propose that we dramatically shorten our release cycle for 2.7/3.1 > to roughly a year and put a strong focus stabilizing all the new > goodies we included in the last release(s). In the 3.x branch, we > should continue to solidify the new code and features that were > introduced. One 2.7's main objectives should be binding 3.x and 2.x > ever closer. There are several things that I would like to see us concentrate on after the 3.0 release. I agree that 3.1 should be primarily a stabilizing release. I suspect that we will find a lot of things that need tweaking only after 3.0 final has been out there for a while. I think 2.7 should continue along the path of convergence toward 3.x. The vision some of us talked about at Pycon was that at some point down the line, maybe there's no difference between "python2.9 -3" and "python3.3 -2". I would really like to see us adopt a distributed version control system. I want our maintenance branches to always be in a releasable state. I want to be confident enough about the tree to be able to cut a point release at any time. I want to release a new point release from the maint branches once a month. Christian rightly points out that with four active trees, we're going to a pretty big challenge on our hands. How do other large open source projects handle similar situations? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSOajHXEjvBPtnXfVAQJ5qgP+I6k+kHMG2zPTvMIstM2wRmhtAPd7kKz9 S6bXllUBzpxQGMYfqR3Ze5/SVUMEV2HvINPDfg816sGOoxs0fMeori398rU97bkH tOFHOEi/KLKMdgGdjGWWnV+iPEGF6stPMX/6nGQDhM5NMzj81hBgF+7U+UNbS7iM dT2wk3vSZHQ= =q4kW -----END PGP SIGNATURE----- From guido at python.org Sat Oct 4 01:02:52 2008 From: guido at python.org (Guido van Rossum) Date: Fri, 3 Oct 2008 16:02:52 -0700 Subject: [Python-Dev] Suspect intermittent failure in buildbots In-Reply-To: References: Message-ID: Module globals are also reset when the module *object* is garbage-collected (e.g. it's removed from sys.modules and not referenced elsewhere), but the module *dict* is still referenced. This can happen if all uses of the module is of the form "from import " where the is a class or function, and at least one of the users survives the garbage-collected module. I suspect that something is resetting part of sys.modules content. It is a known problem (in some parts) that encodings modules cannot be reset that way. I suspect that there is code in the regrtest.py framework that does this (resetting part of sys.modules) in order to restore a clean environment in some cases. Or perhaps it's one of the tests that does this. On Fri, Oct 3, 2008 at 3:56 PM, Amaury Forgeot d'Arc wrote: > I've noticed an error that comes up from time to time in python 3.0 buildbots. > The error is always similar to this one: > > Traceback (most recent call last): > File "E:\cygwin\home\db3l\buildarea\3.0.bolen-windows\build\lib\test\test_io.py", > line 900, in testBasicIO > self.assertEquals(f.write("abc"), 3) > File "E:\cygwin\home\db3l\buildarea\3.0.bolen-windows\build\lib\io.py", > line 1486, in write > b = encoder.encode(s) > File "E:\cygwin\home\db3l\buildarea\3.0.bolen-windows\build\lib\encodings\ascii.py", > line 22, in encode > return codecs.ascii_encode(input, self.errors)[0] > AttributeError: 'NoneType' object has no attribute 'ascii_encode' > > The most recent is here: > http://www.python.org/dev/buildbot/3.0/AMD64%20W2k8%203.0/builds/843/step-test/0 > > but this already happened on various buildbots: > http://www.google.fr/search?q=%27NoneType%27+object+has+no+attribute+encode+site:mail.python.org/pipermail/python-checkins&filter=0 > > "x86 XP-4 3.0", "amd64 gentoo 3.0", "AMD64 W2k8 3.0", "x86 W2k8 3.0", > "g4 osx.4 3.0", "OS X x86 3.0" > "x86 XP-3 trunk" > > yes, even on trunk! > Every time, a "codecs" global module variable has been reset to None, > either in a codec module (encoding/ascii.py, encoding/mac_roman.py) or > in test_io.py. > Every time, io.py is not far (which may be normal, it must have the > larger usage of encodings written in a .py) > > I know that modules globals are reset to None on interpreter shutdown, > but it does not seem to be the case here: the unit test fail, and > fails again when run in verbose mode at the end. > > I checked that the "codecs" name is a module global: the disassembler > shows a LOAD_GLOBAL opcode followed by LOAD_ATTR: > 0 LOAD_GLOBAL 0 (codecs) > 3 LOAD_ATTR 1 (ascii_decode) > ... > > I fail to imagine a reason, apart from a creeping memory error (in > dictionary lookup; chilling idea). > Thoughts? > > -- > Amaury Forgeot d'Arc > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From brett at python.org Sat Oct 4 01:34:15 2008 From: brett at python.org (Brett Cannon) Date: Fri, 3 Oct 2008 16:34:15 -0700 Subject: [Python-Dev] for __future__ import planning In-Reply-To: <3DDCFDD1-52DB-487D-AEB4-758CF868945D@python.org> References: <1afaf6160810031426n21514e81ma213b084aff20648@mail.gmail.com> <3DDCFDD1-52DB-487D-AEB4-758CF868945D@python.org> Message-ID: On Fri, Oct 3, 2008 at 3:56 PM, Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Oct 3, 2008, at 5:26 PM, Benjamin Peterson wrote: > >> So now that we've released 2.6 and are working hard on shepherding 3.0 >> out the door, it's time to worry about the next set of releases. :) >> >> I propose that we dramatically shorten our release cycle for 2.7/3.1 >> to roughly a year and put a strong focus stabilizing all the new >> goodies we included in the last release(s). In the 3.x branch, we >> should continue to solidify the new code and features that were >> introduced. One 2.7's main objectives should be binding 3.x and 2.x >> ever closer. > > There are several things that I would like to see us concentrate on after > the 3.0 release. I agree that 3.1 should be primarily a stabilizing > release. I suspect that we will find a lot of things that need tweaking > only after 3.0 final has been out there for a while. > > I think 2.7 should continue along the path of convergence toward 3.x. The > vision some of us talked about at Pycon was that at some point down the > line, maybe there's no difference between "python2.9 -3" and "python3.3 -2". > +1 from me. I think 2.7/3.1 should be used as a chance to get our testing framework straightened out and have those releases be extremely rock-solid (especially 2.7 as it might be the last in the 2.x series). Oh, and getting import rewritten in pure Python for 3.1 of course. =) > I would really like to see us adopt a distributed version control system. > Along the lines of making 2.7/3.1 very stable releases, I would love to use the time to clean up our workflow. To me that means cleaning up the workflow on the issue tracker and getting on to a DVCS to make it as easy as possible for people to contribute patches and for us to do reviews. > I want our maintenance branches to always be in a releasable state. I want > to be confident enough about the tree to be able to cut a point release at > any time. I want to release a new point release from the maint branches > once a month. > Wow! I guess release.py is going to get really automated then. =) That or you are going to manage to con more of us to help out (and even cut the release ourselves). > Christian rightly points out that with four active trees, we're going to a > pretty big challenge on our hands. How do other large open source projects > handle similar situations? > Beats me. Are that many projects crazy enough to have that many active branches? -Brett From greg.ewing at canterbury.ac.nz Sat Oct 4 01:34:10 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 11:34:10 +1200 Subject: [Python-Dev] if-syntax for regular for-loops In-Reply-To: <239314.61318.qm@web54401.mail.yahoo.com> References: <239314.61318.qm@web54401.mail.yahoo.com> Message-ID: <48E6ABF2.7050205@canterbury.ac.nz> Vitor Bosshard wrote: > >>On Fri, Oct 3, 2008 at 6:10 AM, Andreas Nilsson wrote: >>Essentially, all that saves is a newline >>or two, which, as I think has been generally accepted, tends to hurt >>readability. > > The exact same argument could be used for list comprehensions themselves. No, an LC saves more than newlines -- it saves the code to set up and append to a list. This is a substantial improvement when this code would otherwise swamp the essentials of what's being done. This doesn't apply to a plain for-loop that's not building a list. -- Greg From eric at trueblade.com Sat Oct 4 01:43:00 2008 From: eric at trueblade.com (Eric Smith) Date: Fri, 03 Oct 2008 19:43:00 -0400 Subject: [Python-Dev] for __future__ import planning In-Reply-To: References: <1afaf6160810031426n21514e81ma213b084aff20648@mail.gmail.com> <3DDCFDD1-52DB-487D-AEB4-758CF868945D@python.org> Message-ID: <48E6AE04.3010302@trueblade.com> Brett Cannon wrote: >> Christian rightly points out that with four active trees, we're going to a >> pretty big challenge on our hands. How do other large open source projects >> handle similar situations? >> > > Beats me. Are that many projects crazy enough to have that many active branches? Is it really that bad? Once 3.0 is released, it's not like we're going to be patching 2.6 and 3.0 all that much. All the "real development" (by which I mean most of the checkins) will be on 2.7 and 3.1. The biggest challenge I see is the buildbots. Eric. From amk at amk.ca Sat Oct 4 02:17:40 2008 From: amk at amk.ca (A.M. Kuchling) Date: Fri, 3 Oct 2008 20:17:40 -0400 Subject: [Python-Dev] [Python-checkins] python-checkins seems broken In-Reply-To: References: Message-ID: <20081004001740.GA20814@amk.local> On Sat, Oct 04, 2008 at 12:14:43AM +0200, Amaury Forgeot d'Arc wrote: > Do other subscribed people receive these commit messages? > Is there a problem with the mailer, or some SVN trigger? It looks like mail from dinsdale.python.org to mail.python.org isn't working due to a DNS issue: rcpt to: amk at python.org 550 5.7.1 Client host rejected: cannot find your reverse hostname, [82.94.164.164] data I know there's a transition to new IP addresses going on for the python.org machines, but Thomas or Sean probably needs to do something with the DNS for this. --amk From solipsis at pitrou.net Sat Oct 4 02:18:17 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 4 Oct 2008 00:18:17 +0000 (UTC) Subject: [Python-Dev] =?utf-8?q?for_=5F=5Ffuture=5F=5F_import_planning?= References: <1afaf6160810031426n21514e81ma213b084aff20648@mail.gmail.com> <3DDCFDD1-52DB-487D-AEB4-758CF868945D@python.org> Message-ID: Brett Cannon python.org> writes: > > Beats me. Are that many projects crazy enough to have that many active > branches? Any project using branch-driven development has many active branches. Our specificity is that we must maintain in sync two branches (trunk, py3k) which have widely diverged from each other. Thus, merges are often non-trivial. Regards Antoine. From ncoghlan at gmail.com Sat Oct 4 04:26:30 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 04 Oct 2008 12:26:30 +1000 Subject: [Python-Dev] if-syntax for regular for-loops In-Reply-To: <48E6ABF2.7050205@canterbury.ac.nz> References: <239314.61318.qm@web54401.mail.yahoo.com> <48E6ABF2.7050205@canterbury.ac.nz> Message-ID: <48E6D456.6080004@gmail.com> Greg Ewing wrote: > Vitor Bosshard wrote: >> The exact same argument could be used for list comprehensions themselves. > No, an LC saves more than newlines -- it saves the code > to set up and append to a list. This is a substantial > improvement when this code would otherwise swamp the > essentials of what's being done. > > This doesn't apply to a plain for-loop that's not > building a list. Not only do LCs make it obvious to the reader that "all this loop does is build a list", but the speed increases from doing the iteration in native code rather than pure Python are also non-trivial - every pass through the main eval loop that can be safely avoided leads to a fairly substantial time saving. Generally speaking, syntactic sugar (or new builtins) need to take a construct in idiomatic Python that is fairly obvious to an experienced Python user and make it obvious to even new users, or else take an idiom that is easy to get wrong when writing (or miss when reading) and make it trivial to use correctly. Providing significant performance improvements (usually in the form of reduced memory usage or increased speed) also counts heavily in favour of new constructs. I strongly suggest browsing through past PEPs (both accepted and rejected ones) before proposing syntax changes, but here are some examples of syntactic sugar proposals that were accepted. List/set/dict comprehensions ============================ (and the reduction builtins any(), all(), min(), max(), sum()) target = [op(x) for x in source] instead of: target = [] for x in source: target.append(op(x)) The transformation ("op(x)") is far more prominent in the comprehension version, as is the fact that all the loop does is produce a new list. I include the various reduction builtins here, since they serve exactly the same purpose of taking an idiomatic looping construct and turning it into a single expression. Generator expressions ===================== total = sum(x*x for x in source) instead of: def _g(seq): for x in source: yield x*x total = sum(_g(x)) or: total = sum([x*x for x in source]) Here, the GE version has obvious readability gains over the generator function version (as with comprehensions, it brings the operation being applied to each element front and centre instead of burying it in the middle of the code, as well as allowing reduction operations like sum() to retain their prominence), but doesn't actually improve readability significantly over the second LC-based version. The gain over the latter, of course, is that the GE based version needs a lot less *memory* than the LC version, and, as it consumes the source data incrementally, can work on source iterators of arbitrary (even infinite) length, and can also cope with source iterators with large time gaps between items (e.g. reading from a socket) as each item will be returned as it becomes available. With statements =============== with lock: # perform synchronised operations instead of: lock.aqcuire() try: # perform synchronised operations finally: lock.release() This change was a gain for both readability and writability - there were plenty of ways to get this kind of code wrong (e.g. leave out the try-finally altogether, acquire the resource inside the try block instead of before it, call the wrong method or spell the variable name wrong when attempting to release the resource in the finally block), and it wasn't easy to audit because the lock acquisition and release could be separated by an arbitrary number of lines of code. By combining all of that into a single line of code at the beginning of the block, the with statement eliminated a lot of those issues, making the code much easier to write correctly in the first place, and also easier to audit for correctness later (just make sure the code is using the correct context manager for the task at hand). Function decorators =================== @classmethod def f(cls): # Method body instead of: def f(cls): # Method body f = classmethod(f) Easier to write (function name only written once instead of three times), and easier to read (decorator names up top with the function signature instead of buried after the function body). Some folks still dislike the use of the @ symbol, but compared to the drawbacks of the old approach, the dedicated function decorator syntax is a huge improvement. Conditional expressions ======================= x = A if C else B instead of: x = C and A or B The addition of conditional expressions arguably wasn't a particularly big win for readability, but it *was* a big win for correctness. The and/or based workaround for lack of a true conditional expression was not only hard to read if you weren't already familiar with the construct, but using it was also a potential buggy if A could ever be False while C was True (in such case, B would be returned from the expression instead of A). Except clause ============= except Exception as ex: instead of: except Exception, ex: Another example of changing the syntax to eliminate potential bugs (in this case, except clauses like "except TypeError, AttributeError:", that would actually never catch AttributeError, and would locally do AttributeError=TypeError if a TypeError was caught). Cheers, Nick. P.S. There's a fractionally better argument to be used in favour of allowing an if condition on the for loop header line: it doesn't just save a newline or improve consistency with comprehensions and generator expressions, it saves an *indentation level*. And that gain is exactly the rationale that was used to begin allowing: try: ... except: ... else: ... finally: ... instead of requiring the extra indentation level: try: try: ... except: ... else: ... finally: ... However, even that argument is greatly weakened in the for/if case by the fact that the indentation level is being saved by moving the if condition up and to the right after the for loop details, whereas in the try-statement case there were absolutely no downsides (the redundant try keyword was simply dropped entirely). So I'm personally still -1 when it comes to incorporating an if clause directly into the for loop syntax - it's only necessary in the GE/LC case due to the fact that those don't support statement-based nesting. (Tangent: the above two try/except examples are perfectly legal Py3k code. Do we really need the "pass" statement anymore?) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From steve at pearwood.info Sat Oct 4 05:08:30 2008 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 4 Oct 2008 13:08:30 +1000 Subject: [Python-Dev] if-syntax for regular for-loops In-Reply-To: <48E6D456.6080004@gmail.com> References: <239314.61318.qm@web54401.mail.yahoo.com> <48E6ABF2.7050205@canterbury.ac.nz> <48E6D456.6080004@gmail.com> Message-ID: <200810041308.30800.steve@pearwood.info> On Sat, 4 Oct 2008 12:26:30 pm Nick Coghlan wrote: > (Tangent: the above two try/except examples are perfectly legal Py3k > code. Do we really need the "pass" statement anymore?) I can't imagine why you would think we don't need the pass statement. I often use it: * For subclassing exceptions: class MyTypeError(TypeError): pass * As a placeholder for code I haven't written yet. * As a no-op used in, e.g. the timeit module. And probably a few other places as well. -- Steven From greg at krypto.org Sat Oct 4 08:19:06 2008 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 3 Oct 2008 23:19:06 -0700 Subject: [Python-Dev] Real segmentation fault handler In-Reply-To: References: <200809300105.53473.victor.stinner@haypocalc.com> Message-ID: <52dc1c820810032319k2c67a7d1o52604b3d772bca1@mail.gmail.com> On Thu, Oct 2, 2008 at 10:54 AM, Thomas Heller wrote: > Victor Stinner schrieb: >> Hi, >> >> I would like to be able to catch SIGSEGV in my Python code! So I started to >> hack Python trunk to support this feature. The idea is to use a signal >> handler which call longjmp(), and add setjmp() at Py_EvalFrameEx() enter. > > On windows, ctypes catches fatal errors (exception violations) in > foreign function calls, thanks to windows structured exception handling. > > On other platforms, there is the WAD module by David Beazley which > may do something similar: > > http://www.dabeaz.com/papers/Python2001/python.html > > I do not know whether the code itself is still available or not. It appears to be here: http://web.archive.org/web/20030113032725/systems.cs.uchicago.edu/wad/ It may need a bit of attention to get it to work today, I haven't tried. -gps From martin at v.loewis.de Sat Oct 4 09:22:37 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 04 Oct 2008 09:22:37 +0200 Subject: [Python-Dev] for __future__ import planning In-Reply-To: <48E6AE04.3010302@trueblade.com> References: <1afaf6160810031426n21514e81ma213b084aff20648@mail.gmail.com> <3DDCFDD1-52DB-487D-AEB4-758CF868945D@python.org>