From tim.one@home.com Sun Jul 1 02:58:29 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 30 Jun 2001 21:58:29 -0400 Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <3B3E4487.40054EAE@ActiveState.com> Message-ID: [Paul Prescod] > "The Energy is the mass of the object times the speed of light times > two." [David Ascher] > Actually, it's "squared", not times two. At least in my universe =) This is something for Guido to Pronounce on, then. Who's going to write the PEP? The threat of nuclear war seems almost laughable in Paul's universe, so it's certainly got attractions. OTOH, it's got to be a lot colder too. energy-will-do-what-guido-tells-it-to-do-ly y'rs - tim From paulp@ActiveState.com Sun Jul 1 04:59:02 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sat, 30 Jun 2001 20:59:02 -0700 Subject: [Python-Dev] Support for "wide" Unicode characters References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> Message-ID: <3B3EA006.14882609@ActiveState.com> David Ascher wrote: > > > "The Energy is the mass of the object times the speed of light times > > two." > > Actually, it's "squared", not times two. At least in my universe =) Pedant. Next you're going to claim that these silly equations effect my life somehow. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From paulp@ActiveState.com Sun Jul 1 05:04:49 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sat, 30 Jun 2001 21:04:49 -0700 Subject: [Python-Dev] Support for "wide" Unicode characters References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> Message-ID: <3B3EA161.1375F74C@ActiveState.com> "M.-A. Lemburg" wrote: > >... > > The term "character" in Python should really only be used for > the 8-bit strings. Are we going to change chr() and unichr() to one_element_string() and unicode_one_element_string() u[i] is a character. If u is Unicode, then u[i] is a Python Unicode character. No Python user will find that confusing no matter how Unicode knuckle-dragging, mouth-breathing, wife-by-hair-dragging they are. > In Unicode a "character" can mean any of: Mark Davis said that "people" can use the word to mean any of those things. He did not say that it was imprecisely defined in Unicode. Nevertheless I'm not using the Unicode definition anymore than our standard library uses an ancient Greek definition of integer. Python has a concept of integer and a concept of character. > > It has been proposed that there should be a module for working > > with UTF-16 strings in narrow Python builds through some sort of > > abstraction that handles surrogates for you. If someone wants > > to implement that, it will be another PEP. > > Uhm, narrow builds don't support UTF-16... it's UCS-2 which > is supported (basically: store everything in range(0x10000)); > the codecs can map code points to surrogates, but it is solely > their responsibility and the responsibility of the application > using them to take care of dealing with surrogates. The user can view the data as UCS-2, UTF-16, Base64, ROT-13, XML, .... Just as we have a base64 module, we could have a UTF-16 module that interprets the data in the string as UTF-16 and does surrogate manipulation for you. Anyhow, if any of those is the "real" encoding of the data, it is UTF-16. After all, if the codec reads in four non-BMP characters in, let's say, UTF-8, we represent them as 8 narrow-build Python characters. That's the definition of UTF-16! But it's easy enough for me to take that word out so I will. >... > Also, the module will be useful for both narrow and wide builds, > since the notion of an encoded character can involve multiple code > points. In that sense Unicode is always a variable length > encoding for characters and that's the application field of > this module. I wouldn't advise that you do all different types of normalization in a single module but I'll wait for your PEP. > Here's the adjusted text: > > It has been proposed that there should be a module for working > with Unicode objects using character-, word- and line- based > indexing. The details of the implementation is left to > another PEP. It has been proposed that there should be a module that handles surrogates in narrow Python builds for programmers. If someone wants to implement that, it will be another PEP. It might also be combined with features that allow other kinds of character-, word- and line- based indexing. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From DavidA@ActiveState.com Sun Jul 1 07:09:40 2001 From: DavidA@ActiveState.com (David Ascher) Date: Sat, 30 Jun 2001 23:09:40 -0700 Subject: [Python-Dev] Support for "wide" Unicode characters References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> Message-ID: <3B3EBEA4.3EC84EAF@ActiveState.com> Paul Prescod wrote: > > David Ascher wrote: > > > > > "The Energy is the mass of the object times the speed of light times > > > two." > > > > Actually, it's "squared", not times two. At least in my universe =) > > Pedant. Next you're going to claim that these silly equations effect my > life somehow. Although one stretch the argument to say that the equations _effect_ your life, I'd limit the claim to stating that they _affect_ your life. pedantly y'rs, --dr david From paulp@ActiveState.com Sun Jul 1 07:15:46 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sat, 30 Jun 2001 23:15:46 -0700 Subject: [Python-Dev] Support for "wide" Unicode characters References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com> Message-ID: <3B3EC012.A3A05E64@ActiveState.com> David Ascher wrote: > > Paul Prescod wrote: > > > > David Ascher wrote: > > > > > > > "The Energy is the mass of the object times the speed of light times > > > > two." > > > > > > Actually, it's "squared", not times two. At least in my universe =) > > > > Pedant. Next you're going to claim that these silly equations effect my > > life somehow. > > Although one stretch the argument to say that the equations _effect_ ^ might ----- > your life, I'd limit the claim to stating that they _affect_ your life. And you just bought such a shiny, new glass, house. Pity. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From nhodgson@bigpond.net.au Sun Jul 1 14:00:15 2001 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Sun, 1 Jul 2001 23:00:15 +1000 Subject: [Python-Dev] Support for "wide" Unicode characters References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> Message-ID: <00dd01c1022d$c61e4160$0acc8490@neil> Paul Prescod: The problem I have with this PEP is that it is a compile time option which makes it hard to work with both 32 bit and 16 bit strings in one program. Can not the 32 bit string type be introduced as an additional type? > Are we going to change chr() and unichr() to one_element_string() and > unicode_one_element_string() > > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode > character. This wasn't usefully true in the past for DBCS strings and is not the right way to think of either narrow or wide strings now. The idea that strings are arrays of characters gets in the way of dealing with many encodings and is the primary difficulty in localising software for Japanese. Iteration through the code units in a string is a problem waiting to bite you and string APIs should encourage behaviour which is correct when faced with variable width characters, both DBCS and UTF style. Iteration over variable width characters should be performed in a way that preserves the integrity of the characters. M.-A. Lemburg's proposed set of iterators could be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to provide for the various intended string uses such as "for c in s.inVisualOrder()" reversing the receipt of right-to-left substrings. Neil From guido@digicool.com Sun Jul 1 14:44:29 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 01 Jul 2001 09:44:29 -0400 Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: Your message of "Sun, 01 Jul 2001 23:00:15 +1000." <00dd01c1022d$c61e4160$0acc8490@neil> References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> <00dd01c1022d$c61e4160$0acc8490@neil> Message-ID: <200107011344.f61DiTM03548@odiug.digicool.com> > > > The problem I have with this PEP is that it is a compile time option > which makes it hard to work with both 32 bit and 16 bit strings in one > program. Can not the 32 bit string type be introduced as an additional type? Not without an outrageous amount of additional coding (every place in the code that currently uses PyUnicode_Check() would have to be bifurcated in a 16-bit and a 32-bit variant). I doubt that the desire to work with both 16- and 32-bit characters in one program is typical for folks using Unicode -- that's mostly limited to folks writing conversion tools. Python will offer the necessary codecs so you shouldn't have this need very often. You can use the array module to manipulate 16- and 32-bit arrays, and you can use the various Unicode encodings to do the necessary encodings. > > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode > > character. > > This wasn't usefully true in the past for DBCS strings and is not the > right way to think of either narrow or wide strings now. The idea that > strings are arrays of characters gets in the way of dealing with many > encodings and is the primary difficulty in localising software for Japanese. Can you explain the kind of problems encountered in some more detail? > Iteration through the code units in a string is a problem waiting to bite > you and string APIs should encourage behaviour which is correct when faced > with variable width characters, both DBCS and UTF style. But this is not the Unicode philosophy. All the variable-length character manipulation is supposed to be taken care of by the codecs, and then the application can deal in arrays of characteres. Alternatively, the application can deal in opaque objects representing variable-length encodings, but then it should be very careful with concatenation and even more so with slicing. > Iteration over > variable width characters should be performed in a way that preserves the > integrity of the characters. M.-A. Lemburg's proposed set of iterators could > be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to > provide for the various intended string uses such as "for c in > s.inVisualOrder()" reversing the receipt of right-to-left substrings. I think it's a good idea to provide a set of higher-level tools as well. However nobody seems to know what these higher-level tools should do yet. PEP 261 is specifically focused on getting the lower-level foundations right (i.e. the objects that represent arrays of code units), so that the authors of higher level tools will have a solid base. If you want to help author a PEP for such higher-level tools, you're welcome! --Guido van Rossum (home page: http://www.python.org/~guido/) From loewis@informatik.hu-berlin.de Sun Jul 1 14:52:58 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Sun, 1 Jul 2001 15:52:58 +0200 (MEST) Subject: [Python-Dev] Support for "wide" Unicode characters Message-ID: <200107011352.PAA27645@pandora.informatik.hu-berlin.de> > The problem I have with this PEP is that it is a compile time option > which makes it hard to work with both 32 bit and 16 bit strings in > one program. Can you elaborate why you think this is a problem? > Can not the 32 bit string type be introduced as an additional type? Yes, but not just "like that". You'd have to define an API for creating values of this type, you'd have to teach all functions which ought to accept it to process it, you'd have to define conversion operations and all that: In short, you'd have to go through all the trouble that introduction of the Unicode type gave us once again. Also, I cannot see any advantages in introducing yet another type. Implementing this PEP is straight forward, and with almost no visible effect to Python programs. People have suggested to make it a run-time decision, having the internal representation switch on demand, but that would give an API nightmare for C code that has to access such values. > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode > character. > This wasn't usefully true in the past for DBCS strings and is not the > right way to think of either narrow or wide strings now. The idea > that strings are arrays of characters gets in the way of dealing > with many encodings and is the primary difficulty in localising > software for Japanese. While I don't know much about localising software for Japanese (*), I agree that 'u[i] is a character' isn't useful to say in many cases. If this is the old Python string type, I'd much prefer calling u[i] a 'byte'. Regards, Martin (*) Methinks that the primary difficulty still is translating all the documentation, and messages. Actually, keeping the translations up-to-date is even more challenging. From aahz@rahul.net Sun Jul 1 15:19:41 2001 From: aahz@rahul.net (Aahz Maruch) Date: Sun, 1 Jul 2001 07:19:41 -0700 (PDT) Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <3B3EC012.A3A05E64@ActiveState.com> from "Paul Prescod" at Jun 30, 2001 11:15:46 PM Message-ID: <20010701141941.A323099C80@waltz.rahul.net> Paul Prescod wrote: > David Ascher wrote: >> Paul Prescod wrote: >>> David Ascher wrote: >>>>> >>>>> "The Energy is the mass of the object times the speed of light times >>>>> two." >>>> >>>> Actually, it's "squared", not times two. At least in my universe =) >>> >>> Pedant. Next you're going to claim that these silly equations effect my >>> life somehow. >> >> Although one stretch the argument to say that the equations _effect_ > ^ > might ----- > >> your life, I'd limit the claim to stating that they _affect_ your life. > > And you just bought such a shiny, new glass, house. Pity. All speeling falmes contain at least one erorr. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From just@letterror.com Sun Jul 1 15:43:08 2001 From: just@letterror.com (Just van Rossum) Date: Sun, 1 Jul 2001 16:43:08 +0200 Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <200107011344.f61DiTM03548@odiug.digicool.com> Message-ID: <20010701164315-r01010600-c2d5b07d@213.84.27.177> Guido van Rossum wrote: > > > > > > The problem I have with this PEP is that it is a compile time option > > which makes it hard to work with both 32 bit and 16 bit strings in one > > program. Can not the 32 bit string type be introduced as an additional type? > > Not without an outrageous amount of additional coding (every place in > the code that currently uses PyUnicode_Check() would have to be > bifurcated in a 16-bit and a 32-bit variant). Alternatively, a Unicode object could *internally* be either 8, 16 or 32 bits wide (to be clear: not per character, but per string). Also a lot of work, but it'll be a lot less wasteful. > I doubt that the desire to work with both 16- and 32-bit characters in > one program is typical for folks using Unicode -- that's mostly > limited to folks writing conversion tools. Python will offer the > necessary codecs so you shouldn't have this need very often. Not a lot of people will want to work with 16 or 32 bit chars directly, but I think a less wasteful solution to the surrogate pair problem *will* be desired by people. Why use 32 bits for all strings in a program when only a tiny percentage actually *needs* more than 16? (Or even 8...) > > Iteration through the code units in a string is a problem waiting to bite > > you and string APIs should encourage behaviour which is correct when faced > > with variable width characters, both DBCS and UTF style. > > But this is not the Unicode philosophy. All the variable-length > character manipulation is supposed to be taken care of by the codecs, > and then the application can deal in arrays of characteres. Right: this is the way it should be. My difficulty with PEP 261 is that I'm afraid few people will actually enable 32-bit support (*what*?! all unicode strings become 32 bits wide? no way!), therefore making programs non-portable in very subtle ways. Just From DavidA@ActiveState.com Sun Jul 1 18:13:30 2001 From: DavidA@ActiveState.com (David Ascher) Date: Sun, 01 Jul 2001 10:13:30 -0700 Subject: [Python-Dev] Support for "wide" Unicode characters References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com> <3B3EC012.A3A05E64@ActiveState.com> Message-ID: <3B3F5A3A.A88B54B2@ActiveState.com> Paul: > And you just bought such a shiny, new glass, house. Pity. What kind of comma placement is that? --david From paulp@ActiveState.com Sun Jul 1 19:08:10 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 01 Jul 2001 11:08:10 -0700 Subject: [Python-Dev] Support for "wide" Unicode characters References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> <00dd01c1022d$c61e4160$0acc8490@neil> Message-ID: <3B3F670A.B5396D61@ActiveState.com> Neil Hodgson wrote: > > Paul Prescod: > > > The problem I have with this PEP is that it is a compile time option > which makes it hard to work with both 32 bit and 16 bit strings in one > program. Can not the 32 bit string type be introduced as an additional type? The two solutions are not mutually exclusive. If you (or someone) supplies a 32-bit type and Guido accepts it, then the compile option might fall into disuse. But this solution was chosen because it is much less work. Really though, I think that having 16-bit and 32-bit types is extra confusion for very little gain. I would much rather have a single space-efficient type that hid the details of its implementation. But nobody has volunteered to code it and Guido might not accept it even if someone did. >... > This wasn't usefully true in the past for DBCS strings and is not the > right way to think of either narrow or wide strings now. The idea that > strings are arrays of characters gets in the way of dealing with many > encodings and is the primary difficulty in localising software for Japanese. The whole benfit of moving to 32-bit character strings is to allow people to think of strings as arrays of characters. Forcing them to consider variable-length encodings is precisely what we are trying to avoid. > Iteration through the code units in a string is a problem waiting to bite > you and string APIs should encourage behaviour which is correct when faced > with variable width characters, both DBCS and UTF style. Iteration over > variable width characters should be performed in a way that preserves the > integrity of the characters. On wide Python builds there is no such thing as variable width Unicode characters. It doesn't make sense to combine two 32-bit characters to get a 64-bit one. On narrow Python builds you might want to treat a surrogate pair as a single character but I would strongly advise against it. If you want wide characters, move to a wide build. Even if a narrow build is more space efficient, you'll lose a ton of performance emulating wide characters in Python code. > ... M.-A. Lemburg's proposed set of iterators could > be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to > provide for the various intended string uses such as "for c in > s.inVisualOrder()" reversing the receipt of right-to-left substrings. A floor wax and a desert topping. <0.5 wink> I don't think that the average Python programmer would want s.asCharacters('utf-8') when they already have s.decode('utf-8'). We decided a long time ago that the model for standard users would be fixed-length (1!), abstract characters. That's the way Python's Unicode subsystem has always worked. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From paulp@ActiveState.com Sun Jul 1 19:19:17 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 01 Jul 2001 11:19:17 -0700 Subject: [Python-Dev] Support for "wide" Unicode characters References: <20010701164315-r01010600-c2d5b07d@213.84.27.177> Message-ID: <3B3F69A5.D7CE539D@ActiveState.com> Just van Rossum wrote: > > Guido van Rossum wrote: > > > > > > > > > > The problem I have with this PEP is that it is a compile time option > > > which makes it hard to work with both 32 bit and 16 bit strings in one > > > program. Can not the 32 bit string type be introduced as an additional type? > > > > Not without an outrageous amount of additional coding (every place in > > the code that currently uses PyUnicode_Check() would have to be > > bifurcated in a 16-bit and a 32-bit variant). > > Alternatively, a Unicode object could *internally* be either 8, 16 or 32 bits > wide (to be clear: not per character, but per string). Also a lot of work, but > it'll be a lot less wasteful. I hope this is where we end up one day. But the compile-time option is better than where we are today. Even though PEP 261 is not my favorite solution, it buys us a couple of years of wait-and-see time. Consider that computer memory is growing much faster than textual data. People's text processing techniques get more and more "wasteful" because it is now almost always possible to load the entire "text" into memory at once. I remember how some text editors used to boast that they only loaded your text "on demand". Maybe so much data will be passed to us from UCS-4 APIs that trying to "compress it" will actually be inefficient. Maybe two years from now Guido will make UCS-4 the default and only a tiny minority will notice or care. > ... > My difficulty with PEP 261 is that I'm afraid few people will actually enable > 32-bit support (*what*?! all unicode strings become 32 bits wide? no way!), > therefore making programs non-portable in very subtle ways. It really depends on what the default build option is. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From paulp@ActiveState.com Sun Jul 1 19:22:01 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 01 Jul 2001 11:22:01 -0700 Subject: [Python-Dev] Support for "wide" Unicode characters References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com> <3B3EC012.A3A05E64@ActiveState.com> <3B3F5A3A.A88B54B2@ActiveState.com> Message-ID: <3B3F6A49.6E82B7DE@ActiveState.com> David Ascher wrote: > > Paul: > > And you just bought such a shiny, new glass, house. Pity. > > What kind of comma placement is that? I had to leave you something to complain about; -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From guido@digicool.com Sun Jul 1 19:37:48 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 01 Jul 2001 14:37:48 -0400 Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: Your message of "Sun, 01 Jul 2001 16:43:08 +0200." <20010701164315-r01010600-c2d5b07d@213.84.27.177> References: <20010701164315-r01010600-c2d5b07d@213.84.27.177> Message-ID: <200107011837.f61IbmZ03645@odiug.digicool.com> > Alternatively, a Unicode object could *internally* be either 8, 16 > or 32 bits wide (to be clear: not per character, but per > string). Also a lot of work, but it'll be a lot less wasteful. Depending on what you prefer to waste: developers' time or computer resources. I bet that if you try the measure the wasted space you'll find that it wastes very little compared to all the other overheads in a typical Python program: CPU time compared to writing your code in C, memory overhead for integers, etc. It so happened that the Unicode support was written to make it very easy to change the compile-time code unit size; but making this a per-string (or even global) run-time variable is much harder without touching almost every place that uses Unicode (not to mention slowing down the common case). Nobody was enthusiastic about fixing this, so our choice was really between staying with 16 bits or making 32 bits an option for those who need it. > Not a lot of people will want to work with 16 or 32 bit chars > directly, How do you know? There are more Chinese than Americans and Europeans together, and they will soon all have computers. :-) > but I think a less wasteful solution to the surrogate pair > problem *will* be desired by people. Why use 32 bits for all strings > in a program when only a tiny percentage actually *needs* more than > 16? (Or even 8...) So work in UTF-8 -- a lot of work can be done in UTF-8. > > But this is not the Unicode philosophy. All the variable-length > > character manipulation is supposed to be taken care of by the codecs, > > and then the application can deal in arrays of characteres. > > Right: this is the way it should be. > > My difficulty with PEP 261 is that I'm afraid few people will > actually enable 32-bit support (*what*?! all unicode strings become > 32 bits wide? no way!), therefore making programs non-portable in > very subtle ways. My hope and expectation is that those folks who need 32-bit support will enable it. If this solution is not sufficient, we may have to provide something else in the future, but given that the implementation effort for PEP 261 was very minimal (certainly less than the time expended in discussing it) I am very happy with it. It will take quite a while until lots of folks will need the 32-bit support (there aren't that many characters defined outside the basic plane yet). In the mean time, those that need to 32-bit support should be happy that we allow them to rebuild Python with 32-bit support. In the next 5-10 years, the 32-bit support requirement will become more common -- as will be the memory upgrades to make it painless. It's not like Python is making this decision in a vacuum either: Linux already has 32-bit wchar_t. 32-bit characters will eventually be common (even in Windows, which probably has the largest investment in 16-bit Unicode at the moment of any system). Like IPv6, we're trying to enable uncommon uses of Python without breaking things for the not-so-early adopters. Again, don't see PEP 261 as the ultimate answer to all your 32-bit Unicode questions. Just consider that realistically we have two choices: stick with 16-bit support only or make 32-bit support an option. Other approaches (more surrogate support, run-time choices, transparent variable-length encodings) simply aren't realistic -- no-one has the time to code them. It should be easy to write portable Python programs that work correctly with 16-bit Unicode characters on a "narrow" interpreter and also work correctly with 21-bit Unicode on a "wide" interpreter: just avoid using surrogates. If you *need* to work with surrogates, try to limit yourself to very simple operations like concatenations of valid strings, and splitting strings at known delimiters only. There's a lot you can do with this. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Sun Jul 1 19:52:36 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 1 Jul 2001 14:52:36 -0400 Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <3B3F69A5.D7CE539D@ActiveState.com> Message-ID: [Paul Prescod] > ... > Consider that computer memory is growing much faster than textual data. > People's text processing techniques get more and more "wasteful" because > it is now almost always possible to load the entire "text" into memory > at once. Indeed, the entire text of the Bible fits in a corner of my year-old box's RAM, even at 32 bits per character. > I remember how some text editors used to boast that they only loaded > your text "on demand". Well, they still do -- fancy editors use fancy data structures, so that, e.g., inserting characters at the start of the file doesn't cause a 50Mb memmove each time. Response time is still important, but I'd wager relatively insensitive to basic character size (you need tricks that cut factors of 1000s off potential worst cases to give the appearance of instantaneous results; a factor of 2 or 4 is in the noise compared to what's needed regardless). From aahz@rahul.net Sun Jul 1 20:21:26 2001 From: aahz@rahul.net (Aahz Maruch) Date: Sun, 1 Jul 2001 12:21:26 -0700 (PDT) Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <3B3F670A.B5396D61@ActiveState.com> from "Paul Prescod" at Jul 01, 2001 11:08:10 AM Message-ID: <20010701192126.9EB8299C80@waltz.rahul.net> Paul Prescod wrote: > > On wide Python builds there is no such thing as variable width Unicode > characters. It doesn't make sense to combine two 32-bit characters to > get a 64-bit one. On narrow Python builds you might want to treat a > surrogate pair as a single character but I would strongly advise against > it. If you want wide characters, move to a wide build. Even if a narrow > build is more space efficient, you'll lose a ton of performance > emulating wide characters in Python code. This needn't go into the PEP, I think, but I'd like you to say something about what you expect the end result of this PEP to look like under Windows, where "rebuild" isn't really a valid option for most Python users. Are we simply committing to make two builds available? If so, what happens the next time we run into a situation like this? -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From paulp@ActiveState.com Sun Jul 1 20:21:09 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 01 Jul 2001 12:21:09 -0700 Subject: [Python-Dev] Text editors References: Message-ID: <3B3F7825.CA3D1B5B@ActiveState.com> Tim Peters wrote: > >... > > > I remember how some text editors used to boast that they only loaded > > your text "on demand". > > Well, they still do -- fancy editors use fancy data structures, so that, > e.g., inserting characters at the start of the file doesn't cause a 50Mb > memmove each time. Yes, but most modern text editors take O(n) time to open the file. There was a time when the more advanced ones did not. Or maybe that was just SGML editors...I can't remember. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From guido@digicool.com Sun Jul 1 20:32:52 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 01 Jul 2001 15:32:52 -0400 Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: Your message of "Sun, 01 Jul 2001 12:21:26 PDT." <20010701192126.9EB8299C80@waltz.rahul.net> References: <20010701192126.9EB8299C80@waltz.rahul.net> Message-ID: <200107011932.f61JWq803843@odiug.digicool.com> > This needn't go into the PEP, I think, but I'd like you to say something > about what you expect the end result of this PEP to look like under > Windows, where "rebuild" isn't really a valid option for most Python > users. Are we simply committing to make two builds available? If so, > what happens the next time we run into a situation like this? I imagine that we will pick a choice (I expect it'll be UCS2) and make only that build available, until there are loud enough cries from folks who have a reasonable excuse not to have a copy of VCC around. Given that the rest of Windows uses 16-bit Unicode, I think we'll be able to get away with this for quite a while. --Guido van Rossum (home page: http://www.python.org/~guido/) From paulp@ActiveState.com Sun Jul 1 20:33:20 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 01 Jul 2001 12:33:20 -0700 Subject: [Python-Dev] Support for "wide" Unicode characters References: <20010701192126.9EB8299C80@waltz.rahul.net> Message-ID: <3B3F7B00.29D6832@ActiveState.com> Aahz Maruch wrote: > >... > > This needn't go into the PEP, I think, but I'd like you to say something > about what you expect the end result of this PEP to look like under > Windows, where "rebuild" isn't really a valid option for most Python > users. Are we simply committing to make two builds available? If so, > what happens the next time we run into a situation like this? Windows itself is strongly biased towards 16-bit characters. Therefore I expect that to be the default for a while. Then I expect Guido to announce that 32-bit characters are the new default with version 3000 (perhaps right after Windows 3000 ships) and we'll all change. So most Windows users will not be able to work with 32-bit characters for a while. But since Windows itself doesn't like those characters, they probably won't run into them much. I strongly doubt that we'll ever make two builds available because it would cause a mess of extension module incompatibilities. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From paulp@ActiveState.com Sun Jul 1 20:57:09 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 01 Jul 2001 12:57:09 -0700 Subject: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" Unicode characters Message-ID: <3B3F8095.8D58631D@ActiveState.com> PEP: 261 Title: Support for "wide" Unicode characters Version: $Revision: 1.3 $ Author: paulp@activestate.com (Paul Prescod) Status: Draft Type: Standards Track Created: 27-Jun-2001 Python-Version: 2.2 Post-History: 27-Jun-2001 Abstract Python 2.1 unicode characters can have ordinals only up to 2**16 -1. This range corresponds to a range in Unicode known as the Basic Multilingual Plane. There are now characters in Unicode that live on other "planes". The largest addressable character in Unicode has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we will call this TOPCHAR and call characters in this range "wide characters". Glossary Character Used by itself, means the addressable units of a Python Unicode string. Code point A code point is an integer between 0 and TOPCHAR. If you imagine Unicode as a mapping from integers to characters, each integer is a code point. But the integers between 0 and TOPCHAR that do not map to characters are also code points. Some will someday be used for characters. Some are guaranteed never to be used for characters. Codec A set of functions for translating between physical encodings (e.g. on disk or coming in from a network) into logical Python objects. Encoding Mechanism for representing abstract characters in terms of physical bits and bytes. Encodings allow us to store Unicode characters on disk and transmit them over networks in a manner that is compatible with other Unicode software. Surrogate pair Two physical characters that represent a single logical character. Part of a convention for representing 32-bit code points in terms of two 16-bit code points. Unicode string A Python type representing a sequence of code points with "string semantics" (e.g. case conversions, regular expression compatibility, etc.) Constructed with the unicode() function. Proposed Solution One solution would be to merely increase the maximum ordinal to a larger value. Unfortunately the only straightforward implementation of this idea is to use 4 bytes per character. This has the effect of doubling the size of most Unicode strings. In order to avoid imposing this cost on every user, Python 2.2 will allow the 4-byte implementation as a build-time option. Users can choose whether they care about wide characters or prefer to preserve memory. The 4-byte option is called "wide Py_UNICODE". The 2-byte option is called "narrow Py_UNICODE". Most things will behave identically in the wide and narrow worlds. * unichr(i) for 0 <= i < 2**16 (0x10000) always returns a length-one string. * unichr(i) for 2**16 <= i <= TOPCHAR will return a length-one string on wide Python builds. On narrow builds it will raise ValueError. ISSUE Python currently allows \U literals that cannot be represented as a single Python character. It generates two Python characters known as a "surrogate pair". Should this be disallowed on future narrow Python builds? Pro: Python already the construction of a surrogate pair for a large unicode literal character escape sequence. This is basically designed as a simple way to construct "wide characters" even in a narrow Python build. It is also somewhat logical considering that the Unicode-literal syntax is basically a short-form way of invoking the unicode-escape codec. Con: Surrogates could be easily created this way but the user still needs to be careful about slicing, indexing, printing etc. Therefore some have suggested that Unicode literals should not support surrogates. ISSUE Should Python allow the construction of characters that do not correspond to Unicode code points? Unassigned Unicode code points should obviously be legal (because they could be assigned at any time). But code points above TOPCHAR are guaranteed never to be used by Unicode. Should we allow access to them anyhow? Pro: If a Python user thinks they know what they're doing why should we try to prevent them from violating the Unicode spec? After all, we don't stop 8-bit strings from containing non-ASCII characters. Con: Codecs and other Unicode-consuming code will have to be careful of these characters which are disallowed by the Unicode specification. * ord() is always the inverse of unichr() * There is an integer value in the sys module that describes the largest ordinal for a character in a Unicode string on the current interpreter. sys.maxunicode is 2**16-1 (0xffff) on narrow builds of Python and TOPCHAR on wide builds. ISSUE: Should there be distinct constants for accessing TOPCHAR and the real upper bound for the domain of unichr (if they differ)? There has also been a suggestion of sys.unicodewidth which can take the values 'wide' and 'narrow'. * every Python Unicode character represents exactly one Unicode code point (i.e. Python Unicode Character = Abstract Unicode character). * codecs will be upgraded to support "wide characters" (represented directly in UCS-4, and as variable-length sequences in UTF-8 and UTF-16). This is the main part of the implementation left to be done. * There is a convention in the Unicode world for encoding a 32-bit code point in terms of two 16-bit code points. These are known as "surrogate pairs". Python's codecs will adopt this convention and encode 32-bit code points as surrogate pairs on narrow Python builds. ISSUE Should there be a way to tell codecs not to generate surrogates and instead treat wide characters as errors? Pro: I might want to write code that works only with fixed-width characters and does not have to worry about surrogates. Con: No clear proposal of how to communicate this to codecs. * there are no restrictions on constructing strings that use code points "reserved for surrogates" improperly. These are called "isolated surrogates". The codecs should disallow reading these from files, but you could construct them using string literals or unichr(). Implementation There is a new (experimental) define: #define PY_UNICODE_SIZE 2 There is a new configure option: --enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses wchar_t if it fits --enable-unicode=ucs4 configures a wide Py_UNICODE, and uses whchar_t if it fits --enable-unicode same as "=ucs2" The intention is that --disable-unicode, or --enable-unicode=no removes the Unicode type altogether; this is not yet implemented. It is also proposed that one day --enable-unicode will just default to the width of your platforms wchar_t. Windows builds will be narrow for a while based on the fact that there have been few requests for wide characters, those requests are mostly from hard-core programmers with the ability to buy their own Python and Windows itself is strongly biased towards 16-bit characters. Notes This PEP does NOT imply that people using Unicode need to use a 4-byte encoding for their files on disk or sent over the network. It only allows them to do so. For example, ASCII is still a legitimate (7-bit) Unicode-encoding. It has been proposed that there should be a module that handles surrogates in narrow Python builds for programmers. If someone wants to implement that, it will be another PEP. It might also be combined with features that allow other kinds of character-, word- and line- based indexing. Rejected Suggestions More or less the status-quo We could officially say that Python characters are 16-bit and require programmers to implement wide characters in their application logic by combining surrogate pairs. This is a heavy burden because emulating 32-bit characters is likely to be very inefficient if it is coded entirely in Python. Plus these abstracted pseudo-strings would not be legal as input to the regular expression engine. "Space-efficient Unicode" type Another class of solution is to use some efficient storage internally but present an abstraction of wide characters to the programmer. Any of these would require a much more complex implementation than the accepted solution. For instance consider the impact on the regular expression engine. In theory, we could move to this implementation in the future without breaking Python code. A future Python could "emulate" wide Python semantics on narrow Python. Guido is not willing to undertake the implementation right now. Two types We could introduce a 32-bit Unicode type alongside the 16-bit type. There is a lot of code that expects there to be only a single Unicode type. This PEP represents the least-effort solution. Over the next several years, 32-bit Unicode characters will become more common and that may either convince us that we need a more sophisticated solution or (on the other hand) convince us that simply mandating wide Unicode characters is an appropriate solution. Right now the two options on the table are do nothing or do this. References Unicode Glossary: http://www.unicode.org/glossary/ Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil End: -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From thomas@xs4all.net Sun Jul 1 23:12:48 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 2 Jul 2001 00:12:48 +0200 Subject: [Python-Dev] Python 2.1.1 release 'schedule' Message-ID: <20010702001248.H8098@xs4all.nl> This is just a heads-up to everyone. I plan to release Python 2.1.1c1 (release candidate 1) somewhere on Friday the 13th (of July) and, barring any serious problems, the full release the friday following that, July 20. The python 2.1.1 CVS branch (tagged 'release21-maint') should be stable, and should contain most bugfixes that will be in 2.1.1. If you care about 2.1.1's stability and portability, or you found bugs in 2.1 and aren't sure they are fixed, and you can check things out of CVS, please give the CVS branch a try: just 'checkout' python with cvs co -rrelease21-maint python (with the -d option from the SourceForge CVS page that applies to you) and follow the normal compile procedure. Binaries for Windows as well as source tarballs will be provided for the release candidate and the final release (obviously) but the more bugs people point out before the final release, the more bugs will be fixed in 2.1.1 :-) Python 2.1.1 (as well as the CVS branch) will fall under the new GPL-compatible PSF licence, just like Python 2.0.1. The only notable thing missing from the CVS branch is an updated NEWS file -- I'm working on it. I'm also not done searching the open bugs for ones that might need to be adressed in 2.1.1, but feel free to point me to bugs you think are important! 2.1.1-Patch-Czar-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From greg@cosc.canterbury.ac.nz Mon Jul 2 03:06:50 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Jul 2001 14:06:50 +1200 (NZST) Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <3B3EBEA4.3EC84EAF@ActiveState.com> Message-ID: <200107020206.OAA00427@s454.cosc.canterbury.ac.nz> David Ascher : > I'd limit the claim to stating that they _affect_ your life. If matter didn't have any rest energy, everything would fly about at the speed of light, which would make life very hectic. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 2 03:36:39 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Jul 2001 14:36:39 +1200 (NZST) Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <20010701164315-r01010600-c2d5b07d@213.84.27.177> Message-ID: <200107020236.OAA00432@s454.cosc.canterbury.ac.nz> Just van Rossum : > My difficulty with PEP 261 is that I'm afraid few people will actually enable > 32-bit support (*what*?! all unicode strings become 32 bits wide? no way!), > therefore making programs non-portable in very subtle ways. I agree. This can only be a stopgap measure. Ultimately the Unicode type needs to be made smarter. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 2 03:42:12 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Jul 2001 14:42:12 +1200 (NZST) Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <3B3F5A3A.A88B54B2@ActiveState.com> Message-ID: <200107020242.OAA00436@s454.cosc.canterbury.ac.nz> David Ascher : > > And you just bought such a shiny, new glass, house. Pity. > > What kind of comma placement is that? Obviously it's only the glass that is new, not the whole house. :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From nhodgson@bigpond.net.au Mon Jul 2 03:42:11 2001 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Mon, 2 Jul 2001 12:42:11 +1000 Subject: [Python-Dev] Support for "wide" Unicode characters References: <200107011352.PAA27645@pandora.informatik.hu-berlin.de> Message-ID: <01d601c102a0$98671580$0acc8490@neil> Martin von Loewis: > > The problem I have with this PEP is that it is a compile time option > > which makes it hard to work with both 32 bit and 16 bit strings in > > one program. > > Can you elaborate why you think this is a problem? A common role for Python is to act as glue between various modules. If Paul produces some interesting code that depends on 32 bit strings and I want to use that in conjunction with some Win32 specific or COM dependent code that wants 16 bit strings then it may not be possible or may require difficult workaronds. > (*) Methinks that the primary difficulty still is translating all the > documentation, and messages. Actually, keeping the translations > up-to-date is even more challenging. Translation of documentation and strings can be performed by almost anyone who writes both languages ("even managers") and can be budgeted by working out the amount of text and applying a conversion rate. Code requires careful thought and can lead to the typical buggy software schedule blowouts. Neil From greg@cosc.canterbury.ac.nz Mon Jul 2 03:49:56 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Jul 2001 14:49:56 +1200 (NZST) Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <200107011837.f61IbmZ03645@odiug.digicool.com> Message-ID: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz> > It so happened that the Unicode support was written to make it very > easy to change the compile-time code unit size What about extension modules that deal with Unicode strings? Will they have to be recompiled too? If so, is there anything to detect an attempt to import an extension module with an incompatible Unicode character width? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From nhodgson@bigpond.net.au Mon Jul 2 03:52:45 2001 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Mon, 2 Jul 2001 12:52:45 +1000 Subject: [Python-Dev] Support for "wide" Unicode characters References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> <00dd01c1022d$c61e4160$0acc8490@neil> <200107011344.f61DiTM03548@odiug.digicool.com> Message-ID: <01ea01c102a2$128491c0$0acc8490@neil> Guido van Rossum: > > This wasn't usefully true in the past for DBCS strings and is > > not the right way to think of either narrow or wide strings > > now. The idea that strings are arrays of characters gets in > > the way of dealing with many encodings and is the primary > > difficulty in localising software for Japanese. > > Can you explain the kind of problems encountered in some more detail? Programmers used to working with character == indexable code unit will often split double wide characters when performing an action. For example searching for a particular double byte character "bc" may match "abcd" incorrectly where "ab" and "cd" are the characters. DBCS is not normally self synchronising although UTF-8 is. Another common problem is counting characters, for example when filling a line, hitting the line width and forcing half a character onto the next line. > I think it's a good idea to provide a set of higher-level tools as > well. However nobody seems to know what these higher-level tools > should do yet. PEP 261 is specifically focused on getting the > lower-level foundations right (i.e. the objects that represent arrays > of code units), so that the authors of higher level tools will have a > solid base. If you want to help author a PEP for such higher-level > tools, you're welcome! Its more likely I'll publish some of the low level pieces of Scintilla/SinkWorld as a Python extension providing some of these facilities in an editable-text class. Then we can see if anyone else finds the code worthwhile. Neil From nhodgson@bigpond.net.au Mon Jul 2 04:00:41 2001 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Mon, 2 Jul 2001 13:00:41 +1000 Subject: [Python-Dev] Support for "wide" Unicode characters References: Message-ID: <020b01c102a3$2dd23440$0acc8490@neil> Tim Peters: > Well, they still do -- fancy editors use fancy data structures, so that, > e.g., inserting characters at the start of the file doesn't cause a 50Mb > memmove each time. Response time is still important, but I'd wager > relatively insensitive to basic character size (you need tricks that cut > factors of 1000s off potential worst cases to give the appearance of > instantaneous results; a factor of 2 or 4 is in the noise compared to what's > needed regardless). I actually have some numbers here. Early versions of some new editor buffer code used UCS-2 on .NET and the JVM. Moving to an 8 bit buffer saved 10-20% of execution time on the insert string, delete string and global replace benchmarks using strings that fit into ASCII. These buffers did have some other overhead for line management and other features but I expect these did not affect the proportions much. Neil From tim.one@home.com Mon Jul 2 05:36:20 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 2 Jul 2001 00:36:20 -0400 Subject: [Python-Dev] RE: Python 2.1.1 release 'schedule' In-Reply-To: <20010702001248.H8098@xs4all.nl> Message-ID: Woo hoo! [Thomas Wouters] > ... > Binaries for Windows as well as source tarballs will be provided ... Building a Windows installer isn't straightforward, so you'd better let us do that part (e.g., you need the Wise installer program, Fred needs to supply appropriate HTML docs for the Windows installer to zip up, Tcl/Tk has to get unpacked and rearranged, etc). I just checked in 2.1.1c1 changes to the Windows part of the release21-maint tree, but the rest of it isn't in CVS. From thomas@xs4all.net Mon Jul 2 07:27:24 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 2 Jul 2001 08:27:24 +0200 Subject: [Python-Dev] Re: Python 2.1.1 release 'schedule' In-Reply-To: References: Message-ID: <20010702082724.K32419@xs4all.nl> On Mon, Jul 02, 2001 at 12:36:20AM -0400, Tim Peters wrote: > [Thomas Wouters] > > ... > > Binaries for Windows as well as source tarballs will be provided ... > Building a Windows installer isn't straightforward, so you'd better let us > do that part (e.g., you need the Wise installer program, Fred needs to > supply appropriate HTML docs for the Windows installer to zip up, Tcl/Tk has > to get unpacked and rearranged, etc). I just checked in 2.1.1c1 changes to > the Windows part of the release21-maint tree, but the rest of it isn't in > CVS. Oh yeah, I was entirely going to let you guys do it, or at least find another set of wintendows-weenies to do it :) That's part of why I posted the tentative release dates. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From loewis@informatik.hu-berlin.de Mon Jul 2 08:25:18 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Mon, 2 Jul 2001 09:25:18 +0200 (MEST) Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <01d601c102a0$98671580$0acc8490@neil> (nhodgson@bigpond.net.au) References: <200107011352.PAA27645@pandora.informatik.hu-berlin.de> <01d601c102a0$98671580$0acc8490@neil> Message-ID: <200107020725.JAA25925@pandora.informatik.hu-berlin.de> > > > The problem I have with this PEP is that it is a compile time option > > > which makes it hard to work with both 32 bit and 16 bit strings in > > > one program. > > > > Can you elaborate why you think this is a problem? > > A common role for Python is to act as glue between various modules. If > Paul produces some interesting code that depends on 32 bit strings and I > want to use that in conjunction with some Win32 specific or COM dependent > code that wants 16 bit strings then it may not be possible or may require > difficult workaronds. Neither nor. All it will require is you to recompile your Python installation for to use wide Unicode. On Win32 APIs, this will mean that you cannot directly interpret PyUnicode object representations as WCHAR_T pointers. This is no problem, as you can transparently copy unicode objects into wchar_t strings; it's a matter of coming up with a good C API for doing so conveniently. Regards, Martin From fredrik@pythonware.com Mon Jul 2 09:20:09 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 2 Jul 2001 10:20:09 +0200 Subject: [Python-Dev] Support for "wide" Unicode characters References: <200107020236.OAA00432@s454.cosc.canterbury.ac.nz> Message-ID: <03b301c102cf$e0e3dd00$0900a8c0@spiff> greg wrote: > I agree. This can only be a stopgap measure. Ultimately the > Unicode type needs to be made smarter. PIL uses 8 bits per pixel to store bilevel images, and 32 bits per pixel to store 16- and 24-bit images. back in 1995, some people claimed that the image type had to be made smarter to be usable. these days, nobody ever notices... From fredrik@pythonware.com Mon Jul 2 09:08:10 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 2 Jul 2001 10:08:10 +0200 Subject: [Python-Dev] Support for "wide" Unicode characters References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> <00dd01c1022d$c61e4160$0acc8490@neil> Message-ID: <03b201c102cf$e0dab540$0900a8c0@spiff> Neil Hodgson wrote: > > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode > > character. > > This wasn't usefully true in the past for DBCS strings and is not the > right way to think of either narrow or wide strings now. The idea that > strings are arrays of characters gets in the way if you stop confusing binary buffers with text strings, all such problems will go away. From mal@egenix.com Mon Jul 2 10:39:55 2001 From: mal@egenix.com (M.-A. Lemburg) Date: Mon, 02 Jul 2001 11:39:55 +0200 Subject: [Python-Dev] Support for "wide" Unicode characters References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz> Message-ID: <3B40416B.6438D1F7@egenix.com> Greg Ewing wrote: > > > It so happened that the Unicode support was written to make it very > > easy to change the compile-time code unit size > > What about extension modules that deal with Unicode strings? > Will they have to be recompiled too? If so, is there anything > to detect an attempt to import an extension module with an > incompatible Unicode character width? That's a good question ! The answer is: yes, extensions which use Unicode will have to be recompiled for narrow and wide builds of Python. The question is however, how to detect cases where the user imports an extension built for narrow Python into a wide build and vice versa. The standard way of looking at the API level won't help. We'd need some form of introspection API at the C level... hmm, perhaps looking at the sys module will do the trick for us ?! In any case, this is certainly going to cause trouble one of these days... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Mon Jul 2 11:13:59 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 02 Jul 2001 12:13:59 +0200 Subject: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" Unicode characters References: <3B3F8095.8D58631D@ActiveState.com> Message-ID: <3B404967.14FE180F@lemburg.com> Paul Prescod wrote: > > PEP: 261 > Title: Support for "wide" Unicode characters > Version: $Revision: 1.3 $ > Author: paulp@activestate.com (Paul Prescod) > Status: Draft > Type: Standards Track > Created: 27-Jun-2001 > Python-Version: 2.2 > Post-History: 27-Jun-2001 > > Abstract > > Python 2.1 unicode characters can have ordinals only up to 2**16 > -1. > This range corresponds to a range in Unicode known as the Basic > Multilingual Plane. There are now characters in Unicode that live > on other "planes". The largest addressable character in Unicode > has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we > will call this TOPCHAR and call characters in this range "wide > characters". > > Glossary > > Character > > Used by itself, means the addressable units of a Python > Unicode string. Please add: also known as "code unit". > Code point > > A code point is an integer between 0 and TOPCHAR. > If you imagine Unicode as a mapping from integers to > characters, each integer is a code point. But the > integers between 0 and TOPCHAR that do not map to > characters are also code points. Some will someday > be used for characters. Some are guaranteed never > to be used for characters. > > Codec > > A set of functions for translating between physical > encodings (e.g. on disk or coming in from a network) > into logical Python objects. > > Encoding > > Mechanism for representing abstract characters in terms of > physical bits and bytes. Encodings allow us to store > Unicode characters on disk and transmit them over networks > in a manner that is compatible with other Unicode software. > > Surrogate pair > > Two physical characters that represent a single logical Eeek... two code units (or have you ever seen a physical character walking around ;-) > character. Part of a convention for representing 32-bit > code points in terms of two 16-bit code points. > > Unicode string > > A Python type representing a sequence of code points with > "string semantics" (e.g. case conversions, regular > expression compatibility, etc.) Constructed with the > unicode() function. > > Proposed Solution > > One solution would be to merely increase the maximum ordinal > to a larger value. Unfortunately the only straightforward > implementation of this idea is to use 4 bytes per character. > This has the effect of doubling the size of most Unicode > strings. In order to avoid imposing this cost on every > user, Python 2.2 will allow the 4-byte implementation as a > build-time option. Users can choose whether they care about > wide characters or prefer to preserve memory. > > The 4-byte option is called "wide Py_UNICODE". The 2-byte option > is called "narrow Py_UNICODE". > > Most things will behave identically in the wide and narrow worlds. > > * unichr(i) for 0 <= i < 2**16 (0x10000) always returns a > length-one string. > > * unichr(i) for 2**16 <= i <= TOPCHAR will return a > length-one string on wide Python builds. On narrow builds it will > raise ValueError. > > ISSUE > > Python currently allows \U literals that cannot be > represented as a single Python character. It generates two > Python characters known as a "surrogate pair". Should this > be disallowed on future narrow Python builds? > > Pro: > > Python already the construction of a surrogate pair > for a large unicode literal character escape sequence. > This is basically designed as a simple way to construct > "wide characters" even in a narrow Python build. It is also > somewhat logical considering that the Unicode-literal syntax > is basically a short-form way of invoking the unicode-escape > codec. > > Con: > > Surrogates could be easily created this way but the user > still needs to be careful about slicing, indexing, printing > etc. Therefore some have suggested that Unicode > literals should not support surrogates. > > ISSUE > > Should Python allow the construction of characters that do > not correspond to Unicode code points? Unassigned Unicode > code points should obviously be legal (because they could > be assigned at any time). But code points above TOPCHAR are > guaranteed never to be used by Unicode. Should we allow > access > to them anyhow? > > Pro: > > If a Python user thinks they know what they're doing why > should we try to prevent them from violating the Unicode > spec? After all, we don't stop 8-bit strings from > containing non-ASCII characters. > > Con: > > Codecs and other Unicode-consuming code will have to be > careful of these characters which are disallowed by the > Unicode specification. > > * ord() is always the inverse of unichr() > > * There is an integer value in the sys module that describes the > largest ordinal for a character in a Unicode string on the current > interpreter. sys.maxunicode is 2**16-1 (0xffff) on narrow builds > of Python and TOPCHAR on wide builds. > > ISSUE: Should there be distinct constants for accessing > TOPCHAR and the real upper bound for the domain of > unichr (if they differ)? There has also been a > suggestion of sys.unicodewidth which can take the > values 'wide' and 'narrow'. > > * every Python Unicode character represents exactly one Unicode code > point (i.e. Python Unicode Character = Abstract Unicode > character). > > * codecs will be upgraded to support "wide characters" > (represented directly in UCS-4, and as variable-length sequences > in UTF-8 and UTF-16). This is the main part of the implementation > left to be done. > > * There is a convention in the Unicode world for encoding a 32-bit > code point in terms of two 16-bit code points. These are known > as "surrogate pairs". Python's codecs will adopt this convention > and encode 32-bit code points as surrogate pairs on narrow Python > builds. > > ISSUE > > Should there be a way to tell codecs not to generate > surrogates and instead treat wide characters as > errors? > > Pro: > > I might want to write code that works only with > fixed-width characters and does not have to worry about > surrogates. > > Con: > > No clear proposal of how to communicate this to codecs. No need to pass this information to the codec: simply write a new one and give it a clear name, e.g. "ucs-2" will generate errors while "utf-16-le" converts them to surrogates. > * there are no restrictions on constructing strings that use > code points "reserved for surrogates" improperly. These are > called "isolated surrogates". The codecs should disallow reading > these from files, but you could construct them using string > literals or unichr(). > > Implementation > > There is a new (experimental) define: > > #define PY_UNICODE_SIZE 2 > > There is a new configure option: > > --enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses > wchar_t if it fits > --enable-unicode=ucs4 configures a wide Py_UNICODE, and uses > whchar_t if it fits > --enable-unicode same as "=ucs2" > > The intention is that --disable-unicode, or --enable-unicode=no > removes the Unicode type altogether; this is not yet implemented. > > It is also proposed that one day --enable-unicode will just > default to the width of your platforms wchar_t. > > Windows builds will be narrow for a while based on the fact that > there have been few requests for wide characters, those requests > are mostly from hard-core programmers with the ability to buy > their own Python and Windows itself is strongly biased towards > 16-bit characters. > > Notes > > This PEP does NOT imply that people using Unicode need to use a > 4-byte encoding for their files on disk or sent over the network. > It only allows them to do so. For example, ASCII is still a > legitimate (7-bit) Unicode-encoding. > > It has been proposed that there should be a module that handles > surrogates in narrow Python builds for programmers. If someone > wants to implement that, it will be another PEP. It might also be > combined with features that allow other kinds of character-, > word- and line- based indexing. > > Rejected Suggestions > > More or less the status-quo > > We could officially say that Python characters are 16-bit and > require programmers to implement wide characters in their > application logic by combining surrogate pairs. This is a heavy > burden because emulating 32-bit characters is likely to be > very inefficient if it is coded entirely in Python. Plus these > abstracted pseudo-strings would not be legal as input to the > regular expression engine. > > "Space-efficient Unicode" type > > Another class of solution is to use some efficient storage > internally but present an abstraction of wide characters to > the programmer. Any of these would require a much more complex > implementation than the accepted solution. For instance consider > the impact on the regular expression engine. In theory, we could > move to this implementation in the future without breaking > Python > code. A future Python could "emulate" wide Python semantics on > narrow Python. Guido is not willing to undertake the > implementation right now. > > Two types > > We could introduce a 32-bit Unicode type alongside the 16-bit > type. There is a lot of code that expects there to be only a > single Unicode type. > > This PEP represents the least-effort solution. Over the next > several years, 32-bit Unicode characters will become more common > and that may either convince us that we need a more sophisticated > solution or (on the other hand) convince us that simply > mandating wide Unicode characters is an appropriate solution. > Right now the two options on the table are do nothing or do > this. > > References > > Unicode Glossary: http://www.unicode.org/glossary/ Plus perhaps the Mark Davis paper at: http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/ > Copyright > > This document has been placed in the public domain. Good work, Paul ! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Mon Jul 2 11:08:53 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 02 Jul 2001 12:08:53 +0200 Subject: [Python-Dev] Support for "wide" Unicode characters References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> Message-ID: <3B404835.4CE77C60@lemburg.com> Paul Prescod wrote: > > "M.-A. Lemburg" wrote: > > > >... > > > > The term "character" in Python should really only be used for > > the 8-bit strings. > > Are we going to change chr() and unichr() to one_element_string() and > unicode_one_element_string() No. I am just suggesting to make use of the crispy clear definitions which the Unicode Consortium has developed for us. > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode > character. No Python user will find that confusing no matter how Unicode > knuckle-dragging, mouth-breathing, wife-by-hair-dragging they are. Except that u[i] maps to a code unit which may or may not be a code point. Whether a code point matches a grapheme (this is what users tend to regard as character) is yet another story due to combining code points. > > In Unicode a "character" can mean any of: > > Mark Davis said that "people" can use the word to mean any of those > things. He did not say that it was imprecisely defined in Unicode. > Nevertheless I'm not using the Unicode definition anymore than our > standard library uses an ancient Greek definition of integer. Python has > a concept of integer and a concept of character. Ok, I'll stop whining. Just as final remark, let me say that our little discussion is a perfect example of how people can misunderstand each other by using the terms in different ways (Kant tried to solve this for Philosophy and did not succeed; so I guess the Unicode Consortium doesn't stand a chance either ;-) > > > It has been proposed that there should be a module for working > > > with UTF-16 strings in narrow Python builds through some sort of > > > abstraction that handles surrogates for you. If someone wants > > > to implement that, it will be another PEP. > > > > Uhm, narrow builds don't support UTF-16... it's UCS-2 which > > is supported (basically: store everything in range(0x10000)); > > the codecs can map code points to surrogates, but it is solely > > their responsibility and the responsibility of the application > > using them to take care of dealing with surrogates. > > The user can view the data as UCS-2, UTF-16, Base64, ROT-13, XML, .... > Just as we have a base64 module, we could have a UTF-16 module that > interprets the data in the string as UTF-16 and does surrogate > manipulation for you. > > Anyhow, if any of those is the "real" encoding of the data, it is > UTF-16. After all, if the codec reads in four non-BMP characters in, > let's say, UTF-8, we represent them as 8 narrow-build Python characters. > That's the definition of UTF-16! But it's easy enough for me to take > that word out so I will. u[i] gives you a code unit and whether this maps to a code point or not is dependent on the implementation which in turn depends on the narrow/wide choice. In UCS-2, I believe, surrogates are regarded as two code points; in UTF-16 they always have to come in pairs. There's a semantic difference here which is for the codecs and these additional tools to be aware of -- not the Unicode type implementation. > >... > > Also, the module will be useful for both narrow and wide builds, > > since the notion of an encoded character can involve multiple code > > points. In that sense Unicode is always a variable length > > encoding for characters and that's the application field of > > this module. > > I wouldn't advise that you do all different types of normalization in a > single module but I'll wait for your PEP. I'll see if I find some time at the Bordeaux Python Meeting next week. > > Here's the adjusted text: > > > > It has been proposed that there should be a module for working > > with Unicode objects using character-, word- and line- based > > indexing. The details of the implementation is left to > > another PEP. > > It has been proposed that there should be a module that handles > surrogates in narrow Python builds for programmers. If someone > wants to implement that, it will be another PEP. It might also be > combined with features that allow other kinds of character-, > word- and line- based indexing. Hmm, I liked my version better, but what the heck ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Mon Jul 2 11:43:38 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 02 Jul 2001 12:43:38 +0200 Subject: [Python-Dev] Unicode Maintenance References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com> Message-ID: <3B40505A.2F03EEC4@lemburg.com> Guido van Rossum wrote: > > Hi Marc-Andre, > > I'm dropping the i18n-sig from the distribution list. > > I hear you: > > > You didn't get my point. I feel responsable for the Unicode > > implementation design and would like to see it become a continued > > success. > > I'm sure we all share this goal! > > > In that sense and taking into account that I am the > > maintainer of all this stuff, I think it is very reasonable to > > ask me before making any significant changes to the implementation > > and also respect any comments I put forward. > > I understand you feel that we've rushed this in without waiting for > your comments. > > Given how close your implementation was, I still feel that the changes > weren't that significant, but I understand that you get nervous. If > Christian were to check in his speed hack changes to the guts of > ceval.c I would be nervous too! (Heck, I got nervous when Eric > checked in his library-wide string method changes without asking.) > > Next time I'll try to be more sensitive to situations that require > your review before going forward. Good. > > Currently, I have to watch the checkins list very closely > > to find out who changed what in the implementation and then to > > take actions only after the fact. Since I'm not supporting Unicode > > as my full-time job this is simply impossible. We have the SF manager > > and there is really no need to rush anything around here. > > Hm, apart from the fact that you ought to be left in charge, I think > that in this case the live checkins were a big win over the usual SF > process. At least two people were making changes, sometimes to each > other's code, and many others on at least three continents were > checking out the changes on many different platforms and immediately > reporting problems. We would definitely not have a patch as solid as > the code that's now checked in, after two days of using SF! (We > could've used a branch, but I've found that getting people to actually > check out the branch is not easy.) True, but I was thinking of the concept and design questions which should be resolved *before* taking the direct checkin approach. > So I think that the net result was favorable. Sometimes you just have > to let people work in the spur of the moment to get the results of > their best thinking, otherwise they lose interest or their train of > thought. Understood, but then I'd like to at least receive a summary of the changes in some way, so that I continue to understand how the implementation works after the checkins and which corners to keep in mind for future additions, changes, etc. > > If I am offline or too busy with other things for a day or two, > > then I want to see patches on SF and not find new versions of > > the implementation already checked in. > > That's still the general rule, but in our enthousiasm (and mine was > definitely part of this!) we didn't want to wait. Also, I have to > admit that I mistook your silence for consent -- I didn't think the > main proposed changes (making the size of Py_UNICODE a config choice) > were controversial at all, so I didn't realize you would have a problem > with it. I don't have a problem with it; I was just seeing things slip my fingers and getting worried about this. > > This has worked just fine during the last year, so I can only explain > > the latest actions in this direction with an urge to bypass my comments > > and any discussion this might cause. > > I think you're projecting your own stuff here. Not really. I have processed many patches on SF, gave comments etc. and did the final checkin. This has worked great over the last months and I intend to keep working this way since it is by far the best way to both manage and document the issues and questions which arise during the process. E.g. I'm currently processing a patch by Walter Dörwald which adds support for callback error handlers. He has done some great work there which was the result of many lively discussions. Working like this is fun while staying manageable at the same time... and again, there's really no need to rush things ! > I honestly didn't > think there was much disagreement on your part and thought we were > doing you a favor by implementing the consensus. IMO, Martin and and > Fredrik are familiar enough with both the code and the issues to do a > good job. Well, the above was my interpretation of how things went. I may have been wrong (and honestly do hope that I am wrong), but my gutt feeling simply said: hey, what are these guys doing there... is this some kind of > > Needless to say that > > quality control is not possible anymore. > > Unclear. Lots of other people looked over the changes in your > absence. And CVS makes code review after it's checked in easy enough. > (Hey, in many other open source projects that's the normal procedure > once the rough characteristics of a feature have been agreed upon: > check in first and review later!) That was not my point: quality control also includes checking the design approach. This is something which should normally be done in design/implementation/design/... phases -- just like I worked with you on the Unicode implementation late in 1999. > > Conclusion: > > I am not going to continue this work if this does not change. > > That would be sad, and I hope you will stay with us. We certainly > don't plan to ignore your comments! > > > Another other problem for me is the continued hostility I feel on i18n > > against parts of the design and some of my decisions. I am > > not talking about your feedback and the feedback from many other > > people on the list which was excellent and to high standards. > > But reading the postings of the last few months you will > > find notices of what I am referring to here (no, I don't want > > to be specific). > > I don't know what to say about this, and obviously nobody has the time > to go back and read the archives. I'm sure it's not you as a person > that was attacked. If the design isn't perfect -- and hey, since > Python is the 80 percent language, few things in it are quite perfect! > -- then (positive) criticism is an attempt to help, to move it closer > to perfection. > > If people have at times said "the Unicode support sucks", well, that > may hurt. You can't always stay friends with everybody. I get flames > occasionally for features in Python that folks don't like. I get used > to them, and it doesn't affect my confidence any more. Be the same! I'll try. > But sometimes, after saying "it sucks", people make specific > suggestions for improvements, and it's important to be open for those > even from sources that use offending language. (Within reason, of > course. I don't ask you to listen to somebody who is persistently > hostile to you as a person.) Ok. > > If people don't respect my comments or decision, then how can > > I defend the design and how can I stop endless discussions which > > simply don't lead anywhere ? So either I am missing something > > or there is a need for a clear statement from you about > > my status in all this. > > Do you really *want* to be the Unicode BDFL? Being something's BDFL a > full-time job, and you've indicated you're too busy. (Or is that > temporary?) I am currently doing a lot of consulting work, so things sometimes tighten up and are less work intense at other times. Given this setup, I think that I will be able to play the BD (without the FL) for Unicode for some time. I will certainly pass on the flag to someone else if I find myself not spending enough time on it. The only thing I'm asking for, is some more professional work mentality at times. If people make it hard for me to follow the development, then I cannot manage this task in a satisfying way. > I see you as the original coder, which means that you know that > section of the code better than anyone, and whenever there's a > question that others can't answer about its design, implementation, or > restrictions, I refer to you. But given that you've said you wouldn't > be able to work much on it, I welcome contributions by others as long > as they seem knowledgeable. Same here. > > If I don't have the right to comment on proposals and patches, > > possibly even rejecting them, then I simply don't see any > > ground for keeping the implementation in a state which I can > > maintain. > > Nobody said you couldn't comment, and you know that. If I don't get a chance to comment on a summary of changes (be it before or after a batch of checkings), how am I supposed to follow up on them ? Keeping a close eye on the checkin mailing list doesn't help: it simply doesn't always give you the big picture. We are all professional quality programmers and I respect Fredrik and Martin for their coding quality and ideas. What I am asking for is some more teamwork. > When it comes to rejecting or accepting, I feel that I am still the > final arbiter, even for Unicode, until I get hit by a bus. Since I > don't always understand the implementation or the issues, I'll of > course defer to you in cases where I think I can't make the decision, > but I do reserve the right to be convinced by others to override your > judgement, occasionally, if there's a good reason. And when you're > not responsive, I may try to channel you. (I'll try to be more > explicit about that.) That's perfectly OK (and indeed can be very useful at times). > > And last but not least: The fun-factor has faded which was > > the main motor driving my into working on Unicode in the first > > place. Nothing much you can do about this, though :-/ > > Yes, that happens to all of us at times. The fun factor goes up and > down, and sometimes we must look for fun elsewhere for a while. Then > the fun may come back where it appeared lost. Go on vacation, read a > book, tackle a new project in a totally different area! Then come > back and see if you can find some fun in the old stuff again. I'll visit the Bordeaux Python conference later week. That should give me some time to breathe (and hopefully to write some more PEPs :=). > > > Paul Prescod offered to write a PEP on this issue. My cynical half > > > believes that we'll never hear from him again, but my optimistic half > > > hopes that he'll actually write one, so that we'll be able to discuss > > > the various issues for the users with the users. I encourage you to > > > co-author the PEP, since you have a lot of background knowledge about > > > the issues. > > > > I guess your optimistic half won :-) I think Paul already did all the > > work, so I'll simply comment on what he wrote. > > Your suggestions were very valuable. My opinion of Paul also went up > a notch! > > > > BTW, I think that Misc/unicode.txt should be converted to a PEP, for > > > the historic record. It was very much a PEP before the PEP process > > > was invented. Barry, how much work would this be? No editing needed, > > > just formatting, and assignment of a PEP number (the lower the better). > > > > Thanks for converting the text to PEP format, Barry. > > > > Thanks for reading this far, > > You're welcome, and likewise. > > Just one more thing, Marc-Andre. Please know that I respect your work > very much even if we don't always agree. We would get by without you, > but Python would be hurt if you turned your back on us. Thanks. Be assured that I'll stay around for quite some time -- you won't get by that easily ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Mon Jul 2 11:56:00 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 02 Jul 2001 12:56:00 +0200 Subject: [Python-Dev] Bordeaux Python Meeting 04.07.-07.07. Message-ID: <3B405340.31C5AA11@lemburg.com> Hi everybody, I think nobody has posted an announcement for the conference yet, so I'll at least provide a pointer: http://www.lsm.abul.org/program/topic19/ Marc Poinot, who also organized the "First Python Day" in France, is chair of this subtopic at the "Debian One" conference in Bordeaux: http://www.lsm.abul.org/ Cheers, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From fredrik@pythonware.com Mon Jul 2 12:41:51 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 2 Jul 2001 13:41:51 +0200 Subject: [Python-Dev] Unicode Maintenance References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com> <3B40505A.2F03EEC4@lemburg.com> Message-ID: <001e01c102eb$fe4995d0$4ffa42d5@hagrid> mal wrote: > The only thing I'm asking for, is some more professional > work mentality at times. for the record, your recent posts under this subject doesn't strike me as very professional. think about it. From paulp@ActiveState.com Mon Jul 2 15:25:55 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Mon, 02 Jul 2001 07:25:55 -0700 Subject: [I18n-sig] Re: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" Unicodecharacters References: <3B3F8095.8D58631D@ActiveState.com> <3B404967.14FE180F@lemburg.com> Message-ID: <3B408473.77AB6C8@ActiveState.com> "M.-A. Lemburg" wrote: > >... > > Character > > > > Used by itself, means the addressable units of a Python > > Unicode string. > > Please add: also known as "code unit". I'm not entirely comfortable with that. As you yourself pointed out, the same Python Unicode object can be interpreted as either a series of single-width code points *or* as a UTF-16 string where the characters are code units. You could also interpet it as a BASE64'd region or an XML document... It all depends on how you look at it. > .... > > Surrogate pair > > > > Two physical characters that represent a single logical > > Eeek... two code units (or have you ever seen a physical character > walking around ;-) No, that's sort of my point. The user can decide to adopt the convention of looking at the two characters as code units or they can ignore that interpretation and look at them as two code points. It's all relative, man. Dig it? That's why I use the word "convention" below: > > character. Part of a convention for representing 32-bit > > code points in terms of two 16-bit code points. "Surrogates are all in your head. Python doesn't know or care about them!" I'll change this to: Surrogate pair Two Python Unicode characters that represent a single logical Unicode code point. Part of a convention for representing 32-bit code points in terms of two 16-bit code points. Python has limited support for reading, writing and constructing strings that use this convention (described below). Otherwise Python ignores the convention. > No need to pass this information to the codec: simply write > a new one and give it a clear name, e.g. "ucs-2" will generate > errors while "utf-16-le" converts them to surrogates. That's a good point, but what if I want a UTF-8 codec that doesn't generate surrogates? Or even a UCS4 one? > Plus perhaps the Mark Davis paper at: > > http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/ Okay. > > Copyright > > > > This document has been placed in the public domain. > > Good work, Paul ! Thanks for your help. You did help me to clarify many things even though I argued with you as I was doing it. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From guido@digicool.com Mon Jul 2 16:23:56 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 02 Jul 2001 11:23:56 -0400 Subject: [Python-Dev] Unicode Maintenance In-Reply-To: Your message of "Mon, 02 Jul 2001 12:43:38 +0200." <3B40505A.2F03EEC4@lemburg.com> References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com> <3B40505A.2F03EEC4@lemburg.com> Message-ID: <200107021523.f62FNun01807@odiug.digicool.com> Thanks for your response, Marc-Andre. I'd like to close this topic now. I'm not sure how to get you a "summary of changes", but I think you can ask Fredrik directly (Martin annonced he's away on vacation). One thing you can do is pipe the output of "cvs log" through tools/scripts/logmerge.py -- this gives you the checkin messages in (reverse?) chronological order. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Mon Jul 2 16:29:39 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 02 Jul 2001 11:29:39 -0400 Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: Your message of "Mon, 02 Jul 2001 11:39:55 +0200." <3B40416B.6438D1F7@egenix.com> References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz> <3B40416B.6438D1F7@egenix.com> Message-ID: <200107021529.f62FTdx01823@odiug.digicool.com> > Greg Ewing wrote: > > > > > It so happened that the Unicode support was written to make it very > > > easy to change the compile-time code unit size > > > > What about extension modules that deal with Unicode strings? > > Will they have to be recompiled too? If so, is there anything > > to detect an attempt to import an extension module with an > > incompatible Unicode character width? > > That's a good question ! > > The answer is: yes, extensions which use Unicode will have to > be recompiled for narrow and wide builds of Python. The question > is however, how to detect cases where the user imports an > extension built for narrow Python into a wide build and > vice versa. > > The standard way of looking at the API level won't help. We'd > need some form of introspection API at the C level... hmm, > perhaps looking at the sys module will do the trick for us ?! > > In any case, this is certainly going to cause trouble one > of these days... Here are some alternative ways to deal with this: (1) Use the preprocessor to rename all the Unicode APIs to get "Wide" appended to their name in wide mode. This makes any use of a Unicode API in an extension compiled for the wrong Py_UNICODE_SIZE fail with a link-time error. (Which should cause an ImportError for shared libraries.) (2) Ditto but only rename the PyModule_Init function. This is much less work but more coarse: a module that doesn't use any Unicode APIs (and I expect these will be a large majority) still would not be accepted. (3) Change the interpretation of PYTHON_API_VERSION so that a low bit of '1' means wide Unicode. Then you only get a warning (followed by a core dump when actually trying to use Unicode). I mentioned (1) and (3) in an earlier post. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@beowolf.digicool.com Mon Jul 2 16:37:45 2001 From: fdrake@beowolf.digicool.com (Fred Drake) Date: Mon, 2 Jul 2001 11:37:45 -0400 (EDT) Subject: [Python-Dev] [maintenance doc updates] Message-ID: <20010702153745.B304B28929@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/maint-docs/ Updated to reflect the current state of the Python 2.1.1 maintenance release branch. From mal@lemburg.com Mon Jul 2 17:51:58 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 02 Jul 2001 18:51:58 +0200 Subject: [Python-Dev] Support for "wide" Unicode characters References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz> <3B40416B.6438D1F7@egenix.com> <200107021529.f62FTdx01823@odiug.digicool.com> Message-ID: <3B40A6AE.EDE30857@lemburg.com> Guido van Rossum wrote: > > > Greg Ewing wrote: > > > > > > > It so happened that the Unicode support was written to make it very > > > > easy to change the compile-time code unit size > > > > > > What about extension modules that deal with Unicode strings? > > > Will they have to be recompiled too? If so, is there anything > > > to detect an attempt to import an extension module with an > > > incompatible Unicode character width? > > > > That's a good question ! > > > > The answer is: yes, extensions which use Unicode will have to > > be recompiled for narrow and wide builds of Python. The question > > is however, how to detect cases where the user imports an > > extension built for narrow Python into a wide build and > > vice versa. > > > > The standard way of looking at the API level won't help. We'd > > need some form of introspection API at the C level... hmm, > > perhaps looking at the sys module will do the trick for us ?! > > > > In any case, this is certainly going to cause trouble one > > of these days... > > Here are some alternative ways to deal with this: > > (1) Use the preprocessor to rename all the Unicode APIs to get "Wide" > appended to their name in wide mode. This makes any use of a > Unicode API in an extension compiled for the wrong Py_UNICODE_SIZE > fail with a link-time error. (Which should cause an ImportError > for shared libraries.) > > (2) Ditto but only rename the PyModule_Init function. This is much > less work but more coarse: a module that doesn't use any Unicode > APIs (and I expect these will be a large majority) still would not > be accepted. > > (3) Change the interpretation of PYTHON_API_VERSION so that a low bit > of '1' means wide Unicode. Then you only get a warning (followed > by a core dump when actually trying to use Unicode). > > I mentioned (1) and (3) in an earlier post. (4) Add a feature flag to PyModule_Init() which then looks up the features in the sys module and uses this as basis for processing the import requrest. In this case, I think that (5) would be the best solution, since old code will notice the change in width too. -- Marc-Andre Lemburg ________________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From paulp@ActiveState.com Mon Jul 2 19:15:41 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Mon, 02 Jul 2001 11:15:41 -0700 Subject: [Python-Dev] Support for "wide" Unicode characters References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz> <3B40416B.6438D1F7@egenix.com> <200107021529.f62FTdx01823@odiug.digicool.com> <3B40A6AE.EDE30857@lemburg.com> Message-ID: <3B40BA4D.9C85A202@ActiveState.com> "M.-A. Lemburg" wrote: > >... > > (4) Add a feature flag to PyModule_Init() which then looks up the > features in the sys module and uses this as basis for > processing the import requrest. Could an extension be carefully written so that a single binary could be compatible with both types of Python build? I'm thinking that it would pass data buffers with the "right width" based on checking a runtime flag... -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From just@letterror.com Mon Jul 2 19:20:38 2001 From: just@letterror.com (Just van Rossum) Date: Mon, 2 Jul 2001 20:20:38 +0200 Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <3B40BA4D.9C85A202@ActiveState.com> Message-ID: <20010702202041-r01010600-d5c62b95@213.84.27.177> Paul Prescod wrote: > Could an extension be carefully written so that a single binary could be > compatible with both types of Python build? I'm thinking that it would > pass data buffers with the "right width" based on checking a runtime > flag... But then it would also be compatible with a unicode object using different internal storage units per string, so I'm sure this is a dead end ;-) Just From mal@lemburg.com Mon Jul 2 19:59:06 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 02 Jul 2001 20:59:06 +0200 Subject: [Python-Dev] Support for "wide" Unicode characters References: <20010702202041-r01010600-d5c62b95@213.84.27.177> Message-ID: <3B40C47A.94317663@lemburg.com> Just van Rossum wrote: > > Paul Prescod wrote: > > > Could an extension be carefully written so that a single binary could be > > compatible with both types of Python build? I'm thinking that it would > > pass data buffers with the "right width" based on checking a runtime > > flag... > > But then it would also be compatible with a unicode object using different > internal storage units per string, so I'm sure this is a dead end ;-) Agreed :-) Extension writer will have to provide two versions of the binary. -- Marc-Andre Lemburg ________________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Mon Jul 2 20:12:45 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 02 Jul 2001 21:12:45 +0200 Subject: [I18n-sig] Re: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" Unicodecharacters References: <3B3F8095.8D58631D@ActiveState.com> <3B404967.14FE180F@lemburg.com> <3B408473.77AB6C8@ActiveState.com> Message-ID: <3B40C7AD.F2646D56@lemburg.com> Paul Prescod wrote: > > "M.-A. Lemburg" wrote: > > > >... > > > Character > > > > > > Used by itself, means the addressable units of a Python > > > Unicode string. > > > > Please add: also known as "code unit". > > I'm not entirely comfortable with that. As you yourself pointed out, the > same Python Unicode object can be interpreted as either a series of > single-width code points *or* as a UTF-16 string where the characters > are code units. You could also interpet it as a BASE64'd region or an > XML document... It all depends on how you look at it. Well, that's what code unit tries to capture too: it's the basic storage unit used by the implementation for storing characters. Never mind, it's just a detail... > > .... > > > Surrogate pair > > > > > > Two physical characters that represent a single logical > > > > Eeek... two code units (or have you ever seen a physical character > > walking around ;-) > > No, that's sort of my point. The user can decide to adopt the convention > of looking at the two characters as code units or they can ignore that > interpretation and look at them as two code points. It's all relative, > man. Dig it? That's why I use the word "convention" below: Ok. > > > character. Part of a convention for representing 32-bit > > > code points in terms of two 16-bit code points. > > "Surrogates are all in your head. Python doesn't know or care about > them!" > > I'll change this to: > > Surrogate pair > > Two Python Unicode characters that represent a single logical > Unicode code point. Part of a convention for representing > 32-bit code points in terms of two 16-bit code points. Python > has limited support for reading, writing and constructing > strings > that use this convention (described below). Otherwise Python > ignores the convention. Good. > > No need to pass this information to the codec: simply write > > a new one and give it a clear name, e.g. "ucs-2" will generate > > errors while "utf-16-le" converts them to surrogates. > > That's a good point, but what if I want a UTF-8 codec that doesn't > generate surrogates? Or even a UCS4 one? With Walter's patch for callback error handlers, you should be able to provide handlers which implement whatever you see fit. I think that codecs should work the same on all platforms and always apply the needed conversion for the platform in question; could be wrong though... it's really only a minor issue. > > Plus perhaps the Mark Davis paper at: > > > > http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/ > > Okay. > > > > Copyright > > > > > > This document has been placed in the public domain. > > > > Good work, Paul ! > > Thanks for your help. You did help me to clarify many things even though > I argued with you as I was doing it. Thank you for taking the suggestions into account. -- Marc-Andre Lemburg ________________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik@pythonware.com Mon Jul 2 20:41:33 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 2 Jul 2001 21:41:33 +0200 Subject: [Python-Dev] Unicode Maintenance References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com> <3B40505A.2F03EEC4@lemburg.com> <200107021523.f62FNun01807@odiug.digicool.com> Message-ID: <013101c1032f$022770d0$4ffa42d5@hagrid> guido wrote: > I'm not sure how to get you a "summary of changes", but I think you > can ask Fredrik directly (Martin annonced he's away on vacation). summary: - portability: made unicode object behave properly also if sizeof(Py_UNICODE) > 2 and >= sizeof(long) (FL) - same for unicode codecs and the unicode database (MvL) - base unicode feature selection on unicode defines, not platform (FL) - wrap surrogate handling in #ifdef Py_UNICODE_WIDE (MvL, FL) - tweaked unit tests to work with wide unicode, by replacing explicit surrogates with \U escapes (MvL) - configure options for narrow/wide unicode (MvL) - removed bogus const and register from some scalars (GvR, FL) - default unicode configuration for PC (Tim, FL) - default unicode configuration for Mac (Jack) - added sys.maxunicode (MvL) most changes where really trivial (e.g. ~0xFC00 => 0x3FF). martin's big patch was reviewed and tested by both me and him before checkin (tim managed to check out and build before I'd gotten around to check in my windows tweaks, but that's what makes distributed egoless deve- lopment so fun ;-) From greg@cosc.canterbury.ac.nz Tue Jul 3 01:20:37 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 03 Jul 2001 12:20:37 +1200 (NZST) Subject: [Python-Dev] Support for "wide" Unicode characters In-Reply-To: <03b301c102cf$e0e3dd00$0900a8c0@spiff> Message-ID: <200107030020.MAA00584@s454.cosc.canterbury.ac.nz> Fredrik Lundh : > back in 1995, some people claimed that the image type had > to be made smarter to be usable. But at least you can use more than one depth of image in the same program... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From mal@lemburg.com Tue Jul 3 09:31:50 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 03 Jul 2001 10:31:50 +0200 Subject: [Python-Dev] Unicode Maintenance References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com> <3B40505A.2F03EEC4@lemburg.com> <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid> Message-ID: <3B4182F6.DAC4C1@lemburg.com> Fredrik Lundh wrote: > > guido wrote: > > I'm not sure how to get you a "summary of changes", but I think you > > can ask Fredrik directly (Martin annonced he's away on vacation). > > summary: > > - portability: made unicode object behave properly also if > sizeof(Py_UNICODE) > 2 and >= sizeof(long) (FL) > - same for unicode codecs and the unicode database (MvL) > - base unicode feature selection on unicode defines, not platform (FL) > - wrap surrogate handling in #ifdef Py_UNICODE_WIDE (MvL, FL) > - tweaked unit tests to work with wide unicode, by replacing explicit > surrogates with \U escapes (MvL) > - configure options for narrow/wide unicode (MvL) > - removed bogus const and register from some scalars (GvR, FL) > - default unicode configuration for PC (Tim, FL) > - default unicode configuration for Mac (Jack) > - added sys.maxunicode (MvL) Thank you for the summary. Please let me suggest that for the next coding party you prepare a patch which spans all party checkins and upload that patch with a summary like the above to SF. That way we can keep the documentation of the overall changes in one place and make the process more transparent for everybody. Now let's get on with business... Thanks, -- Marc-Andre Lemburg ________________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik@pythonware.com Tue Jul 3 11:21:27 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 3 Jul 2001 12:21:27 +0200 Subject: [Python-Dev] Unicode Maintenance References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com> <3B40505A.2F03EEC4@lemburg.com> <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid> <3B4182F6.DAC4C1@lemburg.com> Message-ID: <05aa01c103a9$ec29e710$0900a8c0@spiff> mal wrote: > Please let me suggest that for the next coding party you prepare a patch > which spans all party checkins and upload that patch with a summary > like the above to SF. That way we can keep the documentation of the overall > changes in one place and make the process more transparent for everybody. Sorry, but as long as Guido wants an open development approach based on collective code ownership (aka "egoless programming"), that's what he gets. The current environment provides several tools to track changes to the code base. The python-checkins list provides instant info on every single change to the code base; the investment to track tha list is a few minutes per day. The CVS history is also easy to access; you can reach it via the viewcvs interface, or from the command line. Using both CVS and SF's patch manager to track development history is a waste of time. A development project manned by volunteers doesn't need bureaucrats; the version control system provides all the accountability we'll ever need. (commercial development projects doesn't need bureaucrats either, and usually don't have them, but that's another story). I'd also argue that using many incremental checkins improves quality -- the smaller a change is, the easier it is to understand, and the more likely it is that also non-experts will notice simple mistakes or portability issues. (I regularily comment on checkin messages that look suspicious codewise, even if I don't know anything about the problem area. I'm even right, sometimes). Reviewing big patches on SF is really hard, even for experts. And every hour a patch sits on sourceforge instead of in the code repository is ten hours less burn-in in a heterogenous testing en- vironment. That's worth a lot. Finally, my experience from this and other projects is that the "visible heartbeat" you get from a continuous flow of checkin messages improves team productivity and team morale. No- thing is more inspiring than seeing others working for a common goal. It's the final product that matters, not who's in charge of what part of it. The end user couldn't care less. I'd prefer if you didn't feel the need to play miniboss on the Python project (I'm sure you have plenty of 'mx' projects that you can use that approach, if you have to). And I'd rather see you at the next party than out there whining over how you missed the last one. Cheers /F From mal@lemburg.com Tue Jul 3 12:30:05 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 03 Jul 2001 13:30:05 +0200 Subject: [Python-Dev] Unicode Maintenance References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com> <3B40505A.2F03EEC4@lemburg.com> <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid> <3B4182F6.DAC4C1@lemburg.com> <05aa01c103a9$ec29e710$0900a8c0@spiff> Message-ID: <3B41ACBD.9FA8FB25@lemburg.com> Fredrik Lundh wrote: > > > Please let me suggest that for the next coding party you prepare a patch > > which spans all party checkins and upload that patch with a summary > > like the above to SF. That way we can keep the documentation of the overall > > changes in one place and make the process more transparent for everybody. > > Sorry, but as long as Guido wants an open development approach > based on collective code ownership (aka "egoless programming"), > that's what he gets. > > The current environment provides several tools to track changes > to the code base. The python-checkins list provides instant info > on every single change to the code base; the investment to track > tha list is a few minutes per day. The CVS history is also easy to > access; you can reach it via the viewcvs interface, or from the > command line. I think you misunderstood my suggestion: I didn't say you can't have a coding party with lots of small checkins, I just suggested that *after* the party someone does a diff before-and-after-the-party.diff and uploads this diff to SF with a description of the overall changes. You simply don't get the big picture from looking at various small checkin messages which are sometimes spread across mutliple files/checkins. > Using both CVS and SF's patch manager to track development history > is a waste of time. A development project manned by volunteers > doesn't need bureaucrats; the version control system provides > all the accountability we'll ever need. > > (commercial development projects doesn't need bureaucrats > either, and usually don't have them, but that's another story). Wasn't talking about bureaucrats... > I'd also argue that using many incremental checkins improves > quality -- the smaller a change is, the easier it is to understand, > and the more likely it is that also non-experts will notice simple > mistakes or portability issues. (I regularily comment on checkin > messages that look suspicious codewise, even if I don't know > anything about the problem area. I'm even right, sometimes). > Reviewing big patches on SF is really hard, even for experts. It's just for keeping a combined record of changes. Following up on dozens of checkins spanning another dozen files using CVS is harder, IMHO, than looking at one single before/after diff. > And every hour a patch sits on sourceforge instead of in the code > repository is ten hours less burn-in in a heterogenous testing en- > vironment. That's worth a lot. Agreed. > Finally, my experience from this and other projects is that the > "visible heartbeat" you get from a continuous flow of checkin > messages improves team productivity and team morale. No- > thing is more inspiring than seeing others working for a common > goal. It's the final product that matters, not who's in charge of > what part of it. The end user couldn't care less. > > I'd prefer if you didn't feel the need to play miniboss on the Python > project (I'm sure you have plenty of 'mx' projects that you can use > that approach, if you have to). I have no intention of playing "miniboss" (I have enough of that being the boss of a small company), I'm just trying to keep the task of a code maintainer manageable; that's all. 'nuff said. > And I'd rather see you at the next > party than out there whining over how you missed the last one. Perhaps you can send around invitations first, before starting the party next time ?! BTW, do you have plans to update the Unicode database to the 3.1 version ? If not, I'll look into this next week. -- Marc-Andre Lemburg ________________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From thomas@xs4all.net Tue Jul 3 12:41:51 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 3 Jul 2001 13:41:51 +0200 Subject: [Python-Dev] CVS Message-ID: <20010703134151.P8098@xs4all.nl> Slightly off-topic, but I've depleted all my other sources :) I'm trying to get CVS to give me all logentries for all checkins in a specific branch (the 2.1.1 branch) so I can pipe it through logmerge. It seems the one thing I'm missing now is a branchpoint tag (which should translate to a revision with an even number of dots, apparently) but 'release21' and 'release21-maint' both don't qualify. Even the usage logmerge suggests (cvs log -rrelease21) doesn't work, gives me a bunch of "no revision =12elease21' in " warnings and just all logentries for those files. Am I missing something simple, here, or should I hack logmerge to parse the symbolic names, figure out the even-dotted revision for each file from the uneven-dotted branch-tag, and filter out stuff outside that range ? :P --=20 Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me sp= read! From gregor@hoffleit.de Tue Jul 3 13:09:51 2001 From: gregor@hoffleit.de (Gregor Hoffleit) Date: Tue, 3 Jul 2001 14:09:51 +0200 Subject: [Python-Dev] PEP 250, site-python, site-packages Message-ID: <20010703140951.A27647@mediasupervision.de> PEP 250 talks about adopting site-packages for Windows systems. I'd like to discuss the sitedirs as a whole. Currently, site.py appends the following sitedirs to sys.path: * /lib/python/site-packages * /lib/site-python If exec-prefix is different from prefix, then also * /lib/python/site-packages * /lib/site-python >From the viewpoint of a Linux distribution, putting pure Python extension packages in /lib/python/site-packages is quite awkward: Debian has Python extension packages that would work unmodified with all Python versions since 1.4 up to now; and still, for every new of Python, we have to make a new package, with only the installation path changed. Due to Python's good tradition of compatibility, this is the vast majority of packages; only packages with binary modules necessarily need to be recompiled anyway for each major new . What makes me wonder is that nobody seems to use site-python; Distutils is completely unaware of it, and besides a few generic Debian packages (reportbug, dpkg-python), no extension packages on my machine is in site-python. site-packages OTOH is used by Distutils, and this PEP 250 would recommend its use even on Windows systems. I would suggest to turn this upside down: Python extensions should be installed in /lib/site-python by default. Only if they contain things that definitely should not be used with any other Python (e.g. binary modules), they might be installed in the version-specific extension directory, /lib/python/site-packages. I'm thinking about modifying Debian's distutils in order to install all architecture independent stuff in site-python. This would vastly ease the maintenance of Python packages. Gregor From jepler@mail.inetnebr.com Tue Jul 3 13:38:00 2001 From: jepler@mail.inetnebr.com (Jeff Epler) Date: Tue, 3 Jul 2001 07:38:00 -0500 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703140951.A27647@mediasupervision.de>; from gregor@mediasupervision.de on Tue, Jul 03, 2001 at 02:09:51PM +0200 References: <20010703140951.A27647@mediasupervision.de> Message-ID: <20010703073759.A4972@localhost.localdomain> On Tue, Jul 03, 2001 at 02:09:51PM +0200, Gregor Hoffleit wrote: > Due to Python's good tradition of compatibility, this is the vast > majority of packages; only packages with binary modules necessarily need > to be recompiled anyway for each major new . Aren't there bytecode changes in 1.6, 2.0, and 2.1, compared to 1.5.2? If so, this either means that each version of Python does need a separate copy (for the .pyc/.pyo file), or if all versions are compatible with 1.5.2 bytecodes (and I don't know that they are) then all packages would need to be bytecompiled with 1.5.2. For instance, it appears that between 1.5.2 and 2.1, the UNPACK_LIST and UNPACK_TUPLE bytecode instructions were removed and replaced with a single UNPACK_SEQUENCE opcode. Information gathered by executing: python -c 'import dis for name in dis.opname: if name[0] != "<": print name' | sort -u > opcodes-1.5.2 and similarly for python2. Jeff From gregor@hoffleit.de Tue Jul 3 13:53:11 2001 From: gregor@hoffleit.de (Gregor Hoffleit) Date: Tue, 3 Jul 2001 14:53:11 +0200 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703073759.A4972@localhost.localdomain> References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> Message-ID: <20010703145311.A12350@mediasupervision.de> On Tue, Jul 03, 2001 at 07:38:00AM -0500, Jeff Epler wrote: > On Tue, Jul 03, 2001 at 02:09:51PM +0200, Gregor Hoffleit wrote: > > Due to Python's good tradition of compatibility, this is the vast > > majority of packages; only packages with binary modules necessarily need > > to be recompiled anyway for each major new . > > Aren't there bytecode changes in 1.6, 2.0, and 2.1, compared to 1.5.2? If > so, this either means that each version of Python does need a separate copy > (for the .pyc/.pyo file), or if all versions are compatible with 1.5.2 > bytecodes (and I don't know that they are) then all packages would need to > be bytecompiled with 1.5.2. > > For instance, it appears that between 1.5.2 and 2.1, the UNPACK_LIST > and UNPACK_TUPLE bytecode instructions were removed and replaced with > a single UNPACK_SEQUENCE opcode. > > Information gathered by executing: > python -c 'import dis > for name in dis.opname: > if name[0] != "<": print name' | sort -u > opcodes-1.5.2 > and similarly for python2. Right, I forgot about that. It's not so bad for Debian though, since most of our packages byte-compile the stuff only when unpacking the package. Since installation of a new python-base package recompiles the complete site-packages tree (but not yet site-python, you got me ;-), we're not hurt by that problem. Any other arguments contra ? ;-) Gregor From thomas@xs4all.net Tue Jul 3 13:53:34 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 3 Jul 2001 14:53:34 +0200 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703073759.A4972@localhost.localdomain> References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> Message-ID: <20010703145334.Q8098@xs4all.nl> On Tue, Jul 03, 2001 at 07:38:00AM -0500, Jeff Epler wrote: > On Tue, Jul 03, 2001 at 02:09:51PM +0200, Gregor Hoffleit wrote: > > Due to Python's good tradition of compatibility, this is the vast > > majority of packages; only packages with binary modules necessarily need > > to be recompiled anyway for each major new . > Aren't there bytecode changes in 1.6, 2.0, and 2.1, compared to 1.5.2? If > so, this either means that each version of Python does need a separate copy > (for the .pyc/.pyo file), or if all versions are compatible with 1.5.2 > bytecodes (and I don't know that they are) then all packages would need to > be bytecompiled with 1.5.2. None are compatible. This might change, but I don't think so -- I think the CVS tree already has a different bytecode magic than 2.1, though I haven't checked. Perhaps what Gregor wants is a set of symlinks in each python version's site-packages directory, to a system-wide one, and a 'register-python-version' script like the emacs/xemacs stuff has that adds those symlinks. That way, the .pyc/.pyo versions would remain in the version-specific directory. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas@xs4all.net Tue Jul 3 14:00:03 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 3 Jul 2001 15:00:03 +0200 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703145311.A12350@mediasupervision.de> References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> <20010703145311.A12350@mediasupervision.de> Message-ID: <20010703150003.R8098@xs4all.nl> On Tue, Jul 03, 2001 at 02:53:11PM +0200, Gregor Hoffleit wrote: > Right, I forgot about that. It's not so bad for Debian though, since > most of our packages byte-compile the stuff only when unpacking the > package. Since installation of a new python-base package recompiles the > complete site-packages tree (but not yet site-python, you got me ;-), > we're not hurt by that problem. What about when you want to have multiple python versions, like python 1.5.2, 2.0.1, 2.1.1 and 2.2-CVS-snapshot ? :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From gregor@hoffleit.de Tue Jul 3 14:02:50 2001 From: gregor@hoffleit.de (Gregor Hoffleit) Date: Tue, 3 Jul 2001 15:02:50 +0200 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703145334.Q8098@xs4all.nl> References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> <20010703145334.Q8098@xs4all.nl> Message-ID: <20010703150250.B12350@mediasupervision.de> On Tue, Jul 03, 2001 at 02:53:34PM +0200, Thomas Wouters wrote: > On Tue, Jul 03, 2001 at 07:38:00AM -0500, Jeff Epler wrote: > > On Tue, Jul 03, 2001 at 02:09:51PM +0200, Gregor Hoffleit wrote: > > > Due to Python's good tradition of compatibility, this is the vast > > > majority of packages; only packages with binary modules necessarily need > > > to be recompiled anyway for each major new . > > > Aren't there bytecode changes in 1.6, 2.0, and 2.1, compared to 1.5.2? If > > so, this either means that each version of Python does need a separate copy > > (for the .pyc/.pyo file), or if all versions are compatible with 1.5.2 > > bytecodes (and I don't know that they are) then all packages would need to > > be bytecompiled with 1.5.2. > > None are compatible. This might change, but I don't think so -- I think the > CVS tree already has a different bytecode magic than 2.1, though I haven't > checked. Perhaps what Gregor wants is a set of symlinks in each python > version's site-packages directory, to a system-wide one, and a > 'register-python-version' script like the emacs/xemacs stuff has that adds > those symlinks. That way, the .pyc/.pyo versions would remain in the > version-specific directory. Sounds like a LOT of symlinks. To be honest, I would prefer to postulate that there's only one official Python version on a Debian system at a time. Then, the postinst and prerm scripts of python-base could take care of removing and recompiling .pyc and .pyo files at install time of a new Python version. Certainly, this won't work for packages that ship with precompiled .pyc/.pyo files, and we have to provide a method for registering .py files in non-standard places. If all of this was in place, I don't see a reason *not* to use site-python instead of site-packages... Gregor From gregor@hoffleit.de Tue Jul 3 14:05:35 2001 From: gregor@hoffleit.de (Gregor Hoffleit) Date: Tue, 3 Jul 2001 15:05:35 +0200 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703150003.R8098@xs4all.nl> References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> <20010703145311.A12350@mediasupervision.de> <20010703150003.R8098@xs4all.nl> Message-ID: <20010703150535.C12350@mediasupervision.de> On Tue, Jul 03, 2001 at 03:00:03PM +0200, Thomas Wouters wrote: > On Tue, Jul 03, 2001 at 02:53:11PM +0200, Gregor Hoffleit wrote: > > > Right, I forgot about that. It's not so bad for Debian though, since > > most of our packages byte-compile the stuff only when unpacking the > > package. Since installation of a new python-base package recompiles the > > complete site-packages tree (but not yet site-python, you got me ;-), > > we're not hurt by that problem. > > What about when you want to have multiple python versions, like python > 1.5.2, 2.0.1, 2.1.1 and 2.2-CVS-snapshot ? :-) You've hit the forbidden question ;-) Seriously, does anybody (besides the Python developers) feel a need to have multiple Python versions on the same system ? If there's a real world need for this, then, yes, we had to come up with a completely different setup. I guess this setup might involve symlink farms (urghh). Gregor From thomas@xs4all.net Tue Jul 3 14:16:08 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 3 Jul 2001 15:16:08 +0200 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703150535.C12350@mediasupervision.de> Message-ID: <20010703151608.S8098@xs4all.nl> On Tue, Jul 03, 2001 at 03:05:35PM +0200, Gregor Hoffleit wrote: > Seriously, does anybody (besides the Python developers) feel a need to > have multiple Python versions on the same system ? Well, currently anyone who wants to use python2.0+ does, yes. It's up to you, not me, whether that should be continued :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From gregor@hoffleit.de Tue Jul 3 14:28:09 2001 From: gregor@hoffleit.de (Gregor Hoffleit) Date: Tue, 3 Jul 2001 15:28:09 +0200 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703151608.S8098@xs4all.nl> References: <20010703151608.S8098@xs4all.nl> Message-ID: <20010703152809.E12350@mediasupervision.de> On Tue, Jul 03, 2001 at 03:16:08PM +0200, Thomas Wouters wrote: > On Tue, Jul 03, 2001 at 03:05:35PM +0200, Gregor Hoffleit wrote: > > > Seriously, does anybody (besides the Python developers) feel a need to > > have multiple Python versions on the same system ? > > Well, currently anyone who wants to use python2.0+ does, yes. It's up to > you, not me, whether that should be continued :-) Well, that's certainly quite OT since Debian-specific, but the need for an unofficial python2.0+ only arises due to the fact that a controlled and concurrent upgrade of the various Python packages is really, really awkward with the current setup. That's why I brought up this question in the first place. So let me paraphrase: Provided the maintainer of the Debian Python package would do a good job and keep the package always up-to-date, would you think there's a real world need for concurrent Python versions on the same system ? (Python developers still could use symlink farms to link the stuff from /usr/lib/site-python into /usr/local/lib/python3.1/site-packages...) Gregor From barry@digicool.com Tue Jul 3 14:31:32 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Tue, 3 Jul 2001 09:31:32 -0400 Subject: [Python-Dev] PEP 250, site-python, site-packages References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> <20010703145311.A12350@mediasupervision.de> <20010703150003.R8098@xs4all.nl> <20010703150535.C12350@mediasupervision.de> Message-ID: <15169.51508.176575.33388@anthem.wooz.org> >>>>> "GH" == Gregor Hoffleit writes: GH> You've hit the forbidden question ;-) GH> Seriously, does anybody (besides the Python developers) feel a GH> need to have multiple Python versions on the same system ? Yes, definitely as both a Zope and Mailman developer I need multiple Python versions. But I suspect even normal users of the system will need multiple versions. Different Python-based apps are requiring their users to upgrade Python on their own schedule, so multiple versions will still be required. -Barry From gward@python.net Tue Jul 3 14:51:23 2001 From: gward@python.net (Greg Ward) Date: Tue, 3 Jul 2001 09:51:23 -0400 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703152809.E12350@mediasupervision.de>; from gregor@mediasupervision.de on Tue, Jul 03, 2001 at 03:28:09PM +0200 References: <20010703151608.S8098@xs4all.nl> <20010703152809.E12350@mediasupervision.de> Message-ID: <20010703095122.A558@gerg.ca> On 03 July 2001, Gregor Hoffleit said: > So let me paraphrase: Provided the maintainer of the Debian Python > package would do a good job and keep the package always up-to-date, > would you think there's a real world need for concurrent Python versions > on the same system ? Speaking as someone who uses Python day-to-day and occasionally worries about compatibility across Python versions: yes, it would be really nice if Python better supported multiple versions installed at the same time. lib/python1.5, lib/python2.0, and lib/python2.1 just don't cut it: I remember running 1.5.1, 1.5.2, and alpha/beta versions of 1.6 simultaneously. I had to install each to a separate prefix, which was ugly but workable. It would be nice if Python (and, yes, the Distutils now) had better native support for multiple simultaneous versions. Speaking as the main perpetrator of the Distutils: AAUUGGHGHHHH!!!! NOOOOO!!! Please, don't make me look at this stuff AGAIN!!!! Aiiieeee!! But seriously: I think I once attempted to convince Guido that a revamped organization of the library directories would be a good idea, and that the Distutils would be a good way to introduce that scheme. Obviously, I didn't convince him, so we still have the same system. The one glimmer of good news is that the Distutils "install" command is insanely flexible; if you can manage to wrap your head around the 17,000 levels of indirection, it should be a simple matter of changing a few hard-coded dictionaries (there are two for Unix, and one each for Windows and Mac OS) to introduce a completely new installation scheme. I probably had some expectation that someday this discussion would open up again. BTW, I'm skeptical about keeping .py and .pyc code in a non-Python-version-specific directory (ie. site-python). Debian's bytecode-recompilation at installation time scheme sounds cool, but the desire/need to have multiple Python versions available kind of nixes it. Bummer. Oh yeah, another thing I vaguely recall from the pre-Distutils-0.1 era: Guido doesn't (didn't?) like site-python and wanted to deprecate it. Perhaps the above paragraph explains why. Greg -- Greg Ward - Linux geek gward@python.net http://starship.python.net/~gward/ Drive defensively -- buy a tank. From fdrake@acm.org Tue Jul 3 15:02:33 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 3 Jul 2001 10:02:33 -0400 (EDT) Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703150535.C12350@mediasupervision.de> References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> <20010703145311.A12350@mediasupervision.de> <20010703150003.R8098@xs4all.nl> <20010703150535.C12350@mediasupervision.de> <15169.51508.176575.33388@anthem.wooz.org> Message-ID: <15169.53369.55827.570681@cj42289-a.reston1.va.home.com> Gregor Hoffleit writes: > Seriously, does anybody (besides the Python developers) feel a need to > have multiple Python versions on the same system ? Absolutely! Anyone that wants to write cross-version Python code needs to be able to have multiple versions available. I'd even like to be able to have both Python 2.0 and Python 2.0.1 available on the same $prefix/$exec_prefix -- that can't be done currently. This kind of thing is pretty important when you want to take cross-version compatibility seriously. Barry A. Warsaw writes: > Yes, definitely as both a Zope and Mailman developer I need > multiple Python versions. But I suspect even normal users of the > system will need multiple versions. Different Python-based apps are > requiring their users to upgrade Python on their own schedule, so > multiple versions will still be required. Another excellent reason to support multiple versions! As more widely distributed applications are written using Python and don't want to include the interpreter, this becomes a more noticable issue. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From fdrake@acm.org Tue Jul 3 15:09:37 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 3 Jul 2001 10:09:37 -0400 (EDT) Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703095122.A558@gerg.ca> References: <20010703151608.S8098@xs4all.nl> <20010703152809.E12350@mediasupervision.de> <20010703095122.A558@gerg.ca> Message-ID: <15169.53793.422966.868795@cj42289-a.reston1.va.home.com> Greg Ward writes: > Oh yeah, another thing I vaguely recall from the pre-Distutils-0.1 era: > Guido doesn't (didn't?) like site-python and wanted to deprecate it. > Perhaps the above paragraph explains why. Another reason not to use site-python is that it is actually still hard to write cross-version Python code -- there are enough differences that any substantial volume of code (and in Python, you don't need many KLoC to get substantial code!) is bound to encounter a few, especially if you get used to using only Python 2.0+ -- it's easy to get used to features like string methods, list comprehensions, and augmented assignment! The site-packages directory was introduced to avoid the deficiencies of the site-python directory. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido@digicool.com Tue Jul 3 15:31:40 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 03 Jul 2001 10:31:40 -0400 Subject: [Python-Dev] CVS In-Reply-To: Your message of "Tue, 03 Jul 2001 13:41:51 +0200." <20010703134151.P8098@xs4all.nl> References: <20010703134151.P8098@xs4all.nl> Message-ID: <200107031431.f63EVem05167@odiug.digicool.com> > Slightly off-topic, but I've depleted all my other sources :) I'm trying to > get CVS to give me all logentries for all checkins in a specific branch (the > 2.1.1 branch) so I can pipe it through logmerge. It seems the one thing I'm > missing now is a branchpoint tag (which should translate to a revision with > an even number of dots, apparently) but 'release21' and 'release21-maint' > both don't qualify. Even the usage logmerge suggests (cvs log -rrelease21) > doesn't work, gives me a bunch of "no revision elease21' in " > warnings and just all logentries for those files. But those files should be old, so logmerge should safely sort their messages last, right? > Am I missing something simple, here, or should I hack logmerge to parse the > symbolic names, figure out the even-dotted revision for each file from the > uneven-dotted branch-tag, and filter out stuff outside that range ? :P You're lucky: at least the fork point is tagged (release21). For the descr-branch, if I want to do some kind of reasonable merge, I'll have to write a tool that figures out the fork point and tags it. That's one "cvs tag" call for each file... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Tue Jul 3 15:38:07 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 03 Jul 2001 10:38:07 -0400 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: Your message of "Tue, 03 Jul 2001 15:05:35 +0200." <20010703150535.C12350@mediasupervision.de> References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> <20010703145311.A12350@mediasupervision.de> <20010703150003.R8098@xs4all.nl> <20010703150535.C12350@mediasupervision.de> Message-ID: <200107031438.f63Ec7K05210@odiug.digicool.com> > > What about when you want to have multiple python versions, like python > > 1.5.2, 2.0.1, 2.1.1 and 2.2-CVS-snapshot ? :-) > > You've hit the forbidden question ;-) > > Seriously, does anybody (besides the Python developers) feel a need to > have multiple Python versions on the same system ? I've had enough requests over the years for this, so it is indeed supported, and I believe there is a need. Quite often people have important programs that for some minor reason don't work on a newer version yet and they can't find the person or the time to fix it. Python's standard installation makes this possible. You can have only one "python" but you can request a specific version by appending the "major dot minor" part of the version number, e.g. python1.5, python2.0, python2.1, python2.2. "python" is a hard link to one of these. You can't (easily) have multiple version with the same major.minor, but that should never be needed. I've heard though that some Linux distributors break this versioning scheme in favor of their own. > If there's a real world need for this, then, yes, we had to come up with > a completely different setup. I guess this setup might involve symlink > farms (urghh). Ugh maybe, but it's the only thing that scales. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Tue Jul 3 15:45:34 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 03 Jul 2001 10:45:34 -0400 Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: Your message of "Tue, 03 Jul 2001 09:51:23 EDT." <20010703095122.A558@gerg.ca> References: <20010703151608.S8098@xs4all.nl> <20010703152809.E12350@mediasupervision.de> <20010703095122.A558@gerg.ca> Message-ID: <200107031445.f63EjYF05265@odiug.digicool.com> > Speaking as someone who uses Python day-to-day and occasionally worries > about compatibility across Python versions: yes, it would be really nice > if Python better supported multiple versions installed at the same time. > lib/python1.5, lib/python2.0, and lib/python2.1 just don't cut it: I > remember running 1.5.1, 1.5.2, and alpha/beta versions of 1.6 > simultaneously. I had to install each to a separate prefix, which was > ugly but workable. It would be nice if Python (and, yes, the Distutils > now) had better native support for multiple simultaneous versions. That was mostly because we were abusing the version numbering scheme to roll out feature releases with a micro version number (1.5.1, 1.5.2). We don't do that any more -- feature releases have a minor (middle) version number change. If you really need to distinguish Python 2.0 and 2.0.1 on the same system, you're a Python developer by definition. :-) > Speaking as the main perpetrator of the Distutils: AAUUGGHGHHHH!!!! > NOOOOO!!! Please, don't make me look at this stuff AGAIN!!!! > Aiiieeee!! BTW, Greg, there's this bug I've found in Distutils, but the margin of this email isn't wide enough to describe it. :-) > But seriously: I think I once attempted to convince Guido that a > revamped organization of the library directories would be a good idea, > and that the Distutils would be a good way to introduce that scheme. > Obviously, I didn't convince him, so we still have the same system. Which I think isn't so bad given that we now have a well-behaved versioning policy in place. > The one glimmer of good news is that the Distutils "install" command > is insanely flexible; if you can manage to wrap your head around the > 17,000 levels of indirection, it should be a simple matter of > changing a few hard-coded dictionaries (there are two for Unix, and > one each for Windows and Mac OS) to introduce a completely new > installation scheme. I probably had some expectation that someday > this discussion would open up again. > > BTW, I'm skeptical about keeping .py and .pyc code in a > non-Python-version-specific directory (ie. site-python). Debian's > bytecode-recompilation at installation time scheme sounds cool, but the > desire/need to have multiple Python versions available kind of nixes it. > Bummer. Yes, good point. Bytecode is generally not compatible between versions -- its specification is considered an internal detail of the implementation (again, it can't vary with a micro-version, but it can and usually does vary with the minor version number). > Oh yeah, another thing I vaguely recall from the pre-Distutils-0.1 era: > Guido doesn't (didn't?) like site-python and wanted to deprecate it. > Perhaps the above paragraph explains why. Indeed, /usr/local/lib/python./site-packages/ is where site packages should go. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@pfdubois.com Tue Jul 3 15:44:48 2001 From: paul@pfdubois.com (Paul F. Dubois) Date: Tue, 3 Jul 2001 07:44:48 -0700 Subject: [Python-Dev] site-python, multiple installations Message-ID: I'm on vacation and haven't followed this discussion well but read with alarm some talk about how it would be expected that there would only be "one official Python" on a system. This is categorically a false assumption for almost everyone at LLNL. Please do not attempt to make any changes that assume there is one place into which everything should be put, or that there should be some system-wide registry of packages. I thought this demon had been killed in distutils-sig long ago. Paul From thomas@xs4all.net Tue Jul 3 16:05:28 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 3 Jul 2001 17:05:28 +0200 Subject: [Python-Dev] CVS In-Reply-To: <200107031431.f63EVem05167@odiug.digicool.com> References: <20010703134151.P8098@xs4all.nl> <200107031431.f63EVem05167@odiug.digicool.com> Message-ID: <20010703170528.U32419@xs4all.nl> On Tue, Jul 03, 2001 at 10:31:40AM -0400, Guido van Rossum wrote: > > Slightly off-topic, but I've depleted all my other sources :) I'm tryin= g to > > get CVS to give me all logentries for all checkins in a specific branch= (the > > 2.1.1 branch) so I can pipe it through logmerge. It seems the one thing= I'm > > missing now is a branchpoint tag (which should translate to a revision = with > > an even number of dots, apparently) but 'release21' and 'release21-main= t' > > both don't qualify. Even the usage logmerge suggests (cvs log -rrelease= 21) > > doesn't work, gives me a bunch of "no revision =12elease21' in " > > warnings and just all logentries for those files. > But those files should be old, so logmerge should safely sort their > messages last, right? Yes, but it also lists all checkin messages for all branches, including the trunk, after release21... so I end up no smarter than without '-rrelease21'. > > Am I missing something simple, here, or should I hack logmerge to parse= the > > symbolic names, figure out the even-dotted revision for each file from = the > > uneven-dotted branch-tag, and filter out stuff outside that range ? :P > You're lucky: at least the fork point is tagged (release21). For the > descr-branch, if I want to do some kind of reasonable merge, I'll have > to write a tool that figures out the fork point and tags it. That's > one "cvs tag" call for each file... No, that's one 'cvs log' command; for each entry, it contains all symbolic names. All you need to do is to search for the descr-branch symbolic name in that list, grab the revision it lists (if any), chop off the last dot-and-digit, and you're done. You can almost do that in a shell oneliner: centurion:~/python/python-CVS > cvs log | egrep "(RCS file:|descr-branch:)"= | python -c " import fileinput lastline =3D '' for line in fileinput.input(): if lastline and line[0] =3D=3D '\t': filename =3D lastline[33:-3] revision =3D line.split()[1] branchpoint =3D revision[:revision.rindex('.')] print filename, branchpoint lastline =3D '' else: lastline =3D line " (adjust quotes for (t)csh, I guess) Tadaaa! Hmm... I'll just use that myself, too . But why not merge the trunk into your tree ? You can do that with cvs update -j HEAD inside your (sticky-tagged) working tree, IIRC. It doesn't change the repository either, just your working directory, so it's safe to try in a separate directory. Then, when you're satisfied it all works, you can commit the whole thing. --=20 Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me sp= read! From thomas@xs4all.net Tue Jul 3 16:26:16 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 3 Jul 2001 17:26:16 +0200 Subject: [Python-Dev] site-python, multiple installations In-Reply-To: Message-ID: <20010703172615.V8098@xs4all.nl> On Tue, Jul 03, 2001 at 07:44:48AM -0700, Paul F. Dubois wrote: > I'm on vacation and haven't followed this discussion well but read with > alarm some talk about how it would be expected that there would only be "one > official Python" on a system. This is categorically a false assumption for > almost everyone at LLNL. Please do not attempt to make any changes that > assume there is one place into which everything should be put, or that there > should be some system-wide registry of packages. That wasn't for Python, it was for Debian. You'll note that Gregor actually said "this is getting off-topic" in one of the mails :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From aahz@rahul.net Tue Jul 3 19:11:45 2001 From: aahz@rahul.net (Aahz Maruch) Date: Tue, 3 Jul 2001 11:11:45 -0700 (PDT) Subject: [Python-Dev] PEP 250, site-python, site-packages In-Reply-To: <20010703152809.E12350@mediasupervision.de> from "Gregor Hoffleit" at Jul 03, 2001 03:28:09 PM Message-ID: <20010703181145.8D05D99C8A@waltz.rahul.net> Gregor Hoffleit wrote: > > So let me paraphrase: Provided the maintainer of the Debian Python > package would do a good job and keep the package always up-to-date, > would you think there's a real world need for concurrent Python versions > on the same system ? Yes. Thing is, you're going to have Debian system scripts that will possibly rely on a specific version of Python. It's not fair to expect users to upgrade the OS every time they want a newer version of Python, yet you can't take a chance on the system scripts breaking. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From guido@digicool.com Tue Jul 3 20:08:04 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 03 Jul 2001 15:08:04 -0400 Subject: [Python-Dev] CVS In-Reply-To: Your message of "Tue, 03 Jul 2001 17:05:28 +0200." <20010703170528.U32419@xs4all.nl> References: <20010703134151.P8098@xs4all.nl> <200107031431.f63EVem05167@odiug.digicool.com> <20010703170528.U32419@xs4all.nl> Message-ID: <200107031908.f63J84008549@odiug.digicool.com> > But why not merge the trunk into your tree ? You can do that with > > cvs update -j HEAD > > inside your (sticky-tagged) working tree, IIRC. It doesn't change the > repository either, just your working directory, so it's safe to try in a > separate directory. Then, when you're satisfied it all works, you can commit > the whole thing. I believe that's what I tried last time, and it suddenly revived a bunch of files that had been dead for years. But you're right, I should probably try this. But in the light of multiple merges, it's important to tag the tree three times: (1) tag the HEAD at the point where you want to do the merge; (2) tag the branch at the point where you waht to merge into; (3) after resolving conflicts and making the resulting checkins in the branch, tag the branch again. Well, maybe (2) is redundant. But (1) is essential to be able to do do another merge later. And I think I recall that (3) was good for something, too. (Maybe if you want to merge in the other direction.) Sigh. Not my day, it seems. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@digicool.com Tue Jul 3 20:10:36 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Tue, 3 Jul 2001 15:10:36 -0400 Subject: [Python-Dev] CVS References: <20010703134151.P8098@xs4all.nl> Message-ID: <15170.6316.543729.545921@anthem.wooz.org> >>>>> "TW" == Thomas Wouters writes: TW> Slightly off-topic, but I've depleted all my other sources :) TW> I'm trying to get CVS to give me all logentries for all TW> checkins in a specific branch (the 2.1.1 branch) so I can pipe TW> it through logmerge. I had a lot of problems trying to do the same thing with the (slightly misnamed) Mailman Release_2_0_1-branch. I basically could not get CVS to give me just the log messages for all changes to that branch. It would either give me nothing or give me all changes in all branches and trunk. It may just be a CVS bug, I dunno. I eventually gave up. -Barry From thomas.heller@ion-tof.com Wed Jul 4 17:28:49 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 4 Jul 2001 18:28:49 +0200 Subject: [Python-Dev] Checkin problems (slightly off-topic) Message-ID: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook> After I tried to set up syncmail notification for the anygui project, I did not get it to work. Then I found out that exactly the same problem occurs if I try to checkin into other SF projects I have, also python. Here is the message I get: C:\sf\distutils\distutils\command>cvs commit -m "dummy checkin for testing, please ignore" cvs commit: Examining . Checking in bdist_wininst.py; /cvsroot/python/distutils/distutils/command/bdist_wininst.py,v <-- bdist_wininst.py new revision: 1.22; previous revision: 1.21 done Mailing python-checkins@python.org... Generating notification message... Generating notification message... done. 2001-07-04 09:52:14 Failed to get user name for uid 34174 The checkin succeeded, but no mail is sent :-( I have no clue what uid 34174 is, surely not my SF user id (which is 11105). Has anyone seen this problem before, or can offer other help? Thanks, Thomas From fdrake@acm.org Wed Jul 4 18:27:16 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 4 Jul 2001 13:27:16 -0400 (EDT) Subject: [Python-Dev] Checkin problems (slightly off-topic) In-Reply-To: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook> References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook> Message-ID: <15171.20980.397264.341559@cj42289-a.reston1.va.home.com> Thomas Heller writes: > Has anyone seen this problem before, or can > offer other help? I saw this last night, but don't know that we can deal with it. Have you filed a SourceForge support request? -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From thomas.heller@ion-tof.com Wed Jul 4 18:05:40 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 4 Jul 2001 19:05:40 +0200 Subject: [Python-Dev] Checkin problems (slightly off-topic) References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook> <15171.20980.397264.341559@cj42289-a.reston1.va.home.com> Message-ID: <038901c104ab$8e0deac0$e000a8c0@thomasnotebook> > > Thomas Heller writes: > > Has anyone seen this problem before, or can > > offer other help? > > I saw this last night, but don't know that we can deal with it. > Have you filed a SourceForge support request? Seems to be a problem with my account. I will file a support request. Thanks, Thomas From thomas.heller@ion-tof.com Wed Jul 4 18:06:48 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 4 Jul 2001 19:06:48 +0200 Subject: [Python-Dev] Checkin problems (slightly off-topic) References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook> <15171.20980.397264.341559@cj42289-a.reston1.va.home.com> Message-ID: <039301c104ab$b67898c0$e000a8c0@thomasnotebook> > > Thomas Heller writes: > > Has anyone seen this problem before, or can > > offer other help? > > I saw this last night, but don't know that we can deal with it. So you mean _you_ have seen the same problem for yourself? Thomas From fdrake@acm.org Wed Jul 4 18:43:52 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 4 Jul 2001 13:43:52 -0400 (EDT) Subject: [Python-Dev] Checkin problems (slightly off-topic) In-Reply-To: <039301c104ab$b67898c0$e000a8c0@thomasnotebook> References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook> <15171.20980.397264.341559@cj42289-a.reston1.va.home.com> <039301c104ab$b67898c0$e000a8c0@thomasnotebook> Message-ID: <15171.21976.338066.16853@cj42289-a.reston1.va.home.com> Thomas Heller writes: > > I saw this last night, but don't know that we can deal with it. > So you mean _you_ have seen the same problem for yourself? Yes. It started about 2:00am (east coast time); things had been fine before that. I think it affected both mail sent by syncmail and the trackers. I don't know about other systems at SourceForge. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From thomas.heller@ion-tof.com Wed Jul 4 18:20:42 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 4 Jul 2001 19:20:42 +0200 Subject: [Python-Dev] Checkin problems (slightly off-topic) References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook><15171.20980.397264.341559@cj42289-a.reston1.va.home.com><039301c104ab$b67898c0$e000a8c0@thomasnotebook> <15171.21976.338066.16853@cj42289-a.reston1.va.home.com> Message-ID: <03fb01c104ad$a8c5e780$e000a8c0@thomasnotebook> From: "Fred L. Drake, Jr." > > Thomas Heller writes: > > > I saw this last night, but don't know that we can deal with it. > > So you mean _you_ have seen the same problem for yourself? > > Yes. It started about 2:00am (east coast time); things had been > fine before that. I think it affected both mail sent by syncmail and > the trackers. I don't know about other systems at SourceForge. > I found at least two (open) support requests from other people reporting exactly the same problem. I don't think they need another one. Thomas From thomas@xs4all.net Wed Jul 4 21:46:50 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 4 Jul 2001 22:46:50 +0200 Subject: [Python-Dev] Checkin problems (slightly off-topic) In-Reply-To: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook> References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook> Message-ID: <20010704224650.Z8098@xs4all.nl> On Wed, Jul 04, 2001 at 06:28:49PM +0200, Thomas Heller wrote: > Generating notification message... done. > 2001-07-04 09:52:14 Failed to get user name for uid 34174 > The checkin succeeded, but no mail is sent :-( > I have no clue what uid 34174 is, surely not > my SF user id (which is 11105). Actually, it *is* your SF userid ;) twouters@usw-pr-shell2:~$ python Python 1.5.2 (#0, Dec 27 2000, 13:59:38) [GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> import pwd >>> pwd.getpwuid(34174) ('theller', 'x', 34174, 100, 'Thomas Heller', '/home/users/t/th/theller', '/bin/bash') It's your unix user-id, not the SF websystem one. Probably the cvs machine's PAM setup is broken in some way... From a quick look on the shell machine it seems they don't quite run the typical setup ;-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From jack@oratrix.nl Wed Jul 4 22:46:05 2001 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 04 Jul 2001 23:46:05 +0200 Subject: [Python-Dev] Will the new type system allow automatic coercion? Message-ID: <20010704214610.6D99CDA740@oratrix.oratrix.nl> Something I've suddenly started to need is automatic coercion implemented by the source type. I have a type implemented in C, and while automatic coercion from any other type to my type is easy to implement (in your O& routine you simply check whether the passed object is of a type you can coerce) there is no way to do the reverse (at least, not that I'm aware of, please enlighten me). And now I have this CFString type (a wrapper around the MacOS CoreFoundation object. Nice things, by the way, these CoreFoundation objects, sort-of inherited from NextStep and they share a lot of design with Python objects, but I digress) that can show itself as a Unicode string or an 8 bit string or a number of other things. It would be nice if users could pass these CFString objects in places where a string or unicode is expected. Simply said, if PyArg_Parse s format would accept my objects. Will the new type system allow me to do this? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ From mwh@python.net Wed Jul 4 23:30:42 2001 From: mwh@python.net (Michael Hudson) Date: 04 Jul 2001 23:30:42 +0100 Subject: [Python-Dev] summer summaries Message-ID: My internet connection is going to get drastically worse tomorrow, and while I could still do the python-dev summaries over the summer, it would be significantly more tedious. Would someone else be able to do them for a bit? I can provide the scripts I use to generate the distributions and format into text and xhtml, and continue to archive them on my starship pages. It's not that much work; a few hours a fortnight. I could also do with a break from it for more general reasons. My internet connection should be back to full strength in October. Cheers, M. -- While preceding your entrance with a grenade is a good tactic in Quake, it can lead to problems if attempted at work. -- C Hacking -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html From tim.one@home.com Wed Jul 4 23:59:34 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 4 Jul 2001 18:59:34 -0400 Subject: [Python-Dev] Checkin problems (slightly off-topic) In-Reply-To: <039301c104ab$b67898c0$e000a8c0@thomasnotebook> Message-ID: [Thomas Heller, on 2001-07-04 09:52:14 Failed to get user name for uid 34174 at the end of a checkin ] > Has anyone seen this problem before, or can offer other help? [Fred] > I saw this last night, but don't know that we can deal with it. [Thomas Heller] > So you mean _you_ have seen the same problem for yourself? I saw the same thing today when I did a checkin, although with a different uid. Can't speak for Fred, but can't imagine what else he could have meant (unless he was running a spy monitor watching it happen to you ). From akuchlin@mems-exchange.org Thu Jul 5 01:34:03 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 4 Jul 2001 20:34:03 -0400 Subject: [Python-Dev] summer summaries In-Reply-To: ; from mwh@python.net on Wed, Jul 04, 2001 at 11:30:42PM +0100 References: Message-ID: <20010704203403.A11589@ute.cnri.reston.va.us> On Wed, Jul 04, 2001 at 11:30:42PM +0100, Michael Hudson wrote: >would be significantly more tedious. Would someone else be able to do >them for a bit? I can provide the scripts I use to generate the I'm willing to pick them up again for a bit. >I could also do with a break from it for more general reasons. Been there, done that. :) --amk (www.amk.ca) This is the moment when I get a real sense of job satisfaction. -- The Collector, at Leela's execution, in "The Sunmakers" From guido@digicool.com Thu Jul 5 02:06:43 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 04 Jul 2001 21:06:43 -0400 Subject: [Python-Dev] Will the new type system allow automatic coercion? In-Reply-To: Your message of "Wed, 04 Jul 2001 23:46:05 +0200." <20010704214610.6D99CDA740@oratrix.oratrix.nl> References: <20010704214610.6D99CDA740@oratrix.oratrix.nl> Message-ID: <200107050106.f6516hm10144@odiug.digicool.com> > Something I've suddenly started to need is automatic coercion implemented > by the source type. I have a type implemented in C, and while > automatic coercion from any other type to my type is easy to implement > (in your O& routine you simply check whether the passed object is of a > type you can coerce) there is no way to do the reverse (at least, not > that I'm aware of, please enlighten me). > > And now I have this CFString type (a wrapper around the MacOS > CoreFoundation object. Nice things, by the way, these CoreFoundation > objects, sort-of inherited from NextStep and they share a lot of > design with Python objects, but I digress) that can show itself as a > Unicode string or an 8 bit string or a number of other things. It > would be nice if users could pass these CFString objects in places > where a string or unicode is expected. Simply said, if PyArg_Parse s > format would accept my objects. > > Will the new type system allow me to do this? I don't know that the new type system (which isn't really a type system, just a generalized implementation of class construction through a formalization of the Don Beaudry hook :-) has anything to do with this, but can't the buffer interface come to the rescue? The s format accepts anything that conforms to the buffer interface, AFAIK. Does that help? (Alas, I don't think there's a similar API for Unicode.) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Thu Jul 5 02:22:31 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 4 Jul 2001 21:22:31 -0400 Subject: [Python-Dev] "Becoming a Python Developer" Message-ID: <20010704212231.A11629@ute.cnri.reston.va.us> Inspired by some discussion on c.l.py, I've written a draft of a guide to working on Python: http://www.amk.ca/python/writing/python-dev.html Comments welcomed! --amk From greg@cosc.canterbury.ac.nz Thu Jul 5 06:23:58 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 05 Jul 2001 17:23:58 +1200 (NZST) Subject: [Python-Dev] Making a .pyd using Cygwin? Message-ID: <200107050523.RAA00926@s454.cosc.canterbury.ac.nz> Is it feasible to compile a Python extension module for Windows using Cygwin? I have tried this, and the linker tells me that it can't export '_bss_start__', '_bss_end__', '_data_start__' and '_data_end__' because they're not defined. I tried defining some symbols with those names in a c file and got it to link, but importing the resulting extension causes the interpreter to hang. I'm using Python 2.1, Windows 2000 Professional, and whatever version of Cygwin was the latest as of a couple of days ago. Thanks for any help, Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From mwh@python.net Thu Jul 5 09:22:30 2001 From: mwh@python.net (Michael Hudson) Date: 05 Jul 2001 09:22:30 +0100 Subject: [Python-Dev] summer summaries In-Reply-To: Andrew Kuchling's message of "Wed, 4 Jul 2001 20:34:03 -0400" References: <20010704203403.A11589@ute.cnri.reston.va.us> Message-ID: Andrew Kuchling writes: > On Wed, Jul 04, 2001 at 11:30:42PM +0100, Michael Hudson wrote: > >would be significantly more tedious. Would someone else be able to do > >them for a bit? I can provide the scripts I use to generate the > > I'm willing to pick them up again for a bit. Thanks! I've got one in the boiler for today, which I'll post soon. > >I could also do with a break from it for more general reasons. > > Been there, done that. :) I thought you might know what I was talking about here... Cheers, M. -- If your telephone company installs a system in the woods with no one around to see them, do they still get it wrong? -- Robert Moir, alt.sysadmin.recovery From thomas@xs4all.net Thu Jul 5 11:10:25 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 5 Jul 2001 12:10:25 +0200 Subject: [Python-Dev] While we're deprecating... Message-ID: <20010705121024.A8098@xs4all.nl> While we're in the deprecation mood (not that I changed my mind on xrage() ;P) how about we deprecate the alternate-tab-size-comment checks of the parser. That is, generate a deprecation warning for these comments: "tab-width:", /* Emacs */ ":tabstop=", /* vim, full form */ ":ts=", /* vim, abbreviated form */ "set tabsize=", /* will vi never die? */ with sizes other than '8', and rip out the code in 2.3 ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mwh@python.net Thu Jul 5 11:10:48 2001 From: mwh@python.net (Michael Hudson) Date: Thu, 5 Jul 2001 11:10:48 +0100 (BST) Subject: [Python-Dev] python-dev summary 2001-06-21 - 2001-07-05 Message-ID: This is a summary of traffic on the python-dev mailing list between June 21 and July 4 (inclusive) 2001. It is intended to inform the wider Python community of ongoing developments. To comment, just post to python-list@python.org or comp.lang.python in the usual way. Give your posting a meaningful subject line, and if it's about a PEP, include the PEP number (e.g. Subject: PEP 201 - Lockstep iteration) All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on a PEP if you have an opinion. This is the eleventh summary written by Michael Hudson. Summaries are archived at: Posting distribution (with apologies to mbm) Number of articles in summary: 252 40 | [|] | [|] | [|] | [|] | [|] 30 | [|] | [|] | [|] [|] [|] | [|] [|] [|] [|] | [|] [|] [|] [|] 20 | [|] [|] [|] [|] | [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] 10 | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] 0 +-004-019-007-016-042-018-028-013-005-015-016-029-027-013 Thu 21| Sat 23| Mon 25| Wed 27| Fri 29| Sun 01| Tue 03| Fri 22 Sun 24 Tue 26 Thu 28 Sat 30 Mon 02 Wed 04 This will be my last python-dev summary for a while, as I'm going to be mostly away from the internet for the summer. However, Andrew Kuchling has agreed to take up writing them again, so there should be no interruption in the summaries. * Support for "wide" Unicode characters * Paul Prescod posted a draft of PEP 261 'Support for "wide" Unicode characters': which proposes adding a compile time option to configure unicode objects to store "code points" (the integers that the unicode specification maps to "characters" -- though that word is dangerously overlodaed in the Unicode arena) in 32 bit integers -- they're currently stored in 16 bits. This was (I believe) at least partially inspired by the Unicode Consortium assigning code points outside the "Basic Multilingual Plane" (i.e. the range of 16 bit integers). Noone is convinced that this is the best possible solution (a better solution might be to have unicode objects that could either store code points in 16 bits or 32 bits as necessary, and this solution could have binary compatibility problems), but it seems noone has the time to implement a better one (and a better one would probably have compatibility problems that couldn't be fixed by a simple recompile): (I apologise for any abuse of terminology in the above - I know very little about the issues surrounding Unicode). * Python specializing compiler * Armin Rigo announced his "Python specializing compiler", psyco: It works on the principle that you can compile a faster version of a function if you know stuff about the arguments it's likely to be called with. This is one of the more asthetically pleasing of the possible ways to speed Python up (it's similar to some tactics used by the seemingly defunct self compiler), but it's still a very large amount of work away from being useful... * IPv6 * *Very* preliminary support for IPv6 - the "next generation internet protocol" was checked in. The support thus far doesn't actually support IPv6 at all, but rather emulates IPv6's new functions for IPv4 addresses, so that code for Python 2.2 will hopefully be portable between machines that do and do not support IPv6, whilst being able to use IPv6 where it is supported (I hope that makes sense). Unfortunately the checkin broke the build on some platforms (OSF1, Windows) but I believe these problems are now sorted out. IPv6 support has been muttered about for years now, so it's nice to finally see some movement, even if it is causing some x-platform pain. * PEP 260: simplifying xrange * Guido posted PEP 260, a proposal to removed some of the less useful aspects of the xrange type: Support was muted; there's the usual concern on removing "little used" features -- what if someone (who maybe doesn't read comp.lang.python or these summaries) uses them? * site-python, site-packages * Gregor Hoffleit posted a request that /lib/site-python be considered a standard install target: as the current standard of /lib/pythonX.X/site-packages/ makes life awkward for packagers. It's possible Gregor asked the wrong bunch of people; a non-version dependent path makes life awkward for those who want to mantain more than version of Python, and that includes most of the people on pyton-dev. OTOH, it probably also includes everyone who cares about the cross-version portability of the code they write, so it seems that movemnet is unlikely here (could be wrong, though). Cheers, M. From mwh@python.net Thu Jul 5 11:12:48 2001 From: mwh@python.net (Michael Hudson) Date: Thu, 5 Jul 2001 11:12:48 +0100 (BST) Subject: [Python-Dev] python-dev summary 2001-06-21 - 2001-07-05 Message-ID: The less typo-ed version! This is a summary of traffic on the python-dev mailing list between June 21 and July 4 (inclusive) 2001. It is intended to inform the wider Python community of ongoing developments. To comment, just post to python-list@python.org or comp.lang.python in the usual way. Give your posting a meaningful subject line, and if it's about a PEP, include the PEP number (e.g. Subject: PEP 201 - Lockstep iteration) All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on a PEP if you have an opinion. This is the eleventh summary written by Michael Hudson. Summaries are archived at: Posting distribution (with apologies to mbm) Number of articles in summary: 252 40 | [|] | [|] | [|] | [|] | [|] 30 | [|] | [|] | [|] [|] [|] | [|] [|] [|] [|] | [|] [|] [|] [|] 20 | [|] [|] [|] [|] | [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] 10 | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] 0 +-004-019-007-016-042-018-028-013-005-015-016-029-027-013 Thu 21| Sat 23| Mon 25| Wed 27| Fri 29| Sun 01| Tue 03| Fri 22 Sun 24 Tue 26 Thu 28 Sat 30 Mon 02 Wed 04 This will be my last python-dev summary for a while, as I'm going to be mostly away from the internet for the summer. However, Andrew Kuchling has agreed to take up writing them again, so there should be no interruption in the summaries. * Support for "wide" Unicode characters * Paul Prescod posted a draft of PEP 261 'Support for "wide" Unicode characters': which proposes adding a compile time option to configure unicode objects to store "code points" (the integers that the unicode specification maps to "characters" -- though that word is dangerously overlodaed in the Unicode arena) in 32 bit integers -- they're currently stored in 16 bits. This was (I believe) at least partially inspired by the Unicode Consortium assigning code points outside the "Basic Multilingual Plane" (i.e. the range of 16 bit integers). Noone is convinced that this is the best possible solution (a better solution might be to have unicode objects that could either store code points in 16 bits or 32 bits as necessary, and this solution could have binary compatibility problems), but it seems noone has the time to implement a better one (and a better one would probably have compatibility problems that couldn't be fixed by a simple recompile): (I apologise for any abuse of terminology in the above - I know very little about the issues surrounding Unicode). * Python specializing compiler * Armin Rigo announced his "Python specializing compiler", psyco: It works on the principle that you can compile a faster version of a function if you know stuff about the arguments it's likely to be called with. This is one of the more asthetically pleasing of the possible ways to speed Python up (it's similar to some tactics used by the seemingly defunct self compiler), but it's still a very large amount of work away from being useful... * IPv6 * *Very* preliminary support for IPv6 - the "next generation internet protocol" was checked in. The support thus far doesn't actually support IPv6 at all, but rather emulates IPv6's new functions for IPv4 addresses, so that code for Python 2.2 will hopefully be portable between machines that do and do not support IPv6, whilst being able to use IPv6 where it is supported (I hope that makes sense). Unfortunately the checkin broke the build on some platforms (OSF1, Windows) but I believe these problems are now sorted out. IPv6 support has been muttered about for years now, so it's nice to finally see some movement, even if it is causing some x-platform pain. * PEP 260: simplifying xrange * Guido posted PEP 260, a proposal to removed some of the less useful aspects of the xrange type: Support was muted; there's the usual concern on removing "little used" features -- what if someone (who maybe doesn't read comp.lang.python or these summaries) uses them? * site-python, site-packages * Gregor Hoffleit posted a request that /lib/site-python be considered a standard install target: as the current standard of /lib/pythonX.X/site-packages/ makes life awkward for packagers. It's possible Gregor asked the wrong bunch of people; a non-version dependent path makes life awkward for those who want to mantain more than version of Python, and that includes most of the people on pyton-dev. OTOH, it probably also includes everyone who cares about the cross-version portability of the code they write, so it seems that movemnet is unlikely here (could be wrong, though). Cheers, M. From guido@digicool.com Thu Jul 5 14:12:46 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 05 Jul 2001 09:12:46 -0400 Subject: [Python-Dev] While we're deprecating... In-Reply-To: Your message of "Thu, 05 Jul 2001 12:10:25 +0200." <20010705121024.A8098@xs4all.nl> References: <20010705121024.A8098@xs4all.nl> Message-ID: <200107051312.f65DCkv10572@odiug.digicool.com> > While we're in the deprecation mood (not that I changed my mind on xrage() > ;P) how about we deprecate the alternate-tab-size-comment checks of the > parser. That is, generate a deprecation warning for these comments: > > "tab-width:", /* Emacs */ > ":tabstop=", /* vim, full form */ > ":ts=", /* vim, abbreviated form */ > "set tabsize=", /* will vi never die? */ > > with sizes other than '8', and rip out the code in 2.3 ? Was this ever even documented? Is it worth being so careful? We could rip out the functionality now, replacing it with a warning, and lose the warning in 2.3. Or we could just rip it out now, and always enable the -t option. (Hm, that should be unified with the warnings framework, although I'm not sure how easy that will be.) --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Thu Jul 5 14:35:22 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 5 Jul 2001 15:35:22 +0200 Subject: [Python-Dev] While we're deprecating... In-Reply-To: <200107051312.f65DCkv10572@odiug.digicool.com> References: <20010705121024.A8098@xs4all.nl> <200107051312.f65DCkv10572@odiug.digicool.com> Message-ID: <20010705153522.B8098@xs4all.nl> On Thu, Jul 05, 2001 at 09:12:46AM -0400, Guido van Rossum wrote: > > While we're in the deprecation mood (not that I changed my mind on xrage() > > ;P) how about we deprecate the alternate-tab-size-comment checks of the > > parser. That is, generate a deprecation warning for these comments: > > > > "tab-width:", /* Emacs */ > > ":tabstop=", /* vim, full form */ > > ":ts=", /* vim, abbreviated form */ > > "set tabsize=", /* will vi never die? */ > > > > with sizes other than '8', and rip out the code in 2.3 ? > Was this ever even documented? Is it worth being so careful? We > could rip out the functionality now, replacing it with a warning, and > lose the warning in 2.3. Or we could just rip it out now, and always > enable the -t option. (Hm, that should be unified with the warnings > framework, although I'm not sure how easy that will be.) Uhmm... if it wasn't documented, all the more reason to be careful. Imagine, say, # tab-width:4 (or however it's done) <4 spaces>for record in database: < 8 spaces > process_record < 1 tab > del database Now, I completely agree that that is very fragile code (imagine some emacs loathing colleague removing the tab-width line) but that doesn't mean we should just break it for the hell of it... We have it, we want to lose it, we deprecate it. If we rip it out now (which I'd be -0 on) we should replace it with an *error*, not a warning, since code has a high chance of breaking. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas.heller@ion-tof.com Thu Jul 5 14:36:55 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 5 Jul 2001 15:36:55 +0200 Subject: [Python-Dev] Playing with descr-branch Message-ID: <009001c10557$8eb82150$e000a8c0@thomasnotebook> Guido, some feedback from first experiments with descr-branch: The test-suite seems to work, as does the test_descr.py script run standalone. Immediate crash (access vialoation) on executing: C:\sf\desc-branch\python\dist\src\PCbuild>python Python 2.2a0 (#16, Jul 5 2001, 12:26:08) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. >>> class C: ... def foo(*a): return a ... goo = classmethod(foo) ... >>> C.goo The crash can be avoided by executing C = c() before calling C.goo. Just an observation... Currently this code does not seem stable enough to play with. Thomas From thomas@xs4all.net Thu Jul 5 14:54:47 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 5 Jul 2001 15:54:47 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: Message-ID: <20010705155447.C8098@xs4all.nl> On Thu, Jul 05, 2001 at 06:24:46AM -0700, Guido van Rossum wrote: > Update of /cvsroot/python/python/dist/src/Include > In directory usw-pr-cvs1:/tmp/cvs-serv8011 > > Modified Files: > rangeobject.h > Log Message: > Rip out the fancy behaviors of xrange that nobody uses: repeat, slice, > contains, tolist(), and the start/stop/step attributes. This includes > removing the 4th ('repeat') argument to PyRange_New(). Eek... What do we have the fucking warning framework and deprecation warnings for, anyway ?! It may sound overly conservative, but I *really* don't like ripping things out just because you don't like them, without even as much as a release with a warning (and no, 2.1.1 can't have the warning; PEP 6 won't allow it.) You're basically telling people "You didn't use it the way I thought people would use it but never documented anywhere, so if you used them the way they are documented, you're screwed." Defense offers exhibit A: the standard library reference: http://www.python.org/doc/current/lib/typesseq-xrange.htm: "XRange objects behave like tuples, and offer a single method" (Not to mention http://www.python.org/doc/current/lib/typesseq.html which *strongly* suggests all the operations in the table apply to range-objects as well as to strings, unicode strings, lists, tuples and buffers.) The API change also means binary and source level breakages... Is it really that much trouble to, just for *one* release, keep the functionality and just generate a warning when the 4th argument is something other than '1' ? I can live (though not agree with, sorry ;P) the removal of xrange advanced features... just not from supported to *gone* in a single step. Wishing-I-hadn't-mentioned-xrange-in-the-other-thread-ly y'rs, ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From Jason.Tishler@dothill.com Thu Jul 5 14:58:16 2001 From: Jason.Tishler@dothill.com (Jason Tishler) Date: Thu, 5 Jul 2001 09:58:16 -0400 Subject: [Python-Dev] Making a .pyd using Cygwin? In-Reply-To: <200107050523.RAA00926@s454.cosc.canterbury.ac.nz> Message-ID: <20010705095816.B6130@dothill.com> Greg, On Thu, Jul 05, 2001 at 05:23:58PM +1200, Greg Ewing wrote: > Is it feasible to compile a Python extension module > for Windows using Cygwin? By the above, do you mean a Win32 or Cygwin extension module? The answer is yes in either case. However, using Cygwin Python to build a Cygwin extension module is more straight forward than a Win32 one. Actually, Cygwin Python behaves the same as other Unix platforms. > I have tried this, and the linker tells me that it can't > export '_bss_start__', '_bss_end__', '_data_start__' > and '_data_end__' because they're not defined. The above is due to the fact that your extension module (i.e., DLL) is not exporting any symbols. You can rectify this problem by adding a DL_EXPORT macro to the definition of the module's initialization function. See the following for an example of the solution: http://sources.redhat.com/ml/cygwin/2001-06/msg00442.html BTW, the cygwin@cygwin.com mailing list is a more appropriate forum for this type of question. Jason -- Jason Tishler Director, Software Engineering Phone: 732.264.8770 x235 Dot Hill Systems Corp. Fax: 732.264.8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com From guido@digicool.com Thu Jul 5 15:09:58 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 05 Jul 2001 10:09:58 -0400 Subject: [Python-Dev] "Becoming a Python Developer" In-Reply-To: Your message of "Wed, 04 Jul 2001 21:22:31 EDT." <20010704212231.A11629@ute.cnri.reston.va.us> References: <20010704212231.A11629@ute.cnri.reston.va.us> Message-ID: <200107051409.f65E9wS10645@odiug.digicool.com> Nice work, Andrew! I surely hope this will bring us some new contributors... Some answers to your XXX marks: > You like hacking on language interpreters. (XXX is that a bit > snarky? not sure I phrased that well...) Sounds fine to me, or you could extend to "large software packages". > Python is over 10 years old, and its development process is quite > (XXX elaborate? evolved? mature?) I'd say mature. > Python is developed by a group of about 30 people, True, but may sound off-putting to would-be contributors. Maybe you can add that lots of others contribute significantly? (E.g. ask Fred how many folks have contributed to the docs!) > XXX should something be written about CPython / JPython / Stackless > / Python.NET? I only know about CPython. Explaining the distinction would be helpful, and you could add that for Java programmers, participating in Jython would be a logical step. > Guido van Rossum has the title of Benevolent Dictator For Life, or > BDFL. Lest people who aren't familiar with Python culture (or those who are but lack a sense of humor) take this at face value, can you explain that this is a tongue-in-cheek title? The section on CVS is redundant -- this information is already on the SF website, isn't it? (Or most of it.) I don't think detailed instructions need to be in a high-level motivational article -- a link to http://sourceforge.net/cvs/?group_id=5470 is all that's needed (like you do for other services). > diff -C2. (XXX is that correct?) I use "diff -c" which seems to have the same effect. > Python's standard style, described at XXX. Alas, there's no description. Let me try to summarize the rules here. C dialect: - Use ANSI/ISO standard C (the 1989 version of the standard). - All function declarations and definitions must use full prototypes (i.e. specify the types of all arguments). - Never use C++ style // one-line comments. - No compiler warnings with major compilers (gcc, VC++, a few others). Code lay-out: - Use single-tab indents, where a tab is worth 8 spaces. - No line should be longer than 79 characters. If this and the previous rule together don't give you enough room to code, your code is too complicated -- consider using subroutines. - Function definition style: function name in column 1, outermost curly braces in column 1, blank line after local variable declarations. static int extra_ivars(PyTypeObject *type, PyTypeObject *base) { int t_size = PyType_BASICSIZE(type); int b_size = PyType_BASICSIZE(base); assert(t_size >= b_size); /* type smaller than base! */ ... return 1; } - Code structure: one space between keywords like 'if', 'for' and the following left paren; no spaces inside the paren; braces as shown: if (mro != NULL) { ... } else { ... } - The return statement should *not* get redundant parentheses: return Py_None; /* correct */ return(Py_None); /* incorrect */ - Function and macro call style: foo(a, b, c) -- no space before the open paren, no spaces inside the parens, no spaces before commas, one space after each comma. - Always put spaces around assignment, Boolean and comparison operators. In expressions using a lot of operators, add spaces around the outermost (lowest-priority) operators. - Breaking long lines: if you can, break after commas in the outermost argument list. Always indent continuation lines appropriately, e.g.: PyErr_Format(PyExc_TypeError, "cannot create '%.100s' instances", type->tp_name); - When you break a long expression at a binary operator, the operator goes at the end of the previous line, e.g.: if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 && type->tp_dictoffset == b_size && (size_t)t_size == b_size + sizeof(PyObject *)) return 0; /* "Forgive" adding a __dict__ only */ - Put blank lines around functions, structure definitions, and major sections inside functions. - Comments go before the code they describe. - All functions and global variables should be declared static unless they are to be part of a published interface - For external functions and variables, we always have a declaration in an appropriate header file in the "Include" directory, which uses the DL_IMPORT() macro, like this: extern DL_IMPORT(PyObject *) PyObject_Repr(PyObject *); Naming conventions: - Use a Py prefix for public functions; never for static functions. The Py_ prefix is reserved for global service routines like Py_FatalError; specific groups of routines (e.g. specific object type APIs) use a longer prefix, e.g. PyString_ for string functions. - Public functions and variables use MixedCase with underscores, like this: PyObject_GetAttr, Py_BuildValue, PyExc_TypeError. - Occasionally an "internal" function has to be visible to the loader; we use the _Py prefix for this, e.g.: _PyObject_Dump. - Macros should have a MixedCase prefix and then use upper case, for example: PyString_AS_STRING, Py_PRINT_RAW. I'm sure there's more. I'll make this into a PEP. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Jul 5 15:06:38 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 5 Jul 2001 10:06:38 -0400 (EDT) Subject: [Python-Dev] While we're deprecating... In-Reply-To: <200107051312.f65DCkv10572@odiug.digicool.com> References: <20010705121024.A8098@xs4all.nl> <200107051312.f65DCkv10572@odiug.digicool.com> Message-ID: <15172.29806.432822.805282@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > Was this ever even documented? Is it worth being so careful? We It's not in the LaTeX documentation. > could rip out the functionality now, replacing it with a warning, and > lose the warning in 2.3. Or we could just rip it out now, and always > enable the -t option. (Hm, that should be unified with the warnings > framework, although I'm not sure how easy that will be.) I'd rather see the warning added one version before ripping it out. (And no, I don't see any real reason to change xrange either.) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From skip@pobox.com (Skip Montanaro) Thu Jul 5 15:46:59 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 5 Jul 2001 09:46:59 -0500 Subject: [Python-Dev] "Becoming a Python Developer" In-Reply-To: <20010704212231.A11629@ute.cnri.reston.va.us> References: <20010704212231.A11629@ute.cnri.reston.va.us> Message-ID: <15172.32227.320917.531063@beluga.mojam.com> Andrew> Inspired by some discussion on c.l.py, I've written a draft of a Andrew> guide to working on Python: ... Andrew, Good work. I would recommend that new people wanting to contribute to Python be urged to look first at the libraries (batteries) instead of the language (radio) itself. The language itself is growing new features on an occaional basis, but its interaction with the outside world (e.g. XML) is just as important (or perhaps more important). Skip From akuchlin@mems-exchange.org Thu Jul 5 16:03:35 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Thu, 5 Jul 2001 11:03:35 -0400 Subject: [Python-Dev] "Becoming a Python Developer" In-Reply-To: <200107051409.f65E9wS10645@odiug.digicool.com>; from guido@digicool.com on Thu, Jul 05, 2001 at 10:09:58AM -0400 References: <20010704212231.A11629@ute.cnri.reston.va.us> <200107051409.f65E9wS10645@odiug.digicool.com> Message-ID: <20010705110334.C12027@ute.cnri.reston.va.us> On Thu, Jul 05, 2001 at 10:09:58AM -0400, Guido van Rossum wrote: >> You like hacking on language interpreters. (XXX is that a bit >> snarky? not sure I phrased that well...) Whoops, that XXX is now redundant. The original text had something like "... and you want to do something more interesting than writing the 187th Scheme interpreter", but then I took the Scheme reference out. >Explaining the distinction would be helpful, and you could add that >for Java programmers, participating in Jython would be a logical step. The thing is, I'm not sure how Jython is developed. Is Finn Bock the BDFL, do the developers vote, or what? (Same question for .NET.) If some Jython development offered to contribute a description, I certainly wouldn't turn it down. >I'm sure there's more. I'll make this into a PEP. Should the Python style guide, currently at /doc/essays/styleguide.html, also become a PEP? It's often referenced... I'll make the other suggested changes, and refer to PEP 7; thanks! Anyone have suggestions for other topics/issues that should be covered in this document? --amk From guido@digicool.com Thu Jul 5 16:07:32 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 05 Jul 2001 11:07:32 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: Your message of "Thu, 05 Jul 2001 15:54:47 +0200." <20010705155447.C8098@xs4all.nl> References: <20010705155447.C8098@xs4all.nl> Message-ID: <200107051507.f65F7Wf12155@odiug.digicool.com> > > Rip out the fancy behaviors of xrange that nobody uses: repeat, slice, > > contains, tolist(), and the start/stop/step attributes. This includes > > removing the 4th ('repeat') argument to PyRange_New(). > > Eek... What do we have the fucking warning framework and deprecation > warnings for, anyway ?! For more important things?! I posted the PEP about this, and I got mostly favorable or lukewarm responses. *One* person admitted they were using advanced xrange() features now, but he said he wouldn't miss them. The warning framework and deprecation warnings are important for things that will change the semantics of things *without* causing an error message (like nested scopes). They are also important for things that will require lots of folks to change their code. >From the responses to the PEP posting it doesn't seem like there are many people using xrange() in non-idiomatic ways, so the latter risk seems very small to me. With the exception of the change to PyRange_New(), the changes here will give people clear error message when they try to use the existing features. If you insist, I can change the signature for PyRange_New() back and add a warning if the 4th argument is not 1, but I'm reluctant there too. > It may sound overly conservative, but I *really* > don't like ripping things out just because you don't like them, without even > as much as a release with a warning (and no, 2.1.1 can't have the warning; > PEP 6 won't allow it.) Of course not. > You're basically telling people "You didn't use it the way I thought people > would use it but never documented anywhere, so if you used them the way they > are documented, you're screwed." Defense offers exhibit A: the > standard library reference: > > http://www.python.org/doc/current/lib/typesseq-xrange.htm: > "XRange objects behave like tuples, and offer a single method" Well, they never did behave like tuples (s+t never worked, and you couldn't slice a repeated xrange object). But more importantly, (almost) nobody has used them as such. > (Not to mention http://www.python.org/doc/current/lib/typesseq.html which > *strongly* suggests all the operations in the table apply to range-objects > as well as to strings, unicode strings, lists, tuples and buffers.) I'll update this to explain that concat, repeat and slice don't work for xrange() object. > The API change also means binary and source level breakages... Is it > really that much trouble to, just for *one* release, keep the > functionality and just generate a warning when the 4th argument is > something other than '1' ? No binary breakage -- the 4th argument is normally 1 anyway. The source breakage is easy to fix. Again, the real point of the deprecation policy is not to *never* get an error in old code. It is to make sure that you don't get burned by *silent* changes in semantics, and to make sure that *common* usage that will stop working is caught. Advanced xrange() is not common. Calling PyRange_New() from C is not common. > I can live (though not agree with, sorry ;P) the removal of xrange > advanced features... just not from supported to *gone* in a single > step. Sorry, then you better commit suicide. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Thu Jul 5 16:12:14 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 05 Jul 2001 11:12:14 -0400 Subject: [Python-Dev] "Becoming a Python Developer" In-Reply-To: Your message of "Thu, 05 Jul 2001 11:03:35 EDT." <20010705110334.C12027@ute.cnri.reston.va.us> References: <20010704212231.A11629@ute.cnri.reston.va.us> <200107051409.f65E9wS10645@odiug.digicool.com> <20010705110334.C12027@ute.cnri.reston.va.us> Message-ID: <200107051512.f65FCEI12208@odiug.digicool.com> > >Explaining the distinction would be helpful, and you could add that > >for Java programmers, participating in Jython would be a logical step. > > The thing is, I'm not sure how Jython is developed. Is Finn Bock the > BDFL, do the developers vote, or what? (Same question for .NET.) If > some Jython development offered to contribute a description, I > certainly wouldn't turn it down. I really don't know how Jython is developed, but Finn and Samuele are on this list. I expect it's more democratic. I think .NET is purely an ActiveState venture -- Paul Prescod may care to comment. > >I'm sure there's more. I'll make this into a PEP. > > Should the Python style guide, currently at > /doc/essays/styleguide.html, also become a PEP? It's often > referenced... Yes. I vaguely recall some group of volunteers was planning to rework it into a more current document, but I don't know what happened to that effort. Some of the doc string conventions are now immortalized as PEP 257. > I'll make the other suggested changes, and refer to PEP 7; thanks! You're welcome, and thank *you* for doing this. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Thu Jul 5 16:16:37 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 05 Jul 2001 11:16:37 -0400 Subject: [Python-Dev] Playing with descr-branch In-Reply-To: Your message of "Thu, 05 Jul 2001 15:36:55 +0200." <009001c10557$8eb82150$e000a8c0@thomasnotebook> References: <009001c10557$8eb82150$e000a8c0@thomasnotebook> Message-ID: <200107051516.f65FGbE12901@odiug.digicool.com> > some feedback from first experiments with descr-branch: > > The test-suite seems to work, as does the test_descr.py script > run standalone. > > Immediate crash (access vialoation) on executing: > > C:\sf\desc-branch\python\dist\src\PCbuild>python > Python 2.2a0 (#16, Jul 5 2001, 12:26:08) [MSC 32 bit (Intel)] on win32 > Type "copyright", "credits" or "license" for more information. > >>> class C: > ... def foo(*a): return a > ... goo = classmethod(foo) > ... > >>> C.goo > > The crash can be avoided by executing C = c() > before calling C.goo. Hm, I can't reproduce this; it works for me! Have you tried cvs update and rebuilding? > Just an observation... > Currently this code does not seem stable enough to > play with. What you observe sounds like a reference count error. If you can still reproduce it, can you try to dig a little deeper? Linking with -lefence and running it under gdb, then reporting the backtrace when it fails would help. --Guido van Rossum (home page: http://www.python.org/~guido/) From tismer@tismer.com Thu Jul 5 16:16:20 2001 From: tismer@tismer.com (Christian Tismer) Date: Thu, 05 Jul 2001 17:16:20 +0200 Subject: [Python-Dev] Psyco1 with Stackless Message-ID: <3B4484C4.8196A4A9@tismer.com> Hi Armin, developers, I had a closer look at your Python Specializing Compiler. This is a very promising approach, going into directions which I have been trying a little myself. In its current state, Psyco introduces a nice little extra engine, which mostly deals with efficient integer operations. There are a lot of other optimizations possible, and lots of opcodes need to be implemented in order to make it usable for real world applications. Anyway, I found this proof of concept very interesting, and so I built the extensions for Win32 (with very small changes) and did some testing with mctest.py . Here my results with stock Python 2.0: result 1952145856 in 2.43729775712 seconds result 1952145856 in 2.18692571263 seconds result 1952145856 in 5.60894890272 seconds (run1, run2, original func) But before, I tested with Stackless Python by chance, and I got this: result 1952145856 in 2.42536300005 seconds result 1952145856 in 2.18817615088 seconds result 1952145856 in 3.51236064919 seconds While your result outperforms standard Python by 2.56, it performs only by 1.605 better than Stackless! This doesn't say anything against your implementation, instead it tells me that Stackless' code optimization is much better than Standard Python's, especially for integer operations on Win32. For sure, your version could be much faster when it is generating machine code, or if even more optimizations of data flow are done. Your little vm looks already very efficient. The is of course some room for optimizations, like this: the SGET macro is used all around, and it always uses explicit stack addressing. A function like CODE_INT_BINARY(IntSub, -) expands into 16 machine instructions with Visual Studio 6. For common cases, like [TOS-1] = [TOS] - [TOS-1], special accessors might save about half of these opcodes, again. In other words, I assume that you can get three times as fast as Python on integer operations with just a VM. Congratulations and keep this work on! - chris p.s.: Note that at the moment, you don't do any overflow checks on integers. This is not cmpatible to Python, while I would love to have an option to switch of overflow checks in Python, of course! -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From Samuele Pedroni Thu Jul 5 16:27:59 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Thu, 5 Jul 2001 17:27:59 +0200 (MET DST) Subject: [Python-Dev] "Becoming a Python Developer" Message-ID: <200107051528.RAA11938@core.inf.ethz.ch> Hi. [Andrew Kuchling] > > >Explaining the distinction would be helpful, and you could add that > >for Java programmers, participating in Jython would be a logical step. > > The thing is, I'm not sure how Jython is developed. Is Finn Bock the > BDFL, do the developers vote, or what? (Same question for .NET.) If > some Jython development offered to contribute a description, I > certainly wouldn't turn it down. > The situation for jython development is as follows (Finn could have a different opinion): There are 2 active core developers. Until now we never needed to vote, also because most of what we do is mimicking python semantic. The java/jython specific stuff is already subtle enough for two minds and we simply try to converge to a decent solution. I don't think Finn considers himself a BDFL. Until now (jython phase, not jpython) we have received only small contributions, and we have rejected a few patches, we are severe censors wrt. to quality, at least we try. The matter of fact is that for the momement we have never been challenged by a patch, feature addition by a promising new contributor. That's a pity. We have (I imagine) lots of users, and we get praised but... Maybe we are bad at diplomacy or we have somehow closed the development process ... In any case we would definitely like to have some more contributors, also in order to better keep up with python quick pace ... Java is more productive than multi-platform portable C but ... regards, Samuele. From thomas.heller@ion-tof.com Thu Jul 5 16:33:45 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 5 Jul 2001 17:33:45 +0200 Subject: [Python-Dev] Playing with descr-branch References: <009001c10557$8eb82150$e000a8c0@thomasnotebook> <200107051516.f65FGbE12901@odiug.digicool.com> Message-ID: <004901c10567$e0de25f0$e000a8c0@thomasnotebook> > > some feedback from first experiments with descr-branch: > > > > The test-suite seems to work, as does the test_descr.py script > > run standalone. > > > > Immediate crash (access vialoation) on executing: > > > > C:\sf\desc-branch\python\dist\src\PCbuild>python > > Python 2.2a0 (#16, Jul 5 2001, 12:26:08) [MSC 32 bit (Intel)] on win32 > > Type "copyright", "credits" or "license" for more information. > > >>> class C: > > ... def foo(*a): return a > > ... goo = classmethod(foo) > > ... > > >>> C.goo > > > > The crash can be avoided by executing C = c() > > before calling C.goo. > > Hm, I can't reproduce this; it works for me! Have you tried cvs > update and rebuilding? > I thought so. > > Just an observation... > > Currently this code does not seem stable enough to > > play with. > > What you observe sounds like a reference count error. If you can > still reproduce it, can you try to dig a little deeper? Linking with > -lefence and running it under gdb, then reporting the backtrace when > it fails would help. I have seen several variants of this and once tried it with the debug build in MSVC6. IIRC the code was calling ->tp_alloc(...) somewhere which was NULL. I will try again this night and report back. gdb? You use gdb under Windows? And what is -lefence? Thanks, Thomas From guido@digicool.com Thu Jul 5 16:40:57 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 05 Jul 2001 11:40:57 -0400 Subject: [Python-Dev] Playing with descr-branch In-Reply-To: Your message of "Thu, 05 Jul 2001 17:33:45 +0200." <004901c10567$e0de25f0$e000a8c0@thomasnotebook> References: <009001c10557$8eb82150$e000a8c0@thomasnotebook> <200107051516.f65FGbE12901@odiug.digicool.com> <004901c10567$e0de25f0$e000a8c0@thomasnotebook> Message-ID: <200107051540.f65FewK14436@odiug.digicool.com> > I have seen several variants of this and once tried it with the debug > build in MSVC6. IIRC the code was calling ->tp_alloc(...) somewhere > which was NULL. > I will try again this night and report back. > gdb? You use gdb under Windows? > And what is -lefence? Sorry, I missed the subtle clue that you were reporting a Windows bug... :-( I can indeed reproduce this on Windows, and I'll look into it now (if Tim doesn't beat me to it). Under Linux, it *is* stable enough! :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Thu Jul 5 16:43:17 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 5 Jul 2001 17:43:17 +0200 Subject: [Python-Dev] Playing with descr-branch References: <009001c10557$8eb82150$e000a8c0@thomasnotebook> <200107051516.f65FGbE12901@odiug.digicool.com> <004901c10567$e0de25f0$e000a8c0@thomasnotebook> <200107051540.f65FewK14436@odiug.digicool.com> Message-ID: <00a901c10569$3653c890$e000a8c0@thomasnotebook> > I can indeed reproduce this on Windows, and I'll look into it now (if > Tim doesn't beat me to it). > > Under Linux, it *is* stable enough! :-) I know there _are_ reasons to switch :-) Thomas From guido@digicool.com Thu Jul 5 16:48:01 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 05 Jul 2001 11:48:01 -0400 Subject: [Python-Dev] Playing with descr-branch In-Reply-To: Your message of "Thu, 05 Jul 2001 17:43:17 +0200." <00a901c10569$3653c890$e000a8c0@thomasnotebook> References: <009001c10557$8eb82150$e000a8c0@thomasnotebook> <200107051516.f65FGbE12901@odiug.digicool.com> <004901c10567$e0de25f0$e000a8c0@thomasnotebook> <200107051540.f65FewK14436@odiug.digicool.com> <00a901c10569$3653c890$e000a8c0@thomasnotebook> Message-ID: <200107051548.f65Fm1B14472@odiug.digicool.com> > > I can indeed reproduce this on Windows, and I'll look into it now (if > > Tim doesn't beat me to it). > > > > Under Linux, it *is* stable enough! :-) > I know there _are_ reasons to switch :-) I looked at your problem with the debugger, and it seems that the classmethod and staticmethod types don't get initialized (the constructor would have been a good time to do this :-). But now I don't understand why this isn't a hard failure on Linux. Checking will follow ASAP. --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@python.net Thu Jul 5 16:47:53 2001 From: gward@python.net (Greg Ward) Date: Thu, 5 Jul 2001 11:47:53 -0400 Subject: [Python-Dev] Making a .pyd using Cygwin? In-Reply-To: <200107050523.RAA00926@s454.cosc.canterbury.ac.nz>; from greg@cosc.canterbury.ac.nz on Thu, Jul 05, 2001 at 05:23:58PM +1200 References: <200107050523.RAA00926@s454.cosc.canterbury.ac.nz> Message-ID: <20010705114752.A954@gerg.ca> On 05 July 2001, Greg Ewing said: > Is it feasible to compile a Python extension module > for Windows using Cygwin? I was under the impression that the Distutils supported gcc under Cygwin. I know several people put in a lot of work to make this happen, and I eventually approved all the patches. Greg -- Greg Ward - Unix bigot gward@python.net http://starship.python.net/~gward/ "What do you mean -- a European or an African swallow?" From akuchlin@mems-exchange.org Thu Jul 5 16:54:15 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Thu, 5 Jul 2001 11:54:15 -0400 Subject: [Python-Dev] "Becoming a Python Developer" In-Reply-To: <200107051528.RAA11938@core.inf.ethz.ch>; from pedroni@inf.ethz.ch on Thu, Jul 05, 2001 at 05:27:59PM +0200 References: <200107051528.RAA11938@core.inf.ethz.ch> Message-ID: <20010705115415.F12027@ute.cnri.reston.va.us> On Thu, Jul 05, 2001 at 05:27:59PM +0200, Samuele Pedroni wrote: >Maybe we are bad at diplomacy or we have somehow closed the >development process ... It might also be Java's development culture. Back when I occasionally looked for Java classes, it was amazing how little free Java software there was. People would write, say, a specialized AWT Layout class, but instead of putting on a Web page with source code and an example, they'd want to you pay $25 or $300 for it! This probably isn't helped by Java support on Unixes (other than Solaris) having been so dodgy for so long, as Unix seems to have the strongest such culture. So people may not be accustomed to the idea that if they use Jython, they can also *improve* it. BTW, I assume the Jython source uses the standard Java indentation and formatting style? --amk From barry@digicool.com Thu Jul 5 17:03:21 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 5 Jul 2001 12:03:21 -0400 Subject: [Python-Dev] "Becoming a Python Developer" References: <20010704212231.A11629@ute.cnri.reston.va.us> <200107051409.f65E9wS10645@odiug.digicool.com> <20010705110334.C12027@ute.cnri.reston.va.us> Message-ID: <15172.36809.897670.477227@anthem.wooz.org> >>>>> "AK" == Andrew Kuchling writes: AK> Should the Python style guide, currently at AK> /doc/essays/styleguide.html, also become a PEP? It's often AK> referenced... AK> I'll make the other suggested changes, and refer to PEP 7; AK> thanks! PEP 8 will be "Style Guide for Python Code". I'll adapt the contents of the styleguide to PEP form and check that in. -Barry From aahz@rahul.net Thu Jul 5 17:40:01 2001 From: aahz@rahul.net (Aahz Maruch) Date: Thu, 5 Jul 2001 09:40:01 -0700 (PDT) Subject: [Python-Dev] "Becoming a Python Developer" In-Reply-To: <200107051409.f65E9wS10645@odiug.digicool.com> from "Guido van Rossum" at Jul 05, 2001 10:09:58 AM Message-ID: <20010705164002.E662599C81@waltz.rahul.net> Guido van Rossum wrote: > AMK: >> >> Guido van Rossum has the title of Benevolent Dictator For Life, or >> BDFL. > > Lest people who aren't familiar with Python culture (or those who are > but lack a sense of humor) take this at face value, can you explain > that this is a tongue-in-cheek title? It should be made clear, I think, that while the title is tongue-in-cheek, the semantics of the title are not. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From DavidA@ActiveState.com Thu Jul 5 17:48:13 2001 From: DavidA@ActiveState.com (David Ascher) Date: Thu, 05 Jul 2001 09:48:13 -0700 Subject: [Python-Dev] "Becoming a Python Developer" References: <20010704212231.A11629@ute.cnri.reston.va.us> <200107051409.f65E9wS10645@odiug.digicool.com> <20010705110334.C12027@ute.cnri.reston.va.us> <200107051512.f65FCEI12208@odiug.digicool.com> Message-ID: <3B449A4C.4B541061@ActiveState.com> Guido van Rossum wrote: > > > >Explaining the distinction would be helpful, and you could add that > > >for Java programmers, participating in Jython would be a logical step. > > > > The thing is, I'm not sure how Jython is developed. Is Finn Bock the > > BDFL, do the developers vote, or what? (Same question for .NET.) If > > some Jython development offered to contribute a description, I > > certainly wouldn't turn it down. > > I really don't know how Jython is developed, but Finn and Samuele are > on this list. I expect it's more democratic. I think .NET is purely > an ActiveState venture -- Paul Prescod may care to comment. For the purposes of Andrew's document, Mark Hammond, of ActiveState, is the BDFL on the .NET research project, which shouldn't be considered at the same level of maturity as Jython by any means. In other words, it's fun, but it's not useful yet. -- David Ascher ActiveState New! ASPN - ActiveState Programmer Network Essential programming tools and information http://www.ActiveState.com/ASPN From barry@digicool.com Thu Jul 5 17:51:39 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 5 Jul 2001 12:51:39 -0400 Subject: [Python-Dev] "Becoming a Python Developer" References: <200107051528.RAA11938@core.inf.ethz.ch> <20010705115415.F12027@ute.cnri.reston.va.us> Message-ID: <15172.39707.597180.746@anthem.wooz.org> >>>>> "AK" == Andrew Kuchling writes: AK> BTW, I assume the Jython source uses the standard Java AK> indentation and formatting style? Finn and Samuele are better arbiters of this, but when I was hacking JPython, the answer was yes, with one exception: the opening brace for a class should be on a line by itself, in column zero (not, as is the Java convention, hanging at the right end of the first line of code). E.g. the convention was the same as Guido has for C code. This was done primarily for Emacs' sake, but I don't think Finn or Samuele use Emacs for their development. -Barry From tismer@tismer.com Thu Jul 5 17:52:51 2001 From: tismer@tismer.com (Christian Tismer) Date: Thu, 05 Jul 2001 18:52:51 +0200 Subject: [Python-Dev] Psyco1 with Stackless References: <3B4484C4.8196A4A9@tismer.com> Message-ID: <3B449B63.5DB59864@tismer.com> Christian Tismer wrote: ... > For common cases, like [TOS-1] = [TOS] - [TOS-1], special > accessors might save about half of these opcodes, again. The above suggestion was a little thoughtless. The code generator makes no attempt to keep the used slots together, therefore the slot addressing cannot be saved easily. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From Samuele Pedroni Thu Jul 5 17:53:16 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Thu, 5 Jul 2001 18:53:16 +0200 (MET DST) Subject: [Python-Dev] "Becoming a Python Developer" Message-ID: <200107051653.SAA13631@core.inf.ethz.ch> [Andrew Kuchling] > It might also be Java's development culture. Back when I occasionally > looked for Java classes, it was amazing how little free Java software > there was. People would write, say, a specialized AWT Layout class, > but instead of putting on a Web page with source code and an example, > they'd want to you pay $25 or $300 for it! This probably isn't helped > by Java support on Unixes (other than Solaris) having been so dodgy > for so long, as Unix seems to have the strongest such culture. So > people may not be accustomed to the idea that if they use Jython, they > can also *improve* it. Or maybe there's not a lot of people with core hacking attitude in the java world. How many people do know the classfile format and the jvm instruction set ? ... Or their boss does not want them to spend office time on one of the key of their company success ;) Any conspiracy theory can do the job here Nevertheless the situation is bit sad ... > BTW, I assume the Jython source uses the standard Java indentation and > formatting style? Yes and identation = 4 spaces A note: the code is a bit messy sometimes ;) From Samuele Pedroni Thu Jul 5 18:08:25 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Thu, 5 Jul 2001 19:08:25 +0200 (MET DST) Subject: [Python-Dev] "Becoming a Python Developer" Message-ID: <200107051708.TAA13910@core.inf.ethz.ch> [BAW] > > >>>>> "AK" == Andrew Kuchling writes: > > AK> BTW, I assume the Jython source uses the standard Java > AK> indentation and formatting style? > > Finn and Samuele are better arbiters of this, but when I was hacking > JPython, the answer was yes, with one exception: the opening brace for > a class should be on a line by itself, in column zero (not, as is the > Java convention, hanging at the right end of the first line of code). > E.g. the convention was the same as Guido has for C code. > > This was done primarily for Emacs' sake, but I don't think Finn or > Samuele use Emacs for their development. > My bad, now both styles: class A { and class A { appear in the code, I use Forte. >LOL! You should have seen it before it was imported into CVS! It >would have made you cry! I was unaware of that. In any case I have added my personal amount of entropy to the code, and it wasn't in any way a targeted critique. Simply CPython code looks mostly better, maybe is just a virtue of C . Samuele. From barry@digicool.com Thu Jul 5 20:03:53 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 5 Jul 2001 15:03:53 -0400 Subject: [Python-Dev] [ANNOUNCE] PEP 8, Style Guide for Python Code Message-ID: <15172.47641.290174.183847@yyz.digicool.com> I've taken Guido's original style guide essay and converted it to PEP form. It is available as http://www.python.org/peps/pep-0008.html I've done some mild spellchecking and editorial formatting on the file, but left it as incomplete as the original essay. Hopefully though, having this document in PEP form will encourage contributions to update and expand it. Cheers, -Barry From thomas@xs4all.net Fri Jul 6 09:58:08 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 6 Jul 2001 10:58:08 +0200 Subject: [Python-Dev] "Becoming a Python Developer" In-Reply-To: <20010705110334.C12027@ute.cnri.reston.va.us> Message-ID: <20010706105808.D8098@xs4all.nl> On Thu, Jul 05, 2001 at 11:03:35AM -0400, Andrew Kuchling wrote: > The thing is, I'm not sure how Jython is developed. Is Finn Bock the > BDFL, do the developers vote, or what? (Same question for .NET.) If > some Jython development offered to contribute a description, I > certainly wouldn't turn it down. I always find myself thinking the BDFL job covers the *language* 'Python' more than the CPython implementation, and in that respect Guido is the BDFL of all implementations :) If you look at BDFL pronouncements, they are usually about semantics and syntax. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From akuchlin@mems-exchange.org Fri Jul 6 13:26:35 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 6 Jul 2001 08:26:35 -0400 Subject: [Python-Dev] "Becoming a Python Developer" In-Reply-To: <3B449A4C.4B541061@ActiveState.com>; from DavidA@activestate.com on Thu, Jul 05, 2001 at 09:48:13AM -0700 References: <20010704212231.A11629@ute.cnri.reston.va.us> <200107051409.f65E9wS10645@odiug.digicool.com> <20010705110334.C12027@ute.cnri.reston.va.us> <200107051512.f65FCEI12208@odiug.digicool.com> <3B449A4C.4B541061@ActiveState.com> Message-ID: <20010706082635.A14026@ute.cnri.reston.va.us> On Thu, Jul 05, 2001 at 09:48:13AM -0700, David Ascher wrote: >For the purposes of Andrew's document, Mark Hammond, of ActiveState, is >the BDFL on the .NET research project, which shouldn't be considered at >the same level of maturity as Jython by any means. In other words, it's >fun, but it's not useful yet. OK. Here are the descriptions I've written. If people want to rewrite for accuracy, please make suggestions (or just rewrite the text and send it to me): \item Stackless Python is a fork of CPython, but not one that diverges very far from the main tree. Its author, Christian Tismer, rewrote the main interpreter loop of CPython to minimize its use of the C stack; in particular, calling a Python function doesn't occupy any more room on the C stack. This means that, while CPython can only recurse a few thousand levels deep before filling up the C stack and crashing, Stackless can recurse to an unlimited depth. Stackless is also significantly faster than CPython (around 10\%), supports continuations and lightweight threads, and has found a community of highly skilled users, who use it to do things such as writing massively-multiplayer online game. The Stackless Python home page is at \url{http://www.stackless.com}. \item Jython is a reimplementation of Python, written in Java instead of C. (It was originally named JPython, but the name had to be changed for stupid trademark reasons.) Jython compiles Python code into Java bytecodes, and can seamlessly use any Java class directly from Python code, with no need to write an extension module first, as is necessary for CPython. The Jython home page is at \url{http://www.jython.org}. \item Python for .NET is an experimental implementation of Python for the .NET Framework. Currently this seems to be a research effort, because while compiling Python to .NET bytecodes has been implemented, and the resulting code works, making the resulting code \emph{fast} seems to be a difficult problem. See the Python.NET home page, at \url{http://www.activestate.com/Initiatives/NET/Research.html}, to get an overview of the current state of progress. --amk From Samuele Pedroni Fri Jul 6 14:35:06 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Fri, 6 Jul 2001 15:35:06 +0200 (MET DST) Subject: [Python-Dev] Python and e-art Message-ID: <200107061335.PAA17961@core.inf.ethz.ch> Hi. For the curious I just discovered this (maybe someone knew that already). Isn't python incredible . A group of e-artists has presented an e-art "virus" biennale.py written in python: http://www.0100101110101101.org/home/biennale_py/ at the Biennale, the famous international contemporary art exposition and gathering in Venezia. It seems a t-shirt with the source code is available too. Samuele Pedroni. From Samuele Pedroni Fri Jul 6 16:22:58 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Fri, 6 Jul 2001 17:22:58 +0200 (MET DST) Subject: [Python-Dev] Q: import logic Message-ID: <200107061523.RAA28861@core.inf.ethz.ch> Hi. I have looked at CPython import logic (C code) ... is the following true (ignoring relative import issues and None markers): trying to import s.p.a.m the logic checks for: s.p.a.m s.p.a s.p s in sys.modules in this order until it finds an already present module and starts the effective loading from there. Is that an implementation detail, or should be considered an important semantic aspect. Jython has a different logic but then some tricky python code (substituing packages with classes) can incur in inf recursion. Thanks, Samuele Pedroni. From guido@digicool.com Fri Jul 6 16:48:03 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 06 Jul 2001 11:48:03 -0400 Subject: [Python-Dev] Q: import logic In-Reply-To: Your message of "Fri, 06 Jul 2001 17:22:58 +0200." <200107061523.RAA28861@core.inf.ethz.ch> References: <200107061523.RAA28861@core.inf.ethz.ch> Message-ID: <200107061548.f66Fm3P18082@odiug.digicool.com> > Hi. I have looked at CPython import logic (C code) ... > > is the following true (ignoring relative import issues and None markers): > > trying to import s.p.a.m the logic checks for: > > s.p.a.m > s.p.a > s.p > s > > in sys.modules in this order until it finds an already present > module and starts the effective loading from there. > > Is that an implementation detail, or should be considered an > important semantic aspect. > > Jython has a different logic but then some tricky python code > (substituing packages with classes) can incur in inf recursion. > > Thanks, Samuele Pedroni. I'm not sure what alternative you had in mind, so I'm not sure how to answer this (fearing it is a trick question :-). This is supposed to look for s first, then s.p, then s.p.a, and then s.p.a.m. So exactly the opposite order of what you state! I hesitate to call this an implementation detail -- it really is intentional behavior that packages s, s.p, and s.p.a must be loaded and initialized before the import of s.p.a.m is attempted. Can you clarify the background of your question? --Guido van Rossum (home page: http://www.python.org/~guido/) From paulp@ActiveState.com Fri Jul 6 17:44:41 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Fri, 06 Jul 2001 09:44:41 -0700 Subject: [Python-Dev] IPv6 Message-ID: <3B45EAF9.5F91ABB4@ActiveState.com> I don't know if this is interesting to anyone but... -------- Original Message -------- Subject: Re: New python-dev summary Date: Fri, 06 Jul 2001 15:03:39 +0900 From: matz@ruby-lang.org (Yukihiro Matsumoto) To: language-dev@netthink.co.uk References: <15172.62152.816000.876075@gargle.gargle.HOWL> >.... Ruby's socket extension has been IPv6 aware for more than 2 years, by help from the BSD IPv6 stack developers. It may be useful for Python too. Unfortunately I myself have little knowledge about it. matz. Oops, I mailed to Nathan directly, sorry. From fdrake@acm.org Sat Jul 7 00:40:37 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 6 Jul 2001 19:40:37 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010706234037.432972892B@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Lot's of updates! Mostly small style adjustments. Documentation for some new markup of the documentation has been added. There is a bunch of new content in the Python/C API manual. I have started describing the new interface to support high-performance profiling and tracing. Some of the PyObject_*() functions which are used in creating objects have been described and some related reference count information has been added as well. Some small corrections have also been made in the C API manual. The updates to this manual have not yet been checked in. From akuchlin@mems-exchange.org Sat Jul 7 04:59:43 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 6 Jul 2001 23:59:43 -0400 Subject: [Python-Dev] "Becoming", rev. 2 Message-ID: <20010706235943.A15689@ute.cnri.reston.va.us> Another update: http://www.amk.ca/python/writing/python-dev.html Added a section on design principles, mostly so I can quote Tim's 19 theses, a conclusion and acks sections, and the previously posted descriptions of Stackless, Jython, and Python.NET. At this point I'm ready to go more public with it, and will send off notes to the usual places to announce it. Don't hesitate to send more comments, before or after any announcements go out. Now I have to go sleepy-dodos or I shall be all cross in the morning. --amk From thomas@xs4all.net Sat Jul 7 18:07:15 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sat, 7 Jul 2001 19:07:15 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: <200107051507.f65F7Wf12155@odiug.digicool.com> References: <20010705155447.C8098@xs4all.nl> <200107051507.f65F7Wf12155@odiug.digicool.com> Message-ID: <20010707190715.J8098@xs4all.nl> On Thu, Jul 05, 2001 at 11:07:32AM -0400, Guido van Rossum wrote: > > > Rip out the fancy behaviors of xrange that nobody uses: repeat, slice, > > > contains, tolist(), and the start/stop/step attributes. This includes > > > removing the 4th ('repeat') argument to PyRange_New(). > > Eek... What do we have the fucking warning framework and deprecation > > warnings for, anyway ?! > For more important things?! > I posted the PEP about this, and I got mostly favorable or lukewarm > responses. *One* person admitted they were using advanced xrange() > features now, but he said he wouldn't miss them. The warning > framework and deprecation warnings are important for things that will > change the semantics of things *without* causing an error message > (like nested scopes). They are also important for things that will > require lots of folks to change their code. I'm sorry, but I have to disagree with this, and vehemently disagree. Violently, even. Had I not taken the time to write this, it would have been riddled with cusswords ;) I'll tell you why we disagree, though: we look at Python from two entirely different angles. Let me try to explain mine, and why it is *bad* to change something, even something that should be rarely used, without warning. You seem to argue from the belief that everyone installs their own Python version, or upgrades by choice, being fully aware of all the changes it carries with them. This is (unfortunately) probably true for most of the Python users. I say unfortunately, because it means Python still hasn't hit the main stream :) XS4ALL is an ISP. We provide a bunch of services, like webhosting, machine hosting, shell access, etc. We have something like 100k shell users, and 8k webservers with CGI access, and all of them can use Python. Upgrading anything in that setup is a bitch. Upgrading something that *might* break 'broken' customer code is even worse. We had a client threaten to sue us for upgrading a Perl version where close ; was changed from a warning into a (compile time) error. Nevermind that it never did anything in the first place, suddenly their scripts generated an HTTP error 500, without them changing anything. And believe me, when you have 8k clueless companies hire wannabe-scriptkiddies to grab some Matt's Scripting Archive perl scripts from the 'net and get them working the way the company wants, you accumulate a *lot* of broken-but-barely-working code. Upgrading something that might break *perfectly valid code* is a lot, lot worse. The advanced xrange behaviour being gone in 2.2, as well as a 'yield' keyword added (which you hinted at in a c.l.py posting) without future statement, would make it practically impossible for me to upgrade Python from 2.0/2.1 to stock 2.2. I can't imagine it's any different for Gregor or any of the other package/distribution maintainers. How are they supposed to provide a smooth upgrade path if code breaks in silent and unobvious ways ? How can they decide for their millions of 'customers' whether or not they should have used xrange's advanced features ?? About the only thing I can think of that people like Gregor and/or people like me can do, is revert the xrange change and add warnings ourselves. I'm sorry, but "it shouldn't have been used this way" is simply not enough justification to rip something out without as much as a warning in advance. Range-objects aren't broken now. They aren't blocking the advancement of Python in any significant way like 'import *' and 'exec' were for nested scopes. Adding warnings should not be that hard, or the warnings framework is very broken. And I don't see why we bother with future statements and warnings at all if we still won't give the guarantee that code won't go from 'documentation-correct' to 'silently-broken' in a *single release*. It doesn't quite give the message that Python cares about backward compatibility or code-stability at all, so why bother trusting it at all ? Unfortunately, I can't decide *not* to upgrade Python either. One of our customers once threatened to sue us for not upgrading GCC. (And, of course, when we did, one of our other customers threatened to sue us for upgrading GCC, because of the damned C++ ABI/API changes.) We've had (and still have!) similar upgrade nightmares with F-secure SSH and OpenSSH, where you don't really have the option not to upgrade if you care about system security. I really don't need another package to worry about. > Again, the real point of the deprecation policy is not to *never* get > an error in old code. It is to make sure that you don't get burned by > *silent* changes in semantics, and to make sure that *common* usage > that will stop working is caught. Advanced xrange() is not common. > Calling PyRange_New() from C is not common. Not for you, probably not for most people. But I don't trust my customers, so I can't know what they do or what they rely on. But I do know that the removal of the advanced xrange() behaviour is very silent indeed, and it definately warrants a warning in a release before it is ripped out. Especially because there seems to be no reason to remove it, other than "I don't like it". Guido, I trust your language instincts; I know you are probably right about the advanced features of xrange, and I would never try to persuade you to do what you think is wrong, just supply my own opinion. But in maintenance issues, both in a technical and in a PR sense, I trust my own instincts a lot more than yours, and my instrincts are running around in bright red bodypaint, smacking themselves over the head with cluebricks, going "don't do it, don't do it". > > I can live (though not agree with, sorry ;P) the removal of xrange > > advanced features... just not from supported to *gone* in a single > > step. > Sorry, then you better commit suicide. :-) And leave you to finish 2.1.1 as well as 2.0.1 ? Hmmm. But I'll tell you one thing: if you make me be 2.2 Patch Czar with xrange still lobotomized, I'll have to consider that a bug and fix it the same week 2.2 comes out. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From skip@pobox.com (Skip Montanaro) Sat Jul 7 19:00:14 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Sat, 7 Jul 2001 13:00:14 -0500 Subject: [Python-Dev] Re: Comment on PEP-0238 In-Reply-To: <9ngektgqnrl17er73ukukqids95p5158dp@4ax.com> References: <9i7a7k$h9ks6$1@ID-11957.news.dfncis.de> <9ngektgqnrl17er73ukukqids95p5158dp@4ax.com> Message-ID: <15175.20014.876425.627044@beluga.mojam.com> Guido> (Hm. For various reasons I'm very tempted to introduce 'yield' Guido> as a new keyword without warnings or future statements in Python Guido> 2.2, so maybe I should bite the bullet and add 'div' as well...) C//> If one is going to add keywords to a language, I suggest that a C//> list of possible future keywords -- even ones that aren't planned C//> on being supported any time soon -- be reserved at the same time. And that warnings be issued for their use for at least one version. -- Skip Montanaro (skip@pobox.com) (847)971-7098 From guido@digicool.com Sat Jul 7 19:14:31 2001 From: guido@digicool.com (Guido van Rossum) Date: Sat, 07 Jul 2001 14:14:31 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: Your message of "Sat, 07 Jul 2001 19:07:15 +0200." <20010707190715.J8098@xs4all.nl> References: <20010705155447.C8098@xs4all.nl> <200107051507.f65F7Wf12155@odiug.digicool.com> <20010707190715.J8098@xs4all.nl> Message-ID: <200107071814.f67IEWT18834@odiug.digicool.com> Yes, the lawyers have a way of scaring us all, don't they. :-) I hear clearly that you want the advanced xrange() behavior to generate a warning before I take it out. I still think that's unnecessary, given that nobody in their right mind uses it. But since people who are out of their mind have access to lawyers too, you can go ahead and restore the old code and stuff it with warnings. Make sure to add a warning for every feature that I've taken out! (Do you think you'll need to add a warning to the __contains__ implementation? Taking that away doesn't change the functionality, but changes the *performance* from O(1) to O(n).) Regarding the yield statement: I'd love to require a future statement, but the current support for future statements doesn't support modifying the parser based on the presence of future statements, and I don't know how to resolve that, short of totally rewriting the parser or scanning ahead looking for a future statement with some regular expression. Sobering thought: It's possible, given all the other changes that I'm thinking about, that it just won't be possible to make Python 2.2 fully backwards compatible. Should we rename it to 3.0? Forget about the changes? Label it as experimental and encourage ISPs to install it as an "alternative" version, only available by using "python2.2"? PS: I am beginning to believe that the ThreadingTCPServer / SocketServer problems reported on SF are serious enough to warrant fixing in 2.1.1. I'll try to get to the bottom of it ASAP, but if someone else could look into this I'd be grateful too. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Sat Jul 7 19:37:01 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 7 Jul 2001 20:37:01 +0200 Subject: [Python-Dev] Re: CVS: python/dist/src/Include rangeobject.h,2.16,2.17 References: <20010705155447.C8098@xs4all.nl> <200107051507.f65F7Wf12155@odiug.digicool.com> <20010707190715.J8098@xs4all.nl> <200107071814.f67IEWT18834@odiug.digicool.com> Message-ID: <00c201c10713$d19a7be0$4ffa42d5@hagrid> guido wrote: > Sobering thought: It's possible, given all the other changes that I'm > thinking about, that it just won't be possible to make Python 2.2 > fully backwards compatible. Should we rename it to 3.0? Forget about > the changes? Label it as experimental and encourage ISPs to install > it as an "alternative" version, only available by using "python2.2"? every single Python release ever made has broken some of my code (often in rather esoteric ways). does that make them all "experimental"? imo, the only reasonable strategy for an ISP (or anyone offering a "standard python install" for a group of users) is of course to install new versions beside the old ones, notify users, and switch the default a couple of months after the new version has been installed. From loewis@informatik.hu-berlin.de Sat Jul 7 19:38:35 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Sat, 7 Jul 2001 20:38:35 +0200 (MEST) Subject: [Python-Dev] from future import yield Message-ID: <200107071838.UAA11256@pandora.informatik.hu-berlin.de> > Regarding the yield statement: I'd love to require a future statement, > but the current support for future statements doesn't support > modifying the parser based on the presence of future statements, and I > don't know how to resolve that, short of totally rewriting the parser > or scanning ahead looking for a future statement with some regular > expression. The "directive" patch manages to conditionally introduce a new keyword, namely directive. The trick is to introduce it into the grammar, but only recognize it as a keyword if a flag is set. That approach could be used for future imports also, although I'd much prefer to spell it directive transitional yield Regards, Martin From skip@pobox.com (Skip Montanaro) Sat Jul 7 19:51:21 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Sat, 7 Jul 2001 13:51:21 -0500 Subject: [Python-Dev] Re: Comment on PEP-0238 In-Reply-To: References: <9i7a7k$h9ks6$1@ID-11957.news.dfncis.de> Message-ID: <15175.23081.888138.584693@beluga.mojam.com> Guido> "Emile van Sebille" writes: >> If you're going to add keywords, why not add precision and allow >> those who want non-integer division to set it to the level of >> precision they require. That breaks no more code (presumably) than >> adding div or yield does. Guido> I'm not sure what you're asking about. If you're serious, please Guido> submit a PEP! This is the time to do it. Posting to the Guido> newsgroup is *not* sufficient to let an idea be heard by me -- Guido> you *have* to mail it to me directly or to python-dev. (While I Guido> like to read c.l.py sometimes, I cannot guarantee that I see Guido> every post.) Isn't this similar to Paul DuBois' floating point ideas? Skip From paulp@ActiveState.com Sat Jul 7 20:03:14 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sat, 07 Jul 2001 12:03:14 -0700 Subject: [Python-Dev] Mobius2 References: <20010705155447.C8098@xs4all.nl> <200107051507.f65F7Wf12155@odiug.digicool.com> <20010707190715.J8098@xs4all.nl> <200107071814.f67IEWT18834@odiug.digicool.com> Message-ID: <3B475CF2.8EDE5E5A@ActiveState.com> Guido van Rossum wrote: > >... > Regarding the yield statement: I'd love to require a future statement, > but the current support for future statements doesn't support > modifying the parser based on the presence of future statements, and I > don't know how to resolve that, short of totally rewriting the parser > or scanning ahead looking for a future statement with some regular > expression. Jeff Epler has an extension to Python that allows the grammar to be loaded at runtime. That might help: http://aspn.activestate.com/ASPN/Mail/Message/585636 -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From thomas@xs4all.net Sat Jul 7 20:31:58 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sat, 7 Jul 2001 21:31:58 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: <200107071814.f67IEWT18834@odiug.digicool.com> Message-ID: <20010707213158.K8098@xs4all.nl> On Sat, Jul 07, 2001 at 02:14:31PM -0400, Guido van Rossum wrote: > Yes, the lawyers have a way of scaring us all, don't they. :-) I don't care about the lawyers, we have lawyers to do that (and they're *good*, just look at the court cases we've won against all odds :) but I do care about angry customers badmouthing the company I work for, which I like to think isn't an ordinary one :) > I hear clearly that you want the advanced xrange() behavior to > generate a warning before I take it out. I still think that's > unnecessary, given that nobody in their right mind uses it. But since > people who are out of their mind have access to lawyers too, you can > go ahead and restore the old code and stuff it with warnings. Make > sure to add a warning for every feature that I've taken out! Great, thanx, I will, right after I cancel my lawyer's appointment. ;) > (Do you think you'll need to add a warning to the __contains__ > implementation? Taking that away doesn't change the functionality, > but changes the *performance* from O(1) to O(n).) I hadn't even noticed you took that one out... I can't say I see much point in removing it, but I don't see a reason to add a warning for it. > Regarding the yield statement: I'd love to require a future statement, > but the current support for future statements doesn't support > modifying the parser based on the presence of future statements, and I > don't know how to resolve that, short of totally rewriting the parser > or scanning ahead looking for a future statement with some regular > expression. Aha. Hrm... > Sobering thought: It's possible, given all the other changes that I'm > thinking about, that it just won't be possible to make Python 2.2 > fully backwards compatible. Should we rename it to 3.0? Forget about > the changes? Label it as experimental and encourage ISPs to install > it as an "alternative" version, only available by using "python2.2"? Well... hrm... Iterators, generators and the type/class unification strike me as more than enough reason to call it Python 3.0. Or we could ship 2.2 with iterators, but not the other features, warn against identifiers called 'yield' in that one, and ship 3.0 not long after. I have to admit I object less to adding 'yield' without warning than removing advanced xrange features, for two reasons: a new keyword breaks at compilation time, whereas missing xrange features appear at runtime, and secondly, I *like* generators :) A new parser that handles keywords more gracefully would also be an excellent reason for a 3.0 version number :-) > PS: I am beginning to believe that the ThreadingTCPServer / > SocketServer problems reported on SF are serious enough to warrant > fixing in 2.1.1. I'll try to get to the bottom of it ASAP, but if > someone else could look into this I'd be grateful too. Unsure which problems those are, but I'll keep an eye open for it (I'm going through the CVS logs, now that I figured out how to get them working, and the SF bug/patch database in the coming week.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From jack@oratrix.nl Sat Jul 7 22:43:37 2001 From: jack@oratrix.nl (Jack Jansen) Date: Sat, 07 Jul 2001 23:43:37 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: Message by Guido van Rossum , Sat, 07 Jul 2001 14:14:31 -0400 , <200107071814.f67IEWT18834@odiug.digicool.com> Message-ID: <20010707214342.DEFF0DA742@oratrix.oratrix.nl> Recently, Guido van Rossum said: > Sobering thought: It's possible, given all the other changes that I'm > thinking about, that it just won't be possible to make Python 2.2 > fully backwards compatible. Should we rename it to 3.0? Forget about > the changes? Label it as experimental and encourage ISPs to install > it as an "alternative" version, only available by using "python2.2"? In this respect you should also think of the people Embedding/extending Python. From the checkin messages I get the impression that all the new inheritance stuff could well break things there, and if you're going to break, say, pyapache or somesuch then a major version jump may well be called for... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From barry@digicool.com Sun Jul 8 01:19:08 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Sat, 7 Jul 2001 20:19:08 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> Message-ID: <15175.42748.753199.958152@anthem.wooz.org> >>>>> "TW" == Thomas Wouters writes: TW> Well... hrm... Iterators, generators and the type/class TW> unification strike me as more than enough reason to call it TW> Python 3.0. I think this is something to seriously consider. Especially because I suspect that the types/class stuff may be rather green at first, and (as Guido implied) may not be able to be done in a backwards compatible way. Bumping the rev number to 3.0 also makes me a little more comfortable with adding stuff like the yield keyword with no future statement. /If/ we do that, then we shouldn't necessarily abandon the 2.x series immediately. We can do things like work on performance improvements, library enhancements, and bug fixes. This strategy might also calm the fears about Python-the-language moving too quickly. -Barry From akuchlin@mems-exchange.org Sun Jul 8 02:35:09 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Sat, 7 Jul 2001 21:35:09 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: <20010707213158.K8098@xs4all.nl>; from thomas@xs4all.net on Sat, Jul 07, 2001 at 09:31:58PM +0200 References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> Message-ID: <20010707213508.A16251@ute.cnri.reston.va.us> On Sat, Jul 07, 2001 at 09:31:58PM +0200, Thomas Wouters wrote: >Well... hrm... Iterators, generators and the type/class unification strike >me as more than enough reason to call it Python 3.0. Or we could ship 2.2 Agreed. The version number being a few decimal place shifts away from Python 3000 is cute, too. --amk From guido@digicool.com Sun Jul 8 12:45:14 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 08 Jul 2001 07:45:14 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: Your message of "Sat, 07 Jul 2001 21:35:09 EDT." <20010707213508.A16251@ute.cnri.reston.va.us> References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> Message-ID: <200107081145.f68BjE824353@odiug.digicool.com> > On Sat, Jul 07, 2001 at 09:31:58PM +0200, Thomas Wouters wrote: > >Well... hrm... Iterators, generators and the type/class unification strike > >me as more than enough reason to call it Python 3.0. Or we could ship 2.2 > > Agreed. The version number being a few decimal place shifts away from > Python 3000 is cute, too. > > --amk Well, that's one of the reasons why I *don't* want this to be the 3.0 release. Python 2.2 is *not* Python 3000, it is only a small step on the way. I also think that as soon as we announce something that smells like Py3k to the users, there will be a huge effort to keep Python 2.x alive. This could cause a split in the user community of gigantic porportions, and we'd run the risk that most of the users would stay at Python 2.x forever. This in turn would require us to maintain that, probably release 2.2, 2.3 and further versions. Despite what started this discussion, I think there will only be a very small number of real incompatibilities between 2.1 and 2.2: one or two new keywords (and we may have a way to reduce this to zero by using a future or directive statement), and the object introspection API will change. I'm not planning on breaking classic classes in any significant way -- that will be reserved for 2.3 or later (this is the domain of PEP 254 which is deliberately empty so far). Q. If an operation that failed with an AttributeError now fails with a TypeError (or the other way around), how important is that incompatibility? --Guido van Rossum (home page: http://www.python.org/~guido/) From esr@thyrsus.com Sat Jul 7 21:01:32 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Sat, 7 Jul 2001 16:01:32 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: <200107081145.f68BjE824353@odiug.digicool.com>; from guido@digicool.com on Sun, Jul 08, 2001 at 07:45:14AM -0400 References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com> Message-ID: <20010707160132.A8791@thyrsus.com> Guido van Rossum : > Q. If an operation that failed with an AttributeError now fails with a > TypeError (or the other way around), how important is that > incompatibility? Not very, in my opinion. I don't believe I've ever coded an except for either of them. -- Eric S. Raymond The abortion rights and gun control debates are twin aspects of a deeper question --- does an individual ever have the right to make decisions that are literally life-or-death? And if not the individual, who does? From fredrik@pythonware.com Sun Jul 8 13:41:23 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sun, 8 Jul 2001 14:41:23 +0200 Subject: [Python-Dev] Re: changing AttributeError to TypeError References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com> Message-ID: <007101c107ad$27fccc60$4ffa42d5@hagrid> guido wrote: > Q. If an operation that failed with an AttributeError now fails with a > TypeError (or the other way around), how important is that > incompatibility? what operations do you have in mind? cd Lib grep "except.*\(AttributeError\|TypeError\)" *.py */*.py */*/*.py gives me about 75 hits in the 2.0 standard library; looks like all but one would break if you changed *all* attribute errors to type errors, and vice versa... if this change doesn't affect any code in the standard library, changes are that it'll only break a few of the ~1000 uses I found in my company's code repository... From gward@python.net Mon Jul 9 00:14:01 2001 From: gward@python.net (Greg Ward) Date: Sun, 8 Jul 2001 19:14:01 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: <200107081145.f68BjE824353@odiug.digicool.com>; from guido@digicool.com on Sun, Jul 08, 2001 at 07:45:14AM -0400 References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com> Message-ID: <20010708191401.B779@gerg.ca> On 08 July 2001, Guido van Rossum said: > Q. If an operation that failed with an AttributeError now fails with a > TypeError (or the other way around), how important is that > incompatibility? I generally think of those exceptions as meaning, "You've got a bug in your code, bozo" so I don't bother catching them (except in the main loop of GUIs and servers, to show a big scary traceback to the poor user or dump it in a logfile). However, I think that AttributeError is pretty aptly used for the most part, and I don't see a great benefit in changing an incorrect "thing.property" to raise TypeError. Greg -- Greg Ward - Linux geek gward@python.net http://starship.python.net/~gward/ God is real, unless declared integer. From tim.one@home.com Mon Jul 9 00:44:22 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 8 Jul 2001 19:44:22 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: <20010708191401.B779@gerg.ca> Message-ID: [Greg Ward] > However, I think that AttributeError is pretty aptly used for the most > part, and I don't see a great benefit in changing an incorrect > "thing.property" to raise TypeError. "The problem" comes up again and again, and in every release (this isn't something new!), in the specific context of instance objects. Like what do you do for class C: pass c = C() print len(c) ? Instance objects have *every* interesting tp_xxx slot filled in "just in case", so the len() implementation code that first checks for the existence of tp_as_sequence or tp_as_mapping says "NO!" to len(3) but "YES!" to len(c). The result is that len(3) produces TypeError: len() of unsized object but len(c) eventually gets around to raising the superficially different AttributeError: C instance has no attribute '__len__' instead. So in cases like this TypeError really means "and it's damned obvious", while AttributeError means "but it might have been otherwise had you defined your class differently, but you didn't, and I'm not exactly sure *why* someone is asking me for "__len__", so the safest thing to say is that I don't have such an attribute". Then the problem is that "just try it and see whether it works" code gets written based on trying an example in a shell, to see whether AttributeError or TypeError gets raised in the specific case the author is worried about, and in a later release it raises the other one instead. As an old-time Pythoneer, I took the almost total lack of "which exceptions get raised when, exactly" docs as a warning that this stuff was *expected* to change frequently, so I've always written "try it and see" code via try: it._and(see) except (TypeError, AttributeError): pass That almost never "breaks" across releases. Here's a specific 2.1 vs 2.2 example: >>> for i in C(): pass # 2.1 ... Traceback (most recent call last): File "", line 1, in ? AttributeError: C instance has no attribute '__getitem__' >>> >>> for i in C(): pass # 2.2a0 ... Traceback (most recent call last): File "", line 1, in ? TypeError: iter() of non-sequence >>> Since for-loops no longer require __getitem__ at all, even if we *could* raise the same error in 2.2, it wouldn't make *sense* in 2.2. In every case I've seen, a switch from AttributeError to TypeError makes better sense in the end. Can break code, though! From guido@digicool.com Mon Jul 9 01:33:15 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 08 Jul 2001 20:33:15 -0400 Subject: [Python-Dev] Re: changing AttributeError to TypeError In-Reply-To: Your message of "Sun, 08 Jul 2001 14:41:23 +0200." <007101c107ad$27fccc60$4ffa42d5@hagrid> References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com> <007101c107ad$27fccc60$4ffa42d5@hagrid> Message-ID: <200107090033.f690XFg24570@odiug.digicool.com> > guido wrote: > > > Q. If an operation that failed with an AttributeError now fails with a > > TypeError (or the other way around), how important is that > > incompatibility? > > what operations do you have in mind? The specific example was this: class C: pass list(C()) The second line used to raise AttributeError: 'C' instance has no attribute '__len__'; now it raises TypeError: iter() of non-sequence. But I imagine there will be others, caused by the different (IMO better) way of implementing getattr for most built-in types. (Note that I'm hardly touching "classic" classes -- that's a post-2.2 job if there ever was one.) > cd Lib > grep "except.*\(AttributeError\|TypeError\)" *.py */*.py */*/*.py > > gives me about 75 hits in the 2.0 standard library; looks like all but > one would break if you changed *all* attribute errors to type errors, > and vice versa... > > if this change doesn't affect any code in the standard library, > changes are that it'll only break a few of the ~1000 uses I found > in my company's code repository... > > Not clear what that means... I tend to fix the test suite when it tests for too specific an error. I don't think there are many cases in the library proper that are sensitive to the kind of thing that might change. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Mon Jul 9 01:40:47 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 08 Jul 2001 20:40:47 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: Your message of "Sun, 08 Jul 2001 19:14:01 EDT." <20010708191401.B779@gerg.ca> References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com> <20010708191401.B779@gerg.ca> Message-ID: <200107090040.f690elP24596@odiug.digicool.com> > On 08 July 2001, Guido van Rossum said: > > Q. If an operation that failed with an AttributeError now fails with a > > TypeError (or the other way around), how important is that > > incompatibility? Greg Ward: > I generally think of those exceptions as meaning, "You've got a bug in > your code, bozo" so I don't bother catching them (except in the main > loop of GUIs and servers, to show a big scary traceback to the poor user > or dump it in a logfile). That's my view on them too. > However, I think that AttributeError is pretty aptly used for the most > part, and I don't see a great benefit in changing an incorrect > "thing.property" to raise TypeError. Fortunately, that wasn't what I attempted to propose. As I mentioned in my reply to Fredrik, there are/were some cases where you get a surprise AttributeError because a type inconsistency reveals itself when an object doesn't support a required operation. This can go either way: what used to be an AttributeError may become a TypeError, or vice versa. (Sorry, no concrete examples right now besides the previous list(C()) example.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@python.net Mon Jul 9 02:14:22 2001 From: gward@python.net (Greg Ward) Date: Sun, 8 Jul 2001 21:14:22 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: ; from tim.one@home.com on Sun, Jul 08, 2001 at 07:44:22PM -0400 References: <20010708191401.B779@gerg.ca> Message-ID: <20010708211422.A1546@gerg.ca> On 08 July 2001, Tim Peters said: > "The problem" comes up again and again, and in every release (this isn't > something new!), in the specific context of instance objects. Like what do > you do for > > class C: > pass > > c = C() > print len(c) Good point -- this is one place where AttributeError is misused and confusing. +1 on changing it to TypeError -- this sounds like a definite usability increase. (IOW, it won't break *my* code. ;-) Greg -- Greg Ward - Unix nerd gward@python.net http://starship.python.net/~gward/ I repeat myself when under stress I repeat myself when under stress I repeat--- From tim.one@home.com Mon Jul 9 03:40:19 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 8 Jul 2001 22:40:19 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: <20010708211422.A1546@gerg.ca> Message-ID: [Tim] > class C: > pass > > c = C() > print len(c) [Greg Ward] > Good point -- this is one place where AttributeError is misused and > confusing. +1 on changing it to TypeError -- this sounds like a > definite usability increase. (IOW, it won't break *my* code. ;-) We're not actually *proposing* to change anything; in fact, that specific example works the same in 2.2a0 (even with the type/class changes) as in 2.1. The problem is that which of {TypeError, AttributeError} you get when a specific object doesn't support a specific operation is at least partly an accident, and changes from time to time whether or not intended. Since instance objects have always been the flakiest in this respect, and the instance/class machinery is undergoing radical surgery on descr-branch (in particular, classes are themselves becoming instances (of metaclasses)), I think Guido is trying to get a feel for how loudly people will howl if we don't add reams of obscure code seeking to reproduce old accidents exactly. it's-not-whether-they'll-howl-it's-the-volume-ly y'rs - tim From tim.one@home.com Mon Jul 9 04:51:30 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 8 Jul 2001 23:51:30 -0400 Subject: [Python-Dev] Python and e-art In-Reply-To: <200107061335.PAA17961@core.inf.ethz.ch> Message-ID: [Samuele Pedroni] > Hi. For the curious I just discovered this (maybe someone knew > that already). > > Isn't python incredible . > > A group of e-artists has presented an e-art "virus" biennale.py > written in python: > > http://www.0100101110101101.org/home/biennale_py/ > > at the Biennale, the famous international contemporary art > exposition and gathering in Venezia. > > It seems a t-shirt with the source code is available too. Ya, and last week Python-Help got its first question about how concerned Python users should be about this. It's a cute and silly "virus": it's just a bit of Python code that reads its own source code from disk (up to a "stop here!" marker), looks for some other Python files, and prepends itself to them. Thus the files it alters will (probably) do the same kind of thing when *they're* run; and so on. The infected files clearly say that they're infected (in comments), and the "stop here!" marker makes it easy to remove the mutation later. All in all, it's more an example of marketing savvy than virus technology. At the Biennale, their "exhibit" is simply a computer infected with this virus. An article in Wired said they managed to sucker 3 people so far into paying something like $1000.00 a pop for a CD containing the virus source code. all's-fair-in-war-and-art-ly y'rs - tim From barry@digicool.com Mon Jul 9 05:20:46 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Mon, 9 Jul 2001 00:20:46 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 References: <20010708211422.A1546@gerg.ca> Message-ID: <15177.12574.96784.801833@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> Since instance objects have always been the flakiest in this TP> respect, and the instance/class machinery is undergoing TP> radical surgery on descr-branch (in particular, classes are TP> themselves becoming instances (of metaclasses)), I think Guido TP> is trying to get a feel for how loudly people will howl if we TP> don't add reams of obscure code seeking to reproduce old TP> accidents exactly. As you say, this has always been flaky, inconsistent, underspecified, and unpredictable, so IMO Guido's free to change this kind of thing as he sees fit. Builtins like list() or len() which implicitly do attribute access under the covers should be free to raise either exception, and good defensive programs have already probably been catching both. I know there's no danger in changing the behavior for an explicit instance.attr access. We all agree that that should always raise AttributeError, right? :) -Barry From tim.one@home.com Mon Jul 9 06:08:50 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 9 Jul 2001 01:08:50 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: <15177.12574.96784.801833@anthem.wooz.org> Message-ID: [Barry A. Warsaw] > ... > I know there's no danger in changing the behavior for an explicit > instance.attr access. We all agree that that should always raise > AttributeError, right? :) Please, let's be serious here. How an instance looks up attributes is obviously a policy of the instance's class, and class policies are obviously set by the class of which the instance's class is an instance, or, in other words, by the instances's class's metaclass. Unless you want to say that class policies are inherited from base classes, in which case an entirely different line of obvious argument obviously applies -- but that would be wrong. Now if the metaclass is of type type, then all you have to do is look at PyType_Type.tp_getattr == type_getattr, and we see that it raises AttributeError unless the attribute is one of "__name__" or "__doc__" or "__members__". So, yes, instance.attr will *always* raise AttributeError in this case, because PyType_Type doesn't allow for the existence of any attribute named "attr". From that we deduce that looking up instance attributes is probably not a class policy determined by the metaclass after all, so some other obvious argument must apply. I'll get back to you after rereading all the PEPs. But it would be better for all if you didn't ask such obvious questions to begin with . thinking-too-much-is-a-symptom-of-disease-ly y'rs - tim From fredrik@pythonware.com Mon Jul 9 09:11:02 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 9 Jul 2001 10:11:02 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 References: <20010708211422.A1546@gerg.ca> <15177.12574.96784.801833@anthem.wooz.org> Message-ID: <01df01c1084e$b3921090$4ffa42d5@hagrid> barry wrote: > As you say, this has always been flaky, inconsistent, underspecified, > and unpredictable >>> class C: pass ... >>> c = C() >>> len(c) Traceback (most recent call last): File "", line 1, in ? AttributeError: 'C' instance has no attribute '__len__' >>> len(c) Traceback (most recent call last): File "", line 1, in ? AttributeError: 'C' instance has no attribute '__len__' >>> len(c) Traceback (most recent call last): File "", line 1, in ? AttributeError: 'C' instance has no attribute '__len__' >>> len(c) Traceback (most recent call last): File "", line 1, in ? AttributeError: 'C' instance has no attribute '__len__' >>> len(c) Traceback (most recent call last): File "", line 1, in ? AttributeError: 'C' instance has no attribute '__len__' >>> len(c) Traceback (most recent call last): File "", line 1, in ? AttributeError: 'C' instance has no attribute '__len__' >>> len(c) Traceback (most recent call last): File "", line 1, in ? AttributeError: 'C' instance has no attribute '__len__' looks pretty predictable to me... From fredrik@pythonware.com Mon Jul 9 09:08:59 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 9 Jul 2001 10:08:59 +0200 Subject: [Python-Dev] Re: changing AttributeError to TypeError References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com> <007101c107ad$27fccc60$4ffa42d5@hagrid> <200107090033.f690XFg24570@odiug.digicool.com> Message-ID: <01dc01c1084e$b33aefe0$4ffa42d5@hagrid> guido wrote: > > > Q. If an operation that failed with an AttributeError now fails with a > > > TypeError (or the other way around), how important is that > > > incompatibility? > > > what operations do you have in mind? > > The specific example was this: > > class C: pass > list(C()) > > The second line used to raise AttributeError: 'C' instance has no > attribute '__len__'; now it raises TypeError: iter() of non-sequence. so "an operation" in your original question is limited to operations that may have resulted in an AttributeError or a TypeError depending on the type, and the change means that they will now be more consistent? doesn't sound too bad to me. > > gives me about 75 hits in the 2.0 standard library; looks like all but > > one would break if you changed *all* attribute errors to type errors, > > and vice versa... > > > > if this change doesn't affect any code in the standard library, > > changes are that it'll only break a few of the ~1000 uses I found > > in my company's code repository... > > Not clear what that means... the second sentence should have been: on the other hand, if this change DOESN'T affect any code in the standard library, chances are that it'll only break a few of the ~1000 uses I found in my company's code repository... > I tend to fix the test suite when it tests for too specific an error. > I don't think there are many cases in the library proper that are > sensitive to the kind of thing that might change. have you made ANY changes to the library this far? From thomas@xs4all.net Mon Jul 9 13:31:04 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 9 Jul 2001 14:31:04 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 In-Reply-To: <200107071814.f67IEWT18834@odiug.digicool.com> References: <20010705155447.C8098@xs4all.nl> <200107051507.f65F7Wf12155@odiug.digicool.com> <20010707190715.J8098@xs4all.nl> <200107071814.f67IEWT18834@odiug.digicool.com> Message-ID: <20010709143104.R8098@xs4all.nl> On Sat, Jul 07, 2001 at 02:14:31PM -0400, Guido van Rossum wrote: > I hear clearly that you want the advanced xrange() behavior to > generate a warning before I take it out. I still think that's > unnecessary, given that nobody in their right mind uses it. But since > people who are out of their mind have access to lawyers too, you can > go ahead and restore the old code and stuff it with warnings. Make > sure to add a warning for every feature that I've taken out! Done: >>> (xrange(9)[8:7]*6 == xrange(5) or xrange(4).start or xrange(3)).tolist() __main__:1: DeprecationWarning: xrange object slicing is deprecated; convert to list instead __main__:1: DeprecationWarning: xrange object multiplication is deprecated; convert to list instead __main__:1: DeprecationWarning: PyRange_New's 'repetitions' argument is deprecated __main__:1: DeprecationWarning: xrange object comparision is deprecated; convert to list instead __main__:1: DeprecationWarning: xrange object's 'start', 'stop' and 'step' attributes are deprecated __main__:1: DeprecationWarning: xrange.tolist() is deprecated; use list(xrange) instead [0, 1, 2] Those are all the warnings I added: for PyRange_New's 'reps' argument, for slicing, multiplication, comparison (did you really mean to take it out?), the start/stop/step attributes, and tolist(). I did leave the range_concat function out, though, so the error >>> xrange(1) + xrange(1) Traceback (innermost last): File "", line 1, in ? TypeError: cannot concatenate xrange objects still changes into >>> xrange(1) + xrange(1) Traceback (most recent call last): File "", line 1, in ? TypeError: unsupported operand types for + but I don't see a problem with that. I also left out the 'contains' implementation. Still-not-understanding-*why*--ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@digicool.com Mon Jul 9 14:00:11 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 09 Jul 2001 09:00:11 -0400 Subject: [Python-Dev] Re: changing AttributeError to TypeError In-Reply-To: Your message of "Mon, 09 Jul 2001 10:08:59 +0200." <01dc01c1084e$b33aefe0$4ffa42d5@hagrid> References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com> <007101c107ad$27fccc60$4ffa42d5@hagrid> <200107090033.f690XFg24570@odiug.digicool.com> <01dc01c1084e$b33aefe0$4ffa42d5@hagrid> Message-ID: <200107091300.f69D0Bo25004@odiug.digicool.com> > so "an operation" in your original question is limited to operations that > may have resulted in an AttributeError or a TypeError depending on the > type, and the change means that they will now be more consistent? Yes. > doesn't sound too bad to me. Me neither. :-) > > I tend to fix the test suite when it tests for too specific an error. > > I don't think there are many cases in the library proper that are > > sensitive to the kind of thing that might change. > > have you made ANY changes to the library this far? Can't recall. --Guido van Rossum (home page: http://www.python.org/~guido/) From Samuele Pedroni Mon Jul 9 15:16:28 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Mon, 9 Jul 2001 16:16:28 +0200 (MET DST) Subject: [Python-Dev] Python and e-art Message-ID: <200107091416.QAA05547@core.inf.ethz.ch> [Tim Peters] > all's-fair-in-war-and-art-ly y'rs - tim and commerce] No I was not much positively impressed by biennale.py, e.g. as a Program is not a work of art. I think something more along the line of perl (my bad) poetry, would be better than a self-replicating juxtaposition of body soul and fornicate etc (as identifiers) . It's a poor representation of "sex" ... From gball@cfa.harvard.edu Mon Jul 9 15:50:16 2001 From: gball@cfa.harvard.edu (Greg Ball) Date: Mon, 9 Jul 2001 10:50:16 -0400 (EDT) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17 Message-ID: > but I don't see a problem with that. I also left out the 'contains' > implementation. > Still-not-understanding-*why*--ly y'rs, In the fine tradition of xrange, that 'contains' implementation is slightly broken. It doesn't have proper object equality semantics. Python 2.1 (#1, Jul 4 2001, 14:48:37) [GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-81)] on linux2 Type "copyright", "credits" or "license" for more information. >>> r, xr = range(10), xrange(10) >>> 1.1 in r 0 >>> 1.1 in xr 1 >>> 1+0j in r 1 >>> 1+0j in xr Traceback (most recent call last): File "", line 1, in ? TypeError: can't convert complex to int; use e.g. int(abs(z)) --Greg Ball From guido@digicool.com Mon Jul 9 19:39:21 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 09 Jul 2001 14:39:21 -0400 Subject: [Python-Dev] Re: changing AttributeError to TypeError In-Reply-To: Your message of "Mon, 09 Jul 2001 09:00:11 EDT." <200107091300.f69D0Bo25004@odiug.digicool.com> References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com> <007101c107ad$27fccc60$4ffa42d5@hagrid> <200107090033.f690XFg24570@odiug.digicool.com> <01dc01c1084e$b33aefe0$4ffa42d5@hagrid> <200107091300.f69D0Bo25004@odiug.digicool.com> Message-ID: <200107091839.f69IdLT30186@odiug.digicool.com> I just noticed another place that swaps a TypeError for an AttributeError. In 2.1 and before, assigning to an attribute of an object that doesn't support attribute assignment (like a list) raises TypeError. Under the new scheme, this will raise AttributeError. (On the other hand, assigning to a read-only attribute of an object that *does* support attribute assignment raises TypeError in the old and new scheme.) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com (Skip Montanaro) Tue Jul 10 05:14:01 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 9 Jul 2001 23:14:01 -0500 Subject: [Python-Dev] Silly little benchmark Message-ID: <15178.33033.669776.824095@beluga.mojam.com> I don't know what motivated me to try this, but based on the print ``1`+`2`` thing that came up in c.l.py I came up with the following "benchmarks": for i in xrange(100000): pass for i in xrange(100000): x = 1 for i in xrange(100000): x = ``1`+`2`` user mode times on my computer (sys mode was always 0.0) were Python 1.6 Python 2.1 change pass 0.12 0.20 1.67x x = 1 0.17 0.30 1.76x x = ``1`+`2`` 1.60 2.13 1.33x Startup times (python -S -c 'pass') are 0.0 for both versions on my 'puter. It appears loop execution overhead has gotten substantially worse between 1.6 and 2.1. I know new stuff that will affect looping is going into 2.2 (the generator stuff), but it would seem a good time to reserve a minor version for mostly performance improvements. 2.3 perhaps? Or do we wait for Armin's magic pixie dust to sprinkle down upon our heads so we can crush those Perl swine once and for all? ;-) at-least-on-linux-ly y'rs, Skip From tim.one@home.com Tue Jul 10 08:33:09 2001 From: tim.one@home.com (Tim Peters) Date: Tue, 10 Jul 2001 03:33:09 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15178.33033.669776.824095@beluga.mojam.com> Message-ID: [Skip Montanaro] > ... > I came up with the following "benchmarks": > > for i in xrange(100000): pass > for i in xrange(100000): x = 1 > for i in xrange(100000): x = ``1`+`2`` > > user mode times on my computer (sys mode was always 0.0) were > > Python 1.6 Python 2.1 change > pass 0.12 0.20 1.67x > x = 1 0.17 0.30 1.76x > x = ``1`+`2`` 1.60 2.13 1.33x Please don't post stuff with hard tab characters (I took them out by hand so this wasn't an unreadable mess). > Startup times (python -S -c 'pass') are 0.0 for both versions on > my 'puter. It appears loop execution overhead has gotten substantially > worse between 1.6 and 2.1. AFAIK, nothing relevant changed between 1.6 and 2.1. Anyone else? Indeed, AFIAK, *nothing* plausibly relevant about about for-loops or xrange has changed since 1.5 (when some general eval-loop speedups got done). > ... > but it would seem a good time to reserve a minor version for mostly > performance improvements. 2.3 perhaps? The loop speedup in 2.2 requires changes in the PVM as well as adopting the iterator protocol. If you've got some *easy* performance improvements, sure, but then I have to wonder why you've been holding them back . From fdrake@acm.org Tue Jul 10 17:27:12 2001 From: fdrake@acm.org (Fred L. Drake) Date: Tue, 10 Jul 2001 12:27:12 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010710162712.D2A8F2892B@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Updated to reflect the recent checkins for the Python/C API manual, which cover a number of the object creation and initialization functions. From skip@pobox.com (Skip Montanaro) Tue Jul 10 18:23:15 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 10 Jul 2001 12:23:15 -0500 Subject: [Python-Dev] Silly little benchmark In-Reply-To: References: <15178.33033.669776.824095@beluga.mojam.com> Message-ID: <15179.14851.485783.763990@beluga.mojam.com> >> Python 1.6 Python 2.1 change >> pass 0.12 0.20 1.67x >> x = 1 0.17 0.30 1.76x >> x = ``1`+`2`` 1.60 2.13 1.33x Tim> Please don't post stuff with hard tab characters (I took them out Tim> by hand so this wasn't an unreadable mess). Damn! I meant to run untabify before posting (I knew you'd bitch about hard tabs ;-) but then I went and forgot. I just added an untab hook to my Emacs mail-send-hooks, so this shouldn't happen in the future. Tim> The loop speedup in 2.2 requires changes in the PVM as well as Tim> adopting the iterator protocol. If you've got some *easy* Tim> performance improvements, sure, but then I have to wonder why Tim> you've been holding them back . I've not been holding anything back. Like I said, I don't know what made me take the 10 minutes right then to whip up a couple trivial benchmarks. (Bored, I guess.) I'll try playing around to see what I can dig up. Skip From esr@snark.thyrsus.com Tue Jul 10 18:40:30 2001 From: esr@snark.thyrsus.com (Eric S. Raymond) Date: Tue, 10 Jul 2001 13:40:30 -0400 Subject: [Python-Dev] Leading with XML-RPC Message-ID: <200107101740.f6AHeUC21223@snark.thyrsus.com> I just got off the phone with Dave Winer, the designer of the Frontier scripting language. Dave is concerned that the open-source community's response to Microsoft's .NET and and Hailstorm proposals isn't active enough; he views Miguel de Icaza's MONO proposal as good thing but essentially playing catch-up with a Microsoft-defined standard. Dave suggests that the open-source community can turn up the heat on Microsoft by visibly supporting and promoting open RPC standards that compete with .NET, such as XML-RPC and SOAP 1.1. He thinks that the implementors of scripting languages like Perl and Python are in a particularly good position to make this happen, by making XML-RPC and/or SOAP 1.1 fully documented parts of their standard libraries. I agree with both parts of Dave's assessment, and am willing to put my own effort into making it happen by doing some of the integration work. Therefore the concrete proposal: we should make XML-RPC support in the Python standard library a goal for 2.2. I'd like to see votes and/or a BDFL pronouncement on this goal. I've copied Fredrik Lundh and Eric Kidd, the implementors of two XML-RPC implementations that might serve. Dave (who designed XML-RPC) likes them both. I hope they'll report on which, if either, they consider production-ready for integration with Python. -- Eric S. Raymond Idealism is the noble toga that political gentlemen drape over their will to power. -- Aldous Huxley From Petra_Recter@prenhall.com Tue Jul 10 18:28:33 2001 From: Petra_Recter@prenhall.com (Petra_Recter@prenhall.com) Date: 10 Jul 2001 13:28:33 -0400 Subject: [Python-Dev] Publisher seeking technical reviewers for books on Python Message-ID: <"/GUID:Qzwpffjx11RGZOABgCI2PYQ*/G=Petra/S=Recter/OU=exchange/O=pearsontc/PRMD=pearson/ADMD=telemail/C=us/"@MHS> Prentice Hall, a leading college publisher, is seeking knowledgeable python programmers to review chapters from technical computer science books. The chapters are posted on an ftp site and reviewers are asked to download the chapters, print it out and make comments on the hard copy. The requested turnaround time is approximately 3 days per chapter. The token honorarium we are offering is $75 per chapter reviewed. If you are interested, please contact Petra Recter (petra_recter@prenhall.com) and include your resume. Thanks, Petra Petra Recter Senior Acquisitions Editor, Computer Science Prentice Hall One Lake Street - #3F66 Upper Saddle River, NJ 07458 Email: petra_recter@prenhall.com Tel: (201) 236-7186 Fax: (201) 236-7170 From skip@pobox.com (Skip Montanaro) Tue Jul 10 18:52:27 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 10 Jul 2001 12:52:27 -0500 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <200107101740.f6AHeUC21223@snark.thyrsus.com> References: <200107101740.f6AHeUC21223@snark.thyrsus.com> Message-ID: <15179.16603.865875.309079@beluga.mojam.com> Eric> Therefore the concrete proposal: we should make XML-RPC support in Eric> the Python standard library a goal for 2.2. I'd like to see votes Eric> and/or a BDFL pronouncement on this goal. +1 from me. I use a slightly doctored version of /F's 0.9.8 version of xmlrpclib (current version is, I think, 0.9.9). Perhaps inclusion in the Python core would be a good reason to bump the version number to 1.0. The only potential problem I see is that nagging "gotta be ASCII for interoperability" bug up Dave W's butt. It flies in the face of attempts to make Python more Unicode-friendly. Skip From guido@digicool.com Tue Jul 10 19:10:59 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 10 Jul 2001 14:10:59 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: Your message of "Tue, 10 Jul 2001 13:40:30 EDT." <200107101740.f6AHeUC21223@snark.thyrsus.com> References: <200107101740.f6AHeUC21223@snark.thyrsus.com> Message-ID: <200107101811.f6AIAx312199@odiug.digicool.com> > Therefore the concrete proposal: we should make XML-RPC support in the > Python standard library a goal for 2.2. I'd like to see votes and/or > a BDFL pronouncement on this goal. > > I've copied Fredrik Lundh and Eric Kidd, the implementors of two > XML-RPC implementations that might serve. Dave (who designed XML-RPC) > likes them both. I hope they'll report on which, if either, they > consider production-ready for integration with Python. Fredrik Lundh's xmlrpclib.py looks ready for the Python standard library, if Fredrik agrees. The license is right. I'm not sure but I believe that Eric Kidd's version is C or C++ code that *could* be linked into Python? This seems less attractive because there will always have to be a separate distribution (for non-Python targets). But maybe the motivation is wrong. We should decide to include (or not to include) xml-rpc based on a user need, not based on political motives. There may be a user need; Fredrik, do you know how popular your xmlrpc module is? Technical issues: should the server stubs also be included? It might benefit from also including the sgmlop.c extension. --Guido van Rossum (home page: http://www.python.org/~guido/) From eric.kidd@pobox.com Tue Jul 10 19:12:36 2001 From: eric.kidd@pobox.com (Eric Kidd) Date: Tue, 10 Jul 2001 14:12:36 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <15179.16603.865875.309079@beluga.mojam.com>; from skip@pobox.com on Tue, Jul 10, 2001 at 12:52:27PM -0500 References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> Message-ID: <20010710141236.C9416@h00104b370897.ne.mediaone.net> On Tue, Jul 10, 2001 at 12:52:27PM -0500, Skip Montanaro wrote: > > Eric> Therefore the concrete proposal: we should make XML-RPC support in > Eric> the Python standard library a goal for 2.2. I'd like to see votes > Eric> and/or a BDFL pronouncement on this goal. > > +1 from me. I use a slightly doctored version of /F's 0.9.8 version of > xmlrpclib (current version is, I think, 0.9.9). Perhaps inclusion in the > Python core would be a good reason to bump the version number to 1.0. I recommend using /F's library, too--it's just a tiny snippet of native Python code. My library, although nice, is intended for C programmers, and needlessly duplicates a lot of Python functionality. It has its own data model (based on Python's), UTF-8 processing (based on Python's), structure builder (based on Python's), and so on. You get the picture. > The only potential problem I see is that nagging "gotta be ASCII for > interoperability" bug up Dave W's butt. It flies in the face of attempts to > make Python more Unicode-friendly. Fredrik's library supports Unicode. My library supports Unicode. The Java libraries all support Unicode. And furthermore, we all appear to have interop. On a related note: XML-RPC is easy to implement, but a bit of niche. SOAP, on the other, is hard to implement but widely used. But there's a third option--"SOAP BDG" ("Busy Developer's Guide"). Dave Winer and his employees prepared a short summary of the SOAP specification--leaving out many of the vaguer features--and convinced many people to support this feature set. So if you read the SOAP BDG paper and implement it, you can interoperate with many, many commercial SOAP stacks. So either XML-RPC or SOAP BDG would be good strategic options. Cheers, Eric From guido@digicool.com Tue Jul 10 19:16:51 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 10 Jul 2001 14:16:51 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: Your message of "Tue, 10 Jul 2001 14:12:36 EDT." <20010710141236.C9416@h00104b370897.ne.mediaone.net> References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> Message-ID: <200107101817.f6AIGtr12278@odiug.digicool.com> > So either XML-RPC or SOAP BDG would be good strategic options. Or both? And how does WebDAV fit in this picture? That's another open protocol that Python could easily support out of the box. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Tue Jul 10 19:19:26 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Tue, 10 Jul 2001 14:19:26 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <15179.16603.865875.309079@beluga.mojam.com>; from skip@pobox.com on Tue, Jul 10, 2001 at 12:52:27PM -0500 References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> Message-ID: <20010710141926.E2528@ute.cnri.reston.va.us> On Tue, Jul 10, 2001 at 12:52:27PM -0500, Skip Montanaro wrote: > Eric> Therefore the concrete proposal: we should make XML-RPC support in > Eric> the Python standard library a goal for 2.2. I'd like to see votes > Eric> and/or a BDFL pronouncement on this goal. +0, I think. Having the module available might lead people to make more services available through XML-RPC. My misgiving is that XML-RPC is pretty limited, the lack of support for None being particularly painful to a Python programmer. Perhaps, if we can only have one, SOAP would be better, but I haven't used SOAP seriously for anything yet. --amk From esr@thyrsus.com Tue Jul 10 19:47:35 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Tue, 10 Jul 2001 14:47:35 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <20010710141236.C9416@h00104b370897.ne.mediaone.net>; from eric.kidd@pobox.com on Tue, Jul 10, 2001 at 02:12:36PM -0400 References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> Message-ID: <20010710144735.A22087@thyrsus.com> Eric Kidd : > On a related note: XML-RPC is easy to implement, but a bit of niche. SOAP, > on the other, is hard to implement but widely used. But there's a third > option--"SOAP BDG" ("Busy Developer's Guide"). > > Dave Winer and his employees prepared a short summary of the SOAP > specification--leaving out many of the vaguer features--and convinced many > people to support this feature set. So if you read the SOAP BDG paper and > implement it, you can interoperate with many, many commercial SOAP stacks. > > So either XML-RPC or SOAP BDG would be good strategic options. Are there, as yet, any SOAP-BDG implementations we could use? -- Eric S. Raymond He that would make his own liberty secure must guard even his enemy from oppression: for if he violates this duty, he establishes a precedent that will reach unto himself. -- Thomas Paine From esr@thyrsus.com Tue Jul 10 19:56:32 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Tue, 10 Jul 2001 14:56:32 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <200107101811.f6AIAx312199@odiug.digicool.com>; from guido@digicool.com on Tue, Jul 10, 2001 at 02:10:59PM -0400 References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <200107101811.f6AIAx312199@odiug.digicool.com> Message-ID: <20010710145632.C22087@thyrsus.com> Guido van Rossum : > Fredrik Lundh's xmlrpclib.py looks ready for the Python standard > library, if Fredrik agrees. The license is right. Eric Kidd agrees, but Fredrik has not checked in yet. > But maybe the motivation is wrong. We should decide to include (or > not to include) xml-rpc based on a user need, not based on political > motives. There may be a user need; Fredrik, do you know how popular > your xmlrpc module is? There are good political reasons and bad political reasons. I think helping promote an open and well-designed RPC standard is a good political reason. And XML-RPC is very good work; I wouldn't be pushing it if I hadn't evaluated it myself and liked it a lot. One other thing that make Python support particularly appropriate is that Zope objects are XML-RPC accessible (or so I'm told; I have not tried this myself yet). > Technical issues: should the server stubs also be included? It might > benefit from also including the sgmlop.c extension. I would say yes to both. The code is there and it's tested. I'm willing to merge in the documentation. -- Eric S. Raymond The danger (where there is any) from armed citizens, is only to the *government*, not to *society*; and as long as they have nothing to revenge in the government (which they cannot have while it is in their own hands) there are many advantages in their being accustomed to the use of arms, and no possible disadvantage. -- Joel Barlow, "Advice to the Privileged Orders", 1792-93 From eric.kidd@pobox.com Tue Jul 10 20:00:48 2001 From: eric.kidd@pobox.com (Eric Kidd) Date: Tue, 10 Jul 2001 15:00:48 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <200107101811.f6AIAx312199@odiug.digicool.com>; from guido@digicool.com on Tue, Jul 10, 2001 at 02:10:59PM -0400 References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <200107101811.f6AIAx312199@odiug.digicool.com> Message-ID: <20010710150048.D9416@h00104b370897.ne.mediaone.net> On Tue, Jul 10, 2001 at 02:10:59PM -0400, Guido van Rossum wrote: > Fredrik Lundh's xmlrpclib.py looks ready for the Python standard > library, if Fredrik agrees. The license is right. I'm not sure but I > believe that Eric Kidd's version is C or C++ code that *could* be > linked into Python? This seems less attractive because there will > always have to be a separate distribution (for non-Python targets). I recommend using /F's library. It's less than a thousand lines of nice, clean Python, and it doesn't duplicate any code in the Python core. My library is quite a bit faster, but it contains lots of C code which duplicates Python features. The right solution is use /F's code. And if his code isn't fast enough, small sections can be rewritten in C without breaking the API. > But maybe the motivation is wrong. We should decide to include (or > not to include) xml-rpc based on a user need, not based on political > motives. There may be a user need; Fredrik, do you know how popular > your xmlrpc module is? Moderately popular, AFAIK--it's currently bundled with Zope, and it's one of the nicest XML-RPC libraries out there. I've actually used Fredrik's library in more projects than my own. This is probably because I'd rather program in Python than C. :-) Cheers, Eric From skip@pobox.com (Skip Montanaro) Tue Jul 10 20:43:07 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 10 Jul 2001 14:43:07 -0500 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <20010710144735.A22087@thyrsus.com> References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> Message-ID: <15179.23243.253539.305609@beluga.mojam.com> Eric> Are there, as yet, any SOAP-BDG implementations we could use? There was a SOAP.py module announced for Python recently: http://groups.google.com/groups?q=SOAP.py&hl=en&safe=off&rnum=2&ic=1&selm=mailman.990088321.2387.clpa-moderators%40python.org http://www.actzero.com/soap/SOAPpy.html I can't get to the download page at the moment though, so I can't tell where it falls on the spectrum between SOAP-BDG and SOAP. In my opinion supporting both XML-RPC and SOAP in the core library would be a good thing. It's sort of like PIL supporting both GIF and JPEG image files. Both have their uses. Skip From esr@thyrsus.com Tue Jul 10 20:52:46 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Tue, 10 Jul 2001 15:52:46 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <15179.23243.253539.305609@beluga.mojam.com>; from skip@pobox.com on Tue, Jul 10, 2001 at 02:43:07PM -0500 References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> Message-ID: <20010710155246.B23638@thyrsus.com> Skip Montanaro : > In my opinion supporting both XML-RPC and SOAP in the core library would be > a good thing. It's sort of like PIL supporting both GIF and JPEG image > files. Both have their uses. +1. I think supporting XML-RPC is close to being a no-brainer at this point. What to do about SOAP is a less trivial question. -- Eric S. Raymond "Among the many misdeeds of British rule in India, history will look upon the Act depriving a whole nation of arms as the blackest." -- Mohandas Ghandhi, An Autobiography, pg 446 From paulp@ActiveState.com Tue Jul 10 20:56:29 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Tue, 10 Jul 2001 12:56:29 -0700 Subject: [Python-Dev] Leading with XML-RPC References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> <20010710155246.B23638@thyrsus.com> Message-ID: <3B4B5DED.9AC9E280@ActiveState.com> I agree that XML-RPC is a no-brainer. I think it is too early for SOAP. We need to wait for real interop to shake out before we commit to a SOAP library. I don't see why waiting for SOAP should in any way dissuade us from putting in XML-RPC. They are different protocols used by different people in different projects, like POP and IMAP. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From DavidA@ActiveState.com Tue Jul 10 21:55:30 2001 From: DavidA@ActiveState.com (David Ascher) Date: Tue, 10 Jul 2001 13:55:30 -0700 Subject: [Python-Dev] Leading with XML-RPC References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> <20010710155246.B23638@thyrsus.com> Message-ID: <3B4B6BC2.74F4984D@ActiveState.com> "Eric S. Raymond" wrote: > > Skip Montanaro : > > In my opinion supporting both XML-RPC and SOAP in the core library would be > > a good thing. It's sort of like PIL supporting both GIF and JPEG image > > files. Both have their uses. > > +1. I think supporting XML-RPC is close to being a no-brainer at this point. > What to do about SOAP is a less trivial question. FYI: We use a slightly doctored version of /F's xmlrpclib, IIRC, as well as a slightly doctored version of /F's SOAP library. I'm +1 on adding both of those to the library, so that we can get rid of these various 'slight doctorings' =). I'm +1 on adding good DAV support as well, although I think that that will have to be through the addition of Neon. Greg's davlib.py isn't really industrial-strength, from what Greg tells me (e.g. no support for authentication), and I don't think Greg is spending much time on it. Alas, Neon is C code, and still in flux. Greg will speak up whenever he resurfaces =). If we had 'stubs' like Tcl, we could ship a Neon wrapper w/o Neon, which would be good. But we don't. =) Documentation is probably the bigger problem, though, as usual. However, as much as I like XML-RPC and SOAP and WebDAV, I don't know that adding support for these protocols will have much impact on the folks that are being exposed to the .NET story. SOAP, especially, works well with .NET, rather than competing with it. I don't think anyone wants to setup an "XML-RPC vs. SOAP" war, that'd be pretty pointless. -- David Ascher As a PS, I'm all for adding support for these protocols to Python, but I don't see the relationship to 'turning up the heat on Microsoft'. I'd think there would be better ways of doing so, should you be so inclined =). [After reading Dave's piece on Mono, I understand why he thinks that would 'work' -- but now I think he underestimates the scope and depth of .NET -- adding interop through SOAP does not make the alternatives competitive -- see the discussion on language-dev]. From esr@thyrsus.com Tue Jul 10 22:10:08 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Tue, 10 Jul 2001 17:10:08 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <3B4B6BC2.74F4984D@ActiveState.com>; from DavidA@ActiveState.com on Tue, Jul 10, 2001 at 01:55:30PM -0700 References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> <20010710155246.B23638@thyrsus.com> <3B4B6BC2.74F4984D@ActiveState.com> Message-ID: <20010710171008.A31430@thyrsus.com> David Ascher : > Documentation is probably the bigger problem, though, as usual. I'm willing to put some personal elbow grease into solving that problem. > However, as much as I like XML-RPC and SOAP and WebDAV, I don't know > that adding support for these protocols will have much impact on the > folks that are being exposed to the .NET story. SOAP, especially, works > well with .NET, rather than competing with it. I don't think anyone > wants to setup an "XML-RPC vs. SOAP" war, that'd be pretty pointless. Dave's theory (which I agree with) is that the open-source community as a whole can get ahead of Microsoft in things like identity services -- *if* there is uniform support for XML-RPC or SOAP in our development tools. He approached me because he thought I'd be aable to get something moving in the Python world. He'll be talking with other people about Perl. -- Eric S. Raymond "Those who make peaceful revolution impossible will make violent revolution inevitable." -- John F. Kennedy From tim@digicool.com Tue Jul 10 22:26:13 2001 From: tim@digicool.com (Tim Peters) Date: Tue, 10 Jul 2001 17:26:13 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15179.14851.485783.763990@beluga.mojam.com> Message-ID: Here are results on Win2K. In part it just confirms that xrange(large_number) is a poor way to drive benchmarks (the overhead of creating and destroying gagilliblobs of unused integers is no help; OTOH, 2.0 certainly appears to be speedier at creating and destroying gagilliblobs of useless integers! a common cause for slowdowns of that nature is ill-considered "special case" optimizations that turn out to cost more than they save, although I have no particular reason to suspect that here). Note that Windows Python has an excellent clock() function (it's real time, not user time, and has better than microsecond resolution). File skip.py: N = 100000 TRIPS = 3 if 0: indices = xrange(N) # common but ill-advised else: indices = [None] * N # better def t1(): for i in indices: pass def t2(): for i in indices: x = 1 def t3(): for i in indices: x = ``1`+`2`` def timeit(f): from time import clock start = clock() f() finish = clock() return finish - start for f, tag in (t1, "pass"), (t2, "x=1"), (t3, "x=``1`+`2``"): print "%-12s" % tag, # Warm up. f(); f(); f() for i in range(TRIPS): elapsed = timeit(f) print "%6.3f" % elapsed, print """ Results: With indices = xrange(N) C:\Code>\Python20\python.exe skip.py pass 0.038 0.038 0.039 x=1 0.049 0.049 0.049 x=``1`+`2`` 0.421 0.420 0.421 C:\Code>\Python21\python.exe skip.py pass 0.042 0.042 0.042 x=1 0.053 0.053 0.053 x=``1`+`2`` 0.456 0.456 0.455 C:\Code>python\dist\src\PCbuild\python skip.py # CVS pass 0.040 0.039 0.039 x=1 0.050 0.051 0.050 x=``1`+`2`` 0.449 0.452 0.452 With indices = [None] * N instead: C:\Code>\Python20\python.exe skip.py pass 0.035 0.034 0.034 x=1 0.046 0.046 0.046 x=``1`+`2`` 0.414 0.413 0.413 C:\Code>\Python21\python.exe skip.py pass 0.037 0.037 0.037 x=1 0.048 0.048 0.048 x=``1`+`2`` 0.451 0.448 0.453 C:\Code>python\dist\src\PCbuild\python skip.py # CVS pass 0.031 0.030 0.031 x=1 0.041 0.042 0.041 x=``1`+`2`` 0.438 0.447 0.444 """ From skip@pobox.com (Skip Montanaro) Tue Jul 10 23:27:10 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 10 Jul 2001 17:27:10 -0500 Subject: [Python-Dev] Silly little benchmark In-Reply-To: References: <15179.14851.485783.763990@beluga.mojam.com> Message-ID: <15179.33086.641542.664190@beluga.mojam.com> One thing that occurs to me as I rebuild 1.6 is that it would be real nice to be able to query the interpreter for the compilation flags at runtime so I could be more certain I was comparing apples and apples. In my case, the last time I built 1.6 was June 2000, so I have no idea what my compilation flags were. I can tell by the startup message that it was compiled with gcc 2.95.3, but not what optimization flags were used. Skip From skip@pobox.com (Skip Montanaro) Tue Jul 10 23:58:04 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 10 Jul 2001 17:58:04 -0500 Subject: [Python-Dev] Silly little benchmark In-Reply-To: References: <15179.14851.485783.763990@beluga.mojam.com> Message-ID: <15179.34940.785035.23896@beluga.mojam.com> Tim> Note that Windows Python has an excellent clock() function (it's Tim> real time, not user time, and has better than microsecond Tim> resolution). Real time doesn't mean much on an operating system that can juggle multiple tasks, no matter how quiescent you try to make it. Okay, so here are some hopefully more comparable numbers. I cvs up'd both Python 1.6 (release16 tag) and Python 2.1 (release21-maint tag) directories, reconfigured, executed make clean, then ran make and make install. The optimization/debug flags were the default: "-g -O2". Both were compiled with gcc 2.96.0.48mdk, the version of gcc that comes with Linux Mandrake 8.0. Using the xrange(N) version: % python1.6 -S skip.py pass 0.090 0.090 0.090 x=1 0.110 0.120 0.120 x=``1`+`2`` 1.080 1.070 1.060 % python2.1 -S skip.py pass 0.090 0.100 0.090 x=1 0.110 0.120 0.110 x=``1`+`2`` 1.700 1.680 1.700 Using the [None]*N version: % python1.6 -S skip.py pass 0.070 0.070 0.080 x=1 0.100 0.110 0.100 x=``1`+`2`` 1.040 1.030 1.040 % python2.1 -S skip.py pass 0.070 0.080 0.070 x=1 0.110 0.100 0.100 x=``1`+`2`` 1.680 1.690 1.690 So, my observations about loop overhead were almost certainly artifacts of differences in the way the two interpreters were compiled. My apologies for that flub. It still appears there's a big slowdown between 1.6 and 2.1 in the back tic operations though. Aside: Tim, can I assume by your return address that Digital Creations finally gave you an office and you're not computing from some seedy motel room on US 1? ;-) -- Skip Montanaro (skip@pobox.com) (847)971-7098 From thomas@xs4all.net Wed Jul 11 00:04:57 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 11 Jul 2001 01:04:57 +0200 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15179.34940.785035.23896@beluga.mojam.com> References: <15179.14851.485783.763990@beluga.mojam.com> <15179.34940.785035.23896@beluga.mojam.com> Message-ID: <20010711010456.D8098@xs4all.nl> On Tue, Jul 10, 2001 at 05:58:04PM -0500, Skip Montanaro wrote: > Okay, so here are some hopefully more comparable numbers. I cvs up'd both > Python 1.6 (release16 tag) and Python 2.1 (release21-maint tag) directories, Wrong tag; you're testing the 2.1.1 branch, not the 2.1 release. 2.1.1 contains at least one small performance optimization of which the impact has not been determined :) Feel-free-to-determine-though-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From skip@pobox.com (Skip Montanaro) Wed Jul 11 00:28:44 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 10 Jul 2001 18:28:44 -0500 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <20010711010456.D8098@xs4all.nl> References: <15179.14851.485783.763990@beluga.mojam.com> <15179.34940.785035.23896@beluga.mojam.com> <20010711010456.D8098@xs4all.nl> Message-ID: <15179.36780.675079.11961@beluga.mojam.com> >> Okay, so here are some hopefully more comparable numbers. I cvs up'd >> both Python 1.6 (release16 tag) and Python 2.1 (release21-maint tag) >> directories, Thomas> Wrong tag; you're testing the 2.1.1 branch, not the 2.1 Thomas> release. 2.1.1 contains at least one small performance Thomas> optimization of which the impact has not been determined :) I thought the whole idea of the dot dot releases was that they were supposed to just be bug fixes. Skip From guido@digicool.com Wed Jul 11 00:41:56 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 10 Jul 2001 19:41:56 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: Your message of "Tue, 10 Jul 2001 17:27:10 CDT." <15179.33086.641542.664190@beluga.mojam.com> References: <15179.14851.485783.763990@beluga.mojam.com> <15179.33086.641542.664190@beluga.mojam.com> Message-ID: <200107102341.f6ANfvL12944@odiug.digicool.com> > One thing that occurs to me as I rebuild 1.6 is that it would be real nice > to be able to query the interpreter for the compilation flags at runtime so > I could be more certain I was comparing apples and apples. In my case, the > last time I built 1.6 was June 2000, so I have no idea what my compilation > flags were. I can tell by the startup message that it was compiled with gcc > 2.95.3, but not what optimization flags were used. If you did a full "make install" then and the results are still around, look in /lib/python1.6/config/Makefile . --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Wed Jul 11 02:44:26 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 10 Jul 2001 21:44:26 -0400 (EDT) Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15179.33086.641542.664190@beluga.mojam.com> References: <15179.14851.485783.763990@beluga.mojam.com> <15179.33086.641542.664190@beluga.mojam.com> <200107102341.f6ANfvL12944@odiug.digicool.com> Message-ID: <15179.44922.465917.740723@cj42289-a.reston1.va.home.com> Skip Montanaro writes: > One thing that occurs to me as I rebuild 1.6 is that it would be real nice > to be able to query the interpreter for the compilation flags at runtime so > I could be more certain I was comparing apples and apples. In my case, the Guido van Rossum writes: > If you did a full "make install" then and the results are still > around, look in /lib/python1.6/config/Makefile . And this can all be extracted from Python without having to delve into obscure installed files, as well. ;-) For Python 1.6: >>> import distutils.sysconfig >>> distutils.sysconfig.OPT '-g -O2' For Python 2.0 and newer: >>> import distutils.sysconfig >>> distutils.sysconfig.get_config_var('OPT') '-g -O2 -Wall -Wstrict-prototypes -fPIC' -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From skip@pobox.com (Skip Montanaro) Wed Jul 11 03:30:39 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 10 Jul 2001 21:30:39 -0500 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <200107102341.f6ANfvL12944@odiug.digicool.com> References: <15179.14851.485783.763990@beluga.mojam.com> <15179.33086.641542.664190@beluga.mojam.com> <200107102341.f6ANfvL12944@odiug.digicool.com> Message-ID: <15179.47695.555421.793300@beluga.mojam.com> >> I can tell by the startup message that it was compiled with gcc >> 2.95.3, but not what optimization flags were used. Guido> If you did a full "make install" then and the results are still Guido> around, look in /lib/python1.6/config/Makefile . Thanks for the tip. Since I just rebuilt 1.6 this evening I wiped out whatever was there from last June. Still, I now have that information at my fingertips: def getbuildinfo(): import sys, re, string makefile = "%s/lib/python%d.%d/config/Makefile" % \ (sys.prefix, sys.version_info[0], sys.version_info[1]) f = open(makefile) pat = re.compile("^([A-Z_]+)\s*=\s*(.*)") lines = f.readlines() opts = {} for line in lines: mat = pat.match(line) if mat: name = mat.group(1) val = string.strip(mat.group(2)) opts[name] = val return opts Skip From skip@pobox.com (Skip Montanaro) Wed Jul 11 03:35:13 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 10 Jul 2001 21:35:13 -0500 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15179.44922.465917.740723@cj42289-a.reston1.va.home.com> References: <15179.14851.485783.763990@beluga.mojam.com> <15179.33086.641542.664190@beluga.mojam.com> <200107102341.f6ANfvL12944@odiug.digicool.com> <15179.44922.465917.740723@cj42289-a.reston1.va.home.com> Message-ID: <15179.47969.50003.441171@beluga.mojam.com> Fred> And this can all be extracted from Python without having to delve Fred> into obscure installed files, as well. ;-) Dang! Wasted another five minutes... Skip From tim@digicool.com Wed Jul 11 06:05:52 2001 From: tim@digicool.com (Tim Peters) Date: Wed, 11 Jul 2001 01:05:52 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15179.34940.785035.23896@beluga.mojam.com> Message-ID: [Skip Montanaro] > Real time doesn't mean much on an operating system that can > juggle multiple tasks, no matter how quiescent you try to make it. If this made any difference in the results I reported, they wouldn't have been reproducible to nearly 3 significant digits. That's why I printed the times for 3 runs of each -- you can trust that I know what I'm doing here. These things run for a fraction of a second each, the machine was as quiet as possible, and the output showed no cause for suspicion (indeed, I threw out a few runs where one of the three numbers was 10x larger than the other two -- *that's* how you know you got socked by a background task, provided you've got a sensitive timer to work with). > ... > It still appears there's a big slowdown between 1.6 and 2.1 in > the back tic operations though. This assumes too much. ``1`+`2`` triggers three reprs and a string concatenation. I suspect both slowed, but that the latter is the more important hit. First the repr(string) hit (first blob from the release20 rev of stringobject.c): *** 374,383 **** c = op->ob_sval[i]; if (c == quote || c == '\\') *p++ = '\\', *p++ = c; ! else if (c < ' ' || c >= 0177) { ! sprintf(p, "\\%03o", c & 0377); ! while (*p != '\0') ! p++; } else *p++ = c; --- 442,456 ---- c = op->ob_sval[i]; if (c == quote || c == '\\') *p++ = '\\', *p++ = c; ! else if (c == '\t') ! *p++ = '\\', *p++ = 't'; ! else if (c == '\n') ! *p++ = '\\', *p++ = 'n'; ! else if (c == '\r') ! *p++ = '\\', *p++ = 'r'; ! else if (c < ' ' || c >= 0x7f) { ! sprintf(p, "\\x%02x", c & 0xff); ! p += 4; } else *p++ = c; "The usual" string char endures twice as many tests+branches now. The other thing Jeremy has noted before: string+string is slower than it used to be, because BINARY_ADD now tries oodles of "sophisticated" ways to coerce the operands to numbers before considering it might be asking for a sequence catenation instead. Given that the benchmark pastes together two 1-character strings, this overhead is overwhelming compared to the concatenation work. > Aside: Tim, can I assume by your return address that Digital Creations > finally gave you an office and you're not computing from some seedy > motel room on US 1? ;-) I'm not entirely sure DC gave it to us, but there is indeed a Luxurious PythonLabs World Headquarters now, in Falls Church, VA. Conveniently located atop an inaccessible hill, it overlooks the fabulous Leesburg Pike, a stunning continuous strip mall stretching from the Potomac to the Arctic Circle (or France, whichever is closer -- geography isn't my strong suit). join-us-for-lunch!-we-need-the-company-ly y'rs - tim From skip@pobox.com (Skip Montanaro) Wed Jul 11 07:24:32 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 11 Jul 2001 01:24:32 -0500 Subject: [Python-Dev] Silly little benchmark In-Reply-To: References: <15179.34940.785035.23896@beluga.mojam.com> Message-ID: <15179.61728.255760.814673@beluga.mojam.com> Tim> [Skip Montanaro] >> Real time doesn't mean much on an operating system that can >> juggle multiple tasks, no matter how quiescent you try to make it. Tim> If this made any difference in the results I reported, they Tim> wouldn't have been reproducible to nearly 3 significant digits. Tim> That's why I printed the times for 3 runs of each -- you can trust Tim> that I know what I'm doing here. I wasn't suggesting you weren't trustworthy. On a Linux system, wall clock time doesn't mean much when timing processes. I have no control over when sendmail or any of a number of other daemons might wake up to process something. Hence, for my purposes in my environment, user mode time (or user+sys when sys > 0) are more useful than elapsed time (does Windows even distinguish between user and system time?). That time.clock means different things on Windows and Unix-like systems bothers me a bit. (It would bother me more if I had to write timing code that was portable across both Unix and Windows.) But that, as they say, is a something left for another time. Tim> The other thing Jeremy has noted before: string+string is slower Tim> than it used to be, because BINARY_ADD now tries oodles of Tim> "sophisticated" ways to coerce the operands to numbers before Tim> considering it might be asking for a sequence catenation instead. Tim> Given that the benchmark pastes together two 1-character strings, Tim> this overhead is overwhelming compared to the concatenation work. I can buy that. Wasn't there some discussion about improving this situation? If so, I guess I should be using the head branch of the CVS tree instead of release21-maint. Skip From paulp@ActiveState.com Wed Jul 11 09:05:00 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Wed, 11 Jul 2001 01:05:00 -0700 Subject: [Python-Dev] Silly little benchmark References: Message-ID: <3B4C08AC.BB6014DB@ActiveState.com> Tim Peters wrote: > >... > > I'm not entirely sure DC gave it to us, but there is indeed a Luxurious > PythonLabs World Headquarters now, in Falls Church, VA. Conveniently > located atop an inaccessible hill, it overlooks the fabulous Leesburg Pike, > a stunning continuous strip mall stretching from the Potomac to the Arctic > Circle (or France, whichever is closer -- geography isn't my strong suit). When I lived in Ontario I constantly wondered how far south that mall went! I tried to walk to the end of it once. I gave up sometime after I crossed the line where Roots franchises were replaced with Gaps. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From fredrik@pythonware.com Wed Jul 11 09:09:06 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 11 Jul 2001 10:09:06 +0200 Subject: [Python-Dev] Leading with XML-RPC References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> <20010710155246.B23638@thyrsus.com> <3B4B6BC2.74F4984D@ActiveState.com> Message-ID: <018501c109e0$c345a450$4ffa42d5@hagrid> I'm in a hurry, so here's a short version of what I think: +1 on xmlrpclib.py in 2.2 +1 on a pure-python davlib.py in 2.2 (greg, please?) -0 on soap support in 2.2 (it's still a moving target; a new spec draft was released this weekend). if we want something now, it should be cayce ullman's SOAP.py, not my soaplib.py. but I don't think we need SOAP in the standard library for another year or two. (fwiw, my current thinking is that SOAP is a flawed idea, and that the need for SOAP will go away when people get better XML/Schema tools, but that's another story. and don't get me started on SOAP BDG...) Cheers /F From thomas@xs4all.net Wed Jul 11 09:23:20 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 11 Jul 2001 10:23:20 +0200 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15179.36780.675079.11961@beluga.mojam.com> References: <15179.14851.485783.763990@beluga.mojam.com> <15179.34940.785035.23896@beluga.mojam.com> <20010711010456.D8098@xs4all.nl> <15179.36780.675079.11961@beluga.mojam.com> Message-ID: <20010711102320.R32419@xs4all.nl> On Tue, Jul 10, 2001 at 06:28:44PM -0500, Skip Montanaro wrote: > >> Okay, so here are some hopefully more comparable numbers. I cvs up'd > >> both Python 1.6 (release16 tag) and Python 2.1 (release21-maint tag) > >> directories, > Thomas> Wrong tag; you're testing the 2.1.1 branch, not the 2.1 > Thomas> release. 2.1.1 contains at least one small performance > Thomas> optimization of which the impact has not been determined :) > I thought the whole idea of the dot dot releases was that they were supposed > to just be bug fixes. They do, it just depends on what you classify as a bug. This was a small bug in the implementation of function calls that made calling 'normal' C functions (without keyword arguments) from Python code a bit slower than necessary, too. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From Paul.Moore@atosorigin.com Wed Jul 11 09:32:06 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 11 Jul 2001 09:32:06 +0100 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEDF@ukrux002.rundc.uk.origin-it.com> From: Pete Shinners [mailto:pete@shinners.org] > for the love of all things good, can we please make a recommendation > in our PEP that the windows installation location be something other > than "C:\PYTHON21"? something like "C:\PYTHON21\SITE-PACKAGES" would > be a big improvement. i thought i heard that macpython recently made > this "fix", why is the windows version lagging on this? PEP 250 covers this. I have sent in the final PEP for approval, plus a patch, but the process appears to be stalled. I guess I need to nag again. The PEP process doesn't seem to cover non-core Python developers well (eg, people like me who don't have a way of integrating with the Sourceforge mechanisms...) Paul. From martin@strakt.com Wed Jul 11 13:46:26 2001 From: martin@strakt.com (Martin Sjögren) Date: Wed, 11 Jul 2001 14:46:26 +0200 Subject: [Python-Dev] Python and SSL Message-ID: <20010711144626.A2998@strakt.com> --Qxx1br4bt0+wmkIi Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello I'm currently in the process of developing a basic OpenSSL module for Python. Before you say antyhing, yes I know about M2Crypto and its SSL support, but for a number of reasons, it doesn't fulfill our needs. We found the SSL support in Python to be insufficient (nonexistent :-)) for our needs. We thus decided to write our own module. The module is faaaar from complete as an interface to the general cryptographic functionality of OpenSSl, but it does have basic SSL support, including authorization using certificates, PRNG seeding functions and an error handling system. Since we are using Python extensively and don't have to pay for it, we would like to reply in kind and offer the module back to the Python project. (This is, in case you're missing it, a hint that now that security is the hot subject it is, it's silly for an otherwise so complete language to lack SSL support ;-)) The whole kit (including some documentation) can be found here: http://www.strakt.com/~martin/pyOpenSSL.tar.gz My question is... What do I do now? Where to proceed? Please CC me replies, since I'm (of course) not on the list. Regards, Martin Sj=F6gren AB Strakt --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 405242 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html --Qxx1br4bt0+wmkIi Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iEYEARECAAYFAjtMSqEACgkQGpBPiZwE9FbxfwCfU9wL9mTnkLhOvzaprpjhHTod IPsAoJta797qnpcW+veVceqkyulkhYpq =TjXt -----END PGP SIGNATURE----- --Qxx1br4bt0+wmkIi-- From guido@digicool.com Wed Jul 11 14:02:42 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 11 Jul 2001 09:02:42 -0400 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) In-Reply-To: Your message of "Wed, 11 Jul 2001 09:32:06 BST." <714DFA46B9BBD0119CD000805FC1F53B01B5AEDF@ukrux002.rundc.uk.origin-it.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEDF@ukrux002.rundc.uk.origin-it.com> Message-ID: <200107111302.f6BD2gP13353@odiug.digicool.com> > From: Pete Shinners [mailto:pete@shinners.org] > > for the love of all things good, can we please make a recommendation > > in our PEP that the windows installation location be something other > > than "C:\PYTHON21"? something like "C:\PYTHON21\SITE-PACKAGES" would > > be a big improvement. i thought i heard that macpython recently made > > this "fix", why is the windows version lagging on this? [Paul Moore] > PEP 250 covers this. I have sent in the final PEP for approval, plus a > patch, but the process appears to be stalled. I guess I need to nag again. > The PEP process doesn't seem to cover non-core Python developers well (eg, > people like me who don't have a way of integrating with the Sourceforge > mechanisms...) I just read that PEP over, and I agree with it. I think it should be implemented. If anyone with sourceforge permission would like to champion this PEP further (by implementing the modest change it suggests so that it can be rolled out with Python 2.2a1 next week), that would really help! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Wed Jul 11 14:26:51 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 11 Jul 2001 09:26:51 -0400 Subject: [Python-Dev] Python and SSL In-Reply-To: Your message of "Wed, 11 Jul 2001 14:46:26 +0200." <20010711144626.A2998@strakt.com> References: <20010711144626.A2998@strakt.com> Message-ID: <200107111326.f6BDQpY13417@odiug.digicool.com> > Hello > > I'm currently in the process of developing a basic OpenSSL module for > Python. Before you say antyhing, yes I know about M2Crypto and its SSL > support, but for a number of reasons, it doesn't fulfill our needs. > > We found the SSL support in Python to be insufficient (nonexistent :-)) > for our needs. We thus decided to write our own module. > > The module is faaaar from complete as an interface to the general > cryptographic functionality of OpenSSl, but it does have basic SSL > support, including authorization using certificates, PRNG seeding > functions and an error handling system. > > Since we are using Python extensively and don't have to pay for it, we > would like to reply in kind and offer the module back to the Python > project. > > (This is, in case you're missing it, a hint that now that security is > the hot subject it is, it's silly for an otherwise so complete language to > lack SSL support ;-)) > > The whole kit (including some documentation) can be found here: > http://www.strakt.com/~martin/pyOpenSSL.tar.gz > > My question is... What do I do now? Where to proceed? > > Please CC me replies, since I'm (of course) not on the list. > > Regards, > Martin Sjögren > AB Strakt Hi Martin, You can actually subscribe to python-dev. Just go to http://mail.python.org/mailman/listinfo/python-dev and enter your email and password; you will magically be approved. The best thing you can do is try to find someone with Python SF commit privileges who is willing to review your code and check it in. (I would have recommended Jeremy Hylton, but he's still away on paternity leave, so you'll have to find someone outside PythonLabs.) --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Wed Jul 11 14:33:21 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 11 Jul 2001 15:33:21 +0200 Subject: [Python-Dev] Python and SSL References: <20010711144626.A2998@strakt.com> Message-ID: <3B4C55A1.73BB5A78@lemburg.com> "Martin Sjögren" wrote: > > Hello > > I'm currently in the process of developing a basic OpenSSL module for > Python. Before you say antyhing, yes I know about M2Crypto and its SSL > support, but for a number of reasons, it doesn't fulfill our needs. Note that there's also amkCrypto (the successor of mxCrypto which is a wrapper of the low-level blazing fast tools in OpenSSL): http://www.amk.ca/python/code/crypto.html > We found the SSL support in Python to be insufficient (nonexistent :-)) > for our needs. We thus decided to write our own module. > > The module is faaaar from complete as an interface to the general > cryptographic functionality of OpenSSl, but it does have basic SSL > support, including authorization using certificates, PRNG seeding > functions and an error handling system. There is some support in the socket module for dealing HTTPS. Which level of OpenSSL are you focussing (ciphers, certificates or protocol) ? > Since we are using Python extensively and don't have to pay for it, we > would like to reply in kind and offer the module back to the Python > project. > > (This is, in case you're missing it, a hint that now that security is > the hot subject it is, it's silly for an otherwise so complete language to > lack SSL support ;-)) > > The whole kit (including some documentation) can be found here: > http://www.strakt.com/~martin/pyOpenSSL.tar.gz > > My question is... What do I do now? Where to proceed? Since the module is "far from complete", I'd suggest to put the project up on the web somewhere to let it mature. I am not sure whether it's a good idea to put crypto code into the standard Python distribution due to the issues involved in this (import/export restrictions, etc.), but perhaps we could open up the Python core a bit for these "extra" utilities and make them available as separate download alongside the standard ones. > Please CC me replies, since I'm (of course) not on the list. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From martin@strakt.com Wed Jul 11 15:12:21 2001 From: martin@strakt.com (Martin Sjögren) Date: Wed, 11 Jul 2001 16:12:21 +0200 Subject: [Python-Dev] Re: Python and SSL In-Reply-To: <3B4C55A1.73BB5A78@lemburg.com> References: <20010711144626.A2998@strakt.com> <3B4C55A1.73BB5A78@lemburg.com> Message-ID: <20010711161221.A3684@strakt.com> On Wed, Jul 11, 2001 at 03:33:21PM +0200, M.-A. Lemburg wrote: > "Martin Sj=F6gren" wrote: > > I'm currently in the process of developing a basic OpenSSL module for > > Python. Before you say antyhing, yes I know about M2Crypto and its SS= L > > support, but for a number of reasons, it doesn't fulfill our needs. >=20 > Note that there's also amkCrypto (the successor of mxCrypto which > is a wrapper of the low-level blazing fast tools in OpenSSL): >=20 > http://www.amk.ca/python/code/crypto.html Yeah I looked at this too, but it doesn't have the things I'm interested in (SSL_write,read,... etc). At first glance it looks like this module an= d my module complement each other, but I may be wrong :-) > > We found the SSL support in Python to be insufficient (nonexistent :-= )) > > for our needs. We thus decided to write our own module. > >=20 > > The module is faaaar from complete as an interface to the general > > cryptographic functionality of OpenSSl, but it does have basic SSL > > support, including authorization using certificates, PRNG seeding > > functions and an error handling system. >=20 > There is some support in the socket module for dealing HTTPS. > Which level of OpenSSL are you focussing (ciphers, certificates > or protocol) ? We're using SSL to secure the communication in a client/server situation, using certificates for authentication. Basically, my module is what we think we need right now, no more, no less. Given that, I may continue wor= k on it, as our need changes. > > The whole kit (including some documentation) can be found here: > > http://www.strakt.com/~martin/pyOpenSSL.tar.gz > >=20 > > My question is... What do I do now? Where to proceed? >=20 > Since the module is "far from complete", I'd suggest to put the project > up on the web somewhere to let it mature.=20 "faaaar from complete" in that it doesn't do everything OpenSSL does! I'd like to think that it's pretty well contained, and can be used for exactl= y the kind of things we are going to use it for. Nevertheless, letting it mature isn't a bad idea. What is badly needed is getting it compiled and checked on windows. We're doing all our development under Linux, and while it's sufficient that the server (which is written in C and Python) runs on *IX, the client most definitely must run on Windows. Any suggestion where to put it so that it's found? The Vaults of Parnassu= s I guess, are there any other interesting spots? > I am not sure whether it's a good idea to put > crypto code into the standard Python distribution due to the issues > involved in this (import/export restrictions, etc.), but > perhaps we could open up the Python core a bit for these > "extra" utilities and make them available as separate download > alongside the standard ones. I agree with that, but one can argue that since all cryptographic stuff i= s actually done by the OpenSSL library, this module won't even get compiled and installed unless you have OpenSSL on your machine already. As they sa= y on SlashDot, IANAL, and I'm not American so it's not that big a problem for me personally. > > Please CC me replies, since I'm (of course) not on the list. This is still relevant ;) I haven't seen a reply to my subscribe-request yet. Martin --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 405242 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html From thomas@xs4all.net Wed Jul 11 15:36:22 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 11 Jul 2001 16:36:22 +0200 Subject: [Python-Dev] Re: Python and SSL In-Reply-To: <20010711161221.A3684@strakt.com> Message-ID: <20010711163622.A5396@xs4all.nl> On Wed, Jul 11, 2001 at 04:12:21PM +0200, Martin Sj?gren wrote: > Any suggestion where to put it so that it's found? The Vaults of Parnassus > I guess, are there any other interesting spots? Posting to comp.lang.python.announce or python-announce@python.org always works well ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From Greg.Wilson@baltimore.com Wed Jul 11 17:10:56 2001 From: Greg.Wilson@baltimore.com (Greg Wilson) Date: Wed, 11 Jul 2001 12:10:56 -0400 Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1470 - 11 msgs Message-ID: <930BBCA4CEBBD411BE6500508BB3328F36831E@nsamcanms1.ca.baltimore.com> > From: Martin Sj=F6gren > Any suggestion where to put it so that it's found? SourceForge, please --- it'll make it easy for others to contribute, and to get. I'm also finding that a lot of sys admins at places I teach have figured SF out, so I can say, "Install this, this, and this," and it just happens. Kind of cool... Thanks, Greg ---------------------------------------------------------------------------= -------------------------------------- The information contained in this message is confidential and is intended= f or the addressee(s) only. If you have received this message in error or= t here are any problems please notify the originator immediately. The=20 unauthorized use, disclosure, copying or alteration of this message is=20 strictly forbidden. Baltimore Technologies plc will not be liable for direc= t,=20 special, indirect or consequential damages arising from alteration of the= c ontents of this message by a third party or as a result of any virus being= p assed on. In addition, certain Marketing collateral may be added from time to time to= p romote Baltimore Technologies products, services, Global e-Security or=20 appearance at trade shows and conferences. =20 This footnote confirms that this email message has been swept by=20 Baltimore MIMEsweeper for Content Security threats, including computer viruses. From thomas.heller@ion-tof.com Wed Jul 11 20:12:23 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 11 Jul 2001 21:12:23 +0200 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEDF@ukrux002.rundc.uk.origin-it.com> <200107111302.f6BD2gP13353@odiug.digicool.com> Message-ID: <0e2e01c10a3d$6a950b40$e000a8c0@thomasnotebook> > [Paul Moore] > > PEP 250 covers this. I have sent in the final PEP for approval, plus a > > patch, but the process appears to be stalled. I guess I need to nag again. > > The PEP process doesn't seem to cover non-core Python developers well (eg, > > people like me who don't have a way of integrating with the Sourceforge > > mechanisms...) > > I just read that PEP over, and I agree with it. I think it should be > implemented. If anyone with sourceforge permission would like to > champion this PEP further (by implementing the modest change it > suggests so that it can be rolled out with Python 2.2a1 next week), > that would really help! > > --Guido van Rossum (home page: http://www.python.org/~guido/) If noone else shows up, I'll take it (hoping I find the time for it). Thomas From akuchlin@mems-exchange.org Wed Jul 11 20:16:22 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 11 Jul 2001 15:16:22 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <018501c109e0$c345a450$4ffa42d5@hagrid>; from fredrik@pythonware.com on Wed, Jul 11, 2001 at 10:09:06AM +0200 References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> <20010710155246.B23638@thyrsus.com> <3B4B6BC2.74F4984D@ActiveState.com> <018501c109e0$c345a450$4ffa42d5@hagrid> Message-ID: <20010711151622.J6846@ute.cnri.reston.va.us> On Wed, Jul 11, 2001 at 10:09:06AM +0200, Fredrik Lundh wrote: >(fwiw, my current thinking is that SOAP is a flawed idea, and that the >need for SOAP will go away when people get better XML/Schema tools, >but that's another story. and don't get me started on SOAP BDG...) *blink* Really? XML/Schema is the canonical example I use of XML-related standards having grown overcomplicated beyond all reason; SOAP will have far to go before reaching that level. (Or maybe 1.2 would surprise me...?) --amk From James_Althoff@i2.com Wed Jul 11 23:02:23 2001 From: James_Althoff@i2.com (James_Althoff@i2.com) Date: Wed, 11 Jul 2001 15:02:23 -0700 Subject: [Python-Dev] TypeError and AttributeError Message-ID: Given that except (TypeError, AttributeError): is the "safest across releases" idiom (as suggested by Tim Peters), would it make sense to make AttributeError a subclass of TypeError so that except (TypeError): would become equally "safe" (and simpler)? Potentially, some exception handling code would behave differently (break?), but apparently this is already the case given that what is raised (AttributeError or TypeError) changes between releases. Jim From barry@digicool.com Wed Jul 11 23:05:20 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Wed, 11 Jul 2001 18:05:20 -0400 Subject: [Python-Dev] CVS build failures? posixmodule.c on Linux RH6.1 Message-ID: <15180.52640.666460.449616@anthem.wooz.org> I just did a fresh cvs update, make distclean, configure, make and posixmodule.c is now failing to compile. Looks like it's Thomas recent nice() patch that's failing because PRIO_PROCESS isn't defined unless sys/resource.h is #included (at least on this RH6.1-ish Linux box). The following patch seems to fix the problem for me. Comments? -Barry -------------------- snip snip -------------------- Index: posixmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Modules/posixmodule.c,v retrieving revision 2.191 diff -u -r2.191 posixmodule.c --- posixmodule.c 2001/07/11 14:45:34 2.191 +++ posixmodule.c 2001/07/11 22:04:29 @@ -32,6 +32,12 @@ #include /* For WNOHANG */ #endif +#ifdef HAVE_GETPRIORITY +#ifndef PRIO_PROCESS +#include +#endif /* !PRIO_PROCESS */ +#endif /* HAVE_GETPRIORITY */ + #ifdef HAVE_SIGNAL_H #include #endif From barry@digicool.com Wed Jul 11 23:08:05 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Wed, 11 Jul 2001 18:08:05 -0400 Subject: [Python-Dev] TypeError and AttributeError References: Message-ID: <15180.52805.511953.909752@anthem.wooz.org> >>>>> "JA" == James Althoff writes: JA> Given that JA> except (TypeError, AttributeError): JA> is the "safest across releases" idiom (as suggested by Tim JA> Peters), JA> would it make sense to make AttributeError a subclass of JA> TypeError so that JA> except (TypeError): JA> would become equally "safe" (and simpler)? No, but it /might/ make sense to give them a new common base class between them and Exception. If so, called what? -Barry From barry@digicool.com Wed Jul 11 23:11:39 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Wed, 11 Jul 2001 18:11:39 -0400 Subject: [Python-Dev] CVS build failures? posixmodule.c on Linux RH6.1 References: <15180.52640.666460.449616@anthem.wooz.org> Message-ID: <15180.53019.440723.13978@anthem.wooz.org> Patch uploaded to SF bug #440522. -Barry From thomas@xs4all.net Wed Jul 11 23:16:07 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 12 Jul 2001 00:16:07 +0200 Subject: [Python-Dev] CVS build failures? posixmodule.c on Linux RH6.1 In-Reply-To: <15180.52640.666460.449616@anthem.wooz.org> References: <15180.52640.666460.449616@anthem.wooz.org> Message-ID: <20010712001607.B5396@xs4all.nl> On Wed, Jul 11, 2001 at 06:05:20PM -0400, Barry A. Warsaw wrote: > > I just did a fresh cvs update, make distclean, configure, make and > posixmodule.c is now failing to compile. Looks like it's Thomas > recent nice() patch that's failing because PRIO_PROCESS isn't defined > unless sys/resource.h is #included (at least on this RH6.1-ish Linux > box). The same problem exists on BSD-ish systems, and I changed the patch for some other reasons as well. I'm double-checking it works on more systems right now, and will commit in a few minutes :) Sorry it was delayed long enough for most of you to run into this problem (or so it seems) but we had a big network outage that had a bit higher priority ;P FWIW, not just Linux has a boken nice()... BSDI and FreeBSD have it, too, and they also note nice() has been obsoleted. I'll provide a patch to use setpriority/getpriority for the 2.2 tree, falling back to nice() only if they aren't available. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From nas@python.ca Wed Jul 11 23:19:43 2001 From: nas@python.ca (Neil Schemenauer) Date: Wed, 11 Jul 2001 15:19:43 -0700 Subject: [Python-Dev] Method resolution order In-Reply-To: ; from gvanrossum@users.sourceforge.net on Wed, Jul 11, 2001 at 02:26:10PM -0700 References: Message-ID: <20010711151943.A15462@glacier.fnational.com> Guido van Rossum wrote: > Using the classic [depth-first, left-right] lookup rule, construct the > list of classes that would be searched, including duplicates. Now for > each class that occurs in the list multiple times, remove all > occurrences except for the last. The resulting list contains each > ancestor class exactly once Is this original or is it used by other languages as well? My books on Dylan and CLOS are at home but I think they do something similar. Neil From akuchlin@mems-exchange.org Wed Jul 11 23:19:36 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 11 Jul 2001 18:19:36 -0400 Subject: [Python-Dev] beopen.com e-mail addresses in CVS Message-ID: I noticed that Tools/compiler has jeremy@beopen.com as the author's address. A quick grep: ./Lib/test/test_gettext.py:# Barry Warsaw , 2000. ./Lib/test/test_gettext.py:"Last-Translator: Barry A. Warsaw \n" ./Misc/RPM/Tkinter/setup.cfg:packager = Jeremy Hylton ./Misc/RPM/Tkinter/setup.py: author_email="pythoneers@beopen.com", ./Misc/RPM/beopen-python.spec:Packager: Jeremy Hylton ./Misc/RPM/beopen-python.spec:* Mon Oct 9 2000 Jeremy Hylton ./Misc/RPM/beopen-python.spec:* Thu Oct 5 2000 Jeremy Hylton ./Misc/RPM/beopen-python.spec:* Tue Sep 26 2000 Jeremy Hylton ./Misc/RPM/beopen-python.spec:* Tue Sep 12 2000 Jeremy Hylton ./Tools/compiler/setup.py: author_email = "jeremy@beopen.com", ./changes:a suggestion from Bob Weiner . All but the last (and maybe the second) seem to be worth fixing. --amk From bckfnn@worldonline.dk Wed Jul 11 23:44:54 2001 From: bckfnn@worldonline.dk (Finn Bock) Date: Wed, 11 Jul 2001 22:44:54 GMT Subject: [Python-Dev] TypeError and AttributeError In-Reply-To: <15180.52805.511953.909752@anthem.wooz.org> References: <15180.52805.511953.909752@anthem.wooz.org> Message-ID: <3b4cd444.59275613@mail.wanadoo.dk> >>>>>> "JA" == James Althoff writes: > > JA> would it make sense to make AttributeError a subclass of > JA> TypeError so that > > JA> except (TypeError): > > JA> would become equally "safe" (and simpler)? [Barry] >No, but it /might/ make sense to give them a new common base class >between them and Exception. If so, called what? ProtocolException. I think it would have made sense, but it probably wouldn't have helped. Users still see a specific exception thrown and write code against that. regards, finn From guido@digicool.com Thu Jul 12 00:21:49 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 11 Jul 2001 19:21:49 -0400 Subject: [Python-Dev] Re: Method resolution order In-Reply-To: Your message of "Wed, 11 Jul 2001 15:19:43 PDT." <20010711151943.A15462@glacier.fnational.com> References: <20010711151943.A15462@glacier.fnational.com> Message-ID: <200107112321.f6BNLn614044@odiug.digicool.com> > Guido van Rossum wrote: > > Using the classic [depth-first, left-right] lookup rule, construct the > > list of classes that would be searched, including duplicates. Now for > > each class that occurs in the list multiple times, remove all > > occurrences except for the last. The resulting list contains each > > ancestor class exactly once > > Is this original or is it used by other languages as well? My books on > Dylan and CLOS are at home but I think they do something similar. > > Neil I didn't make it up! I got it from the reference [1] in the PEP. C+ seems to do something similar (with added conflict checking). It would be good to mention that this is not a new invention. If you can confirm that Dylan and CLOS have this, I'll add that. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Thu Jul 12 00:26:33 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 11 Jul 2001 19:26:33 -0400 Subject: [Python-Dev] TypeError and AttributeError In-Reply-To: Your message of "Wed, 11 Jul 2001 22:44:54 GMT." <3b4cd444.59275613@mail.wanadoo.dk> References: <15180.52805.511953.909752@anthem.wooz.org> <3b4cd444.59275613@mail.wanadoo.dk> Message-ID: <200107112326.f6BNQYG14059@odiug.digicool.com> > >>>>>> "JA" == James Althoff writes: > > > > JA> would it make sense to make AttributeError a subclass of > > JA> TypeError so that > > > > JA> except (TypeError): > > > > JA> would become equally "safe" (and simpler)? > > [Barry] > > >No, but it /might/ make sense to give them a new common base class > >between them and Exception. If so, called what? > > ProtocolException. > > I think it would have made sense, but it probably wouldn't have helped. > Users still see a specific exception thrown and write code against that. Yeah, the problem with Jim's proposal is that users who write try: "try something" except TypeError: "one way of handling it" except AttributeError: "another way of handling it" will still see a change in behavior, as will users who catch only AttributeError in a situation that now raises TypeError... --Guido van Rossum (home page: http://www.python.org/~guido/) From greg@cosc.canterbury.ac.nz Thu Jul 12 01:54:08 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 12 Jul 2001 12:54:08 +1200 (NZST) Subject: [Python-Dev] TypeError and AttributeError In-Reply-To: <200107112326.f6BNQYG14059@odiug.digicool.com> Message-ID: <200107120054.MAA01835@s454.cosc.canterbury.ac.nz> Maybe TypeError and AttributeError should be merged, and made aliases for the same exception? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From esr@snark.thyrsus.com Thu Jul 12 04:01:57 2001 From: esr@snark.thyrsus.com (Eric S. Raymond) Date: Wed, 11 Jul 2001 23:01:57 -0400 Subject: [Python-Dev] XML-RPC docs are in Message-ID: <200107120301.f6C31vj16739@snark.thyrsus.com> I have spent the afternoon writing, and the first version of xmlrpclib docs is checked in. It probably has markup errors; it is not complete, and could probably stand to have some of the internal things like Marshaller documented. But I think it does a decent job on the entry points and externally visible things. We have more to do. Fred D. and Fredrik L. should proof this sucker for errors. Fredrik L. should add stuff on some of the internals and quasi-internals I haven't described, like the loads and dumps functions. Then, as Eric Kidd pointed out, we really ought to support an XML-RPC server class wrapped around Fredrik's stubs. There appears to be one in the xmlrpclib distribution. Fredrik, are you planning to document that and check it in? This is a feature set to make serious noise about in the publicity for 2.2. -- Eric S. Raymond "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -- Benjamin Franklin, Historical Review of Pennsylvania, 1759. From tim.one@home.com Thu Jul 12 05:51:15 2001 From: tim.one@home.com (Tim Peters) Date: Thu, 12 Jul 2001 00:51:15 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15179.61728.255760.814673@beluga.mojam.com> Message-ID: [Skip Montanaro] > I wasn't suggesting you weren't trustworthy. Neither was I . > On a Linux system, wall clock time doesn't mean much when timing > processes. Sure -- different OS. I'm not telling you how to time things on Linux; I just explained what I did on Windows because it was questioned. > ... (does Windows even distinguish between user and system time?). I've never seen anything in Win9x that does, and not surprised: they're at best <0.7 wink> single-user systems. On NT there's an elaborate performance monitoring subsystem tied to the HKEY_PERFORMANCE_DATA registry hive, from which-- given enough programming pain, none of which Python endures --you can find out almost anything. > That time.clock means different things on Windows and Unix-like systems > bothers me a bit. Blame X3J11 -- ANSI C is vague about what clock() is supposed to do. I've got no use for the *native* Windows implementation of clock() (Python maps time.clock() to the high-resolution Win32 QueryPerformanceCounter API instead); Unices in general don't have usable high-resolution timers; Windowses in general don't have usable notions of user process time; so we take what we can get. > (It would bother me more if I had to write timing code that was > portable across both Unix and Windows.) Hmm. Unless you're happy with wall-clock time, it may well drive you insane just to write timing code portable across Unices. At my last employer, we wrote all our base timing routines in assembler, because it's generally easy to suck what you need out of modern HW, but darned near impossible after seventeen warring stds committees finish taking turns hiding it . [about BINARY_ADD slowing string+string] > I can buy that. Wasn't there some discussion about improving this > situation? Yes. > If so, I guess I should be using the head branch of the CVS tree > instead of release21-maint. AFAIK, nobody did anything *except* discuss it so far. Insert an early special case for sequence cat, and you slow each numeric addition by the time it takes to fail that test, so there's no killer argument either way. int+int is special-cased by BINARY_ADD, but everything else goes thru the general machinery. From tim.one@home.com Thu Jul 12 06:23:17 2001 From: tim.one@home.com (Tim Peters) Date: Thu, 12 Jul 2001 01:23:17 -0400 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) In-Reply-To: <200107111302.f6BD2gP13353@odiug.digicool.com> Message-ID: [Guido] > I just read that PEP over, and I agree with it. I think it should be > implemented. If anyone with sourceforge permission would like to > champion this PEP further (by implementing the modest change it > suggests so that it can be rolled out with Python 2.2a1 next week), > that would really help! Umm, what am I missing? The change to site.py was so simple you could have committed it yourself quicker than it took to write the above. I committed it a few minutes ago. If something else is needed, someone else will have to do it (or explain it to me in detail so precise they could do it themself 10x quicker ). From thomas@xs4all.net Thu Jul 12 08:09:03 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 12 Jul 2001 09:09:03 +0200 Subject: [Python-Dev] Silly little benchmark In-Reply-To: References: Message-ID: <20010712090903.E5396@xs4all.nl> On Thu, Jul 12, 2001 at 12:51:15AM -0400, Tim Peters wrote: > > On a Linux system, wall clock time doesn't mean much when timing > > processes. > Sure -- different OS. I'm not telling you how to time things on Linux; I > just explained what I did on Windows because it was questioned. Actually, it wasn't . Skip just said realtime didn't make sense on a system that did multiple things at the same time. He obviously didn't mean MS Windows :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Thu Jul 12 08:57:49 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 12 Jul 2001 09:57:49 +0200 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: Message-ID: <3B4D587D.FDB733C0@lemburg.com> Tim Peters wrote: > > [Guido] > > I just read that PEP over, and I agree with it. I think it should be > > implemented. If anyone with sourceforge permission would like to > > champion this PEP further (by implementing the modest change it > > suggests so that it can be rolled out with Python 2.2a1 next week), > > that would really help! > > Umm, what am I missing? The change to site.py was so simple you could have > committed it yourself quicker than it took to write the above. I committed > it a few minutes ago. If something else is needed, someone else will have > to do it (or explain it to me in detail so precise they could do it themself > 10x quicker ). Cool, but what about the changes needed in distutils to actually utilize the new directory and the changes to the Windows installer to create the directory at installation time ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From Paul.Moore@atosorigin.com Thu Jul 12 09:32:51 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 12 Jul 2001 09:32:51 +0100 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distu tils] Package DB: strawman PEP) Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEE9@ukrux002.rundc.uk.origin-it.com> From: M.-A. Lemburg [mailto:mal@lemburg.com] Tim Peters wrote: > Umm, what am I missing? The change to site.py was so simple you could have > committed it yourself quicker than it took to write the above. I committed > it a few minutes ago. If something else is needed, someone else will have > to do it (or explain it to me in detail so precise they could do it themself > 10x quicker ). > Cool, but what about the changes needed in distutils to actually > utilize the new directory and the changes to the Windows installer > to create the directory at installation time ? The patch I sent along with the final version of the PEP included the distutils change (it's only one line, but it's on the PC at home, so I can't quote it here). I assume that the Python install should ensure that the site-packages exists (it does at the moment) so I don't see a need for the wininst installer to check. Paul. PS [After a quick rummage...] I *think* the following patch is what is needed for distutils: I haven't tested it, though, so it would be better to check the original version (which I did test...) --- sysconfig.py.orig Thu Apr 19 10:24:24 2001 +++ sysconfig.py Thu Jul 12 09:32:34 2001 @@ -87,7 +87,7 @@ elif os.name == "nt": if standard_lib: - return os.path.join(PREFIX, "Lib") + return os.path.join(PREFIX, "Lib", "site-packages") else: return prefix From mal@lemburg.com Thu Jul 12 10:14:35 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 12 Jul 2001 11:14:35 +0200 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEE9@ukrux002.rundc.uk.origin-it.com> Message-ID: <3B4D6A7B.F493D611@lemburg.com> "Moore, Paul" wrote: > > From: M.-A. Lemburg [mailto:mal@lemburg.com] > Tim Peters wrote: > > Umm, what am I missing? The change to site.py was so simple you could > have > > committed it yourself quicker than it took to write the above. I > committed > > it a few minutes ago. If something else is needed, someone else will have > > to do it (or explain it to me in detail so precise they could do it > themself > > 10x quicker ). > > > Cool, but what about the changes needed in distutils to actually > > utilize the new directory and the changes to the Windows installer > > to create the directory at installation time ? > > The patch I sent along with the final version of the PEP included the > distutils change (it's only one line, but it's on the PC at home, so I can't > quote it here). I assume that the Python install should ensure that the > site-packages exists (it does at the moment) so I don't see a need for the > wininst installer to check. I don't have a site-packages dir in my installations. Could it be that you installed some distutils package which automagically created one or that this change in Python 2.1.1 ? > Paul. > > PS [After a quick rummage...] I *think* the following patch is what is > needed for distutils: I haven't tested it, though, so it would be better to > check the original version (which I did test...) > > --- sysconfig.py.orig Thu Apr 19 10:24:24 2001 > +++ sysconfig.py Thu Jul 12 09:32:34 2001 > @@ -87,7 +87,7 @@ > > elif os.name == "nt": > if standard_lib: > - return os.path.join(PREFIX, "Lib") > + return os.path.join(PREFIX, "Lib", "site-packages") > else: > return prefix This doesn't seem to do the trick: the Windows installer still installs the packages directly to \Python21. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From Paul.Moore@atosorigin.com Thu Jul 12 11:02:28 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 12 Jul 2001 11:02:28 +0100 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distu tils] Package DB: strawman PEP) Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEB@ukrux002.rundc.uk.origin-it.com> From: M.-A. Lemburg [mailto:mal@lemburg.com] > I don't have a site-packages dir in my installations. Could it be that > you installed some distutils package which automagically created one > or that this change in Python 2.1.1 ? I don't believe so. I have ActiveState Python - it's possible (although unlikely, I would think) that that version creates site-packages specially. It's vaguely possible (although unlikely) that I created the directory manually - it was missing in one of the 2.1 betas, IIRC, but I thought it reappeared in 2.1 final. In any case, the necessary changes to make sure that directory exists should be in the Windows Installer package(s) for Python. I guess that means somewhere in the Wise installer scripts - which I don't have access to, nor would I know how to change. It should just be a case of reinstating the behaviour in 2.0, if the directory really has been lost in 2.1. > This doesn't seem to do the trick: the Windows installer still installs > the packages directly to \Python21. This change should (as I said, it's untested) have ensured that "python setup.py install" puts the module into site-packages. I don't know what the installer code in bdist_wininst.py does, as it's a base64-encoded EXE, and I don't have the sources - surely it uses the distutils sysconfig stuff to get the value (it has no other way of knowing...)? Paul. From thomas.heller@ion-tof.com Thu Jul 12 11:57:18 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 12 Jul 2001 12:57:18 +0200 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEB@ukrux002.rundc.uk.origin-it.com> Message-ID: <100d01c10ac1$6b7fd470$e000a8c0@thomasnotebook> From: "Moore, Paul" > From: M.-A. Lemburg [mailto:mal@lemburg.com] > > This doesn't seem to do the trick: the Windows installer still installs > > the packages directly to \Python21. > > This change should (as I said, it's untested) have ensured that "python > setup.py install" puts the module into site-packages. I don't know what the > installer code in bdist_wininst.py does, as it's a base64-encoded EXE, and I > don't have the sources - surely it uses the distutils sysconfig stuff to get > the value (it has no other way of knowing...)? The sources are in CVS: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/distutils/misc/ The bdist_wininst installer simply installs into prefix, this is what the registry has under HKEY_LOCAL_MACHINE\Software\Python\PythonCore\2.1\InstallPath. Now what should it do? There are probably some issues here. Currently it installs the package into prefix, and creates a prefix/Remove.exe uninstaller, and *appends* info about uninstallation into the prefix/-wininst.log file. In the future (after PEP250) it should install the package into prefix/lib/site-packages. Also for older Python versions? Or only for the newer ones? Depending on the distutils' version used to create the installer? Depending on the actual site.py file? Hardcoding a version check (version >= 2.2') into the installer doesn't seem so nice, but would probably do the correct thing. Note that 'python setup.py install' requires distutils to be present - the bdist_wininst installer does not. Thomas From mal@lemburg.com Thu Jul 12 12:51:06 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 12 Jul 2001 13:51:06 +0200 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEB@ukrux002.rundc.uk.origin-it.com> Message-ID: <3B4D8F2A.A1D54B9D@lemburg.com> "Moore, Paul" wrote: > > From: M.-A. Lemburg [mailto:mal@lemburg.com] > > I don't have a site-packages dir in my installations. Could it be that > > you installed some distutils package which automagically created one > > or that this change in Python 2.1.1 ? > > I don't believe so. I have ActiveState Python - it's possible (although > unlikely, I would think) that that version creates site-packages specially. > It's vaguely possible (although unlikely) that I created the directory > manually - it was missing in one of the 2.1 betas, IIRC, but I thought it > reappeared in 2.1 final. In any case, the necessary changes to make sure > that directory exists should be in the Windows Installer package(s) for > Python. I guess that means somewhere in the Wise installer scripts - which I > don't have access to, nor would I know how to change. They should be in the CVS tree of Python on SourceForge. > It should just be a case of reinstating the behaviour in 2.0, if the > directory really has been lost in 2.1. > > > This doesn't seem to do the trick: the Windows installer still installs > > the packages directly to \Python21. > > This change should (as I said, it's untested) have ensured that "python > setup.py install" puts the module into site-packages. About the change: I think distutils should lookup the path in Python's site.py file - that way you assure that distutils will work on all Python installations rather than only on those which have the site.py patch. Otherwise, Python won't find the packages installed in Lib/site-packages. > I don't know what the > installer code in bdist_wininst.py does, as it's a base64-encoded EXE, and I > don't have the sources - surely it uses the distutils sysconfig stuff to get > the value (it has no other way of knowing...)? The sources for the Windows installer are on SourceForge CVS too (under the distutils branch). I believe that Thomas Heller who wrote the installer will know best what to do about this... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From Paul.Moore@atosorigin.com Thu Jul 12 12:57:55 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 12 Jul 2001 12:57:55 +0100 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distu tils] Package DB: strawman PEP) Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEED@ukrux002.rundc.uk.origin-it.com> From: Thomas Heller [mailto:thomas.heller@ion-tof.com] > The sources are in CVS: > http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/distutils/misc/ Unfortunately, I don't have CVS access... > The bdist_wininst installer simply installs into prefix, > this is what the registry has under > HKEY_LOCAL_MACHINE\Software\Python\PythonCore\2.1\InstallPath. > > Now what should it do? What does that key *mean*? If it is the directory into which packages should get installed, then bdist_wininst should keep doing what it does now, and the Python installer should be changed to put site-packages into that key. If, on the other hand, this key has a meaning elsewhere in Python, and changing it would cause a problem, then I would say that this is a bug in the Windows Installer, which should use a key of its own. In that case, my recommendation would be to have the Python 2.2 installer create a new key, and wininst use that if it exists, otherwise fall back to the current key. That would provide the correct behaviour in the new release, but retain backward compatibility with earlier versions of Python. > There are probably some issues here. Agreed. I apologise if I didn't publicise the PEP in the right places for these to get picked up earlier - I thought I had. I believe my suggestion above will do the right thing, but I am not an expert in the intricacies of Python's use of the registry, so I'd like someone more knowledgeable to comment, if possible. Paul. From Paul.Moore@atosorigin.com Thu Jul 12 13:05:52 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 12 Jul 2001 13:05:52 +0100 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distu tils] Package DB: strawman PEP) Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEE@ukrux002.rundc.uk.origin-it.com> From: M.-A. Lemburg [mailto:mal@lemburg.com] > They should be in the CVS tree of Python on SourceForge. I don't have CVS access, so I can't get at these, unfortunately... > About the change: I think distutils should lookup the path in Python's > site.py file - that way you assure that distutils will work on all > Python installations rather than only on those which have the site.py > patch. Otherwise, Python won't find the packages installed in > Lib/site-packages. I'm not sure what you intend, here. site.py doesn't export this directory - it is just one of the directories which gets added to sys.path in site.py. On Unix, there are more than one such directory (both version-specific and version-independent), so there isn't, in general, just one such directory. I don't know how you could encapsulate this in a way which would not clash with other platforms' policies. The intention of this change was to be the smallest possible change which would work. I believe it (or at least, the patch I sent when I submitted the final version of the PEP) does that for everything except the Windows Installer. I'll have to defer judgement on how best to address that area to others better qualified to comment, but see my message to Thomas for my suggestion. Hope this helps, Paul. From guido@digicool.com Thu Jul 12 13:32:47 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 12 Jul 2001 08:32:47 -0400 Subject: [Python-Dev] Improving nested-scope warnings? Message-ID: <200107121232.f6CCWlu14533@odiug.digicool.com> Someone on SF made a point: the warnings you get about nested scopes aren't always as helpful as they could be. > My "user experience" of porting my 2.0 code to 2.1 is > however fairly pityful. Here some destilled suggestions: > * separate warnings for "potential" "import *" problems for > standard > modules (as in the examples) -- sure we know what math > exports > right now and "from math import *" is a common idiom. > * run-time warnings for shadowed constructs > * listing of the variables that are imported and one may > want to > import by name instead (or qualify) > > While I really like the new scoping rules and they support > my programming style their practical impact on existing code > is quite > large. A better support would be fairly important -- I have > 50.000 lines of code to port .... (From http://sourceforge.net/tracker/?func=detail&atid=105470&aid=440497&group_id=5470) Anybody interested in implementing some of these? I guess this would be in the 2.1.1 branch, as in the 2.2 branch we're about to enable the future... --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Thu Jul 12 13:51:05 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 12 Jul 2001 14:51:05 +0200 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEED@ukrux002.rundc.uk.origin-it.com> Message-ID: <115b01c10ad1$50ea9cc0$e000a8c0@thomasnotebook> From: "Moore, Paul" > > The bdist_wininst installer simply installs into prefix, > > this is what the registry has under > > HKEY_LOCAL_MACHINE\Software\Python\PythonCore\2.1\InstallPath. > > > > Now what should it do? > > What does that key *mean*? If it is the directory into which packages should > get installed, then bdist_wininst should keep doing what it does now, and > the Python installer should be changed to put site-packages into that key. > > If, on the other hand, this key has a meaning elsewhere in Python, and > changing it would cause a problem, then I would say that this is a bug in > the Windows Installer, which should use a key of its own. In that case, my > recommendation would be to have the Python 2.2 installer create a new key, > and wininst use that if it exists, otherwise fall back to the current key. > That would provide the correct behaviour in the new release, but retain > backward compatibility with earlier versions of Python. Good idea. But remember that there is still Pythonware's distribution, which does neither create nor require registry entries, also if you compile from source they are not available. OTOH, bdist_wininst installers currently do not recognize these Python installations, which is probably the next bug. > > > There are probably some issues here. > > Agreed. I apologise if I didn't publicise the PEP in the right places for > these to get picked up earlier - I thought I had. I believe my suggestion > above will do the right thing, but I am not an expert in the intricacies of > Python's use of the registry, so I'd like someone more knowledgeable to > comment, if possible. It's my fault, I'm afraid. Didn't think enough about these things earlier. > > Paul. > Thomas BTW: We should narrow the TO: and CC: fields in this discussion. I'm receiving every message threefold. What would be appropriate? From skip@pobox.com (Skip Montanaro) Thu Jul 12 14:13:16 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 12 Jul 2001 08:13:16 -0500 Subject: [Python-Dev] Silly little benchmark In-Reply-To: References: <15179.61728.255760.814673@beluga.mojam.com> Message-ID: <15181.41580.258091.412915@beluga.mojam.com> Tim> AFAIK, nobody did anything *except* discuss it so far. Insert an Tim> early special case for sequence cat, and you slow each numeric Tim> addition by the time it takes to fail that test, so there's no Tim> killer argument either way. int+int is special-cased by Tim> BINARY_ADD, but everything else goes thru the general machinery. Hmmm... What file we talkin' about Willis? If we did test for int+int test for string+string general machinery we might speed up a couple very common cases enough to have an overall win. Skip From mal@lemburg.com Thu Jul 12 14:46:52 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 12 Jul 2001 15:46:52 +0200 Subject: [Distutils] RE: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEE@ukrux002.rundc.uk.origin-it.com> Message-ID: <3B4DAA4C.A6F0859D@lemburg.com> "Moore, Paul" wrote: > > From: M.-A. Lemburg [mailto:mal@lemburg.com] > > They should be in the CVS tree of Python on SourceForge. > I don't have CVS access, so I can't get at these, unfortunately... There should be a tarball of the CVS archive available somewhere on SF. > > About the change: I think distutils should lookup the path in Python's > > site.py file - that way you assure that distutils will work on all > > Python installations rather than only on those which have the site.py > > patch. Otherwise, Python won't find the packages installed in > > Lib/site-packages. > > I'm not sure what you intend, here. site.py doesn't export this directory - > it is just one of the directories which gets added to sys.path in site.py. > On Unix, there are more than one such directory (both version-specific and > version-independent), so there isn't, in general, just one such directory. I > don't know how you could encapsulate this in a way which would not clash > with other platforms' policies. > > The intention of this change was to be the smallest possible change which > would work. I believe it (or at least, the patch I sent when I submitted the > final version of the PEP) does that for everything except the Windows > Installer. I'll have to defer judgement on how best to address that area to > others better qualified to comment, but see my message to Thomas for my > suggestion. Well, site.py could be modified to set a symbol in the sys module which could then be queried by distutils, e.g. sys.extinstallprefix. Alternatively, distutils could be made to default to Lib\site-packages and then revert to Lib\ in case this directory is not available. BTW, I don't think that using Windows registry keys for determining the installation path is a good idea -- this information should be kept in the site.py or sitecustomize.py module for easy editing. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From nas@python.ca Thu Jul 12 14:51:05 2001 From: nas@python.ca (Neil Schemenauer) Date: Thu, 12 Jul 2001 06:51:05 -0700 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15181.41580.258091.412915@beluga.mojam.com>; from skip@pobox.com on Thu, Jul 12, 2001 at 08:13:16AM -0500 References: <15179.61728.255760.814673@beluga.mojam.com> <15181.41580.258091.412915@beluga.mojam.com> Message-ID: <20010712065105.A16964@glacier.fnational.com> Skip Montanaro wrote: > Hmmm... What file we talkin' about Willis? If we did > > test for int+int > test for string+string > general machinery > > we might speed up a couple very common cases enough to have an overall win. BINARY_ADD in ceval.c. I would guess that special casing strings would be an overall loss. Neil From thomas@xs4all.net Thu Jul 12 14:54:09 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 12 Jul 2001 15:54:09 +0200 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15181.41580.258091.412915@beluga.mojam.com> References: <15179.61728.255760.814673@beluga.mojam.com> <15181.41580.258091.412915@beluga.mojam.com> Message-ID: <20010712155409.K5396@xs4all.nl> On Thu, Jul 12, 2001 at 08:13:16AM -0500, Skip Montanaro wrote: > Tim> AFAIK, nobody did anything *except* discuss it so far. Insert an > Tim> early special case for sequence cat, and you slow each numeric > Tim> addition by the time it takes to fail that test, so there's no > Tim> killer argument either way. int+int is special-cased by > Tim> BINARY_ADD, but everything else goes thru the general machinery. > Hmmm... What file we talkin' about Willis? ceval.c, just look for BINARY_ADD. > If we did > test for int+int > test for string+string > general machinery > we might speed up a couple very common cases enough to have an overall win. Don't forget to do meaningful performance comparisons before and after ;P -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@digicool.com Thu Jul 12 15:00:33 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 12 Jul 2001 10:00:33 -0400 Subject: [Python-Dev] "Brennan, Bernadette": Python 1.5.2 & Solaris 8 Message-ID: <200107121400.f6CE0XA14567@odiug.digicool.com> Does anyone remember what the problems with Solaris-8 were? Shallow, I hope? --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Forwarded Message Date: Thu, 12 Jul 2001 09:11:08 -0400 From: "Brennan, Bernadette" To: "'guido@python.org'" Subject: Python 1.5.2 & Solaris 8 We are currently using Python 1.5.2 on Solaris 5. I have been tasked to upgrade to Solaris 8, and I am running into problems compiling with Python. Can you tell me if Python 1.5.2 is compatible with Solaris 8? If 1.5.2 is not compatible are any of the newer releases of Python? Thank you for your help. Bernadette Brennan ------- End of Forwarded Message From thomas@xs4all.net Thu Jul 12 15:33:14 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 12 Jul 2001 16:33:14 +0200 Subject: [Python-Dev] Python 2.1.1 Message-ID: <20010712163314.L5396@xs4all.nl> I'm done checking in bugfixes into 2.1.1. I went through all checkins since release21 (using some evil python & shell scriptery to get it in the first place) and caught a few more, today. However, I see two bugs on SF that still bug me, and I'd like to see fixed: [ #425007 ] Python 2.1 installs shared libs with mode 0700 https://sourceforge.net/tracker/index.php?func=detail&aid=425007&group_id=5470&atid=105470 [ #230075 ] dbmmodule build fails on Debian GNU/Linux unstable (Sid) https://sourceforge.net/tracker/index.php?func=detail&aid=230075&group_id=5470&atid=105470 Both of these are distutils-build related, and I'm not sure on the 'right' fix on either. The latter also applies to 'bsddb', by the way, and is especially annoying to me, because I'm running Debian on more and more machines :) Does anyone who understands setup.py have time to look at these before a week from friday, when 2.1.1-final is scheduled ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From fdrake@acm.org Thu Jul 12 15:48:03 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 12 Jul 2001 10:48:03 -0400 (EDT) Subject: [Python-Dev] Docs for 2.1.1c1 frozen Message-ID: <15181.47267.392908.120928@cj42289-a.reston1.va.home.com> I'm freezing the Doc/ tree on the release21-maint branch until the 2.1.1c1 release is out. If you find a bug in that version of the docs, please report it via the SourceForge bug tracker, even if you have checkin permission, at least until the freeze is lifted. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From thomas@xs4all.net Thu Jul 12 15:55:01 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 12 Jul 2001 16:55:01 +0200 Subject: [Python-Dev] "Brennan, Bernadette": Python 1.5.2 & Solaris 8 In-Reply-To: <200107121400.f6CE0XA14567@odiug.digicool.com> References: <200107121400.f6CE0XA14567@odiug.digicool.com> Message-ID: <20010712165501.M5396@xs4all.nl> On Thu, Jul 12, 2001 at 10:00:33AM -0400, Guido van Rossum wrote: > Does anyone remember what the problems with Solaris-8 were? Shallow, > I hope? No clue, sorry. I don't have Solaris 8, either, but I do have access to Solaris 7 (currently, but not for long) and will attempt to build a couple of releases on it. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim@digicool.com Thu Jul 12 15:57:30 2001 From: tim@digicool.com (Tim Peters) Date: Thu, 12 Jul 2001 10:57:30 -0400 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) In-Reply-To: <3B4D587D.FDB733C0@lemburg.com> Message-ID: [MAL] > Cool, but what about the changes needed in distutils to actually > utilize the new directory Be my guest -- don't know anything about that, and no time to learn. > and the changes to the Windows installer to create the directory > at installation time ? OK, I'll look into that, although it doesn't seem necessary. From skip@pobox.com (Skip Montanaro) Thu Jul 12 16:09:36 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 12 Jul 2001 10:09:36 -0500 Subject: [Python-Dev] What are these defines doing in OPT? Message-ID: <15181.48560.980298.911962@beluga.mojam.com> I just tried Fred's distutils.sysconfig.OPT thing, which failed. I then poked around and found distutils.sysconf.get_config_var. That wouldn't work because I hadn't installed the 2.2 interpreter (there is as yet no /usr/local/lib/python2.2). So, I finall just grepped my Makefile for OPT and found this definition: OPT= -g -O2 -Wall -Wstrict-prototypes -Dss_family=__ss_family -Dss_len=__ss_len What are those -D flags doing in OPT? Shouldn't they be in CPPFLAGS or CFLAGS? Skip From thomas.heller@ion-tof.com Thu Jul 12 16:11:19 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 12 Jul 2001 17:11:19 +0200 Subject: [Distutils] RE: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEE@ukrux002.rundc.uk.origin-it.com> <3B4DAA4C.A6F0859D@lemburg.com> Message-ID: <12c801c10ae4$e79d1fe0$e000a8c0@thomasnotebook> [I've cut down the To: and CC: headers to olny include python-dev and distutils] > Well, site.py could be modified to set a symbol in the sys module > which could then be queried by distutils, e.g. sys.extinstallprefix. > > Alternatively, distutils could be made to default to > Lib\site-packages and then revert to Lib\ in case this directory > is not available. > > BTW, I don't think that using Windows registry keys for determining the > installation path is a good idea -- this information should be kept > in the site.py or sitecustomize.py module for easy editing. The problem is that the 'installation path' information must be loaded at run time by the windows installer, and it may not always sucessful to embed python at run time and let python code retrieve it. Remember the problems we had with Python2.0 on win95/98, when win32all was not installed? The installer was not able to compile the installed files to pyc/pyo because of this path bug. Anyway, how does bdist-rpm does it? Should be the same problem there... Thomas From nas@python.ca Thu Jul 12 16:18:09 2001 From: nas@python.ca (Neil Schemenauer) Date: Thu, 12 Jul 2001 08:18:09 -0700 Subject: [Python-Dev] What are these defines doing in OPT? In-Reply-To: <15181.48560.980298.911962@beluga.mojam.com>; from skip@pobox.com on Thu, Jul 12, 2001 at 10:09:36AM -0500 References: <15181.48560.980298.911962@beluga.mojam.com> Message-ID: <20010712081809.A17168@glacier.fnational.com> Skip Montanaro wrote: > OPT= -g -O2 -Wall -Wstrict-prototypes -Dss_family=__ss_family -Dss_len=__ss_len > > What are those -D flags doing in OPT? Shouldn't they be in CPPFLAGS or > CFLAGS? IMHO, they should be in DEFS. Any objections to moving them there? Neil From skip@pobox.com (Skip Montanaro) Thu Jul 12 16:54:26 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 12 Jul 2001 10:54:26 -0500 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <20010712155409.K5396@xs4all.nl> References: <15179.61728.255760.814673@beluga.mojam.com> <15181.41580.258091.412915@beluga.mojam.com> <20010712155409.K5396@xs4all.nl> Message-ID: <15181.51250.50466.870207@beluga.mojam.com> Thomas> Don't forget to do meaningful performance comparisons before and Thomas> after ;P It's that adjective "meaningful" that makes it difficult... Obviously pystone wouldn't be meaningful since it doesn't do much string stuff. I tried timing the following: PYTHONPATH= time ./python -tt ../Lib/test/regrtest.py -l after making sure the .py[co] files were deleted. I got "102.96user 1.47system" before and "103.24user 1.57system" after. I then removed the .py[co] files again and ran the same test under gdb, with breakpoints in each of the three branches whose break commands incremented counters. After letting it run for *a while*, I got tired of waiting for it to complete (it was in the midst of test___all__). I broke into the debugger then examined the counters. The int/int branch had been taken 5432 times, the string/string branch had been taken 635 times and the else branch 673 times. It would appear that string/string add is perhaps the second-most executed type of add, but that it is executed infrequently enough (at least by the test suite) that special-casing it will have no effect. Still, if you are doing lots of string concatenation, perhaps looking at other methods (append to list, then join the result, for example) would be worthwhile. Skip From trentm@ActiveState.com Thu Jul 12 17:01:13 2001 From: trentm@ActiveState.com (Trent Mick) Date: Thu, 12 Jul 2001 09:01:13 -0700 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) In-Reply-To: ; from tim@digicool.com on Thu, Jul 12, 2001 at 10:57:30AM -0400 References: <3B4D587D.FDB733C0@lemburg.com> Message-ID: <20010712090113.E10387@ActiveState.com> On Thu, Jul 12, 2001 at 10:57:30AM -0400, Tim Peters wrote: > > and the changes to the Windows installer to create the directory > > at installation time ? > > OK, I'll look into that, although it doesn't seem necessary. I have to agree with Tim. If distutils is going to install a package to site-packages then it should create the directory itself if it does not exist. Certainly it should not fail if the directory does not exist. Trent -- Trent Mick TrentM@ActiveState.com From trentm@ActiveState.com Thu Jul 12 17:08:58 2001 From: trentm@ActiveState.com (Trent Mick) Date: Thu, 12 Jul 2001 09:08:58 -0700 Subject: [Distutils] RE: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distu tils] Package DB: strawman PEP) In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEB@ukrux002.rundc.uk.origin-it.com>; from Paul.Moore@atosorigin.com on Thu, Jul 12, 2001 at 11:02:28AM +0100 References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEB@ukrux002.rundc.uk.origin-it.com> Message-ID: <20010712090858.F10387@ActiveState.com> On Thu, Jul 12, 2001 at 11:02:28AM +0100, Moore, Paul wrote: > From: M.-A. Lemburg [mailto:mal@lemburg.com] > > I don't have a site-packages dir in my installations. Could it be that > > you installed some distutils package which automagically created one > > or that this change in Python 2.1.1 ? > > I don't believe so. I have ActiveState Python - it's possible (although > unlikely, I would think) that that version creates site-packages specially. The ActivePython 2.1 installer *does* create \Lib\site-packages on Windows. Trent -- Trent Mick TrentM@ActiveState.com From jeremy@alum.mit.edu Thu Jul 12 17:26:50 2001 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 12 Jul 2001 12:26:50 -0400 Subject: [Python-Dev] Python 2.1.1 In-Reply-To: <20010712163314.L5396@xs4all.nl> Message-ID: There's also a nested scopes bug, related to classes that use the same free variable in several classes. Evan Simpson said he posted a SF report about it, but I can't find it. I may be able to look into it. I'd rather not be on the hook for it, but I'm not sure anyone else understands the code :-(. Jeremy From jeremy@alum.mit.edu Thu Jul 12 17:26:51 2001 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 12 Jul 2001 12:26:51 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15181.51250.50466.870207@beluga.mojam.com> Message-ID: Thomas> Don't forget to do meaningful performance comparisons before and Thomas> after ;P It's that adjective "meaningful" that makes it difficult... >From http://mail.python.org/pipermail/python-dev/2001-May/014911.html: """It looks like the new coercion rules have optimized number ops at the expense of string ops. If you're writing programs with lots of numbers, you probably think that's peachy. If you're parsing HTML, perhaps you don't :-). I looked at the test suite to see how often it is called with non-number arguments. The answer is 77% of the time, but almost all of those calls are from test_unicodedata. If that one test is excluded, the majority of the calls (~90%) are with numbers. But the majority of those calls just come from a few tests -- test_pow, test_long, test_mutants, test_strftime. If I were to do something about the coercions, I would see if there was a way to quickly determine that PyNumber_Add() ain't gonna have any luck. Then we could bail to things like string_concat more quickly.""" Jeremy From thomas@xs4all.net Thu Jul 12 17:33:06 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 12 Jul 2001 18:33:06 +0200 Subject: [Python-Dev] Python 2.1.1 In-Reply-To: References: Message-ID: <20010712183305.S5391@xs4all.nl> On Thu, Jul 12, 2001 at 12:26:50PM -0400, Jeremy Hylton wrote: > There's also a nested scopes bug, related to classes that use the same free > variable in several classes. Evan Simpson said he posted a SF report about > it, but I can't find it. I may be able to look into it. I'd rather not be > on the hook for it, but I'm not sure anyone else understands the code :-(. As far as I could determine, that bug is the one you fixed shortly after 2.1-release: ---------------------------- compile.c revision 2.198 date: 2001/04/27 02:29:40; author: jhylton; state: Exp; lines: +20 -6 Fix 2.1 nested scopes crash reported by Evan Simpson The new test case demonstrates the bug. Be more careful in symtable_resolve_free() to add a var to cells or frees only if it won't be added under some other rule. XXX Add new assertion that will catch this bug. ----------------------------- I couldn't reproduce his bugreport using 2.2/2.1.1-with-this-fix, but I could with 2.1-final, so I mentioned that, marked it fixed and closed it. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From skip@pobox.com (Skip Montanaro) Thu Jul 12 17:33:13 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 12 Jul 2001 11:33:13 -0500 Subject: [Python-Dev] Python 2.1.1 In-Reply-To: <20010712163314.L5396@xs4all.nl> References: <20010712163314.L5396@xs4all.nl> Message-ID: <15181.53577.543414.970113@beluga.mojam.com> Thomas> Both of these are distutils-build related, and I'm not sure on Thomas> the 'right' fix on either. The latter also applies to 'bsddb', Thomas> by the way, and is especially annoying to me, because I'm Thomas> running Debian on more and more machines :) Does anyone who Thomas> understands setup.py have time to look at these before a week Thomas> from friday, when 2.1.1-final is scheduled ? I just added another variant (with a patch): bsddb build on Mandrake 8.0 is broken because it doesn't account for the libdb* shared library when creating bsddb.so: https://sourceforge.net/tracker/index.php?func=detail&aid=440725&group_id=5470&atid=105470 Thomas, I'm not sure if this applies to your Debian build woes, but perhaps it will help. -- Skip Montanaro (skip@pobox.com) (847)971-7098 From mal@lemburg.com Thu Jul 12 17:38:41 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 12 Jul 2001 18:38:41 +0200 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: <3B4D587D.FDB733C0@lemburg.com> <20010712090113.E10387@ActiveState.com> Message-ID: <3B4DD291.7FB22BC6@lemburg.com> Trent Mick wrote: > > On Thu, Jul 12, 2001 at 10:57:30AM -0400, Tim Peters wrote: > > > and the changes to the Windows installer to create the directory > > > at installation time ? > > > > OK, I'll look into that, although it doesn't seem necessary. > > I have to agree with Tim. If distutils is going to install a package to > site-packages then it should create the directory itself if it does not > exist. Certainly it should not fail if the directory does not exist. I believe that it creates the directory (distutils has a make_path() API for this), but having it there for testing would sure help in figuring out what to do. Please keep in mind that distutils has to work with Python versions 1.5.2, 2.0 and 2.1. Also, I think that it is cleaner to have existing directories on sys.path. Indeed, it may be worthwhile having Python eliminate non-existing dirs at startup time (i.e. in site.py). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Thu Jul 12 17:43:36 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 12 Jul 2001 18:43:36 +0200 Subject: [Distutils] RE: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEE@ukrux002.rundc.uk.origin-it.com> <3B4DAA4C.A6F0859D@lemburg.com> <12c801c10ae4$e79d1fe0$e000a8c0@thomasnotebook> Message-ID: <3B4DD3B8.8A176981@lemburg.com> Thomas Heller wrote: > > [I've cut down the To: and CC: headers to olny include python-dev > and distutils] > > Well, site.py could be modified to set a symbol in the sys module > > which could then be queried by distutils, e.g. sys.extinstallprefix. > > > > Alternatively, distutils could be made to default to > > Lib\site-packages and then revert to Lib\ in case this directory > > is not available. > > > > BTW, I don't think that using Windows registry keys for determining the > > installation path is a good idea -- this information should be kept > > in the site.py or sitecustomize.py module for easy editing. > > The problem is that the 'installation path' information must be > loaded at run time by the windows installer, and it may not always > sucessful to embed python at run time and let python code retrieve it. > Remember the problems we had with Python2.0 on win95/98, when win32all > was not installed? The installer was not able to compile the installed > files to pyc/pyo because of this path bug. Ok. Point taken (this time ;-). > Anyway, how does bdist-rpm does it? Should be the same problem > there... bdist_rpm runs the Python interpreter to figure out the install dirs, etc. at rpm build time. The paths are then hard-coded into the rpm file. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas@xs4all.net Thu Jul 12 17:43:19 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 12 Jul 2001 18:43:19 +0200 Subject: [Python-Dev] Python 2.1.1 In-Reply-To: <15181.53577.543414.970113@beluga.mojam.com> References: <20010712163314.L5396@xs4all.nl> <15181.53577.543414.970113@beluga.mojam.com> Message-ID: <20010712184319.T5391@xs4all.nl> On Thu, Jul 12, 2001 at 11:33:13AM -0500, Skip Montanaro wrote: > Thomas> Both of these are distutils-build related, and I'm not sure on > Thomas> the 'right' fix on either. The latter also applies to 'bsddb', > Thomas> by the way, and is especially annoying to me, because I'm > Thomas> running Debian on more and more machines :) Does anyone who > Thomas> understands setup.py have time to look at these before a week > Thomas> from friday, when 2.1.1-final is scheduled ? > I just added another variant (with a patch): bsddb build on Mandrake 8.0 is > broken because it doesn't account for the libdb* shared library when > creating bsddb.so: > > https://sourceforge.net/tracker/index.php?func=detail&aid=440725&group_id=5470&atid=105470 > Thomas, I'm not sure if this applies to your Debian build woes, but > perhaps it will help. Yes, it does! Now bsddb builds, but dbmmodule still doesn't. It seems that's because setup.py only checks for libndbm.so, and not for libdbX.so, which also have a DBM implementation (IIRC), or libgdbm.so, which has one too. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas@xs4all.net Thu Jul 12 17:50:29 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 12 Jul 2001 18:50:29 +0200 Subject: [Python-Dev] Python 2.1.1 In-Reply-To: <20010712184319.T5391@xs4all.nl> Message-ID: <20010712185029.N5396@xs4all.nl> On Thu, Jul 12, 2001 at 06:43:19PM +0200, Thomas Wouters wrote: > Now bsddb builds, but dbmmodule still doesn't. I should have said 'works'. They both build, dbmmodule just doesn't work: test dbm skipped -- /home/thomas/python/python-2.1.1/dist/src/build/lib.linux-i686-2.1/dbm.so: undefined symbol: dbm_firstkey -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From cgw@transamoeba.dyndns.org Thu Jul 12 18:10:33 2001 From: cgw@transamoeba.dyndns.org (charles g waldman) Date: Thu, 12 Jul 2001 12:10:33 -0500 Subject: [Python-Dev] Re: Python 1.5.2 and Solaris 8 In-Reply-To: References: Message-ID: <15181.55817.853438.785937@transamoeba.dyndns.org> I have access to a Solaris 8 machine, but it's only 32 bits wide. I did a quick build of Py 1.5.2 (which I haven't run for quite some time!) under both the native (SunPro) compiler and also using gcc 2.95.2 I configured --with-thread and ran the test suite. There were no compile-time errors or warnings, but one test failed: test test_popen2 crashed -- exceptions.AssertionError : Traceback (innermost last): File "./Lib/test/regrtest.py", line 204, in runtest __import__(test, globals(), locals(), []) File "./Lib/test/test_popen2.py", line 16, in ? main() File "./Lib/test/test_popen2.py", line 14, in main popen2._test() File "./Lib/popen2.py", line 95, in _test assert not _active AssertionError: This happened with both the gcc and SunPro builds. Everything else looks OK, but I did not do any extensive tests beyond "import test.testall" From thomas@xs4all.net Thu Jul 12 18:19:40 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 12 Jul 2001 19:19:40 +0200 Subject: [Python-Dev] Re: Python 1.5.2 and Solaris 8 In-Reply-To: <15181.55817.853438.785937@transamoeba.dyndns.org> References: <15181.55817.853438.785937@transamoeba.dyndns.org> Message-ID: <20010712191940.O5396@xs4all.nl> On Thu, Jul 12, 2001 at 12:10:33PM -0500, charles g waldman wrote: > I have access to a Solaris 8 machine, but it's only 32 bits wide. Weren't there a bunch of 64-bit-system fixes in 2.0/2.1 ? Or were they just for the Windows flavour, where pointers were bigger than longs ? > I did a quick build of Py 1.5.2 (which I haven't run for quite some time!) > under both the native (SunPro) compiler and also using gcc 2.95.2 Could I bug you to do the same thing with Python 2.1.1c1 ? My own attempts on Solaris 7 worked okay, but two things failed: readline, and socket (with SSL support.) The latter works okay without SSL. I suspect that's because both libreadline and libcrypto/libssl are static libraries, not shared ones, and the linker barfs on it, but that's just something I realized on the way home, so I haven't doublechecked it :) All the other modules seem to compile fine, and all tests pass, too. Still, it would be nice to test 2.1.1c1 on as many obscure systems as possible ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From jeremy@alum.mit.edu Thu Jul 12 18:39:25 2001 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 12 Jul 2001 13:39:25 -0400 Subject: [Python-Dev] Python 2.1.1 In-Reply-To: <20010712183305.S5391@xs4all.nl> Message-ID: There was a second bug reported recently. I'll try to dig up the email. Jeremy From fdrake@acm.org Thu Jul 12 21:10:16 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 12 Jul 2001 16:10:16 -0400 (EDT) Subject: [Distutils] Re: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) In-Reply-To: <3B4DD291.7FB22BC6@lemburg.com> References: <3B4D587D.FDB733C0@lemburg.com> <20010712090113.E10387@ActiveState.com> <3B4DD291.7FB22BC6@lemburg.com> Message-ID: <15182.1064.74187.737679@cj42289-a.reston1.va.home.com> M.-A. Lemburg writes: > I believe that it creates the directory (distutils has a make_path() > API for this), but having it there for testing would sure help > in figuring out what to do. Please keep in mind that distutils > has to work with Python versions 1.5.2, 2.0 and 2.1. Yes; the os.path.isdir(...) seems the right test for this. > Also, I think that it is cleaner to have existing directories > on sys.path. Indeed, it may be worthwhile having Python eliminate > non-existing dirs at startup time (i.e. in site.py). It should be doing that now. If not, please file a bug report and assign it to me. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim@digicool.com Thu Jul 12 21:23:13 2001 From: tim@digicool.com (Tim Peters) Date: Thu, 12 Jul 2001 16:23:13 -0400 Subject: [Distutils] Re: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) In-Reply-To: <3B4DD291.7FB22BC6@lemburg.com> Message-ID: FYI, I fiddled the Windows Wise install script to create Lib\site-packages\ Of course this only applies to PythonLabs Windows installers created at or after 2.2a1. From fdrake@acm.org Thu Jul 12 21:46:41 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 12 Jul 2001 16:46:41 -0400 (EDT) Subject: [Distutils] Re: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) In-Reply-To: <15182.1064.74187.737679@cj42289-a.reston1.va.home.com> References: <3B4D587D.FDB733C0@lemburg.com> <20010712090113.E10387@ActiveState.com> <3B4DD291.7FB22BC6@lemburg.com> <15182.1064.74187.737679@cj42289-a.reston1.va.home.com> Message-ID: <15182.3249.195312.99147@cj42289-a.reston1.va.home.com> Fred L. Drake, Jr. writes: > It should be doing that now. If not, please file a bug report and > assign it to me. Nevermind. It is a bug, and I'm about to check in the fix. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From fredrik@pythonware.com Thu Jul 12 21:54:02 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 12 Jul 2001 22:54:02 +0200 Subject: [Python-Dev] one more thing for 2.2? Message-ID: <005401c10b14$ca994ba0$4ffa42d5@hagrid> has anyone looked at Paul Svensson's "unreserved words" patch? http://mail.python.org/pipermail/python-list/2001-June/047996.html "The bottom line: apply this patch, and you can use all of Python's 'reserved words' as identifiers; in most cases right away, in all other cases by wrapping parens around them." From guido@digicool.com Thu Jul 12 22:02:30 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 12 Jul 2001 17:02:30 -0400 Subject: [Python-Dev] one more thing for 2.2? In-Reply-To: Your message of "Thu, 12 Jul 2001 22:54:02 +0200." <005401c10b14$ca994ba0$4ffa42d5@hagrid> References: <005401c10b14$ca994ba0$4ffa42d5@hagrid> Message-ID: <200107122102.f6CL2V215763@odiug.digicool.com> > has anyone looked at Paul Svensson's "unreserved words" patch? > > http://mail.python.org/pipermail/python-list/2001-June/047996.html > > "The bottom line: apply this patch, and you can use all of Python's > 'reserved words' as identifiers; in most cases right away, in all other > cases by wrapping parens around them." Wow, an impressive hack. But a hack! Lots of special casing, and breaks abstractions: the parser driver is supposed to know nothing about the actual grammar embodied in its tables. And it won't help with yield: things like yield (1) yield [1] are as valid in the old syntax as they are with the yield statement added. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Thu Jul 12 22:13:11 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 12 Jul 2001 23:13:11 +0200 Subject: [Python-Dev] one more thing for 2.2? In-Reply-To: <200107122102.f6CL2V215763@odiug.digicool.com> Message-ID: <20010712231311.P5396@xs4all.nl> On Thu, Jul 12, 2001 at 05:02:30PM -0400, Guido van Rossum wrote: > > has anyone looked at Paul Svensson's "unreserved words" patch? > > > > http://mail.python.org/pipermail/python-list/2001-June/047996.html > > > > "The bottom line: apply this patch, and you can use all of Python's > > 'reserved words' as identifiers; in most cases right away, in all other > > cases by wrapping parens around them." > Wow, an impressive hack. But a hack! Lots of special casing, and > breaks abstractions: the parser driver is supposed to know nothing > about the actual grammar embodied in its tables. But does it hurt if it does ? It's not like we use it as a general purpose parser right now, and would we really want to use the current parser as a general purpose one ? I have to agree that a nice, clean, powerful parser that can deal better with ambiguities (an LR parser, is that what it's called ? :P) is a much better solution, but in some cases, a hack is better than nothing. > And it won't help with yield: things like > yield (1) > yield [1] > are as valid in the old syntax as they are with the yield statement > added. No, but it will help with bindings to languages that require keywords. .NET comes to mind, again, as does Java. It would also be very cool if we could rename pprint.pprint to pprint.print ;P -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim@digicool.com Thu Jul 12 22:22:53 2001 From: tim@digicool.com (Tim Peters) Date: Thu, 12 Jul 2001 17:22:53 -0400 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AEED@ukrux002.rundc.uk.origin-it.com> Message-ID: [Thomas Heller] > The bdist_wininst installer simply installs into prefix, > this is what the registry has under > HKEY_LOCAL_MACHINE\Software\Python\PythonCore\2.1\InstallPath. > > Now what should it do? [Moore, Paul] > What does that key *mean*? Mark Hammond documented it as being the directory into which Python is installed; python.exe lives here. > If it is the directory into which packages should get installed, No; it's much older than the package mechanism <0.1 wink>. > then bdist_wininst should keep doing what it does now, and the > Python installer should be changed to put site-packages into that key. Not its purpose. > If, on the other hand, this key has a meaning elsewhere in Python, Not in the PythonLabs distribution, but I expect Mark's Win32 extensions make use of it. > and changing it would cause a problem, IMO, any change to the registry settings requires Mark Hammond's blessing. > then I would say that this is a bug in the Windows Installer, which > should use a key of its own. Couldn't follow that one. > In that case, my recommendation would be to have the Python 2.2 > installer create a new key and wininst use that if it exists, > ... If you have to use the registry, why not paste Lib/site-packages on to the end of InstallPath and use that? From guido@digicool.com Thu Jul 12 22:28:42 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 12 Jul 2001 17:28:42 -0400 Subject: [Python-Dev] one more thing for 2.2? In-Reply-To: Your message of "Thu, 12 Jul 2001 23:13:11 +0200." <20010712231311.P5396@xs4all.nl> References: <20010712231311.P5396@xs4all.nl> Message-ID: <200107122128.f6CLSh615890@odiug.digicool.com> > > > http://mail.python.org/pipermail/python-list/2001-June/047996.html > > > > Wow, an impressive hack. But a hack! Lots of special casing, and > > breaks abstractions: the parser driver is supposed to know nothing > > about the actual grammar embodied in its tables. > > But does it hurt if it does ? It's not like we use it as a general purpose > parser right now, and would we really want to use the current parser as a > general purpose one ? Well, it *is* used to parse its own input. :-) > I have to agree that a nice, clean, powerful parser that can deal > better with ambiguities (an LR parser, is that what it's called ? > :P) Yes, why the :P)? LR parsers deal better with ambiguities at the grammar level -- actually, not so much with real ambiguities, but things that look ambiguous until you've seen more of the input. For example an LR grammar can correctly be told that f(a, b) = 12 is invalid; the current LL parser can't. Therefore this has to be rejected in a separate pass. Currently I believe that's the code generation pass but it could be a separate pass altogether. > is a much better solution, but in some cases, a hack is better than > nothing. Adopting this particular hack means you can never go back. It effectively "unreserves" most keywords most of the time, and that means that you can no longer use other parser technologies to parse Python. E.g. suppose someone has a Yacc-based parser for Python. It would be quite a feat to hack the Yacc driver to do the same retrying that his hack does. I bet it would also require a major effort to get tokenize.py to work correctly again. The hack it effectively makes it impossible to give a specification of the real grammar of the language -- you have to try and see if the parser accepts something or not. > No, but it will help with bindings to languages that require > keywords. .NET comes to mind, again, as does Java. It would also be > very cool if we could rename pprint.pprint to pprint.print ;P An approach that might work for this is to pick a FEW keywords (e.g. those that are not reserved words in C or Java or C++) and add those to a FEW places in the grammar. E.g. add a rule extended_name: NAME | 'print' # plus a few others and then use extended_name instead of NAME in the rules for attribute selection and function definition: funcdef: 'def' extended_name parameters ':' suite . . . trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' extended_name This would be unambiguous. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@cj42289-a.reston1.va.home.com Thu Jul 12 22:37:23 2001 From: fdrake@cj42289-a.reston1.va.home.com (Fred Drake) Date: Thu, 12 Jul 2001 17:37:23 -0400 (EDT) Subject: [Python-Dev] [maintenance doc updates] Message-ID: <20010712213723.CE4202892B@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/maint-docs/ Final documentation build for Python 2.1.1 release candidate 1. This version is also available at the Python FTP site: ftp://ftp.python.org/pub/python/doc/2.1.1c1/ From skip@pobox.com (Skip Montanaro) Thu Jul 12 23:08:06 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 12 Jul 2001 17:08:06 -0500 Subject: [Python-Dev] one more thing for 2.2? In-Reply-To: <20010712231311.P5396@xs4all.nl> References: <200107122102.f6CL2V215763@odiug.digicool.com> <20010712231311.P5396@xs4all.nl> Message-ID: <15182.8134.100759.669643@beluga.mojam.com> Thomas> No, but it will help with bindings to languages that require Thomas> keywords. .NET comes to mind, again, as does Java. It would also Thomas> be very cool if we could rename pprint.pprint to pprint.print ;P Or with Python bindings to various external packages we want to wrap. They sometimes have function, variable or attribute names that are keywords in Python and must therefore be mangled in one fashion or another. James Henstridge has to add trailing underscores to a number of attribute names in his PyGtk wrappers: "in_", "del_" and "raise_". Another one that always grates on me is "class". "class_" or "klass" both look ugly. The biggest drawback I see is that in some situations people will have to enclose variable names in parens to sneak them by the parser. That seems inelegant to me. I'm not sure I want to explain this to new users. Skip From thomas@xs4all.net Thu Jul 12 23:42:27 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 13 Jul 2001 00:42:27 +0200 Subject: [Python-Dev] one more thing for 2.2? In-Reply-To: <200107122128.f6CLSh615890@odiug.digicool.com> References: <20010712231311.P5396@xs4all.nl> <200107122128.f6CLSh615890@odiug.digicool.com> Message-ID: <20010713004227.R5396@xs4all.nl> On Thu, Jul 12, 2001 at 05:28:42PM -0400, Guido van Rossum wrote: > > I have to agree that a nice, clean, powerful parser that can deal > > better with ambiguities (an LR parser, is that what it's called ? > > :P) > Yes, why the :P)? Because I was guessing, as I know practically naught about parsers and parsing techniques. For instance, I was not aware that a yacc-based parser would be LL(x) (for some small value of x). ':P)' was tongue-in-cheek, followed by a closing parenthesis. > > is a much better solution, but in some cases, a hack is better than > > nothing. > > Adopting this particular hack means you can never go back. It > effectively "unreserves" most keywords most of the time, and that > means that you can no longer use other parser technologies to parse > Python. E.g. suppose someone has a Yacc-based parser for Python. It > would be quite a feat to hack the Yacc driver to do the same retrying > that his hack does. I bet it would also require a major effort to get > tokenize.py to work correctly again. [ and ] > An approach that might work for this is to pick a FEW keywords > (e.g. those that are not reserved words in C or Java or C++) and add > those to a FEW places in the grammar. E.g. add a rule > extended_name: NAME | 'print' # plus a few others > > and then use extended_name instead of NAME in the rules for attribute > selection and function definition: > > funcdef: 'def' extended_name parameters ':' suite > . > . > . > trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' extended_name > This would be unambiguous. This has been discussed before. The main problem with this is that no one's done it :) I've done a quick test-hack, but ran into somany unguarded 'STR(node)' calls in compile.c that expected a NAME, not an extended_name, that I gave up. It also wouldn't really alleviate the tokenize.py problem -- if adding a few keywords-as-identifiers is doable, so is adding a lot of them :) And there's the maintenance problem on the Grammar... when adding a new keyword, you need to carefully consider where to allow it. However, it's not like adding a new keyword is done more than once a lustrum ;) But I don't have any real need for keywords as identifiers, so I don't mind if we keep the current limitations. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From greg@cosc.canterbury.ac.nz Fri Jul 13 00:49:04 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Jul 2001 11:49:04 +1200 (NZST) Subject: [Python-Dev] Silly little benchmark In-Reply-To: Message-ID: <200107122349.LAA02015@s454.cosc.canterbury.ac.nz> > """It looks like the new coercion rules have optimized number ops at the > expense of string ops. Is there still an intention to get rid of centralised coercion and move it all into the relevant methods? If that were done, wouldn't problems like this go away (or at least turn into a different set of problems)? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From fdrake@acm.org Fri Jul 13 00:50:43 2001 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 12 Jul 2001 19:50:43 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010712235043.5A42D2892B@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Lots of small updates. Added Eric Raymond's documentation for the XML-RPM module added to the standard library. From esr@thyrsus.com Fri Jul 13 01:04:23 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 12 Jul 2001 20:04:23 -0400 Subject: [Python-Dev] [development doc updates] In-Reply-To: <20010712235043.5A42D2892B@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Thu, Jul 12, 2001 at 07:50:43PM -0400 References: <20010712235043.5A42D2892B@cj42289-a.reston1.va.home.com> Message-ID: <20010712200423.A13553@thyrsus.com> Fred L. Drake : > The development version of the documentation has been updated: > > http://python.sourceforge.net/devel-docs/ > > Lots of small updates. > > Added Eric Raymond's documentation for the XML-RPM module added to > the standard library. Calling the effbot! Calling the effbot! Fredrik, please proofread my stuff and fill in any important bits you think are missing. -- Eric S. Raymond "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -- Benjamin Franklin, Historical Review of Pennsylvania, 1759. From tim.one@home.com Fri Jul 13 01:16:57 2001 From: tim.one@home.com (Tim Peters) Date: Thu, 12 Jul 2001 20:16:57 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <15181.51250.50466.870207@beluga.mojam.com> Message-ID: [Skip Montanaro] > It's that adjective "meaningful" that makes it difficult... Obviously > pystone wouldn't be meaningful since it doesn't do much string stuff. pystone is always meaningful, and *especially* when "it shouldn't" change but does anyway <0.4 wink>. For example, while staring at strings, you may miss that other kinds of + slow down. > I tried timing the following: > > PYTHONPATH= time ./python -tt ../Lib/test/regrtest.py -l > > after making sure the .py[co] files were deleted. I got "102.96user > 1.47system" before and "103.24user 1.57system" after. I'm afraid this is useless except to get the sense of highly significant changes: several of the tests do a varying amount of work depending on results from random.py (which initializes itself from system time when it's first imported). pystone is the only shared "speed benchmark" we have. From guido@digicool.com Fri Jul 13 02:16:16 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 12 Jul 2001 21:16:16 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: Your message of "Fri, 13 Jul 2001 11:49:04 +1200." <200107122349.LAA02015@s454.cosc.canterbury.ac.nz> References: <200107122349.LAA02015@s454.cosc.canterbury.ac.nz> Message-ID: <200107130116.f6D1GGq16019@odiug.digicool.com> > > """It looks like the new coercion rules have optimized number ops at the > > expense of string ops. > > Is there still an intention to get rid of centralised > coercion and move it all into the relevant methods? This has been done (except for complex). > If that were done, wouldn't problems like this go > away (or at least turn into a different set of > problems)? I'm not sure what that remark refers to, actually. BINARY_ADD and BINARY_SUBTRACT just test if both args are ints and then in-line the work; BINARY_SUBSCRIPT does the same thing for list[int]. I don't think it has anything to do with coercions. When the operands are strings, the costs are one pointer deref + compare to link-time constant, and one jump (over the inlined code). Small things add up, but I doubt that this is responsible for any particular slow-down. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@alum.mit.edu Fri Jul 13 04:19:57 2001 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 12 Jul 2001 23:19:57 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <200107130116.f6D1GGq16019@odiug.digicool.com> Message-ID: [Greg Ewing:] >> If that were done, wouldn't problems like this go >> away (or at least turn into a different set of >> problems)? [Guido:] >I'm not sure what that remark refers to, actually. > >BINARY_ADD and BINARY_SUBTRACT just test if both args are ints and >then in-line the work; BINARY_SUBSCRIPT does the same thing for >list[int]. I don't think it has anything to do with coercions. When >the operands are strings, the costs are one pointer deref + compare to >link-time constant, and one jump (over the inlined code). > >Small things add up, but I doubt that this is responsible for any >particular slow-down. The big change is the coercion work being done in binary_op1(), which tries to turn strings into numbers in a variety of ways. BINARY_ADD calls PyNumber_Add(), which calls binary_op1(). When the binary_op1() calls fails, it then tries sequence concatenation. If it were possible for binary_op1() to fail quickly for non-numeric sequences like strings, we would not see the slowdown for small string operations. (I believe that's what the silly little benchmark shows and what one of the pybench tests shows.) Jeremy From tim.one@home.com Fri Jul 13 04:27:32 2001 From: tim.one@home.com (Tim Peters) Date: Thu, 12 Jul 2001 23:27:32 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <200107130116.f6D1GGq16019@odiug.digicool.com> Message-ID: [Greg Ewing] >> Is there still an intention to get rid of centralised >> coercion and move it all into the relevant methods? [Guido] > This has been done (except for complex). >> If that were done, wouldn't problems like this go >> away (or at least turn into a different set of >> problems)? > I'm not sure what that remark refers to, actually. > > BINARY_ADD and BINARY_SUBTRACT just test if both args are ints and > then in-line the work; BINARY_SUBSCRIPT does the same thing for > list[int]. I don't think it has anything to do with coercions. When > the operands are strings, the costs are one pointer deref + compare to > link-time constant, and one jump (over the inlined code). It's not BINARY_ADD, it's the PyNumber_Add() called by BINARY_ADD, which, given two strings, calls binary_op1, which does a few failing tests, then calls PyNumber_CoerceEx, which fails quickly enough to coerce, and then pokes around a little looking for number methods, and finally says "hmm! maybe it's a sequence?". > Small things add up, but I doubt that this is responsible for any > particular slow-down. Jeremy earlier pinned the blame on this for one of the "dramatic" pybench slowdowns; Skip may or may not have bumped into it again with his "silly little benchmark" (read the Subject line ). I doubt it's responsible for significant real-life slowdowns. From greg@cosc.canterbury.ac.nz Fri Jul 13 05:29:38 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Jul 2001 16:29:38 +1200 (NZST) Subject: [Python-Dev] Silly little benchmark In-Reply-To: Message-ID: <200107130429.QAA02078@s454.cosc.canterbury.ac.nz> Tim Peters : > It's not BINARY_ADD, it's the PyNumber_Add() called by BINARY_ADD, which, > given two strings, calls binary_op1, which does a few failing tests, then > calls PyNumber_CoerceEx, which fails quickly enough to coerce, and then > pokes around a little looking for number methods, and finally says "hmm! > maybe it's a sequence?". This seems to contradict what Guido just said about centralised coercion having been removed. Is one or the other of us talking nonsense, or do we misunderstand each other? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From thomas.heller@ion-tof.com Fri Jul 13 09:27:44 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 13 Jul 2001 10:27:44 +0200 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: Message-ID: <027201c10b75$b18224a0$e000a8c0@thomasnotebook> From: "Tim Peters" > [Thomas Heller] > > The bdist_wininst installer simply installs into prefix, > > this is what the registry has under > > HKEY_LOCAL_MACHINE\Software\Python\PythonCore\2.1\InstallPath. > > > > Now what should it do? > > [Moore, Paul] > > What does that key *mean*? > > Mark Hammond documented it as being the directory into which Python is > installed; python.exe lives here. > > > If it is the directory into which packages should get installed, > > No; it's much older than the package mechanism <0.1 wink>. > Per _accident_ it is also the location (pre PEP250 time), where packages should get installed. > > In that case, my recommendation would be to have the Python 2.2 > > installer create a new key and wininst use that if it exists, > > ... > > If you have to use the registry, why not paste Lib/site-packages on to the > end of InstallPath and use that? The problem is that the same wininst executable should behave differently depending on the policy Python has chosen for the installation directory: Python 2.1 and before: Use prefix, Python 2.2 (and higher) should use prefix/lib/site-packages. That's why I said a (very hacky) solution would be to simply check for the version number at install time. A better solution would be to somehow query site.py at install time? Thomas From thomas@xs4all.net Fri Jul 13 11:57:20 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 13 Jul 2001 12:57:20 +0200 Subject: [Python-Dev] Python 2.1.1 In-Reply-To: <20010712184319.T5391@xs4all.nl> Message-ID: <20010713125720.S5396@xs4all.nl> On Thu, Jul 12, 2001 at 06:43:19PM +0200, Thomas Wouters wrote: > On Thu, Jul 12, 2001 at 11:33:13AM -0500, Skip Montanaro wrote: > > Thomas> Both of these are distutils-build related, and I'm not sure on > > Thomas> the 'right' fix on either. The latter also applies to 'bsddb', > > Thomas> by the way, and is especially annoying to me, because I'm > > Thomas> running Debian on more and more machines :) Does anyone who > > Thomas> understands setup.py have time to look at these before a week > > Thomas> from friday, when 2.1.1-final is scheduled ? > > I just added another variant (with a patch): bsddb build on Mandrake 8.0 is > > broken because it doesn't account for the libdb* shared library when > > creating bsddb.so: > > > > https://sourceforge.net/tracker/index.php?func=detail&aid=440725&group_id=5470&atid=105470 > > Thomas, I'm not sure if this applies to your Debian build woes, but > > perhaps it will help. > Yes, it does! Now bsddb builds, but dbmmodule still doesn't. It seems that's > because setup.py only checks for libndbm.so, and not for libdbX.so, which > also have a DBM implementation (IIRC), or libgdbm.so, which has one too. This does fix my problem: Index: setup.py =================================================================== RCS file: /cvsroot/python/python/dist/src/setup.py,v retrieving revision 1.38 diff -c -r1.38 setup.py *** setup.py 2001/04/15 15:16:12 1.38 --- setup.py 2001/07/13 10:51:14 *************** *** 323,331 **** # The standard Unix dbm module: if platform not in ['cygwin']: ! if (self.compiler.find_library_file(lib_dirs, 'ndbm')): exts.append( Extension('dbm', ['dbmmodule.c'], ! libraries = ['ndbm'] ) ) else: exts.append( Extension('dbm', ['dbmmodule.c']) ) --- 323,337 ---- # The standard Unix dbm module: if platform not in ['cygwin']: ! for lib in ('ndbm', 'db', 'db1', 'db2', 'db3', 'dbm'): ! if self.compiler.find_library_file(lib_dirs, lib): ! break ! else: ! lib = None ! ! if lib: exts.append( Extension('dbm', ['dbmmodule.c'], ! libraries = [lib]) ) else: exts.append( Extension('dbm', ['dbmmodule.c']) ) The problem is very simple: distutils does not play well with autoconf. The problem is that I have at least two implementations of 'dbm' available: 'libdbm', which comes with GDBM, and 'libdb1', which comes with libc. Autoconf tries to figure out which include file to use, and it does a decent job, but then distutils goes ahead and just tries to link with 'libndbm', which I don't have. The search path I give above works because I need 'libdb1', but it would still barf if autoconf found a different header than distutils tries to link with. In other words: it's a mess. Distutils should do the include-file-finding *and* the library-file-finding, and pass the right arguments, *or* autoconf should find both the include file and the library file, and pass that info to distutils somehow. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas@xs4all.net Fri Jul 13 12:59:31 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 13 Jul 2001 13:59:31 +0200 Subject: [Python-Dev] "Brennan, Bernadette": Python 1.5.2 & Solaris 8 In-Reply-To: <20010712165501.M5396@xs4all.nl> References: <200107121400.f6CE0XA14567@odiug.digicool.com> <20010712165501.M5396@xs4all.nl> Message-ID: <20010713135931.U5396@xs4all.nl> On Thu, Jul 12, 2001 at 04:55:01PM +0200, Thomas Wouters wrote: > On Thu, Jul 12, 2001 at 10:00:33AM -0400, Guido van Rossum wrote: > > Does anyone remember what the problems with Solaris-8 were? Shallow, > > I hope? > No clue, sorry. I don't have Solaris 8, either, but I do have access to > Solaris 7 (currently, but not for long) and will attempt to build a couple > of releases on it. I managed to get it working on Solaris, though I had some problems with readline and socket-with-ssl -- presumably because distutils tried to link against static libraries, not shared ones. However, I just noticed SourceForge has a compilefarm that includes Solaris 8. I compiled Python 2.1.1 and it worked fine. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Fri Jul 13 13:03:29 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 13 Jul 2001 14:03:29 +0200 Subject: [Python-Dev] PEP: Defining Unicode Literal Encodings Message-ID: <3B4EE391.19995171@lemburg.com> Please comment... -- PEP: 0263 (?) Title: Defining Unicode Literal Encodings Version: $Revision: 1.0 $ Author: mal@lemburg.com (Marc-Andr=E9 Lemburg) Status: Draft Type: Standards Track Python-Version: 2.3 Created: 06-Jun-2001 Post-History:=20 Abstract This PEP proposes to use the PEP 244 statement "directive" to make the encoding used in Unicode string literals u"..." (and their raw counterparts ur"...") definable on a per source file basis. Problem In Python 2.1, Unicode literals can only be written using the Latin-1 based encoding "unicode-escape". This makes the programming environment rather unfriendly to Python users who live and work in non-Latin-1 locales such as many of the eastern countries. Programmers can write their 8-bit strings using the favourite encoding, but are bound to the "unicode-escape" encoding for Unicode literals. Proposed Solution I propose to make the Unicode literal encodings (both standard and raw) a per-source file option which can be set using the "directive" statement proposed in PEP 244. Syntax The syntax for the directives is as follows: 'directive' WS+ 'unicodeencoding' WS* '=3D' WS* PYTHONSTRINGLITERAL 'directive' WS+ 'rawunicodeencoding' WS* '=3D' WS* PYTHONSTRINGLITERA= L with the PYTHONSTRINGLITERAL representing the encoding name to be used as standard Python 8-bit string literal and WS being the whitespace characters [ \t]. Semantics Whenever the Python compiler sees such an encoding directive during the compiling process, it updates an internal flag which holds the encoding name used for the specific literal form. The encoding name flags are initialized to "unicode-escape" for u"..."=20 literals and "raw-unicode-escape" for ur"..." respectively. ISSUE: Maybe we should restrict the directive usage to once per file and additionally to a placement before the first Unicode literal=20 in the source file. If the Python compiler has to convert a Unicode literal to a Unicode object, it will pass the 8-bit string data given by the literal to the Python codec registry and have it decode the data using the current setting of the encoding name flag for the requested type of Unicode literal. It then checks the result of the decoding operation for being an Unicode object and stores it in the byte code stream. Scope This PEP only affects Python source code which makes use of the proposed directives. It does not affect the coercion handling of 8-bit strings and Unicode in the given module. Copyright This document has been placed in the public domain. =0C Local Variables: mode: indented-text indent-tabs-mode: nil End: --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Jul 13 13:04:16 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 13 Jul 2001 14:04:16 +0200 Subject: [Python-Dev] PEP: Unicode Indexing Helper Module Message-ID: <3B4EE3C0.9875AB3D@lemburg.com> Please comment... -- PEP: 0262 (?) Title: Unicode Indexing Helper Module Version: $Revision: 1.0 $ Author: mal@lemburg.com (Marc-Andr=E9 Lemburg) Status: Draft Type: Standards Track Python-Version: 2.3 Created: 06-Jun-2001 Post-History:=20 Abstract This PEP proposes a new module "unicodeindex" which provides=20 means to index Unicode objects in various higher level abstractions of "characters". Problem and Terminology Unicode objects can be indexed just like string object using what in Unicode terms is called a code unit as index basis. =20 Code units are the storage entities used by the Unicode implementation to store a single Unicode information unit and do not necessarily map 1-1 to code points which are the smallest entities encoded by the Unicode standard. These code points can sometimes be composed to form graphemes which are then displayed by the Unicode output device as one character. A word is then a sequence of characters separated by space characters or punctuation, a line is a sequence of code points separated by line breaking code point sequences. For addressing Unicode, there are basically five different methods by which you can reference the data: 1. per code unit (codeunit) 2. per code point (codepoint) 3. per grapheme (grapheme) 4. per word (word) 5. per line (line) The indexing type name is given in parenthesis and used in the module interface. Proposed Solution I propose to add a new module to the standard Python library which provides interfaces implementing the above indexing methods. Module Interface The module should provide the following interfaces for all four indexing styles: next_(u, index) -> integer Returns the Unicode object index for the start of the next found after u[index] or -1 in case no next element of this type exists. prev_(u, index) -> integer Returns the Unicode object index for the start of the previous found before u[index] or -1 in case no previous element of this type exists. _index(u, n) -> integer Returns the Unicode object index for the start of the n-th element in u. Raises an IndexError in case no n-th element can be found. _count(u, index) -> integer Counts the number of complete elements found in u[:index] and returns the count as integer. _start(u, index) -> integer Returns 1 or 0 depending on u[index] marks the start of an element. _end(u, index) -> integer Returns 1 or 0 depending on u[index] marks the end of an element. Used symbols: one of: codeunit, codepoint, grapheme, word, line u is the Unicode object index the Unicode object index n is an integer =20 Copyright This document has been placed in the public domain. =0C Local Variables: mode: indented-text indent-tabs-mode: nil End: --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Jul 13 12:39:54 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 13 Jul 2001 13:39:54 +0200 Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP) References: <027201c10b75$b18224a0$e000a8c0@thomasnotebook> Message-ID: <3B4EDE0A.5D7391D6@lemburg.com> > [where to get the installation path from on Windows] > > That's why I said a (very hacky) solution would be to simply check > for the version number at install time. > A better solution would be to somehow query site.py at install time? Ideal would be looking at the sys module for e.g. sys.extinstallpath (which site.py could set). Is the problem of not being able to embed Python at install time really a problem ? After all, if it doesn't work for the installer, how should it work at all in a different setting... Alternatively, the installer could also simply query the install path from the user and suggest the sys.extinstallpath dir as default. The installer should also make sure that the sys.extinstallpath is on the sys.path (if not, the Python user won't be able to use the installed package and should be warned about this). A totally different problem is that of upgrading from the old installation (in Python21\) to a new one (in Python\Lib\site-packages)... but that one is on the extension writer, I guess. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From Samuele Pedroni Fri Jul 13 13:08:19 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Fri, 13 Jul 2001 14:08:19 +0200 (MET DST) Subject: [Python-Dev] descr-branch, ExtensionClasses Message-ID: <200107131208.OAA21211@core.inf.ethz.ch> Hi. Some questions: - What is the probability that descr-branch go in 2.2? - Will those changes obsolete ExtensionClasses on the long run? Why the questions: There is a guy on jython-dev that is trying to port some ExtensionClasses-like functionality to jython. Concretely he is fighting with the fact that jython internals are there to make things work, not to enable extensibility in any explicit way. At least their messy side make me believe that. My plans were to try to mimick as long as possible the new descr logic in jython 2.2, and try to polish all the internals accordingly. So if the answer to both question is yes, I can promise that to the guy, otherwise I have to be more helpful or diplomatic ... It's a kind of "political" matter. Samuele. From nas@python.ca Fri Jul 13 13:23:49 2001 From: nas@python.ca (Neil Schemenauer) Date: Fri, 13 Jul 2001 05:23:49 -0700 Subject: [Python-Dev] Python 2.1.1 In-Reply-To: <20010713125720.S5396@xs4all.nl>; from thomas@xs4all.net on Fri, Jul 13, 2001 at 12:57:20PM +0200 References: <20010712184319.T5391@xs4all.nl> <20010713125720.S5396@xs4all.nl> Message-ID: <20010713052349.A19240@glacier.fnational.com> Thomas Wouters wrote: > In other words: it's a mess. It sure is. You don't want to change the DB implementation used if it worked in 2.1. I believe that different DBs use different storage formats. People would not be happy if they upgraded to a point release and all their DBs broke (i.e. with 2.1 dbm was actually gdbm but with 2.1.1 it is db1). Neil From nas@python.ca Fri Jul 13 13:38:40 2001 From: nas@python.ca (Neil Schemenauer) Date: Fri, 13 Jul 2001 05:38:40 -0700 Subject: [Python-Dev] Silly little benchmark In-Reply-To: ; from jeremy@alum.mit.edu on Thu, Jul 12, 2001 at 11:19:57PM -0400 References: <200107130116.f6D1GGq16019@odiug.digicool.com> Message-ID: <20010713053840.B19240@glacier.fnational.com> Jeremy Hylton wrote: > The big change is the coercion work being done in binary_op1(), which > tries to turn strings into numbers in a variety of ways. BINARY_ADD > calls PyNumber_Add(), which calls binary_op1(). When the binary_op1() > calls fails, it then tries sequence concatenation. > > If it were possible for binary_op1() to fail quickly for non-numeric > sequences like strings, we would not see the slowdown for small string > operations. (I believe that's what the silly little benchmark shows > and what one of the pybench tests shows.) I had a patch that did this: * Added an ordinal number to some builtin types. All other types had ordinal 0. * Built a 2-D table of binary methods. * Had operations like PyNumber_Add look into this table and use the method there. It turned out to not give much of a speedup but I think the idea is interesting. Neil From thomas@xs4all.net Fri Jul 13 13:48:14 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 13 Jul 2001 14:48:14 +0200 Subject: [Python-Dev] Python 2.1.1 In-Reply-To: <20010713052349.A19240@glacier.fnational.com> References: <20010712184319.T5391@xs4all.nl> <20010713125720.S5396@xs4all.nl> <20010713052349.A19240@glacier.fnational.com> Message-ID: <20010713144814.V5396@xs4all.nl> On Fri, Jul 13, 2001 at 05:23:49AM -0700, Neil Schemenauer wrote: > Thomas Wouters wrote: > > In other words: it's a mess. > It sure is. You don't want to change the DB implementation used if it > worked in 2.1. I believe that different DBs use different storage > formats. People would not be happy if they upgraded to a point release > and all their DBs broke (i.e. with 2.1 dbm was actually gdbm but with > 2.1.1 it is db1). I didn't touch the autoconf code that finds the include file, nor the #ifdef mess in dbmmodule.c that decides which to use, so it could only lead to unrunnable/uncompilable code, not to a new .db silently being used. But I agree that this is not a suitable fix for 2.1.1, I just wish we could fix it better :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From nhodgson@bigpond.net.au Fri Jul 13 14:13:40 2001 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Fri, 13 Jul 2001 23:13:40 +1000 Subject: [Python-Dev] PEP: Unicode Indexing Helper Module References: <3B4EE3C0.9875AB3D@lemburg.com> Message-ID: <072101c10b9d$a28068e0$0acc8490@neil> M.-A. Lemburg: > next_(u, index) -> integer > > Returns the Unicode object index for the start of the next > found after u[index] or -1 in case no next > element of this type exists. > > prev_(u, index) -> integer > ... Its not clear to me from the description whether the term "object index" is used for a code unit index or an index. Code unit index seems to make the most sense but this should be explicit. Neil From mal@lemburg.com Fri Jul 13 14:44:55 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 13 Jul 2001 15:44:55 +0200 Subject: [Python-Dev] Re: PEP 262: Unicode Indexing Helper Module Message-ID: <3B4EFB57.1427EF35@lemburg.com> This is a multi-part message in MIME format. --------------4273B7E264E4649CF795A2CF Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit > Paul Moore (in privte mail): > > You have methods for finding > the start and end of various , but you don't have a method for > finding the length of an . In the case of words (which is the one > I understand :-), the length of a word is not the same as the difference > between the starts of consecutive words - the intervening whitespace should > be excluded (at least for some applications). I would suggest > > length_(u, index) -> integer > Returns the length in Unicode objects of the found at u[index] > or -1 in case u[index] is not in an element of this type (for example, in > the whitespace between words). [XXX Should this be the number of Unicode > objects between index and the end of the element, or should it be the length > from start to end even if you are in the middle?] > > or maybe better > > nextend_(u, index) -> integer > Returns the Unicode object index for the end of the next found > after u[index] or -1 in case no next element of this type exists. > > [But that runs into issues when you are in a word - If index is not the > first Unicode object, nextend is the end of *this* element, whereas next is > the start of the *next* element. I think I'm starting to show my > ignorance...] > > Even though I suspect my suggested methods are too simplistic, I'd suggest > at least a comment in the PEP on how to work out the length of the element > you're in (or why it's hard, and you'd never want to do it :-)... The two suggested APIs probe into the Unicode object. I think it would be more useful to return the slice (as slice object) which represents the element found at the given index in u, e.g. _slice(u, index) -> slice object or None Returns the slice pointing to the element found in u at the given index or None in case no such element can be found at that position. Hmm, I wonder whether slice objects can be "applied" to sequences somehow... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ --------------4273B7E264E4649CF795A2CF Content-Type: message/rfc822 Content-Transfer-Encoding: 7bit Content-Disposition: inline Received: from gw-nl1.origin-it.com (gw-nl1.origin-it.com [193.79.128.34]) by www.egenix.com (8.11.2/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id f6DCTY816219 for ; Fri, 13 Jul 2001 14:29:34 +0200 Received: from exchsmtp-nl1.origin-it.com (localhost.origin-it.com [127.0.0.1]) by gw-nl1.origin-it.com with ESMTP id OAA11738 for ; Fri, 13 Jul 2001 14:26:54 +0200 (MEST) (envelope-from Paul.Moore@atosorigin.com) Received: from exchsmtp-nl1.origin-it.com(172.16.127.66) by gw-nl1.origin-it.com via mwrap (4.0a) id xma011736; Fri, 13 Jul 01 14:26:54 +0200 Received: from mail.origin-it.com (mail.origin-it.com [172.16.127.3]) by exchsmtp-nl1.origin-it.com (8.9.3/8.8.5-1.2.2m-19990317) with ESMTP id OAA04126 for ; Fri, 13 Jul 2001 14:26:53 +0200 (MET DST) Received: from ukrax001.ras.uk.origin-it.com (ukrax001.ras.uk.origin-it.com [172.16.201.234]) by mail.origin-it.com (8.9.3/8.8.5-1.2.2m-19990317) with ESMTP id OAA12785 for ; Fri, 13 Jul 2001 14:26:53 +0200 (MET DST) Received: by ukrax001.ras.uk.origin-it.com with Internet Mail Service (5.5.2650.21) id ; Fri, 13 Jul 2001 13:26:53 +0100 Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEF5@ukrux002.rundc.uk.origin-it.com> From: "Moore, Paul" To: "'mal@lemburg.com'" Subject: PEP 262: Unicode Indexing Helper Module Date: Fri, 13 Jul 2001 13:26:52 +0100 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" Excuse me for commenting on an area which I know virtually nothing about, but one point struck me when I saw this PEP. You have methods for finding the start and end of various , but you don't have a method for finding the length of an . In the case of words (which is the one I understand :-), the length of a word is not the same as the difference between the starts of consecutive words - the intervening whitespace should be excluded (at least for some applications). I would suggest length_(u, index) -> integer Returns the length in Unicode objects of the found at u[index] or -1 in case u[index] is not in an element of this type (for example, in the whitespace between words). [XXX Should this be the number of Unicode objects between index and the end of the element, or should it be the length from start to end even if you are in the middle?] or maybe better nextend_(u, index) -> integer Returns the Unicode object index for the end of the next found after u[index] or -1 in case no next element of this type exists. [But that runs into issues when you are in a word - If index is not the first Unicode object, nextend is the end of *this* element, whereas next is the start of the *next* element. I think I'm starting to show my ignorance...] Even though I suspect my suggested methods are too simplistic, I'd suggest at least a comment in the PEP on how to work out the length of the element you're in (or why it's hard, and you'd never want to do it :-)... Paul. --------------4273B7E264E4649CF795A2CF-- From mal@lemburg.com Fri Jul 13 14:49:42 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 13 Jul 2001 15:49:42 +0200 Subject: [Python-Dev] PEP: Unicode Indexing Helper Module References: <3B4EE3C0.9875AB3D@lemburg.com> <072101c10b9d$a28068e0$0acc8490@neil> Message-ID: <3B4EFC76.606FBC83@lemburg.com> Neil Hodgson wrote: > > M.-A. Lemburg: > > next_(u, index) -> integer > > > > Returns the Unicode object index for the start of the next > > found after u[index] or -1 in case no next > > element of this type exists. > > > > prev_(u, index) -> integer > > ... > > Its not clear to me from the description whether the term "object index" > is used for a code unit index or an index. Code unit index seems > to make the most sense but this should be explicit. Good point. The "Unicode object index" refers to the index you use for slicing or indexing Unicode objects, i.e. like in "u[10]" or "u[12:15]". As such it refers to the Unicode code unit as implemented by the Unicode implementation (and is application specific). I'll add a note to the PEP. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From sjoerd.mullender@oratrix.com Fri Jul 13 15:27:37 2001 From: sjoerd.mullender@oratrix.com (Sjoerd Mullender) Date: Fri, 13 Jul 2001 16:27:37 +0200 Subject: [Python-Dev] re with Unicode broken? Message-ID: <20010713142737.EDBA8301CF7@bireme.oratrix.nl> This is not for the faint of heart. My validating XML parser doesn't work anymore, even though I didn't change a thing (except update Python from CVS). I use re extensively in this parser, and all my expressions use Unicode extensively. If I replace the Unicode stuff by ASCII, the expression works. The expression which now fails to match is: entity = re.compile(''+_Name+')'+_S+ '(?P'+_EntityVal+'|'+_ExternalId+')|(?P'+_Name+')'+_S+ '(?P'+_EntityVal+'|'+_ExternalId+'(?:'+_S+'NDATA'+_S+_Name+ ')?))'+_opS+'>') and the string it fails on is In order to actually use the above expression, you need some more variable definitions. To assemble this into a working program, first copy and paste the bottom part, then the middle part, and finally the top part. The resulting pattern is huge: (len(entity.pattern) == 15193). First the ones that don't use Unicode. _Letter = _BaseChar + _Ideographic _NameChar = '-' + _Letter + _Digit + '._:' + _CombiningChar + _Extender _S = '[ \t\r\n]+' # white space _opS = '[ \t\r\n]*' # optional white space _Name = '['+_Letter+'_:]['+_NameChar+']*' # XML Name ref = '&(?:(?P'+_Name+')|#(?P(?:[0-9]+|x[0-9a-fA-F]+)));' _QStr = "(?:'[^']*'|\"[^\"]*\")" # quoted XML string _EntityVal = '"(?:[^"&%]|'+ref+'|%'+_Name+';)*"|' \ "'(?:[^'&%]|"+ref+"|%"+_Name+";)*'" _SystemLiteral = '(?P'+_QStr+')' _PublicLiteral = '(?P"[-\'()+,./:=?;!*#@$_%% \n\ra-zA-Z0-9]*"|' \ "'[-()+,./:=?;!*#@$_%% \n\ra-zA-Z0-9]*')" _ExternalId = '(?:SYSTEM|PUBLIC'+_S+_PublicLiteral+')'+_S+_SystemLiteral The ASCII versions of the Unicode strings are (if you use these definitions the re matches): _BaseChar = 'A-Za-z' _Ideographic = '' _Digit = '0-9' _CombiningChar = '' _Extender = '' and the Unicode versions (the ones that I actually use and that now fail): # The character sets below are taken directly from the XML spec. _BaseChar = u'\u0041-\u005A\u0061-\u007A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF' \ u'\u0100-\u0131\u0134-\u013E\u0141-\u0148\u014A-\u017E' \ u'\u0180-\u01C3\u01CD-\u01F0\u01F4-\u01F5\u01FA-\u0217' \ u'\u0250-\u02A8\u02BB-\u02C1\u0386\u0388-\u038A\u038C' \ u'\u038E-\u03A1\u03A3-\u03CE\u03D0-\u03D6\u03DA\u03DC\u03DE' \ u'\u03E0\u03E2-\u03F3\u0401-\u040C\u040E-\u044F\u0451-\u045C' \ u'\u045E-\u0481\u0490-\u04C4\u04C7-\u04C8\u04CB-\u04CC' \ u'\u04D0-\u04EB\u04EE-\u04F5\u04F8-\u04F9\u0531-\u0556\u0559' \ u'\u0561-\u0586\u05D0-\u05EA\u05F0-\u05F2\u0621-\u063A' \ u'\u0641-\u064A\u0671-\u06B7\u06BA-\u06BE\u06C0-\u06CE' \ u'\u06D0-\u06D3\u06D5\u06E5-\u06E6\u0905-\u0939\u093D' \ u'\u0958-\u0961\u0985-\u098C\u098F-\u0990\u0993-\u09A8' \ u'\u09AA-\u09B0\u09B2\u09B6-\u09B9\u09DC-\u09DD\u09DF-\u09E1' \ u'\u09F0-\u09F1\u0A05-\u0A0A\u0A0F-\u0A10\u0A13-\u0A28' \ u'\u0A2A-\u0A30\u0A32-\u0A33\u0A35-\u0A36\u0A38-\u0A39' \ u'\u0A59-\u0A5C\u0A5E\u0A72-\u0A74\u0A85-\u0A8B\u0A8D' \ u'\u0A8F-\u0A91\u0A93-\u0AA8\u0AAA-\u0AB0\u0AB2-\u0AB3' \ u'\u0AB5-\u0AB9\u0ABD\u0AE0\u0B05-\u0B0C\u0B0F-\u0B10' \ u'\u0B13-\u0B28\u0B2A-\u0B30\u0B32-\u0B33\u0B36-\u0B39\u0B3D' \ u'\u0B5C-\u0B5D\u0B5F-\u0B61\u0B85-\u0B8A\u0B8E-\u0B90' \ u'\u0B92-\u0B95\u0B99-\u0B9A\u0B9C\u0B9E-\u0B9F\u0BA3-\u0BA4' \ u'\u0BA8-\u0BAA\u0BAE-\u0BB5\u0BB7-\u0BB9\u0C05-\u0C0C' \ u'\u0C0E-\u0C10\u0C12-\u0C28\u0C2A-\u0C33\u0C35-\u0C39' \ u'\u0C60-\u0C61\u0C85-\u0C8C\u0C8E-\u0C90\u0C92-\u0CA8' \ u'\u0CAA-\u0CB3\u0CB5-\u0CB9\u0CDE\u0CE0-\u0CE1\u0D05-\u0D0C' \ u'\u0D0E-\u0D10\u0D12-\u0D28\u0D2A-\u0D39\u0D60-\u0D61' \ u'\u0E01-\u0E2E\u0E30\u0E32-\u0E33\u0E40-\u0E45\u0E81-\u0E82' \ u'\u0E84\u0E87-\u0E88\u0E8A\u0E8D\u0E94-\u0E97\u0E99-\u0E9F' \ u'\u0EA1-\u0EA3\u0EA5\u0EA7\u0EAA-\u0EAB\u0EAD-\u0EAE\u0EB0' \ u'\u0EB2-\u0EB3\u0EBD\u0EC0-\u0EC4\u0F40-\u0F47\u0F49-\u0F69' \ u'\u10A0-\u10C5\u10D0-\u10F6\u1100\u1102-\u1103\u1105-\u1107' \ u'\u1109\u110B-\u110C\u110E-\u1112\u113C\u113E\u1140\u114C' \ u'\u114E\u1150\u1154-\u1155\u1159\u115F-\u1161\u1163\u1165' \ u'\u1167\u1169\u116D-\u116E\u1172-\u1173\u1175\u119E\u11A8' \ u'\u11AB\u11AE-\u11AF\u11B7-\u11B8\u11BA\u11BC-\u11C2\u11EB' \ u'\u11F0\u11F9\u1E00-\u1E9B\u1EA0-\u1EF9\u1F00-\u1F15' \ u'\u1F18-\u1F1D\u1F20-\u1F45\u1F48-\u1F4D\u1F50-\u1F57\u1F59' \ u'\u1F5B\u1F5D\u1F5F-\u1F7D\u1F80-\u1FB4\u1FB6-\u1FBC\u1FBE' \ u'\u1FC2-\u1FC4\u1FC6-\u1FCC\u1FD0-\u1FD3\u1FD6-\u1FDB' \ u'\u1FE0-\u1FEC\u1FF2-\u1FF4\u1FF6-\u1FFC\u2126\u212A-\u212B' \ u'\u212E\u2180-\u2182\u3041-\u3094\u30A1-\u30FA\u3105-\u312C' \ u'\uAC00-\uD7A3' _Ideographic = u'\u4E00-\u9FA5\u3007\u3021-\u3029' _CombiningChar = u'\u0300-\u0345\u0360-\u0361\u0483-\u0486\u0591-\u05A1\u05A3-\u05B9' \ u'\u05BB-\u05BD\u05BF\u05C1-\u05C2\u05C4\u064B-\u0652\u0670' \ u'\u06D6-\u06DC\u06DD-\u06DF\u06E0-\u06E4\u06E7-\u06E8' \ u'\u06EA-\u06ED\u0901-\u0903\u093C\u093E-\u094C\u094D' \ u'\u0951-\u0954\u0962-\u0963\u0981-\u0983\u09BC\u09BE\u09BF' \ u'\u09C0-\u09C4\u09C7-\u09C8\u09CB-\u09CD\u09D7\u09E2-\u09E3' \ u'\u0A02\u0A3C\u0A3E\u0A3F\u0A40-\u0A42\u0A47-\u0A48' \ u'\u0A4B-\u0A4D\u0A70-\u0A71\u0A81-\u0A83\u0ABC\u0ABE-\u0AC5' \ u'\u0AC7-\u0AC9\u0ACB-\u0ACD\u0B01-\u0B03\u0B3C\u0B3E-\u0B43' \ u'\u0B47-\u0B48\u0B4B-\u0B4D\u0B56-\u0B57\u0B82-\u0B83' \ u'\u0BBE-\u0BC2\u0BC6-\u0BC8\u0BCA-\u0BCD\u0BD7\u0C01-\u0C03' \ u'\u0C3E-\u0C44\u0C46-\u0C48\u0C4A-\u0C4D\u0C55-\u0C56' \ u'\u0C82-\u0C83\u0CBE-\u0CC4\u0CC6-\u0CC8\u0CCA-\u0CCD' \ u'\u0CD5-\u0CD6\u0D02-\u0D03\u0D3E-\u0D43\u0D46-\u0D48' \ u'\u0D4A-\u0D4D\u0D57\u0E31\u0E34-\u0E3A\u0E47-\u0E4E\u0EB1' \ u'\u0EB4-\u0EB9\u0EBB-\u0EBC\u0EC8-\u0ECD\u0F18-\u0F19\u0F35' \ u'\u0F37\u0F39\u0F3E\u0F3F\u0F71-\u0F84\u0F86-\u0F8B' \ u'\u0F90-\u0F95\u0F97\u0F99-\u0FAD\u0FB1-\u0FB7\u0FB9' \ u'\u20D0-\u20DC\u20E1\u302A-\u302F\u3099\u309A' _Digit = u'\u0030-\u0039\u0660-\u0669\u06F0-\u06F9\u0966-\u096F\u09E6-\u09EF' \ u'\u0A66-\u0A6F\u0AE6-\u0AEF\u0B66-\u0B6F\u0BE7-\u0BEF' \ u'\u0C66-\u0C6F\u0CE6-\u0CEF\u0D66-\u0D6F\u0E50-\u0E59' \ u'\u0ED0-\u0ED9\u0F20-\u0F29' _Extender = u'\u00B7\u02D0\u02D1\u0387\u0640\u0E46\u0EC6\u3005\u3031-\u3035' \ u'\u309D-\u309E\u30FC-\u30FE' -- Sjoerd Mullender From mal@lemburg.com Fri Jul 13 15:41:18 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 13 Jul 2001 16:41:18 +0200 Subject: [Python-Dev] re with Unicode broken? References: <20010713142737.EDBA8301CF7@bireme.oratrix.nl> Message-ID: <3B4F088E.14D3B2D3@lemburg.com> [re failing with CVS sre and Unicode} I believe Fredrik checked in some changes to sre which affected the handling of Unicode character ranges. Could this be related to what you are seeing ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From fredrik@pythonware.com Fri Jul 13 15:44:22 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 13 Jul 2001 16:44:22 +0200 Subject: [Python-Dev] re with Unicode broken? References: <20010713142737.EDBA8301CF7@bireme.oratrix.nl> Message-ID: <002501c10baa$4ea3fb80$0900a8c0@spiff> sjoerd wrote: > This is not for the faint of heart. > > My validating XML parser doesn't work anymore, even though I didn't > change a thing (except update Python from CVS). when did you last update without problems? the likely cause for this is MvL's "big char set" patch, which I checked in on July 6. here's a workaround: tweak sre_compile.py so it doesn't generate BIGCHARSET op codes. in _optimize_charset, change this: except IndexError: # character set contains unicode characters return _optimize_unicode(charset, fixup) # compress character map to except IndexError: # character set contains unicode characters return charset # WORKAROUND: no compression # compress character map I'll look into this over the weekend. Cheers /F From guido@digicool.com Fri Jul 13 15:59:10 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 13 Jul 2001 10:59:10 -0400 Subject: [Python-Dev] descr-branch, ExtensionClasses In-Reply-To: Your message of "Fri, 13 Jul 2001 14:08:19 +0200." <200107131208.OAA21211@core.inf.ethz.ch> References: <200107131208.OAA21211@core.inf.ethz.ch> Message-ID: <200107131459.f6DExAv16504@odiug.digicool.com> > Hi. Some questions: > > - What is the probability that descr-branch go in 2.2? At least 75%. I'm planning to release 2.2a1 next week from descr-branch. If this is well received, I'll merge the descr-branch into the main trunk. > - Will those changes obsolete ExtensionClasses on the long run? Yes, that's the whole point. > Why the questions: > > There is a guy on jython-dev that is trying to port some > ExtensionClasses-like functionality to jython. Let him use the design from descr-branch instead (PEP 252 and 253 are much more up to date now). > Concretely he is fighting with the fact that jython internals are > there to make things work, not to enable extensibility in any > explicit way. At least their messy side make me believe that. I'll have to trust you there, I'm not familiar with Jython internals. > My plans were to try to mimick as long as possible the new descr > logic in jython 2.2, and try to polish all the internals > accordingly. Sounds like a good plan. > So if the answer to both question is yes, I can promise that to the > guy, otherwise I have to be more helpful or diplomatic ... > > It's a kind of "political" matter. Samuele. I'd say that even if descr-branch doesn't make it into 2.2, it will make it into the next release, so by all means study the design and tell me if it has any problems for Jython! --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Fri Jul 13 16:08:55 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 13 Jul 2001 17:08:55 +0200 Subject: [Python-Dev] Solaris 8 Message-ID: <20010713170855.W5396@xs4all.nl> After much frobbing, I managed to compile Python using the SUNpro compiler as well as gcc. gcc was no problem, asside from the inability to link with static libraries as if they were shared, but the Sun compiler (which is an optional, fairly expensive piece of software, IIRC :) is a nasty little thing. It defaults to a half-ANSI, half-K&R mode with Sun extentions, that has broken thread support and refuses to compile most of the modules because of ANSI-style token-stringification and -concatenation ('#x' and 'x ## y'). Adding '-mt' to the flags, as suggested by the README, didn't help much. Passing '-Xa' to the compiler switches it into an ANSI-compliant-with-Sun-extentions mode, and though threads were still broken, I managed to compile Python and most of the modules. And it remarkably passed all the tests it could find: all 6 of them. For some reason (I couldn't figure it out for the life of me) 'readdir' silently chopped off the first two characters of the entry name, causing the 'findtests' function in the regrtest to not find any tests besides the standard ones. Sounds like a mismatch between include-files and structs actually used by the operating system, but none of the manual pages hinted to anything like it. Finally, '-Xc' turns it into a strictly ANSI compiler, though apparently not as strict as 'gcc -ansi': it compiles with only a few warnings, passes all tests, and with '-mt' it even had working thread suport! There seems to be only one oddness: audioop.so uses sqrt() without being linked to libm, though why this isn't an issue on other systems, I'm not sure. I've added a blob to the 2.1.1 README to mention this all (but not in time for the 2.1.1c1 release), and will add it to the 2.2 one as soon as I've tested that tree on Solaris, too. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@digicool.com Fri Jul 13 16:33:54 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 13 Jul 2001 11:33:54 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: Your message of "Fri, 13 Jul 2001 16:29:38 +1200." <200107130429.QAA02078@s454.cosc.canterbury.ac.nz> References: <200107130429.QAA02078@s454.cosc.canterbury.ac.nz> Message-ID: <200107131533.f6DFXsV16565@odiug.digicool.com> > Tim Peters : > > > It's not BINARY_ADD, it's the PyNumber_Add() called by BINARY_ADD, which, > > given two strings, calls binary_op1, which does a few failing tests, then > > calls PyNumber_CoerceEx, which fails quickly enough to coerce, and then > > pokes around a little looking for number methods, and finally says "hmm! > > maybe it's a sequence?". > > This seems to contradict what Guido just said about > centralised coercion having been removed. Is one or > the other of us talking nonsense, or do we misunderstand > each other? It's complicated. I didn't know everything that was going on when I wrote that before. Now I've seen a bit more. PyNumber_CoerceEx() is called in order to accommodate old-style numbers for backwards compatibility (and for complex, which hasn't been converted to new-style yet). We could add a new-style numeric add operation to strings so that s1+s2 takes an earlier path in binary_op1(). I also note that binary_op1() tries PyNumber_CoerceEx() even when both arguments have a NULL tp_as_number pointer -- at the cost of extra tests the call to PyNumber_CoerceEx() could be avoided. (I guess binary_op1() could add such a test at the top and save itself some work.) --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Fri Jul 13 16:34:25 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 13 Jul 2001 17:34:25 +0200 Subject: [Python-Dev] Possible solution for PEP250 and bdist_wininst Message-ID: <025d01c10bb1$4c0f69c0$e000a8c0@thomasnotebook> I have a possible solution for this problem. (I'll use the name INSTALLPATH for installation directory stored in the registry under the key HKEY_LOCAL_MACHINE\Software\Python\PythonCore\\InstallPath). The bdist_wininst installer at _install_ time sets the PYTHONHOME environment variable to INSTALLPATH, then loads the python dll and retrieves the 'extinstallpath' attribute from the sys module: wwsprintf(buffer, "PYTHONHOME=%s", INSTALLPATH); _putenv(buffer); Py_SetProgramName(modulename); Py_Initialize(); pextinstallpath = PySys_GetObject("extinstallpath"); Py_Finalize(); If this is successful, the (string contents of) pextinstallpath is appended to INSTALLPATH, and that will be the directory where the package will be installed. If unsuccessful, INSTALLPATH will be used as before. I'm unsure about the change to site.py, but this should work: diff -c -r1.26 site.py *** site.py 2001/03/23 17:53:49 1.26 --- site.py 2001/07/13 15:32:27 *************** *** 140,153 **** "python" + sys.version[:3], "site-packages"), makepath(prefix, "lib", "site-python")] - elif os.sep == ':': - sitedirs = [makepath(prefix, "lib", "site-packages")] else: ! sitedirs = [prefix] for sitedir in sitedirs: if os.path.isdir(sitedir): addsitedir(sitedir) # Define new built-ins 'quit' and 'exit'. # These are simply strings that display a hint on how to exit. if os.sep == ':': --- 140,154 ---- "python" + sys.version[:3], "site-packages"), makepath(prefix, "lib", "site-python")] else: ! sitedirs = [prefix, os.path.join(prefix, "lib", "site-packages")] for sitedir in sitedirs: if os.path.isdir(sitedir): addsitedir(sitedir) + if os.sep == '\\': + sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages") + # Define new built-ins 'quit' and 'exit'. # These are simply strings that display a hint on how to exit. if os.sep == ':': If anyone cares, I can post the diffs for the bdist_wininst sources. Thomas From thomas.heller@ion-tof.com Fri Jul 13 16:45:04 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 13 Jul 2001 17:45:04 +0200 Subject: [Python-Dev] Possible solution for PEP250 and bdist_wininst References: <025d01c10bb1$4c0f69c0$e000a8c0@thomasnotebook> Message-ID: <02ee01c10bb2$c9460ce0$e000a8c0@thomasnotebook> > I'm unsure about the change to site.py, but this should work: This was wrong, of course. Sorry for the confusion, should simply be: diff -c -r1.30 site.py *** site.py 2001/07/12 21:08:33 1.30 --- site.py 2001/07/13 15:43:49 *************** *** 151,156 **** --- 151,159 ---- if os.path.isdir(sitedir): addsitedir(sitedir) + if os.sep == ':': + sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages") + del dirs_in_sys_path # Define new built-ins 'quit' and 'exit'. Thomas From Paul.Moore@atosorigin.com Fri Jul 13 16:46:09 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 13 Jul 2001 16:46:09 +0100 Subject: [Python-Dev] RE: Possible solution for PEP250 and bdist_wininst Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEF7@ukrux002.rundc.uk.origin-it.com> From: Thomas Heller [mailto:thomas.heller@ion-tof.com] > I have a possible solution for this problem. [Description cut] Sounds OK to me, but if I knew much about this area I'd have covered it in the PEP :-) One question: Should sys.extinstallpath be set for all platforms? Cleanrly, nothing but Windows will use it at present, but is there a meaningful value it could have on other platforms? If so, exposing it uniformly seems sensible. Paul. From sjoerd.mullender@oratrix.com Fri Jul 13 16:54:07 2001 From: sjoerd.mullender@oratrix.com (Sjoerd Mullender) Date: Fri, 13 Jul 2001 17:54:07 +0200 Subject: [Python-Dev] re with Unicode broken? In-Reply-To: Your message of Fri, 13 Jul 2001 16:44:22 +0200. <002501c10baa$4ea3fb80$0900a8c0@spiff> References: <20010713142737.EDBA8301CF7@bireme.oratrix.nl> <002501c10baa$4ea3fb80$0900a8c0@spiff> Message-ID: <20010713155407.CCCBE301CF7@bireme.oratrix.nl> On Fri, Jul 13 2001 "Fredrik Lundh" wrote: > sjoerd wrote: > > > This is not for the faint of heart. > > > > My validating XML parser doesn't work anymore, even though I didn't > > change a thing (except update Python from CVS). > > when did you last update without problems? I have no idea. I update regularly (only on the main branch), but I don't run the program very often. > the likely cause for this is MvL's "big char set" patch, which > I checked in on July 6. > > here's a workaround: tweak sre_compile.py so it doesn't generate > BIGCHARSET op codes. in _optimize_charset, change this: > > except IndexError: > # character set contains unicode characters > return _optimize_unicode(charset, fixup) > # compress character map > > to > > except IndexError: > # character set contains unicode characters > return charset # WORKAROUND: no compression > # compress character map > > I'll look into this over the weekend. Yes, this works. While you're looking at this, maybe you can also look at speeding up stuff? :-) Importing the module with my XML parser takes an inordinate amount of time. This is entirely due to compiling all the regular expressions. There are a lot of them, and since many of them use the _Name pattern that I included in my previous message, they tend to be big. Unfortunately, I can't use any abbreviations that re might provide for Unicode character sets, since then I don't know for sure that my expressions are compatible with the XML definition. Maybe it's possible to add a way of saving precompiled expressions in the Python file? -- Sjoerd Mullender From thomas.heller@ion-tof.com Fri Jul 13 16:58:17 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 13 Jul 2001 17:58:17 +0200 Subject: [Python-Dev] Re: [Distutils] RE: Possible solution for PEP250 and bdist_wininst References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEF7@ukrux002.rundc.uk.origin-it.com> Message-ID: <03fc01c10bb4$a2754de0$e000a8c0@thomasnotebook> > From: Thomas Heller [mailto:thomas.heller@ion-tof.com] > > I have a possible solution for this problem. > [Description cut] > > Sounds OK to me, but if I knew much about this area I'd have covered it in > the PEP :-) > > One question: Should sys.extinstallpath be set for all platforms? Cleanrly, > nothing but Windows will use it at present, but is there a meaningful value > it could have on other platforms? If so, exposing it uniformly seems > sensible. This must be answered by other people, I only use windows. If it would be exposed uniformly, probably distutils itself should also use it. Thomas From guido@digicool.com Fri Jul 13 17:41:47 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 13 Jul 2001 12:41:47 -0400 Subject: [Python-Dev] Python book reviewers wanted Message-ID: <200107131641.f6DGfmA16706@odiug.digicool.com> Prentice Hall needs reviewers... --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Forwarded Message Date: 13 Jul 2001 12:39:19 -0400 From: Kristen_Blanco@prenhall.com To: webmaster@python.org Subject: Python reviewers I am writing from Prentice Hall publishing and I am seeking reviewers for an up coming publication. We are one of the largest college textbook publishers in the US. We are publishing a book entitled "Python How to Program" by Harvey and Paul Deitel. They are premier programming language authors, with th e best-selling C++ and Java books in the college market place. More information on their suite of publications can be found here: http://www.prenhall.com/deitel We are presently seeking qualified technical reviewers to verify that the Deitels' coverage of Python in their forthcoming book is accurate. In return, we are offering a token honorarium. Might you be willing to participate? If not, could you perhaps suggest a colleague? If you are interested, or have any questions, please contact my colleague, Cris sy Statuto, at Crissy_Statuto@prenhall.com Thank you in advance for your assistance and consideration. Sincerely, Crissy Statuto Crissy Statuto Project Manager, Computer Science Prentice Hall One Lake Street- #3F54 Upper Saddle River, NJ 07458 ------- End of Forwarded Message From mal@lemburg.com Fri Jul 13 18:20:54 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 13 Jul 2001 19:20:54 +0200 Subject: [Python-Dev] Re: Possible solution for PEP250 and bdist_wininst References: <025d01c10bb1$4c0f69c0$e000a8c0@thomasnotebook> Message-ID: <3B4F2DF6.85158478@lemburg.com> Thomas Heller wrote: > > I have a possible solution for this problem. > > (I'll use the name INSTALLPATH for installation directory stored > in the registry under the key > HKEY_LOCAL_MACHINE\Software\Python\PythonCore\\InstallPath). > > The bdist_wininst installer at _install_ time sets the PYTHONHOME > environment variable to INSTALLPATH, then loads the python dll > and retrieves the 'extinstallpath' attribute from the sys module: > > wwsprintf(buffer, "PYTHONHOME=%s", INSTALLPATH); > _putenv(buffer); > Py_SetProgramName(modulename); > Py_Initialize(); > pextinstallpath = PySys_GetObject("extinstallpath"); > Py_Finalize(); > > If this is successful, the (string contents of) pextinstallpath > is appended to INSTALLPATH, and that will be the directory where > the package will be installed. If unsuccessful, INSTALLPATH will > be used as before. Sounds OK. > I'm unsure about the change to site.py, but this should work: > > diff -c -r1.26 site.py > *** site.py 2001/03/23 17:53:49 1.26 > --- site.py 2001/07/13 15:32:27 > *************** > *** 140,153 **** > "python" + sys.version[:3], > "site-packages"), > makepath(prefix, "lib", "site-python")] > - elif os.sep == ':': > - sitedirs = [makepath(prefix, "lib", "site-packages")] > else: > ! sitedirs = [prefix] > for sitedir in sitedirs: > if os.path.isdir(sitedir): > addsitedir(sitedir) > > # Define new built-ins 'quit' and 'exit'. > # These are simply strings that display a hint on how to exit. > if os.sep == ':': > --- 140,154 ---- > "python" + sys.version[:3], > "site-packages"), > makepath(prefix, "lib", "site-python")] > else: > ! sitedirs = [prefix, os.path.join(prefix, "lib", "site-packages")] > for sitedir in sitedirs: > if os.path.isdir(sitedir): > addsitedir(sitedir) > > + if os.sep == '\\': > + sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages") > + Why not do this for all platforms (which support site-packages) ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Fri Jul 13 18:21:42 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 13 Jul 2001 13:21:42 -0400 Subject: [Python-Dev] Python 2.1.1c1 released Message-ID: <200107131721.f6DHLgX16757@odiug.digicool.com> I'm happy to announce the release today of Python 2.1.1c1, a release candidate for Python 2.1.1: http://www.python.org/2.1.1/ This is a pure bugfix release; see the website for details. One fixed "bug" deserves special attention: this release is GPL-compatible. I hope it's in time for inclusion in the Debian release. Thanks to Thomas Wouters for all his work on making this a perfect candidate, despite today's date. :-) The final 2.1.1 release is expected a week from now. Enjoy! --Guido van Rossum (home page: http://www.python.org/~guido/) From esr@thyrsus.com Fri Jul 13 18:49:19 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Fri, 13 Jul 2001 13:49:19 -0400 Subject: [Python-Dev] Python 2.1.1c1 released In-Reply-To: <200107131721.f6DHLgX16757@odiug.digicool.com>; from guido@digicool.com on Fri, Jul 13, 2001 at 01:21:42PM -0400 References: <200107131721.f6DHLgX16757@odiug.digicool.com> Message-ID: <20010713134919.B7279@thyrsus.com> Guido van Rossum : > This is a pure bugfix release; see the website for details. One fixed > "bug" deserves special attention: this release is GPL-compatible. > I hope it's in time for inclusion in the Debian release. Should be. They haven't even settled their freeze policy yet, let alone declared a freeze. I've been tracking this because I'm coming up on a fetchmail-5.9.0 stable release and want to get that in, too. -- Eric S. Raymond Freedom begins between the ears. -- Edward Abbey From thomas.heller@ion-tof.com Fri Jul 13 19:03:06 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 13 Jul 2001 20:03:06 +0200 Subject: [Python-Dev] Re: Possible solution for PEP250 and bdist_wininst References: <025d01c10bb1$4c0f69c0$e000a8c0@thomasnotebook> <3B4F2DF6.85158478@lemburg.com> Message-ID: <050a01c10bc6$11e401b0$e000a8c0@thomasnotebook> From: "M.-A. Lemburg" [about setting sys.extinstallpath in site.py] > > Why not do this for all platforms (which support site-packages) ? Would probably make sense. But in this case, distutils should also use this setting. Thomas From paulp@ActiveState.com Fri Jul 13 20:03:36 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Fri, 13 Jul 2001 12:03:36 -0700 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings References: <3B4EE391.19995171@lemburg.com> Message-ID: <3B4F4608.207DBA7B@ActiveState.com> "M.-A. Lemburg" wrote: > > Please comment... I think that there should be a single directive for: * unicode strings * 8-bit strings * comments If a user uses UTF-8 for 8-bit strings and Shift-JIS for Unicode, there is basically no text editor in the world that is going to do the right thing. And it isn't possible for a web server to properly associate an encoding. In general, it isn't a useful configuration. Also, no matter what the directive says, I think that \uXXXX should continue to work. Just as in 8-bit strings, it should be possible to mix and match direct encoded input and backslash-escaped characters. Sometimes one is convenient (because of your keyboard setup) and sometimes the other is convenient. This proposal exists only to improve typing convenience so we should go all the way and allow both. I strongly think we should restrict the directive to one per file and in fact I would say it should be one of the first two lines. It should be immediately following the shebang line if there is one. This is to allow text editors to detect it as they detect XML encoding declarations. My opinions are influenced by the fact that I've helped implement Unicode support in an Python/XML editor. XML makes it easy to give the user a good experience. Python could too if we are careful. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From guido@digicool.com Fri Jul 13 20:16:21 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 13 Jul 2001 15:16:21 -0400 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings In-Reply-To: Your message of "Fri, 13 Jul 2001 12:03:36 PDT." <3B4F4608.207DBA7B@ActiveState.com> References: <3B4EE391.19995171@lemburg.com> <3B4F4608.207DBA7B@ActiveState.com> Message-ID: <200107131916.f6DJGLU16857@odiug.digicool.com> > I strongly think we should restrict the directive to one per file and in > fact I would say it should be one of the first two lines. It should be > immediately following the shebang line if there is one. This is to allow > text editors to detect it as they detect XML encoding declarations. Hm, then the directive would syntactically have to *precede* the docstring. That currently doesn't work -- the docstring may only be preceded by blank lines and comments. Lots of tools for processing docstrings already have this built into them. Is it worth breaking them so that editors can remain stupid? --Guido van Rossum (home page: http://www.python.org/~guido/) From paulp@ActiveState.com Fri Jul 13 20:38:49 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Fri, 13 Jul 2001 12:38:49 -0700 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings References: <3B4EE391.19995171@lemburg.com> <3B4F4608.207DBA7B@ActiveState.com> <200107131916.f6DJGLU16857@odiug.digicool.com> Message-ID: <3B4F4E49.C299C356@ActiveState.com> Guido van Rossum wrote: > >... > > Hm, then the directive would syntactically have to *precede* the > docstring. It makes sense for the directive to precede the docstring because the directive should be able to change the definition of the docstring! > That currently doesn't work -- the docstring may only be > preceded by blank lines and comments. Lots of tools for processing > docstrings already have this built into them. The directive statement is inherently a backwards incompatible extension. It is a grammar change. Many tools sniff out the docstring from the loaded module anyhow. > Is it worth breaking > them so that editors can remain stupid? I would say that the more important consideration is that it just makes sense to figure out what encoding you are using before you start processing strings! -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From skip@pobox.com (Skip Montanaro) Fri Jul 13 20:41:39 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 13 Jul 2001 14:41:39 -0500 Subject: [Python-Dev] my nomination for quote-of-the-week Message-ID: <15183.20211.386837.739757@beluga.mojam.com> This gets my vote for quote-of-the-week. Andrew, I seem to recall you are collecting this sort of stuff. From: quinn@yak.ugcs.caltech.edu (Quinn Dunkan) To: python-list@python.org Subject: Re: not safe at all Date: 13 Jul 2001 19:12:51 GMT ... The static people talk about rigorously enforced interfaces, correctness proofs, contracts, etc. The dynamic people talk about rigorously enforced testing and say that types only catch a small portion of possible errors. The static people retort that they don't trust tests to cover everything or not have bugs and why write tests for stuff the compiler should test for you, so you shouldn't rely on *only* tests, and besides static types don't catch a small portion, but a large portion of errors. The dynamic people say no program or test is perfect and static typing is not worth the cost in language complexity and design difficulty for the gain in eliminating a few tests that would have been easy to write anyway, since static types catch a small portion of errors, not a large portion. The static people say static types don't add that much language complexity, and it's not design "difficulty" but an essential part of the process, and they catch a large portion, not a small portion. The dynamic people say they add enormous complexity, and they catch a small portion, and point out that the static people have bad breath. The static people assert that the dynamic people must be too stupid to cope with a real language and rigorous requirements, and are ugly besides. This is when both sides start throwing rocks. ... Skip From guido@digicool.com Fri Jul 13 21:25:48 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 13 Jul 2001 16:25:48 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: Your message of "Fri, 13 Jul 2001 11:33:54 EDT." <200107131533.f6DFXsV16565@odiug.digicool.com> References: <200107130429.QAA02078@s454.cosc.canterbury.ac.nz> <200107131533.f6DFXsV16565@odiug.digicool.com> Message-ID: <200107132025.f6DKPmo16935@odiug.digicool.com> Here's a patch to abstract.c that does to binary_op1() what I had in mind. My own attempts at timing this only serve to confuse me, but I'm sure the experts will be able to assess it. I think it may make pystone about 1% faster. Note that this assumes that a type object only sets the NEW_STYLE_NUMBER flag when it has a non-NULL tp_as_number structure pointer. This makes sense, but just to be sure I add an assert(). In a bizarre twist of benchmarking, if I comment the asserts out, pystone is 1% *slower* than without the patch.... I guess I'm going to ignore that. Enjoy. --Guido van Rossum (home page: http://www.python.org/~guido/) Index: abstract.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/abstract.c,v retrieving revision 2.60.2.5 diff -c -r2.60.2.5 abstract.c *** abstract.c 2001/07/07 22:55:30 2.60.2.5 --- abstract.c 2001/07/13 20:14:01 *************** *** 318,324 **** { PyObject *x; binaryfunc *slot; ! if (v->ob_type->tp_as_number != NULL && NEW_STYLE_NUMBER(v)) { slot = NB_BINOP(v->ob_type->tp_as_number, op_slot); if (*slot) { x = (*slot)(v, w); --- 318,334 ---- { PyObject *x; binaryfunc *slot; ! ! /* Quick test if anything down here could work */ ! if (v->ob_type->tp_as_number == NULL && ! w->ob_type->tp_as_number == NULL) ! { ! Py_INCREF(Py_NotImplemented); ! return Py_NotImplemented; ! } ! ! if (NEW_STYLE_NUMBER(v)) { ! assert (v->ob_type->tp_as_number != NULL); slot = NB_BINOP(v->ob_type->tp_as_number, op_slot); if (*slot) { x = (*slot)(v, w); *************** *** 331,337 **** goto binop_error; } } ! if (w->ob_type->tp_as_number != NULL && NEW_STYLE_NUMBER(w)) { slot = NB_BINOP(w->ob_type->tp_as_number, op_slot); if (*slot) { x = (*slot)(v, w); --- 341,348 ---- goto binop_error; } } ! if (NEW_STYLE_NUMBER(w)) { ! assert (w->ob_type->tp_as_number != NULL); slot = NB_BINOP(w->ob_type->tp_as_number, op_slot); if (*slot) { x = (*slot)(v, w); From fredrik@pythonware.com Fri Jul 13 21:30:45 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 13 Jul 2001 22:30:45 +0200 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings References: <3B4EE391.19995171@lemburg.com> <3B4F4608.207DBA7B@ActiveState.com> Message-ID: <012e01c10bda$b3927320$4ffa42d5@hagrid> paul wrote: > I think that there should be a single directive for: > > * unicode strings > * 8-bit strings > * comments I'd say "the entire program". > If a user uses UTF-8 for 8-bit strings and Shift-JIS for Unicode, there > is basically no text editor in the world that is going to do the right > thing. And it isn't possible for a web server to properly associate an > encoding. In general, it isn't a useful configuration. exactly. any proposal that assumes that different parts of a text file is going to use different encodings is seriously flawed, and totally ignorant of reality. things just don't work that way. From tim.one@home.com Fri Jul 13 22:20:12 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 13 Jul 2001 17:20:12 -0400 Subject: [Python-Dev] RE: Defining Unicode Literal Encodings In-Reply-To: <3B4EE391.19995171@lemburg.com> Message-ID: [M.-A. Lemburg] > PEP: 0263 (?) > Title: Defining Unicode Literal Encodings > Version: $Revision: 1.0 $ > Author: mal@lemburg.com (Marc-André Lemburg) > Status: Draft > Type: Standards Track > Python-Version: 2.3 > Created: 06-Jun-2001 > Post-History: Since this depends on PEP 244, it should also have a Requires: 244 header line. > ... > ... can be set using the "directive" statement proposed in PEP 244. > > The syntax for the directives is as follows: > > 'directive' WS+ 'unicodeencoding' WS* '=' WS* PYTHONSTRINGLITERAL > 'directive' WS+ 'rawunicodeencoding' WS* '=' WS* PYTHONSTRINGLITERAL PEP 244 doesn't allow these spellings: at most one atom is allowed after the directive name, and = "whatever" isn't an atom. Remove the '=' and PEP 244 is happy, though. If you want to keep the "=", PEP 244 has to change. > ... [Guido] > Hm, then the directive would syntactically have to *precede* the > docstring. That currently doesn't work -- the docstring may only be > preceded by blank lines and comments. Lots of tools for processing > docstrings already have this built into them. Is it worth breaking > them so that editors can remain stupid? No. From fredrik@pythonware.com Fri Jul 13 22:44:36 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 13 Jul 2001 23:44:36 +0200 Subject: [Python-Dev] RE: Defining Unicode Literal Encodings References: Message-ID: <001b01c10be5$0479c4f0$4ffa42d5@hagrid> tim wrote: > [Guido] > > Hm, then the directive would syntactically have to *precede* the > > docstring. That currently doesn't work -- the docstring may only be > > preceded by blank lines and comments. Lots of tools for processing > > docstrings already have this built into them. Is it worth breaking > > them so that editors can remain stupid? > > No. that's why the "directive" statement shouldn't be used as an encoding directive. (and since I don't see any other use for it, that's also why the "directive" statement doesn't belong in Python at all ;-) From mal@lemburg.com Fri Jul 13 22:56:40 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 13 Jul 2001 23:56:40 +0200 Subject: [Python-Dev] RE: Defining Unicode Literal Encodings References: Message-ID: <3B4F6E98.733B90DC@lemburg.com> Tim Peters wrote: >=20 > [M.-A. Lemburg] > > PEP: 0263 (?) > > Title: Defining Unicode Literal Encodings > > Version: $Revision: 1.0 $ > > Author: mal@lemburg.com (Marc-Andr=E9 Lemburg) > > Status: Draft > > Type: Standards Track > > Python-Version: 2.3 > > Created: 06-Jun-2001 > > Post-History: >=20 > Since this depends on PEP 244, it should also have a >=20 > Requires: 244 >=20 > header line. Ok, I'll add that. =20 > > ... > > ... can be set using the "directive" statement proposed in PEP 244. > > > > The syntax for the directives is as follows: > > > > 'directive' WS+ 'unicodeencoding' WS* '=3D' WS* PYTHONSTRINGLITER= AL > > 'directive' WS+ 'rawunicodeencoding' WS* '=3D' WS* PYTHONSTRINGLI= TERAL >=20 > PEP 244 doesn't allow these spellings: at most one atom is allowed aft= er > the directive name, and >=20 > =3D "whatever" >=20 > isn't an atom. Remove the '=3D' and PEP 244 is happy, though. If you = want to > keep the "=3D", PEP 244 has to change. True... would that pose a problem ? =20 [Paul] > I think that there should be a single directive for: >=20 > * unicode strings > * 8-bit strings > * comments >=20 > If a user uses UTF-8 for 8-bit strings and Shift-JIS for Unicode, there > is basically no text editor in the world that is going to do the right > thing. And it isn't possible for a web server to properly associate an > encoding. In general, it isn't a useful configuration. Please don't mix 8-bit strings with Unicode literals: 8-bit strings don't carry any encoding information, so providing encoding information cannot be stored anywhere.=20 Comments, OTOH, are part of the program text, so they have to be ASCII just like the Python source itself. Note that it doesn't make sense to use a non-ASCII superset for the Unicode literal encoding (as you and others have noted). Since all builtin Python encodings are ASCII-supersets, this shouldn't pose much of a problem, though ;-) =20 > Also, no matter what the directive says, I think that \uXXXX should > continue to work. Just as in 8-bit strings, it should be possible to mi= x > and match direct encoded input and backslash-escaped characters. > Sometimes one is convenient (because of your keyboard setup) and > sometimes the other is convenient. This proposal exists only to improve > typing convenience so we should go all the way and allow both. Hmm, good point, but hard to implement. We'd probably need a two phase decoding for this to work: 1. decode the given Unicode literal encoding 2. decode any Unicode escapes in the Unicode string =20 > I strongly think we should restrict the directive to one per file and i= n > fact I would say it should be one of the first two lines. It should be > immediately following the shebang line if there is one. This is to allo= w > text editors to detect it as they detect XML encoding declarations. >=20 > My opinions are influenced by the fact that I've helped implement > Unicode support in an Python/XML editor. XML makes it easy to give the > user a good experience. Python could too if we are careful. I think that allowing one directive per file is the way to go, but I'm not sure about the exact position. Basically, I think it should go "near" the top, but not necessarily before any doc-string in the file. =20 > [Guido] > > Hm, then the directive would syntactically have to *precede* the > > docstring. That currently doesn't work -- the docstring may only be > > preceded by blank lines and comments. Lots of tools for processing > > docstrings already have this built into them. Is it worth breaking > > them so that editors can remain stupid? >=20 > No. Agreed. Note that the PEP doesn't require the directive to be placed before the doc-string. That point is still open. Technically, the compiler will only need to know about the encoding before the first Unicode literal in the source file. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From fredrik@pythonware.com Fri Jul 13 23:10:42 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 14 Jul 2001 00:10:42 +0200 Subject: [Python-Dev] RE: Defining Unicode Literal Encodings References: <3B4F6E98.733B90DC@lemburg.com> Message-ID: <004c01c10be8$af2face0$4ffa42d5@hagrid> M.-A. Lemburg wrote: > Please don't mix 8-bit strings with Unicode literals: 8-bit > strings don't carry any encoding information, so providing > encoding information cannot be stored anywhere. doesn't change a thing: the SOURCE CODE still has an encoding. I'm strongly -1 on your proposal. it's not representing current best practices (xml, java), and it's not future proof. we can do better. From mal@lemburg.com Fri Jul 13 23:21:32 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 14 Jul 2001 00:21:32 +0200 Subject: [Python-Dev] PEP: Defining Unicode Literal Encodings (revision 1.1) References: <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid> Message-ID: <3B4F746C.827BD177@lemburg.com> Here's an updated version which clarifies some issues... -- PEP: 0263 (?) Title: Defining Unicode Literal Encodings Version: $Revision: 1.1 $ Author: mal@lemburg.com (Marc-Andr=E9 Lemburg) Status: Draft Type: Standards Track Python-Version: 2.3 Created: 06-Jun-2001 Post-History:=20 Requires: 244 Abstract This PEP proposes to use the PEP 244 statement "directive" to make the encoding used in Unicode string literals u"..." (and their raw counterparts ur"...") definable on a per source file basis. Problem In Python 2.1, Unicode literals can only be written using the Latin-1 based encoding "unicode-escape". This makes the programming environment rather unfriendly to Python users who live and work in non-Latin-1 locales such as many of the Asian=20 countries. Programmers can write their 8-bit strings using the favourite encoding, but are bound to the "unicode-escape" encoding for Unicode literals. Proposed Solution I propose to make the Unicode literal encodings (both standard and raw) a per-source file option which can be set using the "directive" statement proposed in PEP 244 in a slightly extended form (by adding the '=3D' between the directive name and it's value). Syntax The syntax for the directives is as follows: 'directive' WS+ 'unicodeencoding' WS* '=3D' WS* PYTHONSTRINGLITERAL 'directive' WS+ 'rawunicodeencoding' WS* '=3D' WS* PYTHONSTRINGLITERA= L with the PYTHONSTRINGLITERAL representing the encoding name to be used as standard Python 8-bit string literal and WS being the whitespace characters [ \t]. Semantics Whenever the Python compiler sees such an encoding directive during the compiling process, it updates an internal flag which holds the encoding name used for the specific literal form. The encoding name flags are initialized to "unicode-escape" for u"..."=20 literals and "raw-unicode-escape" for ur"..." respectively. ISSUE: Maybe we should restrict the directive usage to once per file and additionally to a placement before the first Unicode literal= =20 in the source file. (Comments suggest that this approach suits the goal best.) If the Python compiler has to convert a Unicode literal to a Unicode object, it will pass the 8-bit string data given by the literal to the Python codec registry and have it decode the data using the current setting of the encoding name flag for the requested type of Unicode literal. It then checks the result of the decoding operation for being an Unicode object and stores it in the byte code stream. Since Python source code is defined to be ASCII, the Unicode literal encodings (both standard and raw) should be supersets of ASCII and=20 match the encoding used elsewhere in the program text, e.g. in=20 comments and maybe even 8-bit strings (even though their encoding=20 is only implicit and completely under the programmer's control). It is the responsability of the programmer to choose reasonable=20 encodings. Scope This PEP only affects Python source code which makes use of the proposed directives. It does not affect the coercion handling of 8-bit strings and Unicode in the given module. Copyright This document has been placed in the public domain. =0C Local Variables: mode: indented-text indent-tabs-mode: nil End: --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From paulp@ActiveState.com Fri Jul 13 23:46:02 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Fri, 13 Jul 2001 15:46:02 -0700 Subject: [Python-Dev] RE: Defining Unicode Literal Encodings References: <3B4F6E98.733B90DC@lemburg.com> Message-ID: <3B4F7A2A.5C29C909@ActiveState.com> "M.-A. Lemburg" wrote: > > .... > > Please don't mix 8-bit strings with Unicode literals: 8-bit > strings don't carry any encoding information, so providing encoding > information cannot be stored anywhere. First, we could store the information if we want. Second, whether we choose to store the information or not, the point is that the source file should not mix encodings. > Comments, OTOH, are part of the program text, so they have to be ASCII > just like the Python source itself. The Python interpreter allows non-ASCII characters in comments. > Hmm, good point, but hard to implement. We'd probably need a two > > phase decoding for this to work: > > 1. decode the given Unicode literal encoding > 2. decode any Unicode escapes in the Unicode string That doesn't sound so hard. :) > I think that allowing one directive per file is the way to go, > but I'm not sure about the exact position. Basically, I think it > should go "near" the top, but not necessarily before any doc-string > in the file. If Guido is violently opposed to having it before the docstring then we could allow it either before or after the docstring to give tools time to catch up. I'm not sure what tools in particular have the problem, though. Any tool that uses introspection or inspect.py will be fine. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From skip@pobox.com (Skip Montanaro) Sat Jul 14 00:50:21 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 13 Jul 2001 18:50:21 -0500 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) In-Reply-To: <3B4F746C.827BD177@lemburg.com> References: <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid> <3B4F746C.827BD177@lemburg.com> Message-ID: <15183.35133.264728.399408@beluga.mojam.com> mal> Here's an updated version which clarifies some issues... ... mal> I propose to make the Unicode literal encodings (both standard mal> and raw) a per-source file option which can be set using the mal> "directive" statement proposed in PEP 244 in a slightly mal> extended form (by adding the '=' between the directive name and mal> it's value). I think you need to motivate the need for a different syntax than is defined in PEP 244. I didn't see any obvious reason why the '=' is required. Also, how do you propose to address /F's objections, particularly that the directive can't syntactically appear before the module's docstring (where it makes sense that the module author would logically want to use a non-default encoding)? Skip From paulp@ActiveState.com Sat Jul 14 01:23:43 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Fri, 13 Jul 2001 17:23:43 -0700 Subject: [Python-Dev] RE: Defining Unicode Literal Encodings References: <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid> Message-ID: <3B4F910F.9A1BEC73@ActiveState.com> Fredrik Lundh wrote: > >... > > doesn't change a thing: the SOURCE CODE still has an > encoding. > > I'm strongly -1 on your proposal. > > it's not representing current best practices (xml, java), > and it's not future proof. we can do better. I think that with minor tweaks, the PEP can be a real step forward from where we are. I as disappointed with Guido's quick dismissal because I do think we have a problem in that people can send around Python programs with a bunch of encoded text without any declaration. Neither text editors nor even the Python interpreter itself know how to display that information on someone else's machine. Having a declaration would be a big step towards breaking the implicit dependence of those files on their "home" machines. For the declaration to have the effect I hope for, it would have to be file-scoped and apply to all binary data in the file. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From guido@digicool.com Sat Jul 14 02:25:55 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 13 Jul 2001 21:25:55 -0400 Subject: [Python-Dev] RE: Defining Unicode Literal Encodings In-Reply-To: Your message of "Fri, 13 Jul 2001 17:23:43 PDT." <3B4F910F.9A1BEC73@ActiveState.com> References: <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid> <3B4F910F.9A1BEC73@ActiveState.com> Message-ID: <200107140125.f6E1PtG17067@odiug.digicool.com> > I as disappointed with Guido's quick dismissal Huh????!!! I haven't dismissed anything. I just said I saw a problem. Don't be so quick to jump to conclusions. :-) I'm still watching the discussion... --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Sat Jul 14 04:20:22 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 13 Jul 2001 23:20:22 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010714032022.74B0B28927@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ From mal@lemburg.com Sat Jul 14 12:32:10 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 14 Jul 2001 13:32:10 +0200 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) References: <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid> <3B4F746C.827BD177@lemburg.com> <15183.35133.264728.399408@beluga.mojam.com> Message-ID: <3B502DBA.761D4F69@lemburg.com> Skip Montanaro wrote: >=20 > mal> Here's an updated version which clarifies some issues... > ... > mal> I propose to make the Unicode literal encodings (both stan= dard > mal> and raw) a per-source file option which can be set using t= he > mal> "directive" statement proposed in PEP 244 in a slightly > mal> extended form (by adding the '=3D' between the directive n= ame and > mal> it's value). >=20 > I think you need to motivate the need for a different syntax than is de= fined > in PEP 244. I didn't see any obvious reason why the '=3D' is required. I'm not picky about the '=3D'; if people don't want it, I'll happily drop it from the PEP. The only reason I think it may be worthwhile adding it is because it simply looks right: directive unicodeencoding =3D 'latin-1' rather than directive unicodeencoding 'latin-1' (Note that internally this will set a flag to a value, so the assigning character of '=3D' seems to fit in nicely.) =20 > Also, how do you propose to address /F's objections, particularly that = the > directive can't syntactically appear before the module's docstring (whe= re it > makes sense that the module author would logically want to use a non-de= fault > encoding)? Guido hinted to the problem of breaking code, Tim objected to requiring this.=20 I don't see the need to use Unicode literals as module doc-strings, so I think the problem is not a real one (8-bit strings can be written using any encoding just like you can=20 now). Still, if people would like to use Unicode literals for module doc-strings, then they should place the directive *before* the doc-string accepting that this could break some tools (the PEP currently does not restrict the placement of the directive). Alternatively, we could allow placing the directive into a comment, e.g. #!/usr/local/python #directive unicodeencoding =3D 'utf-8' u""" This is a Unicode doc-string """ About Fredrik's idea that the source code should only use one=20 encoding:=20 Well, that's possible with the proposed directive, since=20 only Unicode literals carry data for Python is encoding-aware and all other parts are under the programmer's control, e.g. #!/usr/local/python """ Module Docs... """ directive unicodeencoding =3D 'latin-1' ... u =3D "H=E9ll=F4 W=F6rld !" ... will give you pretty much what Fredrik asked for.=20 Note that since Python does not assign encoding information to=20 8-bit strings, comments etc. the only parts in a Python program=20 for which the programmer must explicitly tell Python which=20 encoding to assume are the Unicode literals. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Sat Jul 14 12:45:10 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 14 Jul 2001 13:45:10 +0200 Subject: [Python-Dev] RE: Defining Unicode Literal Encodings References: <3B4F6E98.733B90DC@lemburg.com> <3B4F7A2A.5C29C909@ActiveState.com> Message-ID: <3B5030C6.E244ADC2@lemburg.com> Paul Prescod wrote: > > "M.-A. Lemburg" wrote: > > > > .... > > > > Please don't mix 8-bit strings with Unicode literals: 8-bit > > strings don't carry any encoding information, so providing encoding > > information cannot be stored anywhere. > > First, we could store the information if we want. > > Second, whether we choose to store the information or not, the point is > that the source file should not mix encodings. I have added a new paragraph to the PEP (see my rev. 1.1 posting) pointing out that it is the programmers responsability to choose reasonable encodings; in particular, the used encodings should be compatible so that a text editor can display the data correctly. > > Comments, OTOH, are part of the program text, so they have to be ASCII > > just like the Python source itself. > > The Python interpreter allows non-ASCII characters in comments. > > > Hmm, good point, but hard to implement. We'd probably need a two > > > > phase decoding for this to work: > > > > 1. decode the given Unicode literal encoding > > 2. decode any Unicode escapes in the Unicode string > > That doesn't sound so hard. :) True. The issue here is very similar to standard literals vs. raw ones. Perhaps step 2 should only be imposed on standard literals while raw ones stop after step 1. > > I think that allowing one directive per file is the way to go, > > but I'm not sure about the exact position. Basically, I think it > > should go "near" the top, but not necessarily before any doc-string > > in the file. > > If Guido is violently opposed to having it before the docstring then we > could allow it either before or after the docstring to give tools time > to catch up. > > I'm not sure what tools in particular have the problem, though. Any tool > that uses introspection or inspect.py will be fine. See my other posting for ways to work around this problem. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From pedroni@inf.ethz.ch Sat Jul 14 17:50:54 2001 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Sat, 14 Jul 2001 18:50:54 +0200 Subject: [Python-Dev] descr-branch, ExtensionClasses References: <200107131208.OAA21211@core.inf.ethz.ch> <200107131459.f6DExAv16504@odiug.digicool.com> Message-ID: <000b01c10c85$27a82840$8a73fea9@newmexico> Hi. First thanks for the answers. > I'd say that even if descr-branch doesn't make it into 2.2, it will > make it into the next release, so by all means study the design and > tell me if it has any problems for Jython! Yup. I will do that as soon as I have time to do it seriously. I imagine that you need such kind of feedback at least before 2.2 goes beta. Samuele Pedroni. From pedroni@inf.ethz.ch Sat Jul 14 19:45:19 2001 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Sat, 14 Jul 2001 20:45:19 +0200 Subject: [Python-Dev] descr-branch, ExtensionClasses Message-ID: <000b01c10c95$22b60860$8a73fea9@newmexico> [GvR] >Yes, that would be good. Are you aware of the schedule in PEP 251? I was, thank you for remembering me of that. I will try to come out with some comments before a2 or at least before middle of august. Things are a bit complicated with jython because of all the support for java integration playing with classes and instances internals. As you might know, we are still working on jython 2.1, we have just an a1 for it out. We have not yet started working on 2.2. Samuele Pedroni. From mal@lemburg.com Sat Jul 14 19:52:21 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 14 Jul 2001 20:52:21 +0200 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) References: Message-ID: <3B5094E5.F239D0E9@lemburg.com> Roman Suzi wrote: > > On Sat, 14 Jul 2001, M.-A. Lemburg wrote: > > >> #!/usr/bin/python > >> # -*- coding=utf-8 -*- > >> ... > > > >I already mentioned allowing directives in comments to work around > >the problem of directive placement before the first doc-string. > > > >The above would then look like this: > > > >#!/usr/local/bin/python > ># directive unicodeencoding='utf-8' > >u""" UTF-8 doc-string """ > > > >The downside of this is that parsing comments breaks the current > >tokenizing scheme in Python: the tokenizer removes comments before > >passing the tokens to the compiler ...wouldn't be hard to > >fix though ;-) (note that tokenize.py does not) > > BTW, it is possible to write variable names in national alphabet > is locale is set. But I do not know if this is side-effect > which will be corrected or behaviour one can rely on ;-) It is a side-effect of Python relying on the isalpha() C API. I wouldn't count on it though since it is not compliant to the Python reference and other Python implementations may very well not offer this possibility. > It could be also nice to be able replace keywords with localised ones. > Python remains nice even after translating into Russian. Eek, no please ! VisualBasic went down that road and backed out again... it's simply a complete nightmare. > This + mending broken IDLE (which doesn't allow to enter cyrillic) will > allow beginners to think and write. Currently "writing while thinking" > works only for those who think in English ;-) > > And such a move opens Python to secondary schools. For example, Logo has > national variants without any losses. Why Python, also targeted for > education requires to use English? > > And unicoding (utf-8-ing) Python source could be the solution. > > What do you think? I personally think that programs should always be written in ASCII and all national language string literals be moved out into gettext() (or similar) support files. Of course, for beginners and small projects this is overkill, so the proposed Unicode literal variant might help... which is why I wrote the PEP -- adding this support to Python is really simple and does not require a major rewrite of the tokenizer/compiler components in Python. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From tim.one@home.com Sat Jul 14 20:37:39 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 14 Jul 2001 15:37:39 -0400 Subject: [Python-Dev] RE: PEP: Defining Unicode Literal Encodings (revision 1.1) In-Reply-To: <3B502DBA.761D4F69@lemburg.com> Message-ID: [M.-A. Lemburg] > ... > I'm not picky about the '='; if people don't want it, I'll > happily drop it from the PEP. The only reason I think it may be > worthwhile adding it is because it simply looks right: > > directive unicodeencoding = 'latin-1' > > rather than > > directive unicodeencoding 'latin-1' The hangup is finding someone who cares enough <0.9 wink> to change the text and implementation of the directive PEP. There was no significant debate about the proposed directive syntax in that, and in years past similar crusades that did attract debate floundered on the inability to reach consensus on overall syntax; it's not a good sign that the first proposed use wanted syntax the PEP doesn't support. > ... > Still, if people would like to use Unicode literals for module > doc-strings, then they should place the directive *before* the > doc-string accepting that this could break some tools (the PEP > currently does not restrict the placement of the directive). > Alternatively, we could allow placing the directive into a > comment, e.g. > > #!/usr/local/python > #directive unicodeencoding = 'utf-8' > u""" > This is a Unicode doc-string > """ Another alternative: #!/usr/local/python directive unicodeencoding 'utf-8' __doc__ = u""" This is a Unicode doc-string """ That is, the module docstring is just the module's __doc__ attr, and that can be bound explicitly (a trick I've sometimes use for *computed* module docstrings). From paulp@ActiveState.com Sat Jul 14 23:04:47 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sat, 14 Jul 2001 15:04:47 -0700 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) References: Message-ID: <3B50C1FF.7E73739B@ActiveState.com> Tim Peters wrote: > >... > > That is, the module docstring is just the module's __doc__ attr, and that > can be bound explicitly (a trick I've sometimes use for *computed* module > docstrings). I must be missing something fundamental. Why wouldn't we just redefine the algorithm used to find the docstring to allow a directive and implement it in the interpreter? *What tools* in particular are we worried about breaking? -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From tim.one@home.com Sat Jul 14 23:39:25 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 14 Jul 2001 18:39:25 -0400 Subject: [Python-Dev] Silly little benchmark In-Reply-To: <200107132025.f6DKPmo16935@odiug.digicool.com> Message-ID: [Guido] > Here's a patch to abstract.c that does to binary_op1() what I had in > mind. My own attempts at timing this only serve to confuse me, but > I'm sure the experts will be able to assess it. I think it may make > pystone about 1% faster. I would have tried this in PyNumber_Add instead (which pystone never enters! it doesn't do any string cats, and all its adds are integer adds special-cased away by BINARY_ADD). binary_op1() is entered often by pystone, but only for int * and /, and (just) a few times each for float subtract and the one-shot Array1Glob = [0]*51 Array2Glob = map(lambda x: x[:], [Array1Glob]*51) module initialization lines. So adding an early-out in binary_op1() "should" only harm pystone. Adding an early-out in PyNumber_Add instead should be neutral for pystone (but should slow, e.g., floating-point code a little). > Note that this assumes that a type object only sets the > NEW_STYLE_NUMBER flag when it has a non-NULL tp_as_number structure > pointer. This makes sense, but just to be sure I add an assert(). Good. > In a bizarre twist of benchmarking, if I comment the asserts out, > pystone is 1% *slower* than without the patch.... I guess I'm going > to ignore that. Is the Unix build such that release mode doesn't manage to disable asserts? I wouldn't ignore this, because the source code the C compiler sees in release builds *should* be the same as if the assert lines had been ((void)0); lines instead. I don't see anything in the non-Windows builds that's #define'ing NDEBUG in release builds, which is what they have to do to turn asserts off. Note that I understand that the effect of an assert "should be" to slow things down, but that you're seeing it slow down when they're commented out. That's not what I'm pursuing in this part: I'm wondering why you see *any* difference when commenting out asserts, regardless of direction. You shouldn't, and since I don't see anything that ever turns asserts off except in the Windows build, that makes me twice as suspicious. From tim.one@home.com Sun Jul 15 01:12:56 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 14 Jul 2001 20:12:56 -0400 Subject: asserts sure look broken to me (was RE: [Python-Dev] Silly little benchmark) In-Reply-To: Message-ID: [Tim] > ... > and since I don't see anything that ever turns asserts off except > in the Windows build, that makes me twice as suspicious. I built a release-mode Python under Cygwin after including a guaranteed-to-trigger assert, and sure enough it triggered. If that's generally true of non-MSVC builds, it may go quite a way toward explaining, e.g., why the Linux release-mode Python is significantly slower than the Windows release-mode Python on our otherwise-identical office boxes. Ubiquitous screwup or unique to Cygwin? Disabling asserts in release mode requires that NDEBUG be #define'd before including assert.h (this is all std ANSI C, so should work the same way across platforms). The MSVC project defines NDEBUG "on the command line" during release builds, which is a good way to accomplish this. From guido@digicool.com Sun Jul 15 02:26:43 2001 From: guido@digicool.com (Guido van Rossum) Date: Sat, 14 Jul 2001 21:26:43 -0400 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) In-Reply-To: Your message of "Sat, 14 Jul 2001 15:04:47 PDT." <3B50C1FF.7E73739B@ActiveState.com> References: <3B50C1FF.7E73739B@ActiveState.com> Message-ID: <200107150126.VAA23781@cj20424-a.reston1.va.home.com> Explain again why a directive is better than a specially marked comment, when your main goal seems to be to make it easy for non-parsing tools like editors to find it? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Sun Jul 15 02:35:07 2001 From: guido@digicool.com (Guido van Rossum) Date: Sat, 14 Jul 2001 21:35:07 -0400 Subject: asserts sure look broken to me (was RE: [Python-Dev] Silly little benchmark) In-Reply-To: Your message of "Sat, 14 Jul 2001 20:12:56 EDT." References: Message-ID: <200107150135.VAA23850@cj20424-a.reston1.va.home.com> Yup, I think assert is always on with the Unix build. I think I knew this, because I was uncomfortable for a long time with adding asserts to frequently run code. We should fix this. --Guido van Rossum (home page: http://www.python.org/~guido/) From rnd@onego.ru Sun Jul 15 08:38:04 2001 From: rnd@onego.ru (Roman Suzi) Date: Sun, 15 Jul 2001 11:38:04 +0400 (MSD) Subject: [Python-Dev] This is spoiling Python image! Message-ID: The problem: it is impossible to use IDLE with non-latin1 encodings under Windows. IDLE is standard IDE for Python and it is what beginner users of Python see in their Start->Programs. Unfortunately, IDLE can't work with non-latin1 characters any more. This could lead beginners to reconsider their choice of language because of unfriendly i18n issues. The problem is explained in detail below. Lets consider all errors one at a time. 1. Tcl can't find encodings (they are in \Python21\tcl\tcl8.3\encoding\). Without them it is impossible to enter cyrillic and other kinds of letter= s in Text and Entry widgets under Windows. Tkinter tries to help Tcl by means of FixTk.py: import sys, os, _tkinter ver =3D str(_tkinter.TCL_VERSION) for t in "tcl", "tk": v =3D os.path.join(sys.prefix, "tcl", t+ver) if os.path.exists(os.path.join(v, "tclIndex")): os.environ[t.upper() + "_LIBRARY"] =3D v This sets env. variables TCL_LIBRARY and TK_LIBRARY to "C:\Python21\tcl\tcl8.3". The problem is that it imports _tkinter which initialises and calls Tcl_FindExecutable before TCL_LIBRARY is set. It is easy to fix this error in FixTk.py: import sys, os if not os.environ.has_key('TCL_LIBRARY'): tcl_library =3D os.path.join(sys.prefix, "tcl", "tclX.Y") os.environ['TCL_LIBRARY'] =3D tcl_library Tcl is smart enough to look into "C:\Python21\tcl\tclX.Y\..\tcl8.3" as well. 2. Now we are able to print in IDLE: >>> print "=F0=D2=C9=D7=C5=D4" and we will see russian letter... before we press Enter, after which: UnicodeError: ASCII decoding error: ordinal not in range(128) appears. Tcl recoded "=F0=D2=C9=D7=C5=D4" into Unicode. Python tries to recode it = back into usual string, assuming usual strings have sys.getdefaultencoding(). Now we need to set default encoding. Lets look into site.py: # Set the string encoding used by the Unicode implementation. The # default is 'ascii', but if you're willing to experiment, you can # change this. encoding =3D "ascii" # Default value set by _PyUnicode_Init() if 0: # Enable to support locale aware default string encodings. import locale loc =3D locale.getdefaultlocale() if loc[1]: encoding =3D loc[1] if 0: # Enable to switch off string to Unicode coercion and implicit # Unicode to string conversion. encoding =3D "undefined" if encoding !=3D "ascii": sys.setdefaultencoding(encoding) The code for setting default encoding is commented (maybe, to allow faste= r startup?) Then goes: # # Run custom site specific code, if available. # try: import sitecustomize except ImportError: pass # # Remove sys.setdefaultencoding() so that users cannot change the # encoding after initialization. The test for presence is needed when # this module is run as a script, because this code is executed twice. # if hasattr(sys, "setdefaultencoding"): del sys.setdefaultencoding So, sys.setdefaultencoding is deleted after we used it in sitecustomize.py. Its too bad, because the program can't set default encoding and implicit string<->unicode conversions are very common in Python and IDLE. The solution could be as follows. Lets put sitecustomize.py in C:\Python21\ with the following: import locale, sys encoding =3D locale.getdefaultlocale()[1] if encoding: sys.setdefaultencoding(encoding) * It would be wonderful if IDLE itself could setup encoding based on locale or issued warnings and pointed t o solution somehow. 3. Now we can try it again in IDLE: >>> print "=F0=D2=C9=D7=C5=D4" after hitting Enter we are getting... latin1. It's time to look at how _tkinter.c communicates with Tcl. The cheap&dirty solution for IDLE is as follows: --- Percolator.py.orig Sat Jul 14 19:38:16 2001 +++ Percolator.py Sat Jul 14 19:38:16 2001 @@ -22,6 +22,8 @@ def insert(self, index, chars, tags=3DNone): # Could go away if inheriting from Delegator + if index !=3D 'insert': + chars =3D unicode(chars) self.top.insert(index, chars, tags) def delete(self, index1, index2=3DNone): --- PyShell.py.orig Sat Jul 14 19:38:37 2001 +++ PyShell.py Sat Jul 14 19:38:37 2001 @@ -469,6 +469,8 @@ finally: self.reading =3D save line =3D self.text.get("iomark", "end-1c") + if type(line) =3D=3D type(u""): + line =3D line.encode() self.resetoutput() if self.canceled: self.canceled =3D 0 But alas these patches only mask the problem. What is really needed? Starting from version 8.1 Tcl is totally unicoded. It is very simple: tt wants us utf-8 strings and returns also utf-8 strings. (As an exception, Tcl could assume latin1 if it is unable to decode string). _tkinter.c just sends Python strings as is to Tcl. And does it correctly for Unicode strings. Receiving side is slightly more complicated: Tkapp_Call function (aka root.tk.call) handles most of the Tkinter Tcl/Tk commands. If the result is 7bit clean, Tkapp_Call returns usual string, if not -- it converts from utf-8 into unicode and returns Unicode string. Only Tkapp_Call does it. All others (Tkapp_Eval, GetVar, PythonCmd) return utf-8 string! IDLE extensively use Tkinter capabilities and all kinds of strings go back and forth between Python and Tcl. Of course, _tkinter.c works incorrectly. i) before sending a string to Tcl, it must recode it FROM default encoding TO utf-8 ii) upon receive of a string from Tcl, it must recode it from utf-8 to default encoding, if possible. [R.S.: Or return it as Unicode, if impossible] It is possible to optimize the conversions. Of course, this will have impact on the speed of Tkinter. But in our opinion correct work is more important than speed. Solution checked under Win98. >From R.S.: yes, IDLE is not ideal and there are better IDEs (Emacs, for example) and "serious" programmers rarely use it. Also Tkinter is critisized much, etc. But the problem indicated above is very bad for Python image as a user-friendly language. That is why it is very important to FIX the problem as soon, as possible. We can prepare patches for _tkinter.c as well. Before we proceed to submitting bug-reports and patches, we will be glad to hear if somebody has better solution to the indicated problem. (The big deal of the problem is the need to patch _tkinter.c and recompil= e it. Everything else even beginner could fix if supplied with clues and files with fixes. But of course, Python's IDLE must run correct out of th= e box). Author: Kirill Simonov Translator: Roman Suzi From mal@lemburg.com Sat Jul 14 20:57:29 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 14 Jul 2001 21:57:29 +0200 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) References: Message-ID: <3B50A429.59B8FB7B@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > ... > > I'm not picky about the '='; if people don't want it, I'll > > happily drop it from the PEP. The only reason I think it may be > > worthwhile adding it is because it simply looks right: > > > > directive unicodeencoding = 'latin-1' > > > > rather than > > > > directive unicodeencoding 'latin-1' > > The hangup is finding someone who cares enough <0.9 wink> to change the text > and implementation of the directive PEP. There was no significant debate > about the proposed directive syntax in that, and in years past similar > crusades that did attract debate floundered on the inability to reach > consensus on overall syntax; it's not a good sign that the first proposed > use wanted syntax the PEP doesn't support. Well, I guess I would care enough :-) Martin has to change the PEP though, since he's the PEP author (and currently on vacation if I'm not mistaken). I think that supporting the typical "key = value" format is quite reasonable for setting flags in the compiler. The PEP's original idea of replacing your "from __future__ import spam" does not require this format, since is only needs to support switches. > > ... > > Still, if people would like to use Unicode literals for module > > doc-strings, then they should place the directive *before* the > > doc-string accepting that this could break some tools (the PEP > > currently does not restrict the placement of the directive). > > Alternatively, we could allow placing the directive into a > > comment, e.g. > > > > #!/usr/local/python > > #directive unicodeencoding = 'utf-8' > > u""" > > This is a Unicode doc-string > > """ > > Another alternative: > > #!/usr/local/python > directive unicodeencoding 'utf-8' > > __doc__ = u""" > This is a Unicode doc-string > """ > > That is, the module docstring is just the module's __doc__ attr, and that > can be bound explicitly (a trick I've sometimes use for *computed* module > docstrings). Hmm, that looks a little cumbersome, but it would work (at least for doc string extraction tools which import the module rather than tokenize it). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From paulp@ActiveState.com Sun Jul 15 18:15:28 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 15 Jul 2001 10:15:28 -0700 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) References: <3B50C1FF.7E73739B@ActiveState.com> <200107150126.VAA23781@cj20424-a.reston1.va.home.com> Message-ID: <3B51CFB0.ACE9070D@ActiveState.com> Guido van Rossum wrote: > > Explain again why a directive is better than a specially marked > comment, when your main goal seems to be to make it easy for > non-parsing tools like editors to find it? >... Parsing tools do need it. The directive changes the file's semantics. Both parsing and non-parsing tools need it. I could live with a comment but I think that that is actually harder to implement so I don't understand the benefit...I'm still trying to understand what tools we are protecting. compiler.py can be easily fixed. The real parser/compiler can be easily fixed. The other tools mostly take their cue from one of these two modules, right? -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From guido@digicool.com Sun Jul 15 18:29:24 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 15 Jul 2001 13:29:24 -0400 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) In-Reply-To: Your message of "Sun, 15 Jul 2001 10:15:28 PDT." <3B51CFB0.ACE9070D@ActiveState.com> References: <3B50C1FF.7E73739B@ActiveState.com> <200107150126.VAA23781@cj20424-a.reston1.va.home.com> <3B51CFB0.ACE9070D@ActiveState.com> Message-ID: <200107151729.NAA00455@cj20424-a.reston1.va.home.com> > > Explain again why a directive is better than a specially marked > > comment, when your main goal seems to be to make it easy for > > non-parsing tools like editors to find it? > >... > > Parsing tools do need it. The directive changes the file's semantics. > Both parsing and non-parsing tools need it. I understand that. > I could live with a comment but I think that that is actually harder to > implement so I don't understand the benefit...I'm still trying to > understand what tools we are protecting. compiler.py can be easily > fixed. The real parser/compiler can be easily fixed. The other tools > mostly take their cue from one of these two modules, right? I disagree with the first sentence -- I believe a comment is easier to implement. The directive statement is still problematic. Martin's hack falls short of doing the right thing in all cases: you can't have the first statement of your program be "directive = ..." or "directive(...)". Another argument for a comment: I expect there could be situations where you want to declare an encoding that doesn't affect the Python parser, but that does affect the editor (e.g. when you use the encoding only in comments and/or 8-bit strings). A comment would back-port to older Python versions; a directive statement wouldn't. I don't know how important this is though. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Sun Jul 15 19:07:50 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 15 Jul 2001 20:07:50 +0200 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) References: <3B50C1FF.7E73739B@ActiveState.com> <200107150126.VAA23781@cj20424-a.reston1.va.home.com> <3B51CFB0.ACE9070D@ActiveState.com> <200107151729.NAA00455@cj20424-a.reston1.va.home.com> Message-ID: <3B51DBF6.6456A750@lemburg.com> Guido van Rossum wrote: > > > > Explain again why a directive is better than a specially marked > > > comment, when your main goal seems to be to make it easy for > > > non-parsing tools like editors to find it? > > >... > > > > Parsing tools do need it. The directive changes the file's semantics. > > Both parsing and non-parsing tools need it. > > I understand that. > > > I could live with a comment but I think that that is actually harder to > > implement so I don't understand the benefit...I'm still trying to > > understand what tools we are protecting. compiler.py can be easily > > fixed. The real parser/compiler can be easily fixed. The other tools > > mostly take their cue from one of these two modules, right? > > I disagree with the first sentence -- I believe a comment is easier to > implement. The directive statement is still problematic. Martin's > hack falls short of doing the right thing in all cases: you can't have > the first statement of your program be "directive = ..." or > "directive(...)". > > Another argument for a comment: I expect there could be situations > where you want to declare an encoding that doesn't affect the Python > parser, but that does affect the editor (e.g. when you use the > encoding only in comments and/or 8-bit strings). A comment would > back-port to older Python versions; a directive statement wouldn't. I > don't know how important this is though. Even though putting the information into a comment would indeed be easier to implement, I think that from a design point of view, it is a hack and not a clean design. Note that a programmer can always place the encoding information in the format needed for the editor into an additional comment in fron of the doc-string if that's needed (the comment format needed for the editor will be editor-specific !). I think that apart from adding a new keyword to the language the argument about breaking doc-string tools is not a valid one. Non-Unicode doc-strings will continue to work like they always have: #!/usr/local/bin/python # -*- encoding='utf-8' -*- """ Binary doc-string using UTF-8 """ directive unicodeencoding = 'utf-8' ... print u"Unicode encoded as UTF-8 rather than unicode-escape" ... Or am I missing something ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Sun Jul 15 19:09:10 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 15 Jul 2001 20:09:10 +0200 Subject: [Python-Dev] Re: Possible solution for PEP250 and bdist_wininst References: <025d01c10bb1$4c0f69c0$e000a8c0@thomasnotebook> <3B4F2DF6.85158478@lemburg.com> <050a01c10bc6$11e401b0$e000a8c0@thomasnotebook> Message-ID: <3B51DC46.6DD434F2@lemburg.com> Thomas Heller wrote: > > From: "M.-A. Lemburg" > [about setting sys.extinstallpath in site.py] > > > > Why not do this for all platforms (which support site-packages) ? > Would probably make sense. But in this case, > distutils should also use this setting. Sure. (That was the point of inventing sys.extinstallpath ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From paulp@ActiveState.com Sun Jul 15 20:48:08 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 15 Jul 2001 12:48:08 -0700 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) References: <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid> <3B4F746C.827BD177@lemburg.com> Message-ID: <3B51F378.7DC06482@ActiveState.com> "M.-A. Lemburg" wrote: > >... > Since Python source code is defined to be ASCII, the Unicode literal > encodings (both standard and raw) should be supersets of ASCII and > match the encoding used elsewhere in the program text, e.g. in > comments and maybe even 8-bit strings (even though their encoding > is only implicit and completely under the programmer's control). Python programmers do not read PEPs to learn how to use new features. I think it makes the whole thing much simpler if we define it on the file level explicitly. To me, the feature is most helpful if it helps the interpreter and various code inspection tools to understand all of the non-ASCII information in the file. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From c_ullman@yahoo.com Mon Jul 16 06:08:08 2001 From: c_ullman@yahoo.com (Cayce Ullman) Date: Sun, 15 Jul 2001 22:08:08 -0700 (PDT) Subject: [Python-Dev] Leading with XML-RPC Message-ID: <20010716050808.79446.qmail@web11003.mail.yahoo.com> --0-1917799420-995260088=:79364 Content-Type: text/plain; charset=us-ascii /F wrote: >-0 on soap support in 2.2 (it's still a moving target; a new spec draft >was released this weekend). if we want something now, it should be >cayce ullman's SOAP.py, not my soaplib.py. but I don't think we need >SOAP in the standard library for another year or two. Agreed on it being too early for any existing SOAP library to be included in the standard library. SOAP.py for example is not very clean at the moment as it evolved (and does some things wrong) in an attempt to interop with most impls. I do think that if xmlrpclib.py is included in the std lib (which IMHO is a very good idea), any future included SOAP lib should be similar in structure and use (ie more like soaplib.py than SOAP.py). Hopefully, bringing soaplib.py up to speed or cleaning up SOAP.py should be an increasingly easier task as the interop process continues to settle down. I do think Python has an opportunity to become an excellent choice for doing "web services" or .NET type of work out of the box. If these concepts do take off I would hope to see some included SOAP functionality sooner rather than later in the std lib. Also, for the record SOAP.py has moved to http://pywebsvcs.sourceforge.net Cayce --------------------------------- Do You Yahoo!? Get personalized email addresses from Yahoo! Mail - only $35 a year! http://personal.mail.yahoo.com/ --0-1917799420-995260088=:79364 Content-Type: text/html; charset=us-ascii

/F wrote:

>-0 on soap support in 2.2 (it's still a moving target; a new spec draft
>was released this weekend).  if we want something now, it should be
>cayce ullman's SOAP.py, not my soaplib.py.  but I don't think we need
>SOAP in the standard library for another year or two.

Agreed on it being too early for any existing SOAP library to be included in the standard library. SOAP.py for example is not very clean at the moment as it evolved (and does some things wrong) in an attempt to interop with most impls.  I do think that if xmlrpclib.py is included in the std lib (which IMHO is a very good idea), any future included SOAP lib should be similar in structure and use (ie more like soaplib.py than SOAP.py). Hopefully, bringing soaplib.py up to speed or cleaning up SOAP.py should be an increasingly easier task as the interop process continues to settle down.

I do think Python has an opportunity to become an excellent choice for doing "web services" or .NET type of work out of the box.  If these concepts do take off I would hope to see some included SOAP functionality sooner rather than later in the std lib.

Also, for the record SOAP.py has moved to http://pywebsvcs.sourceforge.net

Cayce



Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35 a year!
http://personal.mail.yahoo.com/ --0-1917799420-995260088=:79364-- From thomas@xs4all.net Mon Jul 16 07:57:29 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 16 Jul 2001 08:57:29 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include parsetok.h,2.15,2.16 pythonrun.h,2.42,2.43 In-Reply-To: Message-ID: <20010716085729.E5396@xs4all.nl> On Sun, Jul 15, 2001 at 10:37:26PM -0700, Tim Peters wrote: > Modified Files: > parsetok.h pythonrun.h > Log Message: > Ugly. A pile of new xxxFlags() functions, to communicate to the parser > that 'yield' is a keyword. This doesn't help test_generators at all! I > don't know why not. These things do work now (and didn't before this > patch): What's the problem with this, anyway ? Why would "from __future__ import generators" or special flags be necessary to enable the existance of generators ? I'd have thought it's just a parser directive (okay, so that's tricky to implement) but to code that doesn't use 'yield' a generator is just another iterator, right ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Mon Jul 16 08:53:18 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 16 Jul 2001 09:53:18 +0200 Subject: [Python-Dev] Re: CVS: python/dist/src/Parser parsetok.c,2.25,2.26 References: Message-ID: <3B529D6E.A4BAD3FA@lemburg.com> [Tim] > A pile of new xxxFlags() functions, to communicate to the parser > that 'yield' is a keyword. Would those APIs also be usable for a new "directive" keyword ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From atehwa@iki.fi Mon Jul 16 10:52:30 2001 From: atehwa@iki.fi (Panu A Kalliokoski) Date: Mon, 16 Jul 2001 12:52:30 +0300 (EET DST) Subject: [Python-Dev] A replacement for asyncore / asynchat Message-ID: Hello all, I've developed a Python module (in Python) to make somewhat higher abstraction over select.select(). The package is called "Selecting". The package is somewhat similar to asyncore, but has many advantages over it: - It's made in OO fashion, allowing for greater flexibility in overriding default behaviour; - Event queues, which allow you to schedule events that should happen sometime in the future (nicely synced with select()) (permanent / one-shot events); - Cleaner API; - Channel interfaces. It's possible to make many different channels as long as they have a fd to select() on; with this, you can implement, for example, inter-thread locking with pipes. - Simpler buffering scheme, which makes it unnecessary to use unblocking fd's, and might even give some speed; - No exception handling (I found exception packing of asyncore to be a real nuisance) - Clearer (?) division of responsibility: the API of channel handlers, etc. (asyncore puts part of message handling into the socket wrapper) For these reasons, I think that the asyncore package in the Python main distribution should be replaced with Selecting or at least Selecting should be put in the main distribution. The package is available at http://sange.fi/~atehwa-u/selecting/ (for browsing) and http://sange.fi/~atehwa-u/selecting-0.89.tar.gz (for downloading). The package is quite well tested and has been used to build ircd-style daemons, but more testing and comments are always welcome. Panu Kalliokoski From mal@lemburg.com Mon Jul 16 13:29:11 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 16 Jul 2001 14:29:11 +0200 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) References: <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid> <3B4F746C.827BD177@lemburg.com> <3B51F378.7DC06482@ActiveState.com> Message-ID: <3B52DE17.E283521@lemburg.com> Paul Prescod wrote: > > "M.-A. Lemburg" wrote: > > > >... > > Since Python source code is defined to be ASCII, the Unicode literal > > encodings (both standard and raw) should be supersets of ASCII and > > match the encoding used elsewhere in the program text, e.g. in > > comments and maybe even 8-bit strings (even though their encoding > > is only implicit and completely under the programmer's control). > > Python programmers do not read PEPs to learn how to use new features. I > think it makes the whole thing much simpler if we define it on the file > level explicitly. To me, the feature is most helpful if it helps the > interpreter and various code inspection tools to understand all of the > non-ASCII information in the file. I don't think I understand your point... please clarify. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From martin@loewis.home.cs.tu-berlin.de Mon Jul 16 16:15:01 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 16 Jul 2001 17:15:01 +0200 Subject: [Python-Dev] Leading with XML-RPC Message-ID: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> > It might benefit from also including the sgmlop.c extension. +1 on including this one (after fixing the bugs, that is). People want a "good" XML parser in Python, regardless of XML-RPC; they complain that expat requires an external library. sgmlop should then go into xml.parsers.sgmlop; making sgmllib and xmllib use sgmlop is optional. Regards, Martin From guido@digicool.com Mon Jul 16 17:17:37 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 16 Jul 2001 12:17:37 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: Your message of "Mon, 16 Jul 2001 17:15:01 +0200." <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> Message-ID: <200107161617.f6GGHeE31369@odiug.digicool.com> > > It might benefit from also including the sgmlop.c extension. > > +1 on including this one (after fixing the bugs, that is). People want > a "good" XML parser in Python, regardless of XML-RPC; they complain > that expat requires an external library. > > sgmlop should then go into xml.parsers.sgmlop; making sgmllib and > xmllib use sgmlop is optional. +0 I believe sgmlop can crash on grossly bad input (admittedly I looked at the source once over a year ago). If this were fixed I'd be +1. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@loewis.home.cs.tu-berlin.de Mon Jul 16 17:05:47 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 16 Jul 2001 18:05:47 +0200 Subject: [Python-Dev] guido@digicool.com Message-ID: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> > Martin's hack falls short of doing the right thing in all cases: you > can't have the first statement of your program be "directive = ..." > or "directive(...)". If that is considered as a serious problem, I'll try to solve it with an additional lookahead token: If the next token is a name, then it is a directive. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Mon Jul 16 16:59:08 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 16 Jul 2001 17:59:08 +0200 Subject: [Python-Dev] PEP 244 syntax Message-ID: <200107161559.f6GFx8V03899@mira.informatik.hu-berlin.de> > Well, I guess I would care enough :-) Martin has to change the PEP > though, since he's the PEP author. I don't like having an equal sign there, but I can add this as an alternative and leave it for BDFL pronouncement (and count votes in favour or against). In any case, I'd need to know what the exact proposed change to PEP 244 is. The syntax currently reads directive_statement: 'directive' NAME [atom] [';'] NEWLINE How do you want this to change? > I think that supporting the typical "key = value" format is > quite reasonable for setting flags in the compiler. The PEP's > original idea of replacing your "from __future__ import spam" > does not require this format, since is only needs to support > switches. Actually, based on Tim's objections, I need the syntax in a different way: directive transitional generators Here, "directive transitional" indicates that a transitional feature is being activated, followed by the name of the feature. This is in line with directive transitional nested_scopes Spelling them as directive transitional = nested_scopes # or directive transitional = 'nested_scopes' doesn't sound right, since I'm not assigning to "transitional". Of course, since this directive is spelled "from __future__ import" these days, the only remaining application for directives is the unicodeencoding directive. I'm just pointing out that adding an equal sign likely restricts the applicability of directives. Regards, Martin From mal@lemburg.com Mon Jul 16 17:49:40 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 16 Jul 2001 18:49:40 +0200 Subject: [Python-Dev] Re: PEP 244 syntax References: <200107161559.f6GFx8V03899@mira.informatik.hu-berlin.de> Message-ID: <3B531B24.C8CCC74F@lemburg.com> "Martin v. Loewis" wrote: > > > Well, I guess I would care enough :-) Martin has to change the PEP > > though, since he's the PEP author. > > I don't like having an equal sign there, but I can add this as an > alternative and leave it for BDFL pronouncement (and count votes in > favour or against). > > In any case, I'd need to know what the exact proposed change to PEP > 244 is. The syntax currently reads > > directive_statement: 'directive' NAME [atom] [';'] NEWLINE > > How do you want this to change? To make the directive statment useful for setting compiler parameters, the syntax should be extended to allow for an (optional) '='. Whether or not this '=' sign must be there is up to the definition of the directive NAME. It may also be worthwhile using a testlist (see Grammar) instead of the fixed atom for cases where the compiler parameter needs to be a e.g. list of options. I'd also suggest to remove the optional ';' since this is not confrom with the rest of Python.... directive_statement: 'directive' NAME ['='] [testlist] NEWLINE > > I think that supporting the typical "key = value" format is > > quite reasonable for setting flags in the compiler. The PEP's > > original idea of replacing your "from __future__ import spam" > > does not require this format, since is only needs to support > > switches. > > Actually, based on Tim's objections, I need the syntax in a different > way: > > directive transitional generators > > Here, "directive transitional" indicates that a transitional feature > is being activated, followed by the name of the feature. This is in > line with > > directive transitional nested_scopes > > Spelling them as > > directive transitional = nested_scopes > # or > directive transitional = 'nested_scopes' > > doesn't sound right, since I'm not assigning to "transitional". True. > Of course, since this directive is spelled "from __future__ import" > these days, the only remaining application for directives is the > unicodeencoding directive. I'm just pointing out that adding an equal > sign likely restricts the applicability of directives. It doesn't need to: simply leave the requirement whether to use or not to use an equal sign to the definition of the directive. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From tim@digicool.com Mon Jul 16 17:52:42 2001 From: tim@digicool.com (Tim Peters) Date: Mon, 16 Jul 2001 12:52:42 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include parsetok.h,2.15,2.16 pythonrun.h,2.42,2.43 In-Reply-To: <20010716085729.E5396@xs4all.nl> Message-ID: [Tim] > parsetok.h pythonrun.h > Log Message: > Ugly. A pile of new xxxFlags() functions, to communicate to the parser > that 'yield' is a keyword. This doesn't help test_generators at all! > I don't know why not. These things do work now (and didn't before this > patch): [Thomas Wouters] > What's the problem with this, anyway ? Why would "from __future__ import > generators" or special flags be necessary to enable the existance of > generators ? Sorry, I'm lost. Guido introduced a generators future-statement, and now we're trying to get it to work the way PEP 236 says future statements work. A future statement is needed because yield *will* be a new keyword in 2.3, but is not in 2.2 (unless a module includes the generators future-statement). > I'd have thought it's just a parser directive (okay, so that's > tricky to implement) The new xxxFlags() functions allow passing in flags to the parser, and I guess that's what "a parser directive" means to you. > but to code that doesn't use 'yield' a generator > is just another iterator, right ? Right. Now what? I don't think I grasped what you were getting at. From tim@digicool.com Mon Jul 16 18:01:28 2001 From: tim@digicool.com (Tim Peters) Date: Mon, 16 Jul 2001 13:01:28 -0400 Subject: [Python-Dev] RE: CVS: python/dist/src/Parser parsetok.c,2.25,2.26 In-Reply-To: <3B529D6E.A4BAD3FA@lemburg.com> Message-ID: [Tim] > A pile of new xxxFlags() functions, to communicate to the parser > that 'yield' is a keyword. [MAL] > Would those APIs also be usable for a new "directive" keyword ? Sure, but there's no general machinery here, just the raw existence of a new int "flags" argument, and a ton of teensy special-casing in two dozen other files to support "from __future__ import generators" only and specifically. No new general "parser API" exists or should be inferred. From guido@digicool.com Mon Jul 16 18:05:37 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 16 Jul 2001 13:05:37 -0400 Subject: [Python-Dev] guido@digicool.com In-Reply-To: Your message of "Mon, 16 Jul 2001 18:05:47 +0200." <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> Message-ID: <200107161705.f6GH5bA32228@odiug.digicool.com> (Where did this subject come from???) > > Martin's hack falls short of doing the right thing in all cases: you > > can't have the first statement of your program be "directive = ..." > > or "directive(...)". > > If that is considered as a serious problem, I'll try to solve it with > an additional lookahead token: If the next token is a name, then it is > a directive. Wait. MAL seems to want two other changes: directive should be allowed (required???) before the module docstring, and it should support the syntax from his proto-PEP (directive key = value). But MAL and PaulP don't seem to agree on the semantics of this directive, and I haven't gotten a good answer why we can't do that with a magic comment. In the mean time, I've decided to enable the yield keyword with a future statement. In general I now prefer using future statements for enabling future features over the directive statement. So it's still unclear if we want a directive... --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Jul 16 18:40:25 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 16 Jul 2001 19:40:25 +0200 Subject: [Python-Dev] directive statement (PEP 244) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> Message-ID: <3B532709.56972A46@lemburg.com> Guido van Rossum wrote: > > > > Martin's hack falls short of doing the right thing in all cases: you > > > can't have the first statement of your program be "directive = ..." > > > or "directive(...)". > > > > If that is considered as a serious problem, I'll try to solve it with > > an additional lookahead token: If the next token is a name, then it is > > a directive. > > Wait. > > MAL seems to want two other changes: directive should be allowed > (required???) "allowed" not "required". > before the module docstring, and it should support the > syntax from his proto-PEP (directive key = value). > > But MAL and PaulP don't seem to agree on the semantics of this > directive, and I haven't gotten a good answer why we can't do that > with a magic comment. We don't ? Paul suggested adding encoding directives for 8-bit strings and comments, but these cannot be used by the Python compiler in any way and would only be for the benefit of an editor, so I don't really see the need for them. A programmer can still add some editor specific comment to the source file to tell the editor in what encoding to display the file, but this information is really only useful for the editor, not the Python compiler. About the magic comment: Unicode literals are translated into Unicode objects at compile time. The encoding information is vital for the decoding to succeed. If you place this information into a comment of the Python source code and have the compiler depend on it, removing the comment would break your program. I don't think that's good language design (besides, we already have enough Unicode magic in Python already...), but then people may feel different about this. > In the mean time, I've decided to enable the yield keyword with a > future statement. In general I now prefer using future statements for > enabling future features over the directive statement. > > So it's still unclear if we want a directive... One way or another we need a way to specify compiler parameters and settings on a per-source file basis. Whether you call it directive, pragma or magic comment is really secondary and only a matter of language design. I've only chosen PEP 244 as basis for the PEP because it seemed to fit the need. If you decide to go down some other path, then I'll happily update the PEP to whatever becomes part of Python. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Mon Jul 16 19:24:21 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 16 Jul 2001 14:24:21 -0400 Subject: [Python-Dev] directive statement (PEP 244) In-Reply-To: Your message of "Mon, 16 Jul 2001 19:40:25 +0200." <3B532709.56972A46@lemburg.com> References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> Message-ID: <200107161824.f6GIOL532466@odiug.digicool.com> > > MAL seems to want two other changes: directive should be allowed > > (required???) > > "allowed" not "required". but last I looked if there was a docstring before the directive you couldn't guarantee that the directive applied. > > before the module docstring, and it should support the > > syntax from his proto-PEP (directive key = value). > > > > But MAL and PaulP don't seem to agree on the semantics of this > > directive, and I haven't gotten a good answer why we can't do that > > with a magic comment. > > We don't ? It seems to me that each post from you gets a response from Paul with some kind of objection, and vice versa. Maybe you're converging, but I don't see where you are converging yet. Also, your arguments sometimes seem contradictory. For example, Paul has said that you may need a comment with an editor-specific encoding indicator, while you were expecting editors to look at the directive and made this a reason why the directive should precede the docstring. > Paul suggested adding encoding directives for 8-bit > strings and comments, but these cannot be used by the Python > compiler in any way and would only be for the benefit of an > editor, so I don't really see the need for them. Another indication you two aren't on the same page just yet. > A programmer > can still add some editor specific comment to the source file > to tell the editor in what encoding to display the file, but this > information is really only useful for the editor, not the > Python compiler. This redundancy worries me though. Are we going to encourage people to use an editor-specific comment for each editor out there that could be used to touch the file? > About the magic comment: Unicode literals are translated into > Unicode objects at compile time. The encoding information is > vital for the decoding to succeed. If you place this information > into a comment of the Python source code and have the compiler > depend on it, removing the comment would break your program. Yes, and so would removing a directive. I don't see the point at all. > I don't think that's good language design (besides, we already > have enough Unicode magic in Python already...), but then > people may feel different about this. Directives come with their own set of magic. > > In the mean time, I've decided to enable the yield keyword with a > > future statement. In general I now prefer using future statements for > > enabling future features over the directive statement. > > > > So it's still unclear if we want a directive... > > One way or another we need a way to specify compiler parameters > and settings on a per-source file basis. Whether you call it > directive, pragma or magic comment is really secondary and only > a matter of language design. I still haven't seen this need demonstrated. Most purported uses of these are better done with existing mechanisms. For example, in PEP 253 I propose an assignment to a global __metaclass__ to set the default class for a baseless class statement. > I've only chosen PEP 244 as basis for the PEP because it seemed > to fit the need. If you decide to go down some other path, > then I'll happily update the PEP to whatever becomes part of > Python. But you're implying without clearly specifying all sorts of amendments to PEP 244, which weakens your position. For example, PEP 244 allows a doc string before the directive, but you indicated that the directive can only affect strings that occur after it. I don't think this is true: the creation of actual string objects is done after the whole file has been parsed, is it wouldn't be hard to collect and interpret all directives before creating code objects. --Guido van Rossum (home page: http://www.python.org/~guido/) From paulp@ActiveState.com Mon Jul 16 19:36:58 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Mon, 16 Jul 2001 11:36:58 -0700 Subject: [Python-Dev] directive statement (PEP 244) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> Message-ID: <3B53344A.25AE6EB4@ActiveState.com> "M.-A. Lemburg" wrote: > >.... > > We don't ? > > Paul suggested adding encoding directives for 8-bit > strings and comments, but these cannot be used by the Python > compiler in any way and would only be for the benefit of an > editor, so I don't really see the need for them. Sorry I wasn't clear. Like \F, I think that the best model is that of XML, Java and (I've learned recently) Perl. There should be a single encoding for the file. Logically speaking it should be decoded before tokenization or parsing. Practically speaking it may be simpler to fake this logical decoding in the implementation. I don't care how it is implemented. Logically the model should be that any encoding declaration affects the interpretation of the *file* not some particular construct in the file. If this is too difficult to implement today then maybe we should wait on the whole feature until someone has time to do it right. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From mal@lemburg.com Mon Jul 16 20:02:58 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 16 Jul 2001 21:02:58 +0200 Subject: [Python-Dev] directive statement (PEP 244) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com> Message-ID: <3B533A62.73ECD605@lemburg.com> Paul Prescod wrote: > > "M.-A. Lemburg" wrote: > > Paul suggested adding encoding directives for 8-bit > > strings and comments, but these cannot be used by the Python > > compiler in any way and would only be for the benefit of an > > editor, so I don't really see the need for them. > > Sorry I wasn't clear. Like \F, I think that the best model is that of > XML, Java and (I've learned recently) Perl. There should be a single > encoding for the file. Logically speaking it should be decoded before > tokenization or parsing. Practically speaking it may be simpler to fake > this logical decoding in the implementation. I don't care how it is > implemented. Logically the model should be that any encoding declaration > affects the interpretation of the *file* not some particular construct > in the file. > > If this is too difficult to implement today then maybe we should wait on > the whole feature until someone has time to do it right. Hmm, I guess you have something like this in mind... 1. read the file 2. decode it into Unicode assuming some fixed per-file encoding 3. tokenize the Unicode content 4. compile it, creating Unicode objects from the given Unicode data and creating string objects from the Unicode literal data by first reencoding the Unicode data into 8-bit string data To make this backwards compatible, the implementation would have to assume Latin-1 as the original file encoding if not given (otherwise, binary data currently stored in 8-bit strings wouldn't make the roundtrip). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas@xs4all.net Mon Jul 16 20:07:45 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 16 Jul 2001 21:07:45 +0200 Subject: [Python-Dev] Python 2.1.1 and distutils Message-ID: <20010716210744.H5396@xs4all.nl> I've got a few distutils fixes pending (the unixcompiler thing, and Just/Jack mentioned a few Mac/Metroworks fixes they wanted in) but I'm not sure how to handle this; distutils has a separate version number, and I seem to recall it is/was developed seperately. Basically I'm distutils-ignorant, as I hardly have a need to distribute my scripts :) Anyway, should I apply the fixes and up the version number ? Apply the fixes but keep quiet about them ? Hand the fixes over to someone with distutils clue ? Scream and shout ? (Always my favorite, that ;P) (BTW, Jack, Just, I'm waiting for one of you to follow up on the metroworks thing; just mail me the patches, preferably written in blood, with a signed confession that they won't break any code what so ever :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Mon Jul 16 20:14:43 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 16 Jul 2001 21:14:43 +0200 Subject: [Python-Dev] directive statement (PEP 244) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <200107161824.f6GIOL532466@odiug.digicool.com> Message-ID: <3B533D23.E2099A20@lemburg.com> Guido van Rossum wrote: > > > > MAL seems to want two other changes: directive should be allowed > > > (required???) > > > > "allowed" not "required". > > but last I looked if there was a docstring before the directive you > couldn't guarantee that the directive applied. That was due to a misunderstanding of how the implementation could work... after reading your explanation below, here's a way which would work around this "requirement": If the tokenizer gets to do the directive processing (rather than the compiler), then the placement of the directive becomes irrelevant: it may only appear once per file and the tokenizer will see it before the compiler, so the encoding setting will already have been made before the compiler even starts to compile the first doc-string. > > > before the module docstring, and it should support the > > > syntax from his proto-PEP (directive key = value). > > > > > > But MAL and PaulP don't seem to agree on the semantics of this > > > directive, and I haven't gotten a good answer why we can't do that > > > with a magic comment. > > > > We don't ? > > It seems to me that each post from you gets a response from Paul with > some kind of objection, and vice versa. Maybe you're converging, but > I don't see where you are converging yet. Also, your arguments > sometimes seem contradictory. For example, Paul has said that you may > need a comment with an editor-specific encoding indicator, while you > were expecting editors to look at the directive and made this a reason > why the directive should precede the docstring. No, I was never talking about editors. Paul brought that up. I am only concerned about telling the Python interpreter which encoding to assume when converting Unicode literals into Unicode objects -- that's all. > > Paul suggested adding encoding directives for 8-bit > > strings and comments, but these cannot be used by the Python > > compiler in any way and would only be for the benefit of an > > editor, so I don't really see the need for them. > > Another indication you two aren't on the same page just yet. He posted a clarification of what he think's is the way to go. I think this settles the argument. > > A programmer > > can still add some editor specific comment to the source file > > to tell the editor in what encoding to display the file, but this > > information is really only useful for the editor, not the > > Python compiler. > > This redundancy worries me though. Are we going to encourage people > to use an editor-specific comment for each editor out there that could > be used to touch the file? Let's put it this way: are you expecting that all editors out there will be able to parse the Python way of defining the encoding of Unicode literals ? My point is that I don't see editors as an issue in this discussion. > > About the magic comment: Unicode literals are translated into > > Unicode objects at compile time. The encoding information is > > vital for the decoding to succeed. If you place this information > > into a comment of the Python source code and have the compiler > > depend on it, removing the comment would break your program. > > Yes, and so would removing a directive. I don't see the point at > all. Sure, but a user would normally not expect his program to fail just because he removes a comment... > > I don't think that's good language design (besides, we already > > have enough Unicode magic in Python already...), but then > > people may feel different about this. > > Directives come with their own set of magic. > > > > In the mean time, I've decided to enable the yield keyword with a > > > future statement. In general I now prefer using future statements for > > > enabling future features over the directive statement. > > > > > > So it's still unclear if we want a directive... > > > > One way or another we need a way to specify compiler parameters > > and settings on a per-source file basis. Whether you call it > > directive, pragma or magic comment is really secondary and only > > a matter of language design. > > I still haven't seen this need demonstrated. Most purported uses of > these are better done with existing mechanisms. For example, in PEP > 253 I propose an assignment to a global __metaclass__ to set the > default class for a baseless class statement. Hmm, are you suggesting to use something like the following instead: __unicodeencoding__ = 'utf-8' > > I've only chosen PEP 244 as basis for the PEP because it seemed > > to fit the need. If you decide to go down some other path, > > then I'll happily update the PEP to whatever becomes part of > > Python. > > But you're implying without clearly specifying all sorts of amendments > to PEP 244, which weakens your position. > > For example, PEP 244 allows a doc string before the directive, but you > indicated that the directive can only affect strings that occur after > it. I don't think this is true: the creation of actual string objects > is done after the whole file has been parsed, is it wouldn't be hard > to collect and interpret all directives before creating code objects. Please see the correction I gave above and my reply to Martin which has the specification of my proposed amendment. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Mon Jul 16 20:19:27 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 16 Jul 2001 15:19:27 -0400 Subject: [Python-Dev] directive statement (PEP 244) In-Reply-To: Your message of "Mon, 16 Jul 2001 11:36:58 PDT." <3B53344A.25AE6EB4@ActiveState.com> References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com> Message-ID: <200107161919.f6GJJRE00537@odiug.digicool.com> > Sorry I wasn't clear. Like \F, I think that the best model is that of > XML, Java and (I've learned recently) Perl. There should be a single > encoding for the file. Logically speaking it should be decoded before > tokenization or parsing. Practically speaking it may be simpler to fake > this logical decoding in the implementation. I don't care how it is > implemented. Logically the model should be that any encoding declaration > affects the interpretation of the *file* not some particular construct > in the file. This is the *only* model that makes sense. > If this is too difficult to implement today then maybe we should wait on > the whole feature until someone has time to do it right. Right-o! --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Jul 16 20:21:00 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 16 Jul 2001 21:21:00 +0200 Subject: [Python-Dev] Python 2.1.1 and distutils References: <20010716210744.H5396@xs4all.nl> Message-ID: <3B533E9C.4307923A@lemburg.com> Thomas Wouters wrote: > > I've got a few distutils fixes pending (the unixcompiler thing, and > Just/Jack mentioned a few Mac/Metroworks fixes they wanted in) but I'm not > sure how to handle this; distutils has a separate version number, and I seem > to recall it is/was developed seperately. Basically I'm distutils-ignorant, > as I hardly have a need to distribute my scripts :) > > Anyway, should I apply the fixes and up the version number ? Apply the fixes > but keep quiet about them ? Hand the fixes over to someone with distutils > clue ? Scream and shout ? (Always my favorite, that ;P) > > (BTW, Jack, Just, I'm waiting for one of you to follow up on the metroworks > thing; just mail me the patches, preferably written in blood, with a signed > confession that they won't break any code what so ever :-) Why not simply include the latest stable distutils version in Python 2.1.1 and add the new patches/features to the next distutils release (which would then go into 2.1.2, etc.) ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From paulp@ActiveState.com Mon Jul 16 20:22:43 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Mon, 16 Jul 2001 12:22:43 -0700 Subject: [Python-Dev] directive statement (PEP 244) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com> <3B533A62.73ECD605@lemburg.com> Message-ID: <3B533F03.A5FD37D8@ActiveState.com> "M.-A. Lemburg" wrote: > >... > > Hmm, I guess you have something like this in mind... > > 1. read the file > 2. decode it into Unicode assuming some fixed per-file encoding > 3. tokenize the Unicode content > 4. compile it, Right. This is how XML, Java, Perl etc. work. XML and Python would be the only languages to actually declare the encoding in use (in ASCII). I think that the declaration way is clearly superior to depending on command line arguments or BOMs. But this is just how it has to *look* to the user. If there is an implementation that behind the scenes only decodes Unicode literals, that would be fine. > ... creating Unicode objects from the given Unicode data > and creating string objects from the Unicode literal data > by first reencoding the Unicode data into 8-bit string data Or we could just disallow non-ASCII 8-bit strings literals in files that use the declaration. That was never a feature Guido really intended to support (as I understand it!) and I don't see a need to carry it forward. If you are in the Unicode universe then the need to put binary data in 8-bit string literals is massively reduced. > To make this backwards compatible, the implementation would have to > assume Latin-1 as the original file encoding if not given (otherwise, > binary data currently stored in 8-bit strings wouldn't make the > roundtrip). Another way to think about it is that files without the declaration skip directly to the tokenize step and skip the decoding step. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From thomas@xs4all.net Mon Jul 16 20:31:25 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 16 Jul 2001 21:31:25 +0200 Subject: [Python-Dev] Python 2.1.1 and distutils In-Reply-To: <3B533E9C.4307923A@lemburg.com> Message-ID: <20010716213125.K5391@xs4all.nl> On Mon, Jul 16, 2001 at 09:21:00PM +0200, M.-A. Lemburg wrote: > > Anyway, should I apply the fixes and up the version number ? Apply the fixes > > but keep quiet about them ? Hand the fixes over to someone with distutils > > clue ? Scream and shout ? (Always my favorite, that ;P) > Why not simply include the latest stable distutils version in > Python 2.1.1 and add the new patches/features to the next > distutils release (which would then go into 2.1.2, etc.) ? Two reasons: 1) Like I said, I have *no* clue about distutils :) What is the 'latest stable distutils version' ? Where can I find it ? Who has an idea of what, exactly, changed, and whether all changes are appropriate in a bugfix release (I can be lenient in the case of distutils, but bugfix releases are supposed to keep *even broken code* working, up to a point. 2) I'm not sure if the fixes I talked about are in the 'latest stable distutils version', since one of them was checked in mere hours ago. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@digicool.com Mon Jul 16 20:46:07 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 16 Jul 2001 15:46:07 -0400 Subject: [Python-Dev] directive statement (PEP 244) In-Reply-To: Your message of "Mon, 16 Jul 2001 21:02:58 +0200." <3B533A62.73ECD605@lemburg.com> References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com> <3B533A62.73ECD605@lemburg.com> Message-ID: <200107161946.f6GJk7Q00944@odiug.digicool.com> > Hmm, I guess you have something like this in mind... > > 1. read the file > 2. decode it into Unicode assuming some fixed per-file encoding > 3. tokenize the Unicode content > 4. compile it, creating Unicode objects from the given Unicode data > and creating string objects from the Unicode literal data > by first reencoding the Unicode data into 8-bit string data > > To make this backwards compatible, the implementation would have to > assume Latin-1 as the original file encoding if not given (otherwise, > binary data currently stored in 8-bit strings wouldn't make the > roundtrip). To be compatible with the current default encoding, I would use ASCII as the default encoding and issue an error if any non-ASCII characters are found. One should always use hex/oct escapes to enter binary data in literals! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Mon Jul 16 20:56:16 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 16 Jul 2001 15:56:16 -0400 Subject: [Python-Dev] directive statement (PEP 244) In-Reply-To: Your message of "Mon, 16 Jul 2001 21:14:43 +0200." <3B533D23.E2099A20@lemburg.com> References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <200107161824.f6GIOL532466@odiug.digicool.com> <3B533D23.E2099A20@lemburg.com> Message-ID: <200107161956.f6GJuG600983@odiug.digicool.com> > > but last I looked if there was a docstring before the directive you > > couldn't guarantee that the directive applied. > > That was due to a misunderstanding of how the implementation could > work... after reading your explanation below, here's a way which > would work around this "requirement": > > If the tokenizer gets to do the directive processing > (rather than the compiler), then the placement of the directive > becomes irrelevant: it may only appear once per file and the tokenizer > will see it before the compiler, so the encoding setting will already > have been made before the compiler even starts to compile the > first doc-string. Sure. (Technically, it's not the tokenizer that interprets the directives, but a pass that runs before the code generator runs. The compiler has sprouted quite a few passes lately... :-) > No, I was never talking about editors. Paul brought that up. > I am only concerned about telling the Python interpreter which > encoding to assume when converting Unicode literals into > Unicode objects -- that's all. Well, I believe that for XML everybody (editors and other processors) looks in the same place, right? > He posted a clarification of what he think's is the way to go. > I think this settles the argument. I agree. > Let's put it this way: are you expecting that all editors out > there will be able to parse the Python way of defining the > encoding of Unicode literals ? Not right away, but this is what I would hope would happen eventually. > My point is that I don't see editors as an issue in this discussion. Well, anything we can do to make parsing the encoding indicator easier for editors helps. > > > About the magic comment: Unicode literals are translated into > > > Unicode objects at compile time. The encoding information is > > > vital for the decoding to succeed. If you place this information > > > into a comment of the Python source code and have the compiler > > > depend on it, removing the comment would break your program. > > > > Yes, and so would removing a directive. I don't see the point at > > all. > > Sure, but a user would normally not expect his program to > fail just because he removes a comment... Weak argument. A magic comment is specially marked as such, e.g. #*encoding utf-8 You might as well say that users are prone to remove the #! comment... > Hmm, are you suggesting to use something like the following > instead: > > __unicodeencoding__ = 'utf-8' Not in this particular case, but for other cases where directives have been suggested. In this case (encoding) I'd prefer a magic comment. I still haven't seen a good example of something for which directives are the best solution. Of course, it should be '__fileencoding__'. :-) > Please see the correction I gave above and my reply to Martin which has > the specification of my proposed amendment. I've seen them now. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Mon Jul 16 20:55:11 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 16 Jul 2001 15:55:11 -0400 Subject: [Python-Dev] Python 2.1.1 and distutils In-Reply-To: <20010716210744.H5396@xs4all.nl>; from thomas@xs4all.net on Mon, Jul 16, 2001 at 09:07:45PM +0200 References: <20010716210744.H5396@xs4all.nl> Message-ID: <20010716155511.A13393@ute.cnri.reston.va.us> On Mon, Jul 16, 2001 at 09:07:45PM +0200, Thomas Wouters wrote: >Anyway, should I apply the fixes and up the version number ? Apply the fixes >but keep quiet about them ? Hand the fixes over to someone with distutils >clue ? Scream and shout ? (Always my favorite, that ;P) Apply the fixes and don't bother increasing the version number. The standalone Distutils releases happen in sync with Python releases, and are so that users of older Python versions, particularly 1.5.2, can get the current set of Distutils fixes. I don't think there have been enough changes at this point to make it worth issuing a new Distutils release; indeed, I don't know if it's worth the bother of issuing them any longer. --amk From fdrake@acm.org Mon Jul 16 21:09:24 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 16 Jul 2001 16:09:24 -0400 (EDT) Subject: [Python-Dev] directive statement (PEP 244) In-Reply-To: <200107161956.f6GJuG600983@odiug.digicool.com> References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <200107161824.f6GIOL532466@odiug.digicool.com> <3B533D23.E2099A20@lemburg.com> <200107161956.f6GJuG600983@odiug.digicool.com> Message-ID: <15187.18932.675886.239925@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > Well, I believe that for XML everybody (editors and other processors) > looks in the same place, right? It also assumes a pretty strict set of expected characters: If you don't have UTF-8, you have: [byte-order-mark] "" Basically, the encoding can be discovered very easily given an assumption about legal content. Once that assumption doesn't hold, the encoding can't be discovered reliably. We could probably make a pretty reasonable statement of how to auto-detect enough so that Python files could have an encoded declaration (however we spell it), but it's hard to beat the assumption of mandated structure. (Some assumption, huh? ;) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From paulp@ActiveState.com Mon Jul 16 21:11:10 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Mon, 16 Jul 2001 13:11:10 -0700 Subject: [Python-Dev] directive statement (PEP 244) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <200107161824.f6GIOL532466@odiug.digicool.com> <3B533D23.E2099A20@lemburg.com> Message-ID: <3B534A5E.D38DD416@ActiveState.com> "M.-A. Lemburg" wrote: > >... > > My point is that I don't see editors as an issue in this discussion. There are two points where this touches editors: * if we keep the encoding consistent throughout the file then at least a Unicode-aware text editor like Notepad or Visual Studio will be able to do something intelligent with the files. The user will choose "Shift-JIS" from their menu and go ahead. * if we make the directive easy for an editor to find the declaration, we increase the liklihood of people writing editors that magically guess the right encoding instead of requiring the user to instruct them. The first is more important to me than the second. >... > > Sure, but a user would normally not expect his program to > fail just because he removes a comment... #!/usr/bin/python :) I am usually not a fan of putting semantic information in comments but the practical difficulties in doing so in this case seem small. And the benefit would be that we could require the declaration to precede the first non-comment line in the file. That means that we (both the tokenizer and editors) don't have to seach the file for the declaration. We just read two lines and then give up. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From guido@digicool.com Mon Jul 16 22:09:43 2001 From: guido@digicool.com (Guido van Rossum) Date: Mon, 16 Jul 2001 17:09:43 -0400 Subject: [Python-Dev] Heads up: Python 2.2a1 to be released from descr-branch Message-ID: <200107162109.f6GL9iL05351@odiug.digicool.com> PEP 251 promises a 2.2a1 release on July 18 (coming Wednesday), and I have every intention to fulfill this promise. (That's why we added the future statement for generators.) The PEP also promises that the release will be done from a branch. Rather than forming a new branch, I intend to do the release, for once, from the descr-branch. This means that the release will contain the experimental code that implements most (but not all) of PEP 252 and 253. This is intended to be backwards compatible. One purpose of the release is to see *how* backwards compatible. If the descr-branch release turns out to be a disaster, I may decide to hold off on the descr-branch work and we'll release 2.2 without all the good stuff from the descr-branch. But I don't expect that this will happen. The worst that I really expect is that we'll have to do a bunch more backwards compatibility work. If 2.2a1 is a success, I'll merge the descr-branch into the trunk. I realize that the descr-branch work is not finished and not sufficiently documented (despite the 10K words in the two PEPs). That's OK, it's an alpha release. In preparation for this event, Tim is semi-continuously merging the trunk into the descr-branch, and I've added the branch tag to all files in the trunk (so the branch is now a complete set of files). If you have something that should go into the 2.2a1 release, please check it in on the trunk and add a note to the checkin message "please merge into 2.2a1". Backwards incompatibility ------------------------- 99% of the features on descr-branch are only invoked when you use a class statement with a built-in object as a base class (or when you use an explicit __metaclass__ assignment). Some descr-branch things that might affect old code: - Introspection works differently (see PEP 252). In particular, most objects now have a __class__ attribute, and the __methods__ and __members__ attributes no longer work. This means that dir([]) will return an empty list. Use dir(type([])) instead -- this is consistent with regular classes. See the example in PEP 252. - Several built-ins that can be seen as coercions or constructors are now type objects rather than factory functions; the type objects support the same behaviors as the old factory functions. Affected are: complex, float, long, int, str, tuple, list, unicode, and type. (There are also new ones: dictionary, object, classmethod, staticmethod, but since these are new built-ins I can't see how this would break old code.) - There's one very specific (and fortunately uncommon) bug that used to go undetected, but which is now reported as an error: class A: def foo(self): pass class B(A): pass class C(A): def foo(self): B.foo(self) Here, C.foo wants to call A.foo, but by mistake calls B.foo. In the old system, because B doesn't define foo, B.foo is identical to A.foo, so the call would succeed. In the new system, B.foo is marked as a method requiring a B instance, and a C is not a B, so the call fails. - Binary compatibility with old extensions is not guaranteed. We'll tighten this in future releases. I also very much doubt that extensions based on Jim Fulton's ExtensionClass will work -- although I encourage folks to try this to see how much breaks, so we can hopefully fix this for 2.2a2. While the ultimate goal of PEP 253 is to do away with ExtensionClass, I believe that ExtensionClass should still work in 2.2, breaking it in 2.3. I should also note that PEP 254 will probably remain unimplemented for now, since it would create way more incompatibilities. I promise to reopen it for Python 2.3. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Jul 16 22:38:50 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 16 Jul 2001 23:38:50 +0200 Subject: [Python-Dev] directive statement (PEP 244) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com> <3B533A62.73ECD605@lemburg.com> <200107161946.f6GJk7Q00944@odiug.digicool.com> Message-ID: <3B535EEA.A629E61C@lemburg.com> Guido van Rossum wrote: > > > Hmm, I guess you have something like this in mind... > > > > 1. read the file > > 2. decode it into Unicode assuming some fixed per-file encoding > > 3. tokenize the Unicode content > > 4. compile it, creating Unicode objects from the given Unicode data > > and creating string objects from the Unicode literal data > > by first reencoding the Unicode data into 8-bit string data > > > > To make this backwards compatible, the implementation would have to > > assume Latin-1 as the original file encoding if not given (otherwise, > > binary data currently stored in 8-bit strings wouldn't make the > > roundtrip). > > To be compatible with the current default encoding, I would use ASCII > as the default encoding and issue an error if any non-ASCII characters > are found. One should always use hex/oct escapes to enter binary data > in literals! Hmm, Latin-1 and other locale-specific encodings are currently being used in 8-bit strings by far too many people in Europe and elsewhere... people won't feel good about it. Note that the reason for using Latin-1 is that Latin-1 decoded into Unicode and then reencoded into Latin-1 is a 1-1 mapping for all 8-bit values -- this gives us binary backward compatibility. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From martin@loewis.home.cs.tu-berlin.de Mon Jul 16 22:40:59 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 16 Jul 2001 23:40:59 +0200 Subject: [Python-Dev] Re: PEP 244 syntax In-Reply-To: <3B531B24.C8CCC74F@lemburg.com> (mal@lemburg.com) References: <200107161559.f6GFx8V03899@mira.informatik.hu-berlin.de> <3B531B24.C8CCC74F@lemburg.com> Message-ID: <200107162140.f6GLexo09608@mira.informatik.hu-berlin.de> > > directive_statement: 'directive' NAME [atom] [';'] NEWLINE > > > > How do you want this to change? > > To make the directive statment useful for setting compiler > parameters, the syntax should be extended to allow > for an (optional) '='. Whether or not this '=' sign must be there > is up to the definition of the directive NAME. Ok. > It may also be worthwhile using a testlist (see Grammar) > instead of the fixed atom for cases where the compiler > parameter needs to be a e.g. list of options. Ok. > > I'd also suggest to remove the optional ';' since this is > not confrom with the rest of Python.... Sure it is; disallowing the semicolon would be not conform: simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE You can have a semicolon after each small_stmt Regards, Martin From martin@loewis.home.cs.tu-berlin.de Mon Jul 16 22:48:21 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 16 Jul 2001 23:48:21 +0200 Subject: [Python-Dev] directive statement (PEP 244) In-Reply-To: <200107161705.f6GH5bA32228@odiug.digicool.com> (message from Guido van Rossum on Mon, 16 Jul 2001 13:05:37 -0400) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> Message-ID: <200107162148.f6GLmLl09639@mira.informatik.hu-berlin.de> > (Where did this subject come from???) I meant to put it into CC:, not into Subject: ... > So it's still unclear if we want a directive... It seems to me that to reasonably use non-ASCII *characters* in strings (as opposed to using mere byte sequences), we have to offer a declaration-type statement. A comment should not be used since comments should not change the outcome of a program, whereas this thing may change the program result. The question is whether a general-purpose syntax is needed. I think the answer is yes: I'd also like to say "all strings are Unicode" on a per-module basis, perhaps combined with providing an encoding. But then, this might be a future import: from __future__ import all_strings_are_unicode Regards, Martin From skip@pobox.com (Skip Montanaro) Mon Jul 16 22:56:19 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 16 Jul 2001 16:56:19 -0500 Subject: [Python-Dev] directive statement (PEP 244) In-Reply-To: <3B535EEA.A629E61C@lemburg.com> References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com> <3B533A62.73ECD605@lemburg.com> <200107161946.f6GJk7Q00944@odiug.digicool.com> <3B535EEA.A629E61C@lemburg.com> Message-ID: <15187.25347.383123.602253@beluga.mojam.com> Please excuse the naive interruption to this discussion. I'm a bit removed from this debate, being someone who is generally happy with ASCII (and who really doesn't understand all the fur that is flying), however I would imagine that programmers in Moscow writing code to be read by other Russian programmers would want to enter Cyrillic characters directly into their module doc strings and not have to insert hex escapes. Can they safely do that now if they set the encoding variable in site.py appropriately? If so, what is the need for the proposed directive to set encodings? Is it an attempt simply to allow different encodings on a per-module basis? On a related note, can the "Defining Unicode Literal Encodings" PEP be added to the PEP site/page so those of us who don't save every message that flows into their inboxes have it to refer to? Thx, Skip From skip@pobox.com (Skip Montanaro) Mon Jul 16 22:59:50 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 16 Jul 2001 16:59:50 -0500 Subject: [Python-Dev] Re: PEP 244 syntax In-Reply-To: <200107162140.f6GLexo09608@mira.informatik.hu-berlin.de> References: <200107161559.f6GFx8V03899@mira.informatik.hu-berlin.de> <3B531B24.C8CCC74F@lemburg.com> <200107162140.f6GLexo09608@mira.informatik.hu-berlin.de> Message-ID: <15187.25558.243182.290606@beluga.mojam.com> >> I'd also suggest to remove the optional ';' since this is not confrom >> with the rest of Python.... Martin> Sure it is; disallowing the semicolon would be not conform: Martin> simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE Martin> You can have a semicolon after each small_stmt But it doesn't appear that PEP 244 allows multiple directives per line: A directive_statement is a statement of the form directive_statement: 'directive' NAME [atom] [';'] NEWLINE If you decide to allow it, then the semicolon makes sense, but not otherwise. Skip From mal@lemburg.com Sat Jul 14 17:04:04 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 14 Jul 2001 18:04:04 +0200 Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1) References: Message-ID: <3B506D74.634EA1F@lemburg.com> Roman Suzi wrote: >=20 > On Sat, 14 Jul 2001, M.-A. Lemburg wrote: >=20 > >directive unicodeencoding =3D 'latin-1' >=20 > >#!/usr/local/python > >""" Module Docs... > >""" > >directive unicodeencoding =3D 'latin-1' > >... > >u =3D "H=E9ll=F4 W=F6rld !" > >... >=20 > Is there any need for new directive like that? > Maybe it is possible to use Emacs-style "coding" directive > in the second line instead: >=20 > #!/usr/bin/python > # -*- coding=3Dutf-8 -*- > ... I already mentioned allowing directives in comments to work around the problem of directive placement before the first doc-string. The above would then look like this: #!/usr/local/bin/python # directive unicodeencoding=3D'utf-8' u""" UTF-8 doc-string """ The downside of this is that parsing comments breaks the current tokenizing scheme in Python: the tokenizer removes comments before passing the tokens to the compiler ...wouldn't be hard to=20 fix though ;-) (note that tokenize.py does not) --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From martin@loewis.home.cs.tu-berlin.de Mon Jul 16 23:01:37 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 17 Jul 2001 00:01:37 +0200 Subject: [Python-Dev] directive statement (PEP 244) In-Reply-To: <200107161824.f6GIOL532466@odiug.digicool.com> (message from Guido van Rossum on Mon, 16 Jul 2001 14:24:21 -0400) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <200107161824.f6GIOL532466@odiug.digicool.com> Message-ID: <200107162201.f6GM1bX09702@mira.informatik.hu-berlin.de> > > A programmer > > can still add some editor specific comment to the source file > > to tell the editor in what encoding to display the file, but this > > information is really only useful for the editor, not the > > Python compiler. > > This redundancy worries me though. Are we going to encourage people > to use an editor-specific comment for each editor out there that could > be used to touch the file? For non-ASCII source code? Certainly, this is the only option (although many editors might chose a "display something, even as garbage" mode without being further instructed). We cannot expect all editors to correctly detect the encoding. So if some provide customization through comments, users will use that. A dedicated Python editor would look at the encoding directive, of course. > Yes, and so would removing a directive. I don't see the point at > all. It contradicts what most users expect from comments, and contradicts what the language reference says: # Comments are ignored by the syntax; they are not tokens. Comments are ignored; putting a meaning into them for program execution is a hack. > Directives come with their own set of magic. There is no magic to the directive statement. Instead, it does what all statements do: It has a certain meaning to the language. Python has few declarations, the directive statement would be one of them. Is that bothering you? Regards, Martin From martin@loewis.home.cs.tu-berlin.de Mon Jul 16 23:12:12 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 17 Jul 2001 00:12:12 +0200 Subject: [Python-Dev] directive statement (PEP 244) In-Reply-To: <3B533F03.A5FD37D8@ActiveState.com> (message from Paul Prescod on Mon, 16 Jul 2001 12:22:43 -0700) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com> <3B533A62.73ECD605@lemburg.com> <3B533F03.A5FD37D8@ActiveState.com> Message-ID: <200107162212.f6GMCCp09767@mira.informatik.hu-berlin.de> > But this is just how it has to *look* to the user. If there is an > implementation that behind the scenes only decodes Unicode literals, > that would be fine. Formally, you would have to decode everything just to make sure that everything follows the declared encoding (i.e. no invalid byte sequences). I'm also not sure what an "ASCII superset" exactly is. Is it an encoding where all ASCII strings just mean themselves? If so, and if you allow encodings that have a shift state, you need to keep track of the shift state for tokenization: char 39 might not always mean APOSTROPHE, e.g. if you are in a shift state. > Or we could just disallow non-ASCII 8-bit strings literals in files that > use the declaration. +1. > > To make this backwards compatible, the implementation would have to > > assume Latin-1 as the original file encoding if not given (otherwise, > > binary data currently stored in 8-bit strings wouldn't make the > > roundtrip). > > Another way to think about it is that files without the declaration skip > directly to the tokenize step and skip the decoding step. That's the way I would think of it also. You don't have Latin-1 values in such strings - they are just byte strings. Regards, Martin From PyChecker Tue Jul 17 03:23:26 2001 From: PyChecker (Neal Norwitz) Date: Mon, 16 Jul 2001 22:23:26 -0400 Subject: [Python-Dev] ANN: PyChecker version 0.7 Message-ID: <3B53A19E.899A5084@metaslash.com> A new version of PyChecker is available for your hacking pleasure. PyChecker is a tool for finding common bugs in python source code. It finds problems that are typically caught by a compiler for less dynamic languages, like C and C++. Comments, criticisms, new ideas, and other feedback is welcome. Change Log: * Improve import warning messages, add from ... import ... checks * checker.py -h prints defaults after processing .pycheckrc file * Add config option -k/--pkgimport to disable unused imports from __init__.py * Add warning for variable used before being set * Improve format string checks/warnings * Check arguments to constructors * Check that self is first arg to base constructor * Add -e/--errors option to only warn about likely errors * Make 'self' configurable as the first argument to methods * Add check that there is a c'tor when instantiating an object and passing arguments * Add config option (-N/--initreturn) to turn off warnings when returning None from __init__() * Fix internal error with python 2.1 which defines a new op: LOAD_DEREF * Check in lambda functions for module/variable use * Fix inability to evaluate { 1: 'a' } inline, led to incorrect __init__() not called warnings * Fix exception when class overrides __special__() methods & raise exception * Fix check in format strings when using '%*g %*.*g', etc * Add check for static class attributes * Fix checking of module attributes * Fix wrong filename in 'Base class (xxx) __init__() not called' when doing a from X import * * Fix 'No attribute found' for very dynamic classes (may also work for classes that use __getattr__) PyChecker is available on Source Forge: Web page: http://pychecker.sourceforge.net/ Project page: http://sourceforge.net/projects/pychecker/ Neal -- pychecker@metaslash.com From fredrik@pythonware.com Tue Jul 17 08:49:16 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 17 Jul 2001 09:49:16 +0200 Subject: [Python-Dev] guido@digicool.com References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> Message-ID: <007301c10e95$00a7d6c0$4ffa42d5@hagrid> guido wrote: > But MAL and PaulP don't seem to agree on the semantics of this > directive and I don't agree with either of them. the encoding should apply to the entire source file, and there are lots of tricky issues related to source code embedding (e.g. python code in XML) and encoding-aware transports (e.g. python code over HTTP) that needs to be covered. > and I haven't gotten a good answer why we can't do that > with a magic comment. probably because we can ;-) I'm on vacation; assuming neither encodings nor directives will go into 2.2a1, I prepare a counter-PEP when I have some time to spare (not today, most likely). From fredrik@pythonware.com Tue Jul 17 09:07:58 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 17 Jul 2001 10:07:58 +0200 Subject: [Python-Dev] Leading with XML-RPC References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> Message-ID: <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> martin wrote: > > It might benefit from also including the sgmlop.c extension. > > +1 on including this one (after fixing the bugs, that is). People want > a "good" XML parser in Python, regardless of XML-RPC; they complain > that expat requires an external library. > > sgmlop should then go into xml.parsers.sgmlop; making sgmllib and > xmllib use sgmlop is optional. any reason we cannot ship a snapshot of the expat sources with Python? (just the necessary files, that is: three C files, and some header files) From mal@lemburg.com Tue Jul 17 11:08:58 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 17 Jul 2001 12:08:58 +0200 Subject: [Python-Dev] PEP: Defining Python Source Code Encodings Message-ID: <3B540EBA.EE5372BD@lemburg.com> After having been through two rounds of comments with the "Unicode Literal Encoding" pre-PEP, it has turned out that people actually prefer to go for the full Monty meaning that the PEP should handle the complete Python source code encoding and not just the encoding of the Unicode literals (which are currently the only parts in a Python source code file for which Python assumes a fixed encoding). Here's a summary of what I've learned from the comments: 1. The complete Python source file should use a single encoding. 2. Handling of escape sequences should continue to work as it does now, but with all possible source code encodings, that is standard string literals (both 8-bit and Unicode) are subject to escape sequence expansion while raw string literals only expand a very small subset of escape sequences. 3. Python's tokenizer/compiler combo will need to be updated to work as follows: 1. read the file 2. decode it into Unicode assuming a fixed per-file encoding 3. tokenize the Unicode content 4. compile it, creating Unicode objects from the given Unicode data and creating string objects from the Unicode literal data by first reencoding the Unicode data into 8-bit string data using the given file encoding To make this backwards compatible, the implementation would have to assume Latin-1 as the original file encoding if not given (otherwise, binary data currently stored in 8-bit strings wouldn't make the roundtrip). 4. The encoding used in a Python source file should be easily parseable for en editor; a magic comment at the top of the file seems to be what people want to see, so I'll drop the directive (PEP 244) requirement in the PEP. Issues that still need to be resolved: - how to enable embedding of differently encoded data in Python source code (e.g. UTF-8 encoded XML data in a Latin-1 source file) - what to do with non-literal data in the source file, e.g. variable names and comments: * reencode them just as would be done for literals * only allow ASCII for certain elements like variable names etc. - which format to use for the magic comment, e.g. * Emacs style: #!/usr/bin/python # -*- encoding = 'utf-8' -*- * Via meta-option to the interpreter: #!/usr/bin/python --encoding=utf-8 * Using a special comment format: #!/usr/bin/python #!encoding = 'utf-8' Comments are welcome ! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Tue Jul 17 11:24:16 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 17 Jul 2001 12:24:16 +0200 Subject: [Python-Dev] directive statement (PEP 244) References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com> <3B533A62.73ECD605@lemburg.com> <200107161946.f6GJk7Q00944@odiug.digicool.com> <3B535EEA.A629E61C@lemburg.com> <15187.25347.383123.602253@beluga.mojam.com> Message-ID: <3B541250.6FD73F11@lemburg.com> Skip Montanaro wrote: > > Please excuse the naive interruption to this discussion. > > I'm a bit removed from this debate, being someone who is generally happy > with ASCII (and who really doesn't understand all the fur that is flying), > however I would imagine that programmers in Moscow writing code to be read > by other Russian programmers would want to enter Cyrillic characters > directly into their module doc strings and not have to insert hex escapes. > Can they safely do that now if they set the encoding variable in site.py > appropriately? If so, what is the need for the proposed directive to set > encodings? Is it an attempt simply to allow different encodings on a > per-module basis? You can only set the default encoding in site.py and this only affects magic conversions from strings to Unicode and back. Unicode literals must currently always use the unicode-escape encoding. The PEP tries to undo with the latter restriction by allowing flexible Python source code encodings on a per-file basis, e.g. Japanese programmer would be able to write source files which use Japanese characters in the Unicode literals which should enhance code readability and user acceptance. > On a related note, can the "Defining Unicode Literal Encodings" PEP be added > to the PEP site/page so those of us who don't save every message that flows > into their inboxes have it to refer to? I will update the pre-PEP according to the findings I've posted under the subject "PEP: Defining Python Source Code Encodings" and then ask Barry to assign a PEP number for the upload. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Tue Jul 17 13:11:21 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 17 Jul 2001 14:11:21 +0200 Subject: [Python-Dev] Re: PEP: Defining Python Source Code Encodings References: Message-ID: <3B542B69.8C092964@lemburg.com> Roman Suzi wrote: > > On Tue, 17 Jul 2001, M.-A. Lemburg wrote: > > > After having been through two rounds of comments with the "Unicode > > Literal Encoding" pre-PEP, it has turned out that people actually > > prefer to go for the full Monty meaning that the PEP should handle > > the complete Python source code encoding and not just the encoding > > of the Unicode literals (which are currently the only parts in a > > Python source code file for which Python assumes a fixed encoding). > > > > Here's a summary of what I've learned from the comments: > > > > 1. The complete Python source file should use a single encoding. > > Yes, certainly > > > 2. Handling of escape sequences should continue to work as it does > > now, but with all possible source code encodings, that is > > standard string literals (both 8-bit and Unicode) are subject to > > escape sequence expansion while raw string literals only expand > > a very small subset of escape sequences. > > > > 3. Python's tokenizer/compiler combo will need to be updated to > > work as follows: > > > > 1. read the file > > 2. decode it into Unicode assuming a fixed per-file encoding > > 3. tokenize the Unicode content > > 4. compile it, creating Unicode objects from the given Unicode data > > and creating string objects from the Unicode literal data > > by first reencoding the Unicode data into 8-bit string data > > using the given file encoding > > I think, that if encoding is not given, it must sillently assume "UNKNOWN" > encoding and do nothing, that is be 8-bit clean (as it is now). To be 8-bit clean it will have to use Latin-1 as fallback encoding since this encoding assures the roundtrip safety (decode to Unicode, then reencode). > Otherwise, it will slow down parser considerably. Yes, that could be an issue (I don't think it matters much though, since parsing usually only done during byte-code compilation and the results are buffered in .pyc files). > I also think that if encoding is choosen, there is no need to reencode it > back to literal strings: let them be in Unicode. That would be nice, but is not feasable at the moment (just try to run Python with -U option and see what happens...). > Or the encoding must _always_ be ASCII+something, as utf-8 for example. > Eliminating the need to bother with tokenizer (Because only docstrings, > comments and string-literals are entities which require encoding / > decoding). > > If I understood correctly, Python will soon switch to "unicode-only" > strings, as Java and Tcl did. (This is of course disaster for some Python > usage areas such as fast text-processing, but...) > > Or am I missing something? It won't switch any time soon... there's still too much work ahead and I'm also pretty sure that the 8-bit string type won't go away for backward compatibility reasons. > > To make this backwards compatible, the implementation would have to > > assume Latin-1 as the original file encoding if not given (otherwise, > > binary data currently stored in 8-bit strings wouldn't make the > > roundtrip). > > ...as I said, there must be no assumed charset. Things must > be left as is now when no explicit encoding given. This is what the Latin-1 encoding assures. > > 4. The encoding used in a Python source file should be easily > > parseable for en editor; a magic comment at the top of the > > file seems to be what people want to see, so I'll drop the > > directive (PEP 244) requirement in the PEP. > > > > Issues that still need to be resolved: > > > > - how to enable embedding of differently encoded data in Python > > source code (e.g. UTF-8 encoded XML data in a Latin-1 > > source file) > > Probably, adding explicit conversions. Yes, but there are cases where the source file having the embedded data will not decode into Unicode (I got the example backwards: think of a UTF-8 encoded source file with a Latin-1 string literal). Perhaps we should simply rule out this case and have the programmer stick to the source file encoding + some escaping or a run-time recoding of the literal data into the preferred encoding. > > - what to do with non-literal data in the source file, e.g. > > variable names and comments: > > > > * reencode them just as would be done for literals > > * only allow ASCII for certain elements like variable names > > etc. > > I think non-literal data must be in ASCII. > But it could be too cheesy to have variable names in national > alphabet ;-) That's for Guido to decide... > > - which format to use for the magic comment, e.g. > > > > * Emacs style: > > > > #!/usr/bin/python > > # -*- encoding = 'utf-8' -*- > > > > * Via meta-option to the interpreter: > > > > #!/usr/bin/python --encoding=utf-8 > > > > * Using a special comment format: > > > > #!/usr/bin/python > > #!encoding = 'utf-8' > > No variant is ideal. The 2nd is worse/best than all > (it depends on how to look at it!) > > Python has no macro directives. In this situation > they could help greatly! We've been discussing these on python-dev, but Guido is not too keen on having them. > That "#!encoding" is special case of macro directive. > > May be just put something like ''# at the beginning... > > Or, even greater idea occured to me: allow some XML > with meta-information (not only encoding) somehow escaped. > > I think, GvR could come with some advice here... > > > Comments are welcome ! Thanks for your comments, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Tue Jul 17 15:21:54 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 17 Jul 2001 10:21:54 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: Your message of "Tue, 17 Jul 2001 10:07:58 +0200." <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> Message-ID: <200107171421.KAA20380@cj20424-a.reston1.va.home.com> > any reason we cannot ship a snapshot of the expat sources > with Python? (just the necessary files, that is: three C files, > and some header files) Well, there's maintenance (someone has to sync these files from the expat source tree into the Python tree regularly) and possibly licensing (I don't know about the expat license, but who knows if it's GPL compatible). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Tue Jul 17 15:28:03 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 17 Jul 2001 10:28:03 -0400 Subject: [Python-Dev] Re: PEP: Defining Python Source Code Encodings In-Reply-To: Your message of "Tue, 17 Jul 2001 14:11:21 +0200." <3B542B69.8C092964@lemburg.com> References: <3B542B69.8C092964@lemburg.com> Message-ID: <200107171428.KAA20424@cj20424-a.reston1.va.home.com> > I think, GvR could come with some advice here... I have to bow out of this discussion for now. There are too many things requesting my attention, and I have to shed load. Ditto for the "directive" proposal. Sorry, --Guido van Rossum (home page: http://www.python.org/~guido/) From gregor@hoffleit.de Tue Jul 17 15:32:09 2001 From: gregor@hoffleit.de (Gregor Hoffleit) Date: Tue, 17 Jul 2001 16:32:09 +0200 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <200107171421.KAA20380@cj20424-a.reston1.va.home.com> References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> <200107171421.KAA20380@cj20424-a.reston1.va.home.com> Message-ID: <20010717163209.A30372@mediasupervision.de> On Tue, Jul 17, 2001 at 10:21:54AM -0400, Guido van Rossum wrote: > > any reason we cannot ship a snapshot of the expat sources > > with Python? (just the necessary files, that is: three C files, > > and some header files) > > Well, there's maintenance (someone has to sync these files from the > expat source tree into the Python tree regularly) and possibly > licensing (I don't know about the expat license, but who knows if it's > GPL compatible). Grumble, browse, murmur... The license should be fine for all practical uses: It's a MIT/X style license, i.e. very similar to the old Python license, and therefore compatible with the GPL without any doubt ;-) ! http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/expat/expat/COPYING Gregor From fdrake@acm.org Tue Jul 17 15:34:06 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 17 Jul 2001 10:34:06 -0400 (EDT) Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <200107171421.KAA20380@cj20424-a.reston1.va.home.com> References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> <200107171421.KAA20380@cj20424-a.reston1.va.home.com> Message-ID: <15188.19678.344303.182227@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > Well, there's maintenance (someone has to sync these files from the > expat source tree into the Python tree regularly) and possibly We even know who that someone would be, and how annoyed he'd be at having one more place to check things in. ;-) Expat (not pyexpat) *really* needs to have some attention. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From mwh@python.net Tue Jul 17 16:14:11 2001 From: mwh@python.net (Michael Hudson) Date: 17 Jul 2001 11:14:11 -0400 Subject: [Python-Dev] PEP: Defining Python Source Code Encodings References: <3B540EBA.EE5372BD@lemburg.com> Message-ID: <2m1ynfr3jw.fsf@starship.python.net> "M.-A. Lemburg" writes: > - which format to use for the magic comment, e.g. > > * Emacs style: > > #!/usr/bin/python > # -*- encoding = 'utf-8' -*- Emacs already has a name for this; you'd write that # -*- coding: utf-8; -*- Seems reasonable to me. Cheers, M. -- We've had a lot of problems going from glibc 2.0 to glibc 2.1. People claim binary compatibility. Except for functions they don't like. -- Peter Van Eynde, comp.lang.lisp From mal@lemburg.com Tue Jul 17 17:06:12 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 17 Jul 2001 18:06:12 +0200 Subject: [Python-Dev] A replacement for asyncore / asynchat References: Message-ID: <3B546274.86B0FD7B@lemburg.com> Panu A Kalliokoski wrote: > > Hello all, I've developed a Python module (in Python) to make somewhat > higher abstraction over select.select(). The package is called > "Selecting". The package is somewhat similar to asyncore, but has many > advantages over it: > > [...] > > For these reasons, I think that the asyncore package in the Python main > distribution should be replaced with Selecting or at least Selecting > should be put in the main distribution. Is your package backwards compatible to asyncore ? If not, then it might be a better idea, to place it on the web (e.g. on SourceForge) and register the URLs with Parnassus so that Python users can easily find it. > The package is available at > http://sange.fi/~atehwa-u/selecting/ (for browsing) and > http://sange.fi/~atehwa-u/selecting-0.89.tar.gz (for downloading). > > The package is quite well tested and has been used to build ircd-style > daemons, but more testing and comments are always welcome. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Tue Jul 17 17:32:26 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 17 Jul 2001 11:32:26 -0500 Subject: [Python-Dev] A replacement for asyncore / asynchat In-Reply-To: <3B546274.86B0FD7B@lemburg.com> References: <3B546274.86B0FD7B@lemburg.com> Message-ID: <15188.26778.547568.706846@beluga.mojam.com> >> The package is available at >> http://sange.fi/~atehwa-u/selecting/ (for browsing) and >> http://sange.fi/~atehwa-u/selecting-0.89.tar.gz (for downloading). >> >> The package is quite well tested and has been used to build ircd-style >> daemons, but more testing and comments are always welcome. mal> Is your package backwards compatible to asyncore ? If not, then it mal> might be a better idea, to place it on the web (e.g. on mal> SourceForge) and register the URLs with Parnassus so that Python mal> users can easily find it. You might also bring it to Sam Rushing's attention. I think he's been working on "async, the next generation". He will probably have some good ideas and could provide some useful critique of the selecting code. I don't recall if he is on python-dev or not. Skip From atehwa@iki.fi Tue Jul 17 18:02:21 2001 From: atehwa@iki.fi (Panu A Kalliokoski) Date: Tue, 17 Jul 2001 20:02:21 +0300 (EET DST) Subject: [Python-Dev] A replacement for asyncore / asynchat In-Reply-To: <3B546274.86B0FD7B@lemburg.com> Message-ID: On Tue, 17 Jul 2001, M.-A. Lemburg wrote: | > For these reasons, I think that the asyncore package in the Python main | > distribution should be replaced with Selecting or at least Selecting | > should be put in the main distribution. | | Is your package backwards compatible to asyncore ? If not, then | it might be a better idea, to place it on the web (e.g. on SourceForge) | and register the URLs with Parnassus so that Python users can | easily find it. I've registered the module in Parnassus. My point is mostly that because Selecting really is (at least in my opinion) much easier to work with than asyncore, it should be placed where people will look for standard solutions, and that is the standard library. Selecting is not backwards compatible with asyncore. It is not impossible to write such glue code that one could use Selecting directly in projects that have been written for asyncore, but I doubt whether it is worth it. Selecting does not offer great advantages in performance (at least not currently, but this might change), but in customisability and clean API. asyncore does its work well in projects that use it, and Selecting is mostly better because it is easier to make a new project that uses it. The sensible solution, as I see it, would be to deprecate asyncore/asynchat but leave them there for supporting old projects, and add Selecting as the primary way of abstracting over select() (and poll(), kqueue and rtsig, which I plan to transparently add to Selecting). Panu From fdrake@acm.org Tue Jul 17 19:21:29 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 17 Jul 2001 14:21:29 -0400 (EDT) Subject: [Python-Dev] Docs for 2.2a1 frozen Message-ID: <15188.33321.534672.664230@cj42289-a.reston1.va.home.com> Please do not make any documentation checkins on the trunk or descr-branch; we're getting things ready for the 2.2a1 release. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From bsass@freenet.edmonton.ab.ca Tue Jul 17 19:31:24 2001 From: bsass@freenet.edmonton.ab.ca (Bruce Sass) Date: Tue, 17 Jul 2001 12:31:24 -0600 (MDT) Subject: [Python-Dev] Re: PEP: Defining Python Source Code Encodings In-Reply-To: <3B540EBA.EE5372BD@lemburg.com> Message-ID: On Tue, 17 Jul 2001, M.-A. Lemburg wrote: <...> > - which format to use for the magic comment, e.g. > > * Emacs style: > > #!/usr/bin/python > # -*- encoding = 'utf-8' -*- This should work for everyone, but will it confuse emacs?. I suppose, "# # ...", or "### ...", or almost any short sequence starting with "#" will work, eh. > * Via meta-option to the interpreter: > > #!/usr/bin/python --encoding=utf-8 This will require editing if python is not in /usr/bin, and can not be used to pass more than one argument to the command (python, in this case). > * Using a special comment format: > > #!/usr/bin/python > #!encoding = 'utf-8' This is confusing, and will only work on *nix (linux?) iff it is the second (or later) line; if it is the first line... it will fail because there is probably no executable named "encoding" available, and if there is, "= 'utf8'" is unlikely to exist. please, Avoid character sequences that have other meanings in this context. I think this should be done as a generic method for pre-processing Python source before the compiler/interpreter has a look at it. e.g., # ## encoding utf-8 triggers whatever you encoding fans want, # ## format noweb runs the source through a filter which can extract code noweb marked up code, and maybe even installs the weaved docs and tangled code (via distutils?) # ## MySpecialMarkup runs the source through a filter named MySpecialMarkup. MySpecialMarkup could be anything: extensions to docstrings, a proprietary binary format, an entire package-in-a-file! Generally: # [] If Python does not know what the is it should either look in a set location for a program of the same name then use its output as the source, or look into a table that maps to a procedure which results in Python source. - Bruce From martin@loewis.home.cs.tu-berlin.de Tue Jul 17 23:50:27 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 18 Jul 2001 00:50:27 +0200 Subject: [Python-Dev] Re: PEP 244 syntax In-Reply-To: <15187.25558.243182.290606@beluga.mojam.com> (message from Skip Montanaro on Mon, 16 Jul 2001 16:59:50 -0500) References: <200107161559.f6GFx8V03899@mira.informatik.hu-berlin.de> <3B531B24.C8CCC74F@lemburg.com> <200107162140.f6GLexo09608@mira.informatik.hu-berlin.de> <15187.25558.243182.290606@beluga.mojam.com> Message-ID: <200107172250.f6HMoR501665@mira.informatik.hu-berlin.de> > But it doesn't appear that PEP 244 allows multiple directives per line: > > A directive_statement is a statement of the form > > directive_statement: 'directive' NAME [atom] [';'] NEWLINE > > If you decide to allow it, then the semicolon makes sense, but not > otherwise. Ok, then I guess I should put the directive statement into the small_stmt category, and allow multiple of those in a single line. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Tue Jul 17 23:45:42 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 18 Jul 2001 00:45:42 +0200 Subject: [Python-Dev] PEP: Defining Python Source Code Encodings Message-ID: <200107172245.f6HMjgp01656@mira.informatik.hu-berlin.de> > To be 8-bit clean it will have to use Latin-1 as fallback encoding > since this encoding assures the roundtrip safety (decode to Unicode, > then reencode). No, that is not true. Any other 8-bit encoding that has all code points assigned (e.g. Latin-2, or KOI8-R) would also give you full round-trip encoding. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Wed Jul 18 00:05:07 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 18 Jul 2001 01:05:07 +0200 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> (fredrik@pythonware.com) References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> Message-ID: <200107172305.f6HN57T01730@mira.informatik.hu-berlin.de> > > +1 on including this one (after fixing the bugs, that is). People want > > a "good" XML parser in Python, regardless of XML-RPC; they complain > > that expat requires an external library. > > > > sgmlop should then go into xml.parsers.sgmlop; making sgmllib and > > xmllib use sgmlop is optional. > > any reason we cannot ship a snapshot of the expat sources > with Python? (just the necessary files, that is: three C files, > and some header files) Would be fine with me, and I would contribute the necessary changes to the CVS - I just would need permission to do so (and an advise whether to stuff everything into Modules, or to create an expat subdirectory). Regards, Martin From fdrake@acm.org Wed Jul 18 00:31:07 2001 From: fdrake@acm.org (Fred L. Drake) Date: Tue, 17 Jul 2001 19:31:07 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010717233107.C0EAE2892B@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Final update of the 2.2a1 documentation. From guido@digicool.com Wed Jul 18 00:53:36 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 17 Jul 2001 19:53:36 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: Your message of "Wed, 18 Jul 2001 01:05:07 +0200." <200107172305.f6HN57T01730@mira.informatik.hu-berlin.de> References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> <200107172305.f6HN57T01730@mira.informatik.hu-berlin.de> Message-ID: <200107172353.TAA21022@cj20424-a.reston1.va.home.com> > > any reason we cannot ship a snapshot of the expat sources > > with Python? (just the necessary files, that is: three C files, > > and some header files) > > Would be fine with me, and I would contribute the necessary changes to > the CVS - I just would need permission to do so (and an advise whether > to stuff everything into Modules, or to create an expat subdirectory). If Fred (Drake) approves, that's fine with me. Negotiate the details with him. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Wed Jul 18 08:36:50 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 18 Jul 2001 09:36:50 +0200 Subject: [Python-Dev] Re: PEP: Defining Python Source Code Encodings References: <200107172245.f6HMjgp01656@mira.informatik.hu-berlin.de> Message-ID: <3B553C92.D030234F@lemburg.com> "Martin v. Loewis" wrote: > > > To be 8-bit clean it will have to use Latin-1 as fallback encoding > > since this encoding assures the roundtrip safety (decode to Unicode, > > then reencode). > > No, that is not true. Any other 8-bit encoding that has all code > points assigned (e.g. Latin-2, or KOI8-R) would also give you full > round-trip encoding. True. Still, Latin-1 gives you the best performance and is also compatible with the unicode-ecsape encoding which is currently in use. I'll add a note about this to the PEP. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From dee@investorcafe.net Wed Jul 18 09:00:56 2001 From: dee@investorcafe.net (InvestorCafe) Date: Wed, 18 Jul 2001 01:00:56 -0700 Subject: [Python-Dev] NEWS ALERT!!! - RateXchange (RTX: AMEX) Message-ID:
 
InvestorCafe.Net Profile: RateXchange Corporation (AMEX: RTX)

According to President and Chief Executive Jon Merriman, "RateXchange is now poised for success, having completed a transition
into a dynamic electronic trading system. Last year, the company raised Dollars 30 million in an institutional private placement, and
moved its stock listing from the OTC Bulletin Board to the American Stock Exchange."
RTX Info About RateXchange

RTX Quote

  
RTX   
1.00   
0.00% 
Delayed 15 minutes
RTX Intraday Chart


RTX 1-Month Chart


RTX 1-Year Chart
RateXchange Corporation provides trading, consulting and information solutions enabling market participants to maximize their assets. These trading solutions allow network providers, energy merchants, financial institutions and asset managers engaged on the RateXchange Trading System (RTS)TM and RateXchange Futures System (RFS)TM the ability to trade bandwidth and futures globally. The Company's consulting solutions practice provides asset valuation tools, risk management strategies and analytics. The information solutions group provides pricing information, market research and comprehensive industry background. Founded in 1997, RateXchange is a publicly-traded (AMEX: RTX) trading solutions company recognized as an industry leader in enabling the creation of liquid marketplaces for bandwidth and other telecommunications products.

RateXchange Trading System (RTS)TM: RateXchange Trading System (RTS)TM is RateXchange's platform for bandwidth trading and is analogous to trading systems that dominate online natural gas and electricity commodity trading. RateXchange's proprietary trading platform is available for customization to meet your organization's needs in the form of Private Label Exchanges.

OffXchange Trading: Market participants can trade bandwidth with the assistance of an experienced broker as a result of RateXchange's alliance with Amerex, the world's largest over-the-counter (OTC) energy & power broker.

RateXchange Futures System:  RateXchange Future System's high-speed online trading and risk management system enables brokerages, institutional fund managers and financial institutions to both execute and clear listed futures contracts on a global basis.

Recent News

7/16/01

6/28/01

 

 

 

Want more info on RTX? Join our database, and we'll contact you with further information.
 

Your Name:

Email
*required:

Disclaimer:

Verify all claims and do your own due diligence. This profile is not a solicitation or recommendation to buy, sell or hold securities. InvestorCafe.Net is not offering securities for sale. An offer to buy or sell can be made only with accompanying disclosure documents and only in the states and provinces for which they are approved. All statements and expressions are the sole opinion of the editor and are subject to change without notice. InvestorCafe.Net is not liable for any investment decisions by its readers or subscribers. It is strongly recommended that any purchase or sale decision be discussed with a financial adviser, or a broker-dealer, or a member of any financial regulatory bodies. The information contained herein has been provided as an information service only. The accuracy or completeness of the information is not warranted and is only as reliable as the sources from which it was obtained. It should be understood there is no guarantee that past performance will be indicative of future results. Investors are cautioned that they may lose all or a portion of their investment in this or any other company. In order to be in full compliance with the Securities Act of 1933, Section 17(b), InvestorCafe.net and its management fully disclose that they receive fees from profiled companies or agents representing the profiled companies. These fees may be paid in cash or in stock and they will be fully disclosed in each profile. Neither InvestorCafe.Net nor any of its affiliates, or employees shall be liable to you or anyone else for any loss or damages from use of this Internet Web Site or e-mail, caused in whole or part by its negligence or contingencies beyond its control in procuring, compiling, interpreting, reporting, or delivering this Web Site or e-mail and any contents. Since InvestorCafe.Net receives compensation and its employees or members of their families may hold stock in the profiled companies, there is an inherent conflict of interest in InvestorCafe.Net's statements and opinions and such statements and opinions cannot be considered independent. InvestorCafe.Net and its management may benefit from any increase in the share prices of the profiled companies. Information contained herein contains "forward looking statements" within the meaning of Section 27A of the Securities Act of 1933 and Section 21E of the Securities and Exchange Act of 1934. Any statements that express or involve discussions with respect to predictions, expectations, beliefs, plans, projections, objectives, goals, assumptions or future events or performance are not statements of historical facts and may be "forward looking statements". Forward looking statements are based on expectations, estimates and projections at the time the statements are made that involve a number of risks and uncertainties which could cause actual results or events to differ materially from those presently anticipated. Forward looking statement may be identified through the use of words such as "expects", "will", "anticipates", "estimates", "believes", or by statements indicating certain actions "may", "could", "should" or "might" occur. InvestorCafe.Net has been compensated a total of 6,000 free trading shares of RTX for the production and distribution of this document. We will benefit from any increase in the share price or liquidity of RTX. We retain the option of liquidating all or part of our compensation (RTX shares) before, during, or immediately after the dissemination of this report. EquInvestorCafe.Net encourages its readers to invest carefully and read the investor information available at the Web sites of the Securities and Exchange Commission (SEC) and/or the National Association of Securities Dealers (NASD). The NASD offers very good information on its site about how to invest with caution. Readers can also review all public filings by companies at the SEC's EDGAR page. InvestorCafe.Net may make use of information including, but not limited to, Company Press Releases, SEC Filings, Company Profiles, Other Research Sites, Brokerages, Newspapers, Magazines, Journals, Electronic Databases, Company Interviews, and other Publicly accessible sources. InvestorCafe.Net neither represents nor warrants the accuracy of information provided by any of the sources mentioned above and therefore: THE READER SHOULD VERIFY ALL CLAIMS AND DO THEIR OWN DUE DILIGENCE BEFORE INVESTING IN ANY SECURITIES MENTIONED. INVESTING IN SECURITIES IS SPECULATIVE AND CARRIES A HIGH DEGREE OF RISK. THE INFORMATION FOUND IN THIS WEB SITE OR E-MAIL IS PROTECTED BY THE COPYRIGHT LAWS OF THE UNITED STATES AND MAY NOT BE COPIES, OR REPRODUCED IN ANY WAY WITHOUT THE EXPRESSED, WRITTEN CONSENT OF THE EDITOR OF INVESTORCAFE.NET.


Click here to unsubscribe

From guido@digicool.com Wed Jul 18 14:23:04 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 18 Jul 2001 09:23:04 -0400 Subject: [Python-Dev] 2.2a1 released Message-ID: <200107181323.JAA27606@cj20424-a.reston1.va.home.com> The 2.2a1 release is on the website: http://www.python.org/2.2/ I'm waiting for two things before I send out a wider announcement: - SF has a 30 minute cron job delay before the .tgz file is visible; - I have to finish an introduction to the features added ob behalf of PEP 252 and PEP 253 (http://www.python.org/2.2/descrintro.html). I'm feeling shitty today so I may not complete that intro right away, but I'll post the announcement anyway once SF has all three files. --Guido van Rossum (home page: http://www.python.org/~guido/) From Paul.Moore@atosorigin.com Wed Jul 18 14:57:06 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 18 Jul 2001 14:57:06 +0100 Subject: [Python-Dev] PEP 250: Summary of comments Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com> Having waited a few days to let the dust settle, I believe that the following is the current state of affairs: 1. The change to site.py, to include site-packages in sys.path, is in. 2. The change to distutils.sysconfig to change to site-packages, is in. 3. The Windows Installer still needs changes: a) site.py should change to export a "sys.extinstallpath" which points to site-packages b) the Windows Installer should use this, rather than the registry I see no great issue with 3a - it should be a pretty trivial change. Can someone with access to the sources make it? I attach a suggested patch. Item 3b is the key point - it's pretty critical that the Windows Installer change to use the new directory, otherwise, most of the work is a waste. I can't do much about this, as I haven't even seen the source yet. Can someone do something about this? On point 3a, sys.extinstallpath should be set for all platforms, but I have to admit that I don't know what to do for non-Windows platforms. The best I can suggest is that we do something like if os.sep == '/': sys.extinstallpath = os.path.join(sys.prefix, "lib", "python" + sys.version[:3], "site-packages") else: sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages") which matches the sys.path setting for Unix - but I couldn't really offer this as a patch, as I don't understand the issues around site-packages vs site-python on Unix. All I could say is that it's better than leaving sys.extinstallpath unset on some platforms. To summarise the summary: 1. The patch to site.py to expose sys.extinstallpath should be made, at least for Windows. 2. The Windows Installer needs to be updated to use sys.extinstallpath for Python 2.2 and greater. 3. If the Mac people want, the same can be done for Mac. 4. If the Unix people have a consensus, that should go in too (affects site.py, bdist_rpm, at least). As a side benefit, if this goes in, bdist_wininst will start working for Pythons which don't use the registry (such as the PythonLabs ones). Sadly, I note that I've just missed the 2.2a1 release. Is anyone likely to be able to do anything about this prior to 2.2a2? (If someone sends me a pointer to the wininst sources, I'll look into what's involved in a patch, assuming no-one else has adequate time). Paul. PS As documentation of sys.extinstallpath, I'd suggest something like: sys.extinstallpath: The directory into which Python extensions should be installed. This is merely a recommendation - Python will pick up extensions which are located anywhere along sys.path. However, extension installers should use this directory by default. The distutils package (and installers built with it) will use this directory (XXX - currently not trie for bdist_rpm, I guess). Patch for site.py (point 3a - Windows only, from Thomas Heller) --- \Applications\Python\lib\site.py.orig Tue Jun 26 10:07:06 2001 +++ \Applications\Python\lib\site.py Wed Jul 18 14:43:54 2001 @@ -148,6 +148,9 @@ if os.path.isdir(sitedir): addsitedir(sitedir) +if os.sep == '\\': # != '/' if you want to do all except Unix like this... + sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages") + # Define new built-ins 'quit' and 'exit'. # These are simply strings that display a hint on how to exit. if os.sep == ':': From mal@lemburg.com Wed Jul 18 15:21:37 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 18 Jul 2001 16:21:37 +0200 Subject: [Python-Dev] PEP: Defining Python Source Code Encodings Message-ID: <3B559B71.C08C6145@lemburg.com> Here's an update of the pre-PEP. After this round of comments, the PEP will be checked into CVS (provided Barry assigns a PEP number, hi Barry ;-) -- PEP: 0263 (?) Title: Defining Python Source Code Encodings Version: $Revision: 1.2 $ Author: mal@lemburg.com (Marc-Andr=E9 Lemburg) Status: Draft Type: Standards Track Python-Version: 2.3 Created: 06-Jun-2001 Post-History:=20 Requires: 244 Abstract This PEP proposes to introduce a syntax to declare the encoding of a Python source file. The encoding information is then used by the Python parser to interpret the file using the given encoding. Most notably this enhances the interpretation of Unicode literals in the source code and makes it possible to write Unicode literals using e.g. UTF-8 directly in an Unicode aware editor. Problem In Python 2.1, Unicode literals can only be written using the Latin-1 based encoding "unicode-escape". This makes the programming environment rather unfriendly to Python users who live and work in non-Latin-1 locales such as many of the Asian=20 countries. Programmers can write their 8-bit strings using the favourite encoding, but are bound to the "unicode-escape" encoding for Unicode literals. Proposed Solution I propose to make the Python source code encoding both visible and changeable on a per-source file basis by using a special comment at the top of the file to declare the encoding. To make Python aware of this encoding declaration a number of concept changes are necessary with repect to the handling of Python source code data. Concepts The PEP is based on the following concepts which would have to be implemented to enable usage of such a magic comment: 1. The complete Python source file should use a single encoding. Embedding of differently encoded data is not allowed and will result in a decoding error during compilation of the Python source code. 2. Handling of escape sequences should continue to work as it does=20 now, but with all possible source code encodings, that is standard string literals (both 8-bit and Unicode) are subject to=20 escape sequence expansion while raw string literals only expand a very small subset of escape sequences. 3. Python's tokenizer/compiler combo will need to be updated to work as follows: 1. read the file 2. decode it into Unicode assuming a fixed per-file encoding 3. tokenize the Unicode content 4. compile it, creating Unicode objects from the given Unicode dat= a and creating string objects from the Unicode literal data by first reencoding the Unicode data into 8-bit string data using the given file encoding 5. variable names and other identifiers will be reencoded into 8-bit strings using the file encoding to assure backward compatibility with the existing implementation ISSUE:=20 Should we restrict identifiers to ASCII ? To make this backwards compatible, the implementation would have t= o assume Latin-1 as the original file encoding if not given (otherwi= se, binary data currently stored in 8-bit strings wouldn't make the roundtrip). Comment Syntax The magic comment will use the following syntax. It will have to appear as first or second line in the Python source file. ISSUE: Possible choices for the format: 1. Emacs style: #!/usr/bin/python # -*- coding: utf-8; -*- 2. Via a pseudo-option to the interpreter (one which is not used by the interpreter): #!/usr/bin/python --encoding=3Dutf-8 3. Using a special comment format: #!/usr/bin/python #!encoding =3D 'utf-8' 4. XML-style format: #!/usr/bin/python #?python encoding =3D 'utf-8' Usage of a new keyword "directive" (see PEP 244) for this purpose has been proposed, but was put aside due to PEP 244 not being widely accepted (yet). Scope This PEP only affects Python source code which makes use of the proposed magic comment. Without the magic comment in the proposed position, Python will treat the source file as it does currently to maintain backwards compatibility. Copyright This document has been placed in the public domain. =0C Local Variables: mode: indented-text indent-tabs-mode: nil End: --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From Paul.Moore@atosorigin.com Wed Jul 18 15:21:03 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 18 Jul 2001 15:21:03 +0100 Subject: [Python-Dev] RE: [Distutils] PEP 250: Summary of comments Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AF16@UKRUX002.rundc.uk.origin-it.com> From: Moore, Paul [mailto:Paul.Moore@atosorigin.com] > Having waited a few days to let the dust settle, I believe that the > following is the current state of affairs: > > 1. The change to site.py, to include site-packages in sys.path, is in. > 2. The change to distutils.sysconfig to change to site-packages, is in. Urk. I just downloaded 2.2a1, and the sysconfig.py change isn't in :-( Attached are a couple of possible patches - one which just tweaks Windows (os.name is 'nt' for Win9x???) and another which tries to slim down on the platform-specific special casing, but which may have larger effects on non-Windows platforms. Can someone put one of these in? Thanks, Paul. Trivial version of the change: --- sysconfig.py.orig Sat Jul 07 23:55:28 2001 +++ sysconfig.py Wed Jul 18 15:13:45 2001 @@ -89,7 +89,7 @@ if standard_lib: return os.path.join(PREFIX, "Lib") else: - return prefix + return os.path.join(libpython, "site-packages") elif os.name == "mac": if plat_specific: Better version, which covers all platforms (but which removes a couple of "OK, where DO site-specific modules go on the Mac?" errors, which may be glossing over an issue...) is --- sysconfig.py.orig Sat Jul 07 23:55:28 2001 +++ sysconfig.py Wed Jul 18 15:18:55 2001 @@ -80,35 +80,23 @@ if os.name == "posix": libpython = os.path.join(prefix, "lib", "python" + sys.version[:3]) - if standard_lib: - return libpython - else: - return os.path.join(libpython, "site-packages") - elif os.name == "nt": - if standard_lib: - return os.path.join(PREFIX, "Lib") - else: - return prefix - + libpython = os.path.join(PREFIX, "Lib") elif os.name == "mac": if plat_specific: - if standard_lib: - return os.path.join(EXEC_PREFIX, "Mac", "Plugins") - else: - raise DistutilsPlatformError, \ - "OK, where DO site-specific extensions go on the Mac?" + libpython = os.path.join(EXEC_PREFIX, "Mac", "Plugins") else: - if standard_lib: - return os.path.join(PREFIX, "Lib") - else: - raise DistutilsPlatformError, \ - "OK, where DO site-specific modules go on the Mac?" + libpython = os.path.join(PREFIX, "Lib") else: raise DistutilsPlatformError, \ ("I don't know where Python installs its library " + "on platform '%s'") % os.name + if standard_lib: + return libpython + else: + return os.path.join(libpython, "site-packages") + # get_python_lib() From akuchlin@mems-exchange.org Wed Jul 18 15:23:41 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 18 Jul 2001 10:23:41 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <200107172353.TAA21022@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jul 17, 2001 at 07:53:36PM -0400 References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> <200107172305.f6HN57T01730@mira.informatik.hu-berlin.de> <200107172353.TAA21022@cj20424-a.reston1.va.home.com> Message-ID: <20010718102341.B16348@ute.cnri.reston.va.us> On Tue, Jul 17, 2001 at 07:53:36PM -0400, Guido van Rossum wrote: >> the CVS - I just would need permission to do so (and an advise whether >> to stuff everything into Modules, or to create an expat subdirectory). > >If Fred (Drake) approves, that's fine with me. Negotiate the details >with him. Note that, the last time this idea was brought up, the issue of version mismatches was brought up. What if the platform has a newer version of Expat? What if you have an extension module for a library that also links with Expat internally? --amk From thomas@xs4all.net Wed Jul 18 15:24:40 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 18 Jul 2001 16:24:40 +0200 Subject: [Python-Dev] Python 2.1.1 & Mac/ Message-ID: <20010718162439.E2054@xs4all.nl> When I updated the 2.1.1 tree this morning, I noticed it checked out the entire Mac/ subtree... As it wasn't part of 2.1, I don't think it should be part of 2.1.1, and I don't remember seeing any add's for it (at least not with the release21-maint tag.) Jack/Just, did either of you add it explicitly ? If so, there's something wrong with the checkin messages for those adds. If not, it's likely something went wrong with Guido's attempt to add the date-snapshot/descr-branch tags to all files :P In that case, I guess we'll have to manually exclude the Mac/ tree from the source tarball of the final release, next friday. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From fdrake@acm.org Wed Jul 18 15:26:26 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 18 Jul 2001 10:26:26 -0400 (EDT) Subject: [Python-Dev] PEP 250: Summary of comments In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com> Message-ID: <15189.40082.973330.760541@cj42289-a.reston1.va.home.com> Moore, Paul writes: > On point 3a, sys.extinstallpath should be set for all platforms, but I have > to admit that I don't know what to do for non-Windows platforms. The best I > can suggest is that we do something like > > if os.sep == '/': > sys.extinstallpath = os.path.join(sys.prefix, "lib", "python" + > sys.version[:3], "site-packages") > else: > sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages") There's one aspect that doesn't appear to have been addressed for Unix: there are two reasonable values for extinstallpath. In multi-architecture installations, where the Python portions of the library are shared among architectures, there are two site-packages directories: $prefix/lib/pythonX.Y/site-packages/ and $exec_prefix/lib/pythonX.Y/site-packages/ When $prefix and $exec_prefix are the same, this isn't an issue, but for this is a problem for multi-platform installations. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido@digicool.com Wed Jul 18 15:36:07 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 18 Jul 2001 10:36:07 -0400 Subject: [Python-Dev] RE: [Distutils] PEP 250: Summary of comments In-Reply-To: Your message of "Wed, 18 Jul 2001 15:21:03 BST." <714DFA46B9BBD0119CD000805FC1F53B01B5AF16@UKRUX002.rundc.uk.origin-it.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF16@UKRUX002.rundc.uk.origin-it.com> Message-ID: <200107181436.KAA28132@cj20424-a.reston1.va.home.com> > Can someone put one of these in? Don't count on me -- I haven't followed this discussion at all. Sorry. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Wed Jul 18 15:41:55 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 18 Jul 2001 10:41:55 -0400 Subject: [Python-Dev] Python 2.1.1 & Mac/ In-Reply-To: Your message of "Wed, 18 Jul 2001 16:24:40 +0200." <20010718162439.E2054@xs4all.nl> References: <20010718162439.E2054@xs4all.nl> Message-ID: <200107181441.KAA28173@cj20424-a.reston1.va.home.com> > When I updated the 2.1.1 tree this morning, I noticed it checked out > the entire Mac/ subtree... As it wasn't part of 2.1, I don't think > it should be part of 2.1.1, and I don't remember seeing any add's > for it (at least not with the release21-maint tag.) Jack/Just, did > either of you add it explicitly ? If so, there's something wrong > with the checkin messages for those adds. If not, it's likely > something went wrong with Guido's attempt to add the > date-snapshot/descr-branch tags to all files :P In that case, I > guess we'll have to manually exclude the Mac/ tree from the source > tarball of the final release, next friday. I don't recall doing this, but it's possible that it happened this way. "cvs tag" operations don't create email notifications! E.g. on Mac/Relnotes, I see that the release21-maint tag is set. If Jack&Just agree, the best way to go about this (I think) is to execute the command "cvs tag -d release21-maint" in the Mac branch, to delete the tag. Make very sure to only do this in the Mac branch!!! On the other hand, it could be that Jack/Just intends to release a 2.1.1 version of MacPython, and then the tagging is correct. But I would still exclude the Mac tree from the 2.1.1 source distribution. --Guido van Rossum (home page: http://www.python.org/~guido/) From Paul.Moore@atosorigin.com Wed Jul 18 15:41:05 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 18 Jul 2001 15:41:05 +0100 Subject: [Python-Dev] PEP 250: Summary of comments Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AF17@UKRUX002.rundc.uk.origin-it.com> From: Fred L. Drake, Jr. [mailto:fdrake@acm.org] > There's one aspect that doesn't appear to have been addressed for > Unix: there are two reasonable values for extinstallpath. In > multi-architecture installations, where the Python portions of the > library are shared among architectures, there are two site-packages > directories: I agree entirely. I have no knowledge of the issues on Unix, and hence I can make no comment. As far as I am concerned, I feel very strongly that this should go in for Windows (where I believe that the current use of a "bare" sys.prefix is wrong), but I have no view at all on other platforms. My PEP was originally entitled "Using site-packages on Windows" - I am happy if the scope gets extended, but don't rely on me to do it - and please don't let the key point (for me) which is fixing Windows, get sidetracked by issues for Unix, where (as far as I know) the current status quo is perfectly acceptable to the majority of users. So I'd vote to leave sys.extinstallpath undefined except on Windows, and leave the other platforms for a new PEP. Paul. From mal@lemburg.com Wed Jul 18 16:00:41 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 18 Jul 2001 17:00:41 +0200 Subject: [Distutils] Re: [Python-Dev] PEP 250: Summary of comments References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com> <15189.40082.973330.760541@cj42289-a.reston1.va.home.com> Message-ID: <3B55A499.42F55B5A@lemburg.com> "Fred L. Drake, Jr." wrote: > > Moore, Paul writes: > > On point 3a, sys.extinstallpath should be set for all platforms, but I have > > to admit that I don't know what to do for non-Windows platforms. The best I > > can suggest is that we do something like > > > > if os.sep == '/': > > sys.extinstallpath = os.path.join(sys.prefix, "lib", "python" + > > sys.version[:3], "site-packages") > > else: > > sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages") > > There's one aspect that doesn't appear to have been addressed for > Unix: there are two reasonable values for extinstallpath. In > multi-architecture installations, where the Python portions of the > library are shared among architectures, there are two site-packages > directories: > > $prefix/lib/pythonX.Y/site-packages/ > > and > > $exec_prefix/lib/pythonX.Y/site-packages/ > > When $prefix and $exec_prefix are the same, this isn't an issue, but > for this is a problem for multi-platform installations. I don't think this is an issue since distutils already knows that extension package live in .../site-package on Unix. The Windows install and unix_home are the only ones which copy the files into non-standard dirs (Unix seems to be the only target which supports multi-platform installs out-of-the-box): [taken from distutils.commands.install] """ INSTALL_SCHEMES = { 'unix_prefix': { 'purelib': '$base/lib/python$py_version_short/site-packages', 'platlib': '$platbase/lib/python$py_version_short/site-packages', 'headers': '$base/include/python$py_version_short/$dist_name', 'scripts': '$base/bin', 'data' : '$base', }, 'unix_home': { 'purelib': '$base/lib/python', 'platlib': '$base/lib/python', 'headers': '$base/include/python/$dist_name', 'scripts': '$base/bin', 'data' : '$base', }, 'nt': { 'purelib': '$base', 'platlib': '$base', 'headers': '$base/Include/$dist_name', 'scripts': '$base/Scripts', 'data' : '$base', }, 'mac': { 'purelib': '$base/Lib/site-packages', 'platlib': '$base/Lib/site-packages', 'headers': '$base/Include/$dist_name', 'scripts': '$base/Scripts', 'data' : '$base', } } """ Paul, note that your patches don't even touch install.py -- are your sure that the patch to sysconfig.py suffices to have distutils install the extensions into site-packages on Windows ? (I believe that install.py would have to be told about sys.extinstallpath too and that it should fallback to the defaults given in the install schemes if it is not set.) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From fdrake@acm.org Wed Jul 18 16:06:00 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 18 Jul 2001 11:06:00 -0400 (EDT) Subject: [Distutils] Re: [Python-Dev] PEP 250: Summary of comments In-Reply-To: <3B55A499.42F55B5A@lemburg.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com> <15189.40082.973330.760541@cj42289-a.reston1.va.home.com> <3B55A499.42F55B5A@lemburg.com> Message-ID: <15189.42456.461317.848018@cj42289-a.reston1.va.home.com> M.-A. Lemburg writes: > I don't think this is an issue since distutils already knows > that extension package live in .../site-package on Unix. Frankly, I'm not convinced that there's a need for extinstallpath. Why not define INSTALL_SCHEMES like this: if sys.version < "2.2": WINDOWS_SCHEME = { 'purelib': '$base', 'platlib': '$base', 'headers': '$base/Include/$dist_name', 'scripts': '$base/Scripts', 'data' : '$base', } else: WINDOWS_SCHEME = { 'purelib': '$base/Lib/site-packages', 'platlib': '$base/Lib/site-packages', 'headers': '$base/Include/$dist_name', 'scripts': '$base/Scripts', 'data' : '$base', } INSTALL_SCHEMES = { 'nt': WINDOWS_SCHEME, ... } -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From loewis@informatik.hu-berlin.de Wed Jul 18 16:18:07 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 18 Jul 2001 17:18:07 +0200 (MEST) Subject: [Python-Dev] Freeze hacks Message-ID: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> A number of modules in the standard library make use of dynamic imports, or import modules through C code. In either case, no import statement can be found. Unfortunately, this means that tools like freeze or py2exe cannot detect that those modules are used, so the frozen applications will then fail at runtime. To make this work, I suggest to add explicit import statements, which are put into a conditional 'if 0:'. In particular, I found that the following modules need to be referenced somewhere: - xml.sax.expatreader, from xml.sax.__init__ - encodings.__init__, probably from codecs - encodings.*, from encodings.__init__ - dbhash, gdbm, dbm, dumbdbm, from anydbm - unixccompiler, msvccompiler, cygwinccompiler, bcppcompiler, mwerkscompiler, from distutils.ccompiler - distutils.command.* from distutils.dist What is the purpose of dumbdbm not importing os directly? To give a specific example, I'd change xml.sax.__init__ to read default_parser_list = ["xml.sax.expatreader"] if 0: # freeze hack: the import relationship is not visible without this # statement import xml.sax.expatreader Is that a desirable change? If so, I'll produce a patch. The case of encodings is particularly troubling: I don't think there is a way to tell freeze/py2exe/installer that print u"Hallo".encode("iso8859-2") will require additional modules. As a convention, I'd still recommend to link all this to codecs, so that an application requiring any codecs can do if 0: import codecs explicitly, or just tell the freeze tool to use codecs, and then will get all codecs that are known statically. Regards, Martin From thomas.heller@ion-tof.com Wed Jul 18 16:32:37 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 18 Jul 2001 17:32:37 +0200 Subject: [Python-Dev] Freeze hacks References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> Message-ID: <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> From: "Martin von Loewis" > A number of modules in the standard library make use of dynamic > imports, or import modules through C code. In either case, no import > statement can be found. > > Unfortunately, this means that tools like freeze or py2exe cannot > detect that those modules are used, so the frozen applications will > then fail at runtime. To make this work, I suggest to add explicit > import statements, which are put into a conditional 'if 0:'. Very good idea IMO, but 'if 0:' is optimized away. This one works: _FAKE=0 if _FAKE: import whatever Thomas From Paul.Moore@atosorigin.com Wed Jul 18 16:52:02 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 18 Jul 2001 16:52:02 +0100 Subject: [Distutils] Re: [Python-Dev] PEP 250: Summary of comments Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AF1B@UKRUX002.rundc.uk.origin-it.com> From: M.-A. Lemburg [mailto:mal@lemburg.com] > (I believe that install.py would have to be told about > sys.extinstallpath too and that it should fallback to the > defaults given in the install schemes if it is not set.) Hmm, browsing this a bit more, I'm getting further confused. The cause of this is the INSTALL_SCHEMES stuff, which has a purelib/platlib distinction, which is only used on unix_prefix (all other cases use the same value for both of these). I can't see how sys.extinstallpath relates - I could use it as default for both purelib and platlib, but that somewhat defeats the point of having the two. Does this imply that sys.extinstallpath should be split into two parts (pure & plat)? I can't comment, as this is a Unix-only thing. This is getting silly. I feel that the correct approach is to go back to my original stance, of *only* changing Windows behaviour - leave the Unix and Mac camps as they are. With that in mind, sys.extinstallpath seems like an overgeneralisation, and the attached patch does everything bar handle bdist_wininst. The Windows Installer should then do the same thing - load Python, and generate os.path.join(sys.prefix, "lib", "site-packages") as the destination directory. OK, so the same thing is hard-coded in four places, but this whole area is rife with duplicated code, and fixing that issue is way outside the scope of PEP 250. For the limited purpose of making site-packages appear in sys.path, and making python setup.py install install to site-packages, the attached patch works. I've only tested it on a simple Python module, but that's all I have to hand. I can try some C modules tonight when I get home, but I see no reason why they wouldn't work as well. The patch is pretty much trivial, which (IMHO) is very much in its favour as Python 2.2a1 is already out... Unless someone comes up with a *very strong* argument as to why I should be going further than this, I would like to request that this goes into Python as it stands. If someone can supply the source of the bdist_wininst installer, I will make a corresponding change to that. I will NOT make any changes which affect Unix, or Mac platforms. I don't know the issues. If someone wants to supply a patch which does this, I'll be happy to see it go in, and I am quite comfortable with it going under the banner of PEP 250, but I will not get involved in the issues - I simply am not qualified to comment. Paul. ------------------------------------------------------ diff -u site.py.orig site.py --- site.py.orig Tue Jun 26 10:07:06 2001 +++ site.py Wed Jul 18 16:33:37 2001 @@ -143,7 +143,7 @@ elif os.sep == ':': sitedirs = [makepath(prefix, "lib", "site-packages")] else: - sitedirs = [prefix] + sitedirs = [prefix, makepath(prefix, "lib", "site-packages")] for sitedir in sitedirs: if os.path.isdir(sitedir): addsitedir(sitedir) diff -u distutils\sysconfig.py.orig distutils\sysconfig.py --- distutils\sysconfig.py.orig Thu Apr 19 10:24:24 2001 +++ distutils\sysconfig.py Wed Jul 18 16:20:20 2001 @@ -87,7 +87,7 @@ elif os.name == "nt": if standard_lib: - return os.path.join(PREFIX, "Lib") + return os.path.join(PREFIX, "Lib", "site-packages") else: return prefix diff -u distutils\command\install.py.orig distutils\command\install.py --- distutils\command\install.py.orig Thu Apr 19 10:24:24 2001 +++ distutils\command\install.py Wed Jul 18 16:29:29 2001 @@ -31,8 +31,8 @@ 'data' : '$base', }, 'nt': { - 'purelib': '$base', - 'platlib': '$base', + 'purelib': '$base/Lib/site-packages', + 'platlib': '$base/Lib/site-packages', 'headers': '$base/Include/$dist_name', 'scripts': '$base/Scripts', 'data' : '$base', From loewis@informatik.hu-berlin.de Wed Jul 18 16:53:41 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 18 Jul 2001 17:53:41 +0200 (MEST) Subject: [Python-Dev] Freeze hacks In-Reply-To: <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> (thomas.heller@ion-tof.com) References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> Message-ID: <200107181553.RAA19593@pandora.informatik.hu-berlin.de> > Very good idea IMO, but 'if 0:' is optimized away. I'm not sure I understand. freeze does not optimize away such a code block. Under which condition is that optimized away? Regards, Martin From loewis@informatik.hu-berlin.de Wed Jul 18 17:01:05 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 18 Jul 2001 18:01:05 +0200 (MEST) Subject: [Python-Dev] Bumping the API version Message-ID: <200107181601.SAA20887@pandora.informatik.hu-berlin.de> I'm about to commit patch #412229, which will add an addition field at the end of PyInterpreterState if HAVE_DLOPEN is defined. Do I need to bump the API version for that? Regards, Martin From thomas.heller@ion-tof.com Wed Jul 18 17:10:24 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 18 Jul 2001 18:10:24 +0200 Subject: [Python-Dev] Freeze hacks References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de> Message-ID: <013e01c10fa4$27376700$e000a8c0@thomasnotebook> From: "Martin von Loewis" > > Very good idea IMO, but 'if 0:' is optimized away. > > I'm not sure I understand. freeze does not optimize away such a code > block. Under which condition is that optimized away? > The Python compiler itself. 'if 0: import whatever' does not generate any byte code. Modulefinder (used by freeze, py2exe, and Gordon's installer) checks the compiled byte code for import statements. Thomas C:\Python21>c:\python21\python.exe ActivePython 2.1, build 211 (ActiveState) based on Python 2.1 (#15, Jun 18 2001, 21:42:28) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import dis >>> _FAKE=0 >>> def f(): ... if 0: ... import win32api ... >>> def g(): ... if _FAKE: ... import win32api ... >>> dis.dis(f) 0 SET_LINENO 1 3 SET_LINENO 2 6 LOAD_CONST 0 (None) 9 RETURN_VALUE >>> dis.dis(g) 0 SET_LINENO 1 3 SET_LINENO 2 6 LOAD_GLOBAL 0 (_FAKE) 9 JUMP_IF_FALSE 16 (to 28) 12 POP_TOP 13 SET_LINENO 3 16 LOAD_CONST 0 (None) 19 IMPORT_NAME 1 (win32api) 22 STORE_FAST 0 (win32api) 25 JUMP_FORWARD 1 (to 29) >> 28 POP_TOP >> 29 LOAD_CONST 0 (None) 32 RETURN_VALUE >>> ^Z From dubois1@llnl.gov Wed Jul 18 17:19:59 2001 From: dubois1@llnl.gov (Paul F. Dubois) Date: Wed, 18 Jul 2001 09:19:59 -0700 Subject: [Python-Dev] 2.2a1 and Numerical Message-ID: <01071809231800.14475@almanac> Numerical Python 20.1 compiles and passes all its tests with 2.2a1. It is about 10% slower than 2.1 both with and without optimization in exe= cuting the test suite. That test suite uses PyUnit and most of the operations in= volve small arrays so I suspect most of its time is in Python not the C code fo= r Numeric.=20 From guido@digicool.com Wed Jul 18 17:32:03 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 18 Jul 2001 12:32:03 -0400 Subject: [Python-Dev] Freeze hacks In-Reply-To: Your message of "Wed, 18 Jul 2001 17:18:07 +0200." <200107181518.RAA05425@pandora.informatik.hu-berlin.de> References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> Message-ID: <200107181632.MAA28401@cj20424-a.reston1.va.home.com> > - unixccompiler, msvccompiler, cygwinccompiler, > bcppcompiler, mwerkscompiler, from distutils.ccompiler > - distutils.command.* from distutils.dist I don't expect frozen programs to use distutils. > What is the purpose of dumbdbm not importing os directly? I'm afraid that it's just an old way of spelling import os as _os With the intention of not exporting anything undesired on "from dumbdbm import *". Feel free to fix it. > To give a specific example, I'd change xml.sax.__init__ to read > > default_parser_list = ["xml.sax.expatreader"] > if 0: > # freeze hack: the import relationship is not visible without this > # statement > import xml.sax.expatreader > > Is that a desirable change? If so, I'll produce a patch. Sounds good to me. > The case of encodings is particularly troubling: I don't think there > is a way to tell freeze/py2exe/installer that > > print u"Hallo".encode("iso8859-2") > > will require additional modules. As a convention, I'd still recommend > to link all this to codecs, so that an application requiring any > codecs can do > > if 0: > import codecs > > explicitly, or just tell the freeze tool to use codecs, and then will > get all codecs that are known statically. Won't this create enormously bloated frozen binaries? --Guido van Rossum (home page: http://www.python.org/~guido/) From loewis@informatik.hu-berlin.de Wed Jul 18 17:29:27 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 18 Jul 2001 18:29:27 +0200 (MEST) Subject: [Python-Dev] Freeze hacks In-Reply-To: <013e01c10fa4$27376700$e000a8c0@thomasnotebook> (thomas.heller@ion-tof.com) References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de> <013e01c10fa4$27376700$e000a8c0@thomasnotebook> Message-ID: <200107181629.SAA29368@pandora.informatik.hu-berlin.de> > The Python compiler itself. 'if 0: import whatever' does > not generate any byte code. Modulefinder (used by freeze, > py2exe, and Gordon's installer) checks the compiled byte code > for import statements. Ah, thanks for the explanation. I'll consider it in my patch. Regards, Martin From guido@digicool.com Wed Jul 18 17:42:23 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 18 Jul 2001 12:42:23 -0400 Subject: [Python-Dev] Bumping the API version In-Reply-To: Your message of "Wed, 18 Jul 2001 18:01:05 +0200." <200107181601.SAA20887@pandora.informatik.hu-berlin.de> References: <200107181601.SAA20887@pandora.informatik.hu-berlin.de> Message-ID: <200107181642.MAA28484@cj20424-a.reston1.va.home.com> > I'm about to commit patch #412229, which will add an addition field at > the end of PyInterpreterState if HAVE_DLOPEN is defined. Do I need to > bump the API version for that? I don't think so, since PyInterpreterState objects are always allocated by Python, not by 3rd party code. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Wed Jul 18 17:44:40 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 18 Jul 2001 12:44:40 -0400 Subject: [Python-Dev] Freeze hacks In-Reply-To: Your message of "Wed, 18 Jul 2001 18:10:24 +0200." <013e01c10fa4$27376700$e000a8c0@thomasnotebook> References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de> <013e01c10fa4$27376700$e000a8c0@thomasnotebook> Message-ID: <200107181644.MAA28513@cj20424-a.reston1.va.home.com> > > > Very good idea IMO, but 'if 0:' is optimized away. > > > > I'm not sure I understand. freeze does not optimize away such a code > > block. Under which condition is that optimized away? > > > The Python compiler itself. 'if 0: import whatever' does > not generate any byte code. Modulefinder (used by freeze, > py2exe, and Gordon's installer) checks the compiled byte code > for import statements. Good catch, Thomas. I find defining a variable _FAKE a bit cumbersome as a work-around. I would suggest instead: if 1==0: import whatever since the optimizer only optimizes out "if 0:". --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Wed Jul 18 17:46:33 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 18 Jul 2001 12:46:33 -0400 Subject: [Python-Dev] 2.2a1 and Numerical In-Reply-To: Your message of "Wed, 18 Jul 2001 09:19:59 PDT." <01071809231800.14475@almanac> References: <01071809231800.14475@almanac> Message-ID: <200107181646.MAA28536@cj20424-a.reston1.va.home.com> > Numerical Python 20.1 compiles and passes all its tests with 2.2a1. > > It is about 10% slower than 2.1 both with and without optimization > in executing the test suite. That test suite uses PyUnit and most of > the operations involve small arrays so I suspect most of its time is > in Python not the C code for Numeric. Thanks for the report, Paul! We'll definitely put performance on the agenda for 2.2; it's already got our attention since Jim Fulton posted some Zope benchmarks in an internal list at Digital Creations... --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Wed Jul 18 17:50:34 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 18 Jul 2001 18:50:34 +0200 Subject: [Python-Dev] Freeze hacks References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de> <013e01c10fa4$27376700$e000a8c0@thomasnotebook> <200107181644.MAA28513@cj20424-a.reston1.va.home.com> Message-ID: <01fb01c10fa9$c3c4d170$e000a8c0@thomasnotebook> From: "Guido van Rossum" > > > > Very good idea IMO, but 'if 0:' is optimized away. > > > > > > I'm not sure I understand. freeze does not optimize away such a code > > > block. Under which condition is that optimized away? > > > > > The Python compiler itself. 'if 0: import whatever' does > > not generate any byte code. Modulefinder (used by freeze, > > py2exe, and Gordon's installer) checks the compiled byte code > > for import statements. > > Good catch, Thomas. > > I find defining a variable _FAKE a bit cumbersome as a work-around. I > would suggest instead: > > if 1==0: > import whatever > > since the optimizer only optimizes out "if 0:". If the optimizer becomes more intelligent in the future, and also probably easier to document the purpose would be to use an (uncalled) function: def _freeze_hints(): import spam Just another idea, Thomas From fdrake@acm.org Wed Jul 18 17:54:05 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 18 Jul 2001 12:54:05 -0400 (EDT) Subject: [Python-Dev] Freeze hacks In-Reply-To: <01fb01c10fa9$c3c4d170$e000a8c0@thomasnotebook> References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de> <013e01c10fa4$27376700$e000a8c0@thomasnotebook> <200107181644.MAA28513@cj20424-a.reston1.va.home.com> <01fb01c10fa9$c3c4d170$e000a8c0@thomasnotebook> Message-ID: <15189.48941.975174.479751@cj42289-a.reston1.va.home.com> Thomas Heller writes: > def _freeze_hints(): > import spam Much nicer! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido@digicool.com Wed Jul 18 18:05:42 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 18 Jul 2001 13:05:42 -0400 Subject: [Python-Dev] Freeze hacks In-Reply-To: Your message of "Wed, 18 Jul 2001 18:50:34 +0200." <01fb01c10fa9$c3c4d170$e000a8c0@thomasnotebook> References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de> <013e01c10fa4$27376700$e000a8c0@thomasnotebook> <200107181644.MAA28513@cj20424-a.reston1.va.home.com> <01fb01c10fa9$c3c4d170$e000a8c0@thomasnotebook> Message-ID: <200107181705.NAA30880@cj20424-a.reston1.va.home.com> > If the optimizer becomes more intelligent > in the future, and also probably easier to document > the purpose would be to use an (uncalled) function: > > def _freeze_hints(): > import spam > > Just another idea, Very nice one! An optimizer could never remove this, because it could be called from outside. --Guido van Rossum (home page: http://www.python.org/~guido/) From loewis@informatik.hu-berlin.de Wed Jul 18 19:27:52 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 18 Jul 2001 20:27:52 +0200 (MEST) Subject: [Python-Dev] re with Unicode broken? Message-ID: <200107181827.UAA00284@pandora.informatik.hu-berlin.de> > The expression which now fails to match is: Did you, by any chance, use a big-endian system for that? If so, could you please try the patch http://sourceforge.net/tracker/?func=detail&aid=442512&group_id=5470&atid=305470 With that patch, your example code matches fine on my SPARC box. Regards, Martin From guido@digicool.com Wed Jul 18 19:38:55 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 18 Jul 2001 14:38:55 -0400 Subject: [Python-Dev] re with Unicode broken? In-Reply-To: Your message of "Wed, 18 Jul 2001 20:27:52 +0200." <200107181827.UAA00284@pandora.informatik.hu-berlin.de> References: <200107181827.UAA00284@pandora.informatik.hu-berlin.de> Message-ID: <200107181838.OAA00994@cj20424-a.reston1.va.home.com> > > The expression which now fails to match is: > > Did you, by any chance, use a big-endian system for that? If so, could > you please try the patch > > http://sourceforge.net/tracker/?func=detail&aid=442512&group_id=5470&atid=305470 > > With that patch, your example code matches fine on my SPARC box. I'm guessing this is a showstopper fix for 2.1.1 too... --Guido van Rossum (home page: http://www.python.org/~guido/) From loewis@informatik.hu-berlin.de Wed Jul 18 19:44:51 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 18 Jul 2001 20:44:51 +0200 (MEST) Subject: [Python-Dev] re with Unicode broken? In-Reply-To: <200107181838.OAA00994@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Wed, 18 Jul 2001 14:38:55 -0400) References: <200107181827.UAA00284@pandora.informatik.hu-berlin.de> <200107181838.OAA00994@cj20424-a.reston1.va.home.com> Message-ID: <200107181844.UAA00379@pandora.informatik.hu-berlin.de> > > With that patch, your example code matches fine on my SPARC box. > > I'm guessing this is a showstopper fix for 2.1.1 too... No, the BIGCHARSET support was added to the CVS only recently; this bug is not in 2.1. In case it got merged to the 2.2a1 branch, it might be worthwhile applying the patch to that branch as well - provided /F approves the patch. Or, we could live with the bug until 2.2a2, since it only triggers if Unicode character classes are used on a big-endian machine. Regards, Martin From guido@digicool.com Wed Jul 18 19:49:36 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 18 Jul 2001 14:49:36 -0400 Subject: [Python-Dev] re with Unicode broken? In-Reply-To: Your message of "Wed, 18 Jul 2001 20:44:51 +0200." <200107181844.UAA00379@pandora.informatik.hu-berlin.de> References: <200107181827.UAA00284@pandora.informatik.hu-berlin.de> <200107181838.OAA00994@cj20424-a.reston1.va.home.com> <200107181844.UAA00379@pandora.informatik.hu-berlin.de> Message-ID: <200107181849.OAA01109@cj20424-a.reston1.va.home.com> > No, the BIGCHARSET support was added to the CVS only recently; this > bug is not in 2.1. In case it got merged to the 2.2a1 branch, it might > be worthwhile applying the patch to that branch as well - provided /F > approves the patch. Or, we could live with the bug until 2.2a2, since > it only triggers if Unicode character classes are used on a big-endian > machine. When you check it in to the trunk, it will be merged into the branch the next time we do a merge. We've adopted a policy of "merge early, merge often", by the way -- my procrastination was ill-guided. :-) If 2.2a1 receives good marks, we'll do a marge back to the trunk and then we'll be out of the merge business. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Wed Jul 18 20:22:37 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 18 Jul 2001 15:22:37 -0400 Subject: [Python-Dev] re with Unicode broken? In-Reply-To: <200107181849.OAA01109@cj20424-a.reston1.va.home.com> Message-ID: Want to emphasize a point: NOBODY check anything into descr-branch! The only people who have a legitimate reason to check into descr-branch already know that scream wasn't directed at them -- and if you had to think even an instant, you're not one of them. The way we're merging the trunk back into the branch works much better if you let trunk changes show up in the branch by magic (which means Tim at 3 in the morning, but that's close enough to magic that Guido can't tell the difference ). From mal@lemburg.com Wed Jul 18 20:24:01 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 18 Jul 2001 21:24:01 +0200 Subject: [Distutils] Re: [Python-Dev] PEP 250: Summary of comments References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com> <15189.40082.973330.760541@cj42289-a.reston1.va.home.com> <3B55A499.42F55B5A@lemburg.com> <15189.42456.461317.848018@cj42289-a.reston1.va.home.com> Message-ID: <3B55E251.7BEE00F0@lemburg.com> "Fred L. Drake, Jr." wrote: > > M.-A. Lemburg writes: > > I don't think this is an issue since distutils already knows > > that extension package live in .../site-package on Unix. > > Frankly, I'm not convinced that there's a need for extinstallpath. Uhm... that's what I implied (or at least tried to imply) with my reply ;-) > Why not define INSTALL_SCHEMES like this: > > if sys.version < "2.2": > WINDOWS_SCHEME = { > 'purelib': '$base', > 'platlib': '$base', > 'headers': '$base/Include/$dist_name', > 'scripts': '$base/Scripts', > 'data' : '$base', > } > else: > WINDOWS_SCHEME = { > 'purelib': '$base/Lib/site-packages', > 'platlib': '$base/Lib/site-packages', > 'headers': '$base/Include/$dist_name', > 'scripts': '$base/Scripts', > 'data' : '$base', > } > > INSTALL_SCHEMES = { > 'nt': WINDOWS_SCHEME, > ... > } > > -Fred > > -- > Fred L. Drake, Jr. > PythonLabs at Digital Creations > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG@python.org > http://mail.python.org/mailman/listinfo/distutils-sig -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From simon@netthink.co.uk Wed Jul 18 20:21:28 2001 From: simon@netthink.co.uk (Simon Cozens) Date: Wed, 18 Jul 2001 15:21:28 -0400 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <20010718102341.B16348@ute.cnri.reston.va.us> Message-ID: <20010718152128.C27569@netthink.co.uk> On Wed, Jul 18, 2001 at 10:23:41AM -0400, Andrew Kuchling wrote: > Note that, the last time this idea was brought up, the issue of > version mismatches was brought up. What if the platform has a newer > version of Expat? What if you have an extension module for a library > that also links with Expat internally? For what it's worth, when I (very recently) raised the issue on perl5-porters, the concerns were: 1) Expat isn't *necessarily* the best tool for the job. 2) No support for linking as shared library (this has been fixed, though, apparently) 3) Symbol conflicts between, eg, apache's expat and Perl's expat. Apparently this affects PHP and mod_python too. 4) Worries about portability 5) Version mismatches between Perl's expat and expat's expat Simon From fdrake@beowolf.digicool.com Wed Jul 18 21:10:31 2001 From: fdrake@beowolf.digicool.com (Fred Drake) Date: Wed, 18 Jul 2001 16:10:31 -0400 (EDT) Subject: [Python-Dev] [maintenance doc updates] Message-ID: <20010718201031.F34352892C@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/maint-docs/ Current status of the 2.1.1 documentation -- very few changes since the 2.1.1c1 release. From skip@pobox.com (Skip Montanaro) Wed Jul 18 21:30:11 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 18 Jul 2001 15:30:11 -0500 Subject: [Python-Dev] Please have a look at proposed doc changes for time epoch Message-ID: <15189.61907.883300.127987@beluga.mojam.com> I was just reminded by an update to another bug I had submitted that I was assigned bug #434143. Anything that uses time.mktime will fail if the time tuple it is passed is "too old". Unfortunately, by trying to be precise, the message that goes along with the ValueError that's raised, it's a little misleading: ValueError: year out of range (00-99, 1900-*) Obviously, on Unix systems the baseline date is more like 1970. I suspect this error message was written when Python was being developed mostly (or entirely) on Macs. The context in which this arose was a user trying to generate a calendar using the calendar module. https://sourceforge.net/tracker/?func=detail&aid=434143&group_id=5470&atid=105470 I think this is a difficult problem to solve properly without adding an alternative to time.mktime and making some changes to the various modules that use it (calendar, imaplib and rfc822 in the current CVS tree). I propose instead to make a few documentation changes: * in Modules/timemodule.c, make the error message more vague ;-) * in Doc/lib/lib{time,calendar}.tex indicate that the "epoch" is platform-dependent I'm more than happy to add the necessary ifdefs to Modules/timemodule.c if we can settle on what the actual epochs are for the various platforms. For Unix it is 1970-01-01. For Macs I think it is 1900-01-01. Is it 1904-01-01 on Windows? If you have a moment, please have a look at the above sourceforge url. I'd like to get this off my plate in the next few days. Skip From mal@lemburg.com Wed Jul 18 21:31:14 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 18 Jul 2001 22:31:14 +0200 Subject: [Python-Dev] PEP: Defining Python Source Code Encodings References: <3B559B71.C08C6145@lemburg.com> Message-ID: <3B55F212.9B11346A@lemburg.com> Barry has assigned the PEP number 0263 to this PEP. If you prefer to read the PEP online, here is the URL: http://python.sourceforge.net/peps/pep-0263.html -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Wed Jul 18 21:37:43 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 18 Jul 2001 15:37:43 -0500 Subject: [Python-Dev] some unassigned bugs - where should they go? Message-ID: <15189.62359.164818.924865@beluga.mojam.com> https://sourceforge.net/my/ is a very useful page, 'cuz it reminds you of all the bugs assigned to you as well as all the bugs you submitted that haven't been closed. ;-) In reviewing bugs that I submitted, I find these that have yet to be assigned to anybody: https://sourceforge.net/tracker/?func=detail&aid=440725&group_id=5470&atid=105470 https://sourceforge.net/tracker/?func=detail&aid=427073&group_id=5470&atid=105470 one bug and one patch that were assigned to Ping (probably by me, 'cuz I knew they were in his inspect and pydoc code) https://sourceforge.net/tracker/?func=detail&aid=426740&group_id=5470&atid=105470 https://sourceforge.net/tracker/?func=detail&aid=419419&group_id=5470&atid=305470 and one bug that got assigned to Guido: https://sourceforge.net/tracker/?func=detail&aid=424554&group_id=5470&atid=305470 I assume the last one should probably be reassigned. I can't believe it would actually need BDFL eyes to figure out or approve. Should I just randomly (or otherwise) assign the first two to someone? Ping seems to have disappeared from the face of the earth. Has anyone heard from him lately? In fact, there were seven bugs assigned to Ping between March and May, all related to pydoc or inspect. Skip From tim.one@home.com Thu Jul 19 00:20:20 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 18 Jul 2001 19:20:20 -0400 Subject: [Python-Dev] Please have a look at proposed doc changes for time epoch In-Reply-To: <15189.61907.883300.127987@beluga.mojam.com> Message-ID: [Skip Montanaro] > ... > Unfortunately, by trying to be precise, the message that goes along > with the ValueError that's raised, it's a little misleading: > > ValueError: year out of range (00-99, 1900-*) > > Obviously, on Unix systems the baseline date is more like 1970. I > suspect this error message was written when Python was being > developed mostly (or entirely) on Macs. C mktime is broken all over the place. Since the tm_year member of a struct tm is defined as the number of years since 1900, reasonable implementers should have assumed the committee intended that years before 1970 weren't anything special -- but apparently only the Mac implementers were reasonable <0.9 wink>. C99 spells this out in severe detail, making clear that there's nothing special even about negative tm_year offsets (they're for years before 1900, of course). > ... > * in Modules/timemodule.c, make the error message more vague ;-) > > * in Doc/lib/lib{time,calendar}.tex indicate that the "epoch" is > platform-dependent The defn. of mktime makes no reference to epoch (indeed, the C std doesn't mention "the epoch" anywhere!), so that's mixing concepts that shouldn't get mixed even when making excuses for bad implementations. If we can't fix it ourselves, better to say that mktime simply isn't well-defined across platforms. > Is it 1904-01-01 on Windows? The MS docs say: mktime returns the specified calendar time encoded as a value of type time_t. If timeptr references a date before midnight, January 1, 1970, or if the calendar time cannot be represented, the function returns –1 cast to type time_t. From tim.one@home.com Thu Jul 19 00:47:23 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 18 Jul 2001 19:47:23 -0400 Subject: [Python-Dev] Python 2.1.1 & Mac/ In-Reply-To: <20010718162439.E2054@xs4all.nl> Message-ID: [Thomas Wouters] > When I updated the 2.1.1 tree this morning, I noticed it checked > out the entire Mac/ subtree... As it wasn't part of 2.1, I don't > think it should be part of 2.1.1, and I don't remember seeing any > add's for it (at least not> with the release21-maint tag.) Jack/Just, > did either of you add it explicitly ? Ever get an answer? Looks like all of Jack's checkins today were made on the release21-maint branch (and not the trunk) too. From tim.one@home.com Thu Jul 19 02:28:54 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 18 Jul 2001 21:28:54 -0400 Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1 Message-ID: A very recent patch to webbrowser.py broke this module on Windows; the patch also appears in the 2.1.1 maintenance branch. C:\Code\2.1.1\dist\src\PCbuild>python Python 2.1.1c1 (#19, Jul 13 2001, 00:25:06) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. >>> import webbrowser Traceback (most recent call last): File "", line 1, in ? File "C:\CODE\2.1.1\DIST\SRC\lib\webbrowser.py", line 312, in ? if _iscommand(cmd.lower()): NameError: name '_iscommand' is not defined >>> This also causes test___all__.py to fail on Windows, in 2.1.1 and CVS. Note that we intend to build 2.1.1 final tomorrow (Thursday) night, so please fix it or rip it out ASAP. From akuchlin@mems-exchange.org Thu Jul 19 02:55:46 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 18 Jul 2001 21:55:46 -0400 Subject: [Python-Dev] 2.2 Unicode questions Message-ID: <20010718215546.A16539@ludwig.cnri.reston.va.us> I've written some text on Unicode for the 2.2 article, but it's doubtful I actually understand what's going on. Can people who actually understand where Unicode has been please take a look at the following? First, a short one, Mark Hammond's patch for supporting MBCS on Windows. I trust everyone can handle a little bit of TeX markup? % XXX is this explanation correct? \item When presented with a Unicode filename on Windows, Python will now correctly convert it to a string using the MBCS encoding. Filenames on Windows are a case where Python's choice of ASCII as the default encoding turns out to be an annoyance. This patch also adds \samp{et} as a format sequence to \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and an encoding name, and converts it to the given encoding if the parameter turns out to be a Unicode string, or leaves it alone if it's an 8-bit string, assuming it to already be in the desired encoding. (This differs from the \samp{es} format character, which assumes that 8-bit strings are in Python's default ASCII encoding and converts them to the specified new encoding.) (Contributed by Mark Hammond with assistance from Marc-Andr\'e Lemburg.) Second, the --enable-unicode changes: %====================================================================== \section{Unicode Changes} Python's Unicode support has been enhanced a bit in 2.2. Unicode strings are usually stored as UCS-2, as 16-bit unsigned integers. Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned integers, as its internal encoding by supplying \longprogramopt{enable-unicode=ucs4} to the configure script. When built to use UCS-4, in theory Python could handle Unicode characters from U-00000000 to U-7FFFFFFF. Being able to use UCS-4 internally is a necessary step to do that, but it's not the only step, and in Python 2.2alpha1 the work isn't complete yet. For example, the \function{unichr()} function still only accepts values from 0 to 65535, and there's no \code{\e U} notation for embedding characters greater than 65535 in a Unicode string literal. All this is the province of the still-unimplemented PEP 261, ``Support for `wide' Unicode characters''; consult it for further details, and please offer comments and suggestions on the proposal it describes. % ... section on decode() deleted; on firmer ground there... \method{encode()} and \method{decode()} were implemented by Marc-Andr\'e Lemburg. The changes to support using UCS-4 internally were implemented by Fredrik Lundh and Martin von L\"owis. \begin{seealso} \seepep{261}{Support for `wide' Unicode characters}{PEP written by Paul Prescod. Not yet accepted or fully implemented.} \end{seealso} Corrections? Thanks in advance... --amk From fdrake@acm.org Thu Jul 19 04:49:06 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 18 Jul 2001 23:49:06 -0400 (EDT) Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1 In-Reply-To: References: Message-ID: <15190.22706.479809.871806@cj42289-a.reston1.va.home.com> Tim Peters writes: > A very recent patch to webbrowser.py broke this module on Windows; the patch > also appears in the 2.1.1 maintenance branch. Please try again; I think this is fixed in the patch I just checked in, but I don't have a convenient Windows box to try it on right now. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From skip@pobox.com (Skip Montanaro) Thu Jul 19 04:57:22 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 18 Jul 2001 22:57:22 -0500 Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1 In-Reply-To: References: Message-ID: <15190.23202.463412.71847@beluga.mojam.com> Tim> A very recent patch to webbrowser.py broke this module on Windows; Tim> the patch also appears in the 2.1.1 maintenance branch. Ah shit. This whole branching thing has me very confused, so I don't dare check anything in. To get things to work, I think all you need to do at the end of webbrowser.py is replace the for loop with try: _iscommand for cmd in _tryorder: if not _browsers.has_key(cmd.lower()): if _iscommand(cmd.lower()): register(cmd.lower(), None, GenericBrowser("%s %%s" % cmd.lower())) except NameError: pass My version suddenly looks a hell of a lot different than what I checked in earlier today. I suspect someone may have backed stuff out and went too far back in time. Skip From fdrake@acm.org Thu Jul 19 04:56:43 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 18 Jul 2001 23:56:43 -0400 (EDT) Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1 In-Reply-To: <15190.23202.463412.71847@beluga.mojam.com> References: <15190.23202.463412.71847@beluga.mojam.com> Message-ID: <15190.23163.302172.511620@cj42289-a.reston1.va.home.com> Skip Montanaro writes: > My version suddenly looks a hell of a lot different than what I checked in > earlier today. I suspect someone may have backed stuff out and went too far > back in time. No; this was strictly forward motion, at least in my book. The patch you submitted was *not* reverted. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim.one@home.com Thu Jul 19 05:30:06 2001 From: tim.one@home.com (Tim Peters) Date: Thu, 19 Jul 2001 00:30:06 -0400 Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1 In-Reply-To: <15190.23163.302172.511620@cj42289-a.reston1.va.home.com> Message-ID: Fred fixed webbrowswer.py on Windows, in both the trunk and 2.1.1 (thank you, Fred!). [Skip] > This whole branching thing has me very confused, so I don't dare > check anything in. ... Feel free to check things into the trunk. This close to the release, though, I strongly advise checking anything into the maintenance branch, unless you're Thomas Wouters, or one of the PythonLabs guys (Guido is able to make us stay up until Friday to fix anything we screw up <0.9 wink>). From thomas@xs4all.net Thu Jul 19 08:30:01 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 19 Jul 2001 09:30:01 +0200 Subject: [Python-Dev] Python 2.1.1 & Mac/ In-Reply-To: References: Message-ID: <20010719093001.G2054@xs4all.nl> On Wed, Jul 18, 2001 at 07:47:23PM -0400, Tim Peters wrote: > [Thomas Wouters] > > When I updated the 2.1.1 tree this morning, I noticed it checked > > out the entire Mac/ subtree... As it wasn't part of 2.1, I don't > > think it should be part of 2.1.1, and I don't remember seeing any > > add's for it (at least not> with the release21-maint tag.) Jack/Just, > > did either of you add it explicitly ? > Ever get an answer? Looks like all of Jack's checkins today were made on > the release21-maint branch (and not the trunk) too. Yeah, the issue was resolved. Jack added it to make a MacPython 2.1.1, hadn't realized it would make it trickier for the rest of us, apologized Guido (and you ) agreed to remember to keep the Mac/ subdirectory out of the release 'by hand'. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From sjoerd.mullender@oratrix.com Thu Jul 19 09:57:47 2001 From: sjoerd.mullender@oratrix.com (Sjoerd Mullender) Date: Thu, 19 Jul 2001 10:57:47 +0200 Subject: [Python-Dev] Re: re with Unicode broken? In-Reply-To: Your message of Wed, 18 Jul 2001 20:27:52 +0200. <200107181827.UAA00284@pandora.informatik.hu-berlin.de> References: <200107181827.UAA00284@pandora.informatik.hu-berlin.de> Message-ID: <20010719085747.CC051301CF7@bireme.oratrix.nl> Yes, I was using a big-endian system (SGI), and yes, the patch worked. On Wed, Jul 18 2001 Martin von Loewis wrote: > > The expression which now fails to match is: > > Did you, by any chance, use a big-endian system for that? If so, could > you please try the patch > > http://sourceforge.net/tracker/?func=detail&aid=442512&group_id=5470&atid=305470 > > With that patch, your example code matches fine on my SPARC box. > > Regards, > Martin > -- Sjoerd Mullender From thomas@xs4all.net Thu Jul 19 10:18:06 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 19 Jul 2001 11:18:06 +0200 Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1 In-Reply-To: Message-ID: <20010719111806.H2054@xs4all.nl> On Thu, Jul 19, 2001 at 12:30:06AM -0400, Tim Peters wrote: > Feel free to check things into the trunk. This close to the release, > though, I strongly advise checking anything into the maintenance branch, > unless you're Thomas Wouters, or one of the PythonLabs guys (Guido is able > to make us stay up until Friday to fix anything we screw up <0.9 wink>). And there's another reason not to check things into the maint branch unless you know I won't (which is true only for Jack and Just :): I have a hard enough time keeping track of all the checkins and the stuff people *want* me to check in, without people actually checking things in themselves ;P No harm done, this time, but it's definately something to remember for the next branch :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Thu Jul 19 10:03:30 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 19 Jul 2001 11:03:30 +0200 Subject: [Python-Dev] Please have a look at proposed doc changes for time epoch References: Message-ID: <3B56A262.C461CB@lemburg.com> Tim Peters wrote: > > [Skip Montanaro about deficiencies in the time module] Why don't you use mxDateTime ? It provides a platform independent layer on top of all the C lib confusion underneath. Also, the representable time range is -5851455-01-01 00:00:00.00 - 5867440-12-31 00:00:00.00 ... should cover most people's needs ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Thu Jul 19 12:04:02 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 19 Jul 2001 13:04:02 +0200 Subject: [Python-Dev] 2.2 Unicode questions References: <20010718215546.A16539@ludwig.cnri.reston.va.us> Message-ID: <3B56BEA2.3472A44F@lemburg.com> After looking at the web-page I found: """ Since their introduction, Unicode strings have supported an encode() method to convert the string to a selected encoding such as UTF-8 or=20 Latin-1. A symmetric decode([encoding]) method has been added to both 8-bit and Unicode strings in 2.2, which assumes=20 that the string is in the specified encoding and decodes it. This means that encode() and decode() can be called on=20 both types of strings, and can be used for tasks not directly related to Unicode. """ I did want to add unicode_string.decode(), but there was unexpected opposition to this small addition, so I decided to postpone the change. As a result, things are not as symmetric as they could be=20 in 2.2. I hope that Walter D=F6rwald finishes the codec callback=20 error handling patch before 2.2a2... it would make a great difference to the XML crowd. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Thu Jul 19 13:10:08 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 19 Jul 2001 08:10:08 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: Your message of "Wed, 18 Jul 2001 21:55:46 EDT." <20010718215546.A16539@ludwig.cnri.reston.va.us> References: <20010718215546.A16539@ludwig.cnri.reston.va.us> Message-ID: <200107191210.IAA07020@cj20424-a.reston1.va.home.com> > First, a short one, Mark Hammond's patch for supporting MBCS on > Windows. I trust everyone can handle a little bit of TeX markup? > > % XXX is this explanation correct? > \item When presented with a Unicode filename on Windows, Python will > now correctly convert it to a string using the MBCS encoding. > Filenames on Windows are a case where Python's choice of ASCII as > the default encoding turns out to be an annoyance. > > This patch also adds \samp{et} as a format sequence to > \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and > an encoding name, and converts it to the given encoding if the > parameter turns out to be a Unicode string, or leaves it alone if > it's an 8-bit string, assuming it to already be in the desired > encoding. (This differs from the \samp{es} format character, which > assumes that 8-bit strings are in Python's default ASCII encoding > and converts them to the specified new encoding.) > > (Contributed by Mark Hammond with assistance from Marc-Andr\'e > Lemburg.) I learned something here, so I hope this is correct. :-) > Second, the --enable-unicode changes: > > %====================================================================== > \section{Unicode Changes} > > Python's Unicode support has been enhanced a bit in 2.2. Unicode > strings are usually stored as UCS-2, as 16-bit unsigned integers. > Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned > integers, as its internal encoding by supplying > \longprogramopt{enable-unicode=ucs4} to the configure script. When > built to use UCS-4, in theory Python could handle Unicode characters > from U-00000000 to U-7FFFFFFF. I think the Unicode folks use U+, not U-, and the largest Unicode chracter is "only" U+10FFFF. (Never mind that the data type can handle larger values.) > Being able to use UCS-4 internally is > a necessary step to do that, but it's not the only step, and in Python > 2.2alpha1 the work isn't complete yet. For example, the > \function{unichr()} function still only accepts values from 0 to > 65535, Untrue: it supports range(0x110000) (in UCS-2 mode this returns a surrogate pair). Now, maybe that's not what it *should* do... > and there's no \code{\e U} notation for embedding characters > greater than 65535 in a Unicode string literal. Not true either -- correct \U has been part of Python since 2.0. It does the same thing as unichr() described above. > All this is the > province of the still-unimplemented PEP 261, ``Support for `wide' > Unicode characters''; consult it for further details, and please offer > comments and suggestions on the proposal it describes. > > % ... section on decode() deleted; on firmer ground there... > > \method{encode()} and \method{decode()} were implemented by > Marc-Andr\'e Lemburg. The changes to support using UCS-4 internally > were implemented by Fredrik Lundh and Martin von L\"owis. > > \begin{seealso} > > \seepep{261}{Support for `wide' Unicode characters}{PEP written by > Paul Prescod. Not yet accepted or fully implemented.} > > \end{seealso} > > Corrections? Thanks in advance... If I were you, I would make sure that Marc-Andre and Martin agree with me before adopting my comments above... And thank *you* for doing this very useful write-up again! (I'm doing my part by writing up the types/class unification thing -- now mostly complete at http://www.python.org/2.2/descrintro.html.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Thu Jul 19 13:29:25 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 19 Jul 2001 08:29:25 -0400 Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1 In-Reply-To: Your message of "Thu, 19 Jul 2001 00:30:06 EDT." References: Message-ID: <200107191229.IAA07241@cj20424-a.reston1.va.home.com> [Tim] > Feel free to check things into the trunk. This close to the release, > though, I strongly advise checking anything into the maintenance branch, ^AGAINST! > unless you're Thomas Wouters, or one of the PythonLabs guys (Guido is able > to make us stay up until Friday to fix anything we screw up <0.9 wink>). --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Jul 19 14:05:55 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 19 Jul 2001 15:05:55 +0200 Subject: [Python-Dev] 2.2 Unicode questions References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com> Message-ID: <3B56DB33.71C9161B@lemburg.com> Guido van Rossum wrote: > > > First, a short one, Mark Hammond's patch for supporting MBCS on > > Windows. I trust everyone can handle a little bit of TeX markup? > > > > % XXX is this explanation correct? > > \item When presented with a Unicode filename on Windows, Python will > > now correctly convert it to a string using the MBCS encoding. > > Filenames on Windows are a case where Python's choice of ASCII as > > the default encoding turns out to be an annoyance. > > > > This patch also adds \samp{et} as a format sequence to > > \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and > > an encoding name, and converts it to the given encoding if the > > parameter turns out to be a Unicode string, or leaves it alone if > > it's an 8-bit string, assuming it to already be in the desired > > encoding. (This differs from the \samp{es} format character, which > > assumes that 8-bit strings are in Python's default ASCII encoding > > and converts them to the specified new encoding.) > > > > (Contributed by Mark Hammond with assistance from Marc-Andr\'e > > Lemburg.) > > I learned something here, so I hope this is correct. :-) The last part is... the rest is for Mark to comment on. > > Second, the --enable-unicode changes: > > > > %====================================================================== > > \section{Unicode Changes} > > > > Python's Unicode support has been enhanced a bit in 2.2. Unicode > > strings are usually stored as UCS-2, as 16-bit unsigned integers. > > Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned > > integers, as its internal encoding by supplying > > \longprogramopt{enable-unicode=ucs4} to the configure script. When > > built to use UCS-4, in theory Python could handle Unicode characters > > from U-00000000 to U-7FFFFFFF. > > I think the Unicode folks use U+, not U-, True. > and the largest Unicode > chracter is "only" U+10FFFF. (Never mind that the data type can > handle larger values.) I wouldn't count on that... (note that Andrew wrote "could" ;-) > > Being able to use UCS-4 internally is > > a necessary step to do that, but it's not the only step, and in Python > > 2.2alpha1 the work isn't complete yet. For example, the > > \function{unichr()} function still only accepts values from 0 to > > 65535, > > Untrue: it supports range(0x110000) (in UCS-2 mode this returns a > surrogate pair). Now, maybe that's not what it *should* do... It should definitely not, unless you want to break code which assumes that chr() and unichr() always return a single byte/code unit ! This was part of the UCS-4 checkins which hadn't had time yet to review. Should I remove the surrogate part for narrow builds ? > > and there's no \code{\e U} notation for embedding characters > > greater than 65535 in a Unicode string literal. > > Not true either -- correct \U has been part of Python since 2.0. It > does the same thing as unichr() described above. Right. Note that in this case, the handling of surrogates is needed to make the unicode-escape encoding roundtrip safe. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From akuchlin@mems-exchange.org Thu Jul 19 14:52:20 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Thu, 19 Jul 2001 09:52:20 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <200107191210.IAA07020@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Jul 19, 2001 at 08:10:08AM -0400 References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com> Message-ID: <20010719095220.A7282@ute.cnri.reston.va.us> On Thu, Jul 19, 2001 at 08:10:08AM -0400, Guido van Rossum wrote: >Untrue: it supports range(0x110000) (in UCS-2 mode this returns a >surrogate pair). Now, maybe that's not what it *should* do... I formed the impression that all of the UCS-4 and surrogate work was for the goal of supporting ISO 10646 (or whatever the number is -- you know, the 31-bit character set), so everything is written with that assumption. Presumably that's wrong. Is ISO 10646 on the roadmap at this point, or is it completely irrelevant? Your other corrections will get applied; thanks! --amk From skip@pobox.com (Skip Montanaro) Thu Jul 19 15:06:03 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 19 Jul 2001 09:06:03 -0500 Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1 In-Reply-To: References: <15190.23163.302172.511620@cj42289-a.reston1.va.home.com> Message-ID: <15190.59723.520322.609204@beluga.mojam.com> Tim> Feel free to check things into the trunk. This close to the Tim> release, though, I strongly advise checking anything into the Tim> maintenance branch, unless you're Thomas Wouters, or one of the Tim> PythonLabs guys (Guido is able to make us stay up until Friday to Tim> fix anything we screw up <0.9 wink>). I only checked webbrowser.py into the maintenance branch 'cuz Fred said to. Would someone please flog him for me? ;-) Skip From guido@digicool.com Thu Jul 19 15:09:33 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 19 Jul 2001 10:09:33 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: Your message of "Thu, 19 Jul 2001 15:05:55 +0200." <3B56DB33.71C9161B@lemburg.com> References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com> <3B56DB33.71C9161B@lemburg.com> Message-ID: <200107191409.KAA07785@cj20424-a.reston1.va.home.com> > > Untrue: it supports range(0x110000) (in UCS-2 mode this returns a > > surrogate pair). Now, maybe that's not what it *should* do... > > It should definitely not, unless you want to break code which assumes > that chr() and unichr() always return a single byte/code unit ! Reasonable people can disagree about this. > This was part of the UCS-4 checkins which hadn't had time yet to > review. Should I remove the surrogate part for narrow builds ? Well, this snuck into the 2.2a1, so hopefully we'll get some comments ("love it" / "hate it") from the field to guide our decision. > > > and there's no \code{\e U} notation for embedding characters > > > greater than 65535 in a Unicode string literal. > > > > Not true either -- correct \U has been part of Python since 2.0. It > > does the same thing as unichr() described above. > > Right. > > Note that in this case, the handling of surrogates is needed > to make the unicode-escape encoding roundtrip safe. I don't understand what this means. Can you give an example? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Thu Jul 19 15:14:15 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 19 Jul 2001 10:14:15 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src Makefile.pre.in,1.35.2.1,1.35.2.2 In-Reply-To: Your message of "Thu, 19 Jul 2001 06:21:08 PDT." References: Message-ID: <200107191414.KAA07830@cj20424-a.reston1.va.home.com> > Revert the previous two changes, unsetting PYTHONHOME breaks the build > procedure on some platforms. Better safe than sorry! [...] > *** Makefile.pre.in 2001/07/19 09:28:24 1.35.2.1 > --- Makefile.pre.in 2001/07/19 13:21:05 1.35.2.2 > *************** > *** 283,288 **** > # Build the shared modules > sharedmods: $(PYTHON) > ! PYTHONPATH= PYTHONHOME= PYTHONSTARTUP= \ > ! ./$(PYTHON) $(srcdir)/setup.py build > > # buildno should really depend on something like LIBRARY_SRC > --- 283,287 ---- > # Build the shared modules > sharedmods: $(PYTHON) > ! PYTHONPATH= ./$(PYTHON) $(srcdir)/setup.py build > > # buildno should really depend on something like LIBRARY_SRC It suddenly occurred to me that in the future (like 2.2) perhaps we ought to have a command line option that means "ignore all $PYTHON... environment variables". --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Jul 19 15:14:05 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 19 Jul 2001 10:14:05 -0400 (EDT) Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1 In-Reply-To: <15190.59723.520322.609204@beluga.mojam.com> References: <15190.23163.302172.511620@cj42289-a.reston1.va.home.com> <15190.59723.520322.609204@beluga.mojam.com> Message-ID: <15190.60205.105809.805536@cj42289-a.reston1.va.home.com> Skip Montanaro writes: > I only checked webbrowser.py into the maintenance branch 'cuz Fred said to. > Would someone please flog him for me? > > ;-) Hey, I know that smile... it's the one you use when you're serious but don't want it too obvious. Now I'll have to cower under my desk in fear of the arrival of the rest of the PythonLabs crew, 'cuz I know they'll do as you ask! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From simon@netthink.co.uk Thu Jul 19 15:15:49 2001 From: simon@netthink.co.uk (Simon Cozens) Date: Thu, 19 Jul 2001 10:15:49 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <200107191409.KAA07785@cj20424-a.reston1.va.home.com> Message-ID: <20010719101549.B31796@netthink.co.uk> On Thu, Jul 19, 2001 at 10:09:33AM -0400, Guido van Rossum wrote: > > > Untrue: it supports range(0x110000) (in UCS-2 mode this returns a > > > surrogate pair). Now, maybe that's not what it *should* do... > > > > It should definitely not, unless you want to break code which assumes > > that chr() and unichr() always return a single byte/code unit ! > > Reasonable people can disagree about this. It certainly should not, if by UCS-2 you actually mean UCS-2. UCS-2 can't access characters outside the Basic Multilingual Plane, and so shouldn't be using surrogates. If by UCS-2 you actually mean UTF-16, then using surrogates is the right approach. :) Simon From guido@digicool.com Thu Jul 19 15:18:15 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 19 Jul 2001 10:18:15 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: Your message of "Thu, 19 Jul 2001 09:52:20 EDT." <20010719095220.A7282@ute.cnri.reston.va.us> References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com> <20010719095220.A7282@ute.cnri.reston.va.us> Message-ID: <200107191418.KAA07872@cj20424-a.reston1.va.home.com> > I formed the impression that all of the UCS-4 and surrogate work was > for the goal of supporting ISO 10646 (or whatever the number is -- you > know, the 31-bit character set), so everything is written with that > assumption. Presumably that's wrong. Is ISO 10646 on the roadmap at > this point, or is it completely irrelevant? The impression I got from the discussion around this was that ISO 10464 now *also* promises to limit itself to 0x110000 characters forever. MvL or MAL can corroborate. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com (Skip Montanaro) Thu Jul 19 15:19:47 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 19 Jul 2001 09:19:47 -0500 Subject: [Python-Dev] Please have a look at proposed doc changes for time epoch In-Reply-To: <3B56A262.C461CB@lemburg.com> References: <3B56A262.C461CB@lemburg.com> Message-ID: <15190.60547.254115.394049@beluga.mojam.com> mal> Tim Peters wrote: >> >> [Skip Montanaro about deficiencies in the time module] mal> Why don't you use mxDateTime ? It provides a platform independent mal> layer on top of all the C lib confusion underneath. mal> Also, the representable time range is mal> -5851455-01-01 00:00:00.00 - 5867440-12-31 00:00:00.00 mal> ... should cover most people's needs ;-) I think we're getting a bit far removed from the original context here. I'm quite well aware of mx.DateTime and use it in my own code. I was assigned a bug report about the calendar module: http://sourceforge.net/tracker/?func=detail&aid=434143&group_id=5470&atid=105470 nThe tail end of the traceback is a ValueError generated by time.mktime whose message suggests that it accepts years in the range 00-99 and 1900+. I don't think it's reasonable to try and make time.mktime "work", so I propose that we make the documentation and exception messages more forthcoming about its platform-dependence. Personally, I think adding mx.DateTime to the core wouldn't be a bad idea. Python's date manipulation code is in need of some more cojones. 2.2 is probably too near, but that's ultimately for the PythonLabs folks to decide. Skip From skip@pobox.com (Skip Montanaro) Thu Jul 19 15:28:15 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 19 Jul 2001 09:28:15 -0500 Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1 In-Reply-To: <15190.60205.105809.805536@cj42289-a.reston1.va.home.com> References: <15190.23163.302172.511620@cj42289-a.reston1.va.home.com> <15190.59723.520322.609204@beluga.mojam.com> <15190.60205.105809.805536@cj42289-a.reston1.va.home.com> Message-ID: <15190.61055.368392.280668@beluga.mojam.com> Fred> Now I'll have to cower under my desk in fear of the arrival of the Fred> rest of the PythonLabs crew, 'cuz I know they'll do as you ask! Worse yet, I bcc'd that message to the PS From guido@digicool.com Thu Jul 19 15:44:18 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 19 Jul 2001 10:44:18 -0400 Subject: [Python-Dev] Please have a look at proposed doc changes for time epoch In-Reply-To: Your message of "Thu, 19 Jul 2001 09:19:47 CDT." <15190.60547.254115.394049@beluga.mojam.com> References: <3B56A262.C461CB@lemburg.com> <15190.60547.254115.394049@beluga.mojam.com> Message-ID: <200107191444.f6JEiIk12637@odiug.digicool.com> > Personally, I think adding mx.DateTime to the core wouldn't be a bad idea. I've heard this endorsement before. It looks a bit too unwieldy to me, but it might be a good starting point for something truly Pythonic. > Python's date manipulation code is in need of some more cojones. 2.2 is > probably too near, but that's ultimately for the PythonLabs folks to decide. No, there's plenty of time to add this to 2.2. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Thu Jul 19 15:46:45 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 19 Jul 2001 10:46:45 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: Your message of "Thu, 19 Jul 2001 10:15:49 EDT." <20010719101549.B31796@netthink.co.uk> References: <20010719101549.B31796@netthink.co.uk> Message-ID: <200107191446.f6JEkjc12663@odiug.digicool.com> > If by UCS-2 you actually mean UTF-16, then using surrogates is the > right approach. :) But isn't the whole point of UTF-16 to fool code that believes it's manipulating UCS-2 into a false sense of security? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From simon@netthink.co.uk Thu Jul 19 15:50:27 2001 From: simon@netthink.co.uk (Simon Cozens) Date: Thu, 19 Jul 2001 10:50:27 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <200107191446.f6JEkjc12663@odiug.digicool.com> Message-ID: <20010719105027.A32172@netthink.co.uk> On Thu, Jul 19, 2001 at 10:46:45AM -0400, Guido van Rossum wrote: > But isn't the whole point of UTF-16 to fool code that believes it's > manipulating UCS-2 into a false sense of security? :-) Well, sort of. More like fooling into a true sense of insecurity. :) Anyway, the Standard sez that a conforming UCS-2 application will not use characters in the surrogates area. Future versions of ISO10646 and the Unicode Standard will probably require UTF-16 instead of UCS-2. Simon From guido@digicool.com Thu Jul 19 15:58:23 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 19 Jul 2001 10:58:23 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: Your message of "Thu, 19 Jul 2001 10:50:27 EDT." <20010719105027.A32172@netthink.co.uk> References: <20010719105027.A32172@netthink.co.uk> Message-ID: <200107191458.f6JEwNA12824@odiug.digicool.com> > > But isn't the whole point of UTF-16 to fool code that believes it's > > manipulating UCS-2 into a false sense of security? :-) > > Well, sort of. More like fooling into a true sense of insecurity. :) Same difference. :-) > Anyway, the Standard sez that a conforming UCS-2 application will > not use characters in the surrogates area. Future versions of ISO10646 > and the Unicode Standard will probably require UTF-16 instead of UCS-2. So the proper way to code *libraries* that use 16-bit data would be not to commit on the issue: don't generate surrogates on your own account, but also don't actively reject them, instead passing them through transparently. This should conform to both UCS-2 and UTF-16. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Thu Jul 19 15:57:37 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Thu, 19 Jul 2001 10:57:37 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <20010719101549.B31796@netthink.co.uk>; from simon@netthink.co.uk on Thu, Jul 19, 2001 at 10:15:49AM -0400 References: <200107191409.KAA07785@cj20424-a.reston1.va.home.com> <20010719101549.B31796@netthink.co.uk> Message-ID: <20010719105737.D7282@ute.cnri.reston.va.us> On Thu, Jul 19, 2001 at 10:15:49AM -0400, Simon Cozens wrote: >If by UCS-2 you actually mean UTF-16, then using surrogates is the >right approach. :) If a narrow Python uses UTF-16 (and it does seem to, according to PEP 100), then the configure script's --enable-unicode=ucs2 option should be changed, because it's misleading. Here's another pass: %====================================================================== \section{Unicode Changes} Python's Unicode support has been enhanced a bit in 2.2. Unicode strings are usually stored as UTF-16, as 16-bit unsigned integers. Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned integers, as its internal encoding by supplying \longprogramopt{enable-unicode=ucs4} to the configure script. When built to use UCS-4 (a ``wide Python''), the interpreter can natively handle Unicode characters from U+000000 to U+110000. The range of legal values for the \function{unichr()} function has been expanded; it used to only accept values up to 65535, but in 2.2 will accept values from 0 to 0x110000. Using a ``narrow Python'', an interpreter compiled to use UTF-16, values greater than 65535 will result in \function{unichr()} returning a string of length 2: \begin{verbatim} >>> s = unichr(65536) >>> s u'\ud800\udc00' >>> len(s) 2 \end{verbatim} This possibly-confusing behaviour, breaking the intuitive invariant that \function{chr()} and\function{unichr()} always return strings of length 1, may be changed later in 2.2, depending on public reaction. All this is the province of the still-unimplemented PEP 261, ``Support for `wide' Unicode characters''; consult it for further details, and please offer comments and suggestions on the proposal it describes. --amk From loewis@informatik.hu-berlin.de Thu Jul 19 16:37:42 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Thu, 19 Jul 2001 17:37:42 +0200 (MEST) Subject: [Python-Dev] 2.2 Unicode questions Message-ID: <200107191537.RAA29684@pandora.informatik.hu-berlin.de> > The impression I got from the discussion around this was that ISO > 10464 now *also* promises to limit itself to 0x110000 characters > forever. MvL or MAL can corroborate. It appears that the state is still the one of resolution M38.6, as reported in http://209.109.201.97/unicode/reports/tr19/tr19-7.html # WG2 accepts the proposal in document N2175 towards removing the # provision for Private Use Groups and Planes beyond Plane 16 in # ISO/IEC 10646, to ensure internal consistency in the standard # between UCS-4, UTF-8 and UTF-16 encoding formats, and instructs its # project editor [to] prepare suitable text for processing as a future # Technical Corrigendum or an Amendment to 10646-1:2000." The original proposal can be found in http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n2175.htm It appears that the promised amendment is PDAM 1 to ISO 10646-1:2000, in http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n2308.pdf which, in 9.1, reserves planes 11 to FF in group 0, and all other groups, for future use, and removes the private use planes E0 to plane FF of group 0, as well as the private use groups 60-7F. In addition, it adds the note # To ensure continued interoperability between the UTF-16 form and # other coded representations of the UCS, it is intended that no other # characters will ever be allocated to code positions above 0010FFFF. However, this addmendment is still in the draft stage, with comments in http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n2355.pdf Since voting in ISO usually takes a while, there may be some more months until ISO 10646 is officially restricted to 17 planes - but it is unlikely that this won't happen. Regards, Martin From mal@lemburg.com Thu Jul 19 18:41:47 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 19 Jul 2001 19:41:47 +0200 Subject: [Python-Dev] 2.2 Unicode questions References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com> <3B56DB33.71C9161B@lemburg.com> <200107191409.KAA07785@cj20424-a.reston1.va.home.com> Message-ID: <3B571BDB.9FB77EF4@lemburg.com> Guido van Rossum wrote: > > > > Untrue: it supports range(0x110000) (in UCS-2 mode this returns a > > > surrogate pair). Now, maybe that's not what it *should* do... > > > > It should definitely not, unless you want to break code which assumes > > that chr() and unichr() always return a single byte/code unit ! > > Reasonable people can disagree about this. > > > This was part of the UCS-4 checkins which hadn't had time yet to > > review. Should I remove the surrogate part for narrow builds ? > > Well, this snuck into the 2.2a1, so hopefully we'll get some comments > ("love it" / "hate it") from the field to guide our decision. Waiting for comments from the field :-) > > > > and there's no \code{\e U} notation for embedding characters > > > > greater than 65535 in a Unicode string literal. > > > > > > Not true either -- correct \U has been part of Python since 2.0. It > > > does the same thing as unichr() described above. > > > > Right. > > > > Note that in this case, the handling of surrogates is needed > > to make the unicode-escape encoding roundtrip safe. > > I don't understand what this means. Can you give an example? It means that the roundtrip Unicode -> encoding -> Unicode is a 1-1 mapping for all Unicode code points. Other examples for roundtrip safe encodings are UTF-8 and UT-16. Looking at the code, I found that the unicode-escape encoder does not convert Unicode surrogates to \UXXXXXXXX escapes. I'll fix that. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Thu Jul 19 19:36:58 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 19 Jul 2001 14:36:58 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: Your message of "Thu, 19 Jul 2001 19:41:47 +0200." <3B571BDB.9FB77EF4@lemburg.com> References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com> <3B56DB33.71C9161B@lemburg.com> <200107191409.KAA07785@cj20424-a.reston1.va.home.com> <3B571BDB.9FB77EF4@lemburg.com> Message-ID: <200107191836.f6JIawF16908@odiug.digicool.com> > > > Note that in this case, the handling of surrogates is needed > > > to make the unicode-escape encoding roundtrip safe. > > > > I don't understand what this means. Can you give an example? > > It means that the roundtrip Unicode -> encoding -> Unicode is a > 1-1 mapping for all Unicode code points. Other examples for > roundtrip safe encodings are UTF-8 and UT-16. > > Looking at the code, I found that the unicode-escape encoder > does not convert Unicode surrogates to \UXXXXXXXX escapes. > I'll fix that. Ah. I had missed the fact that this was a roundtrip for a specific encoding, the unicode-escape encoding. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim@digicool.com Thu Jul 19 20:01:04 2001 From: tim@digicool.com (Tim Peters) Date: Thu, 19 Jul 2001 15:01:04 -0400 Subject: [Python-Dev] getaddrinfo.c: warnings on Windows Message-ID: If you're mucking with Windows specifically (as the latest patch here was), and you don't have the MS Windows compiler, please upload a patch to SF instead. "NO WARNINGS" is a rule on Windows. C:\Code\python\dist\src\Modules\getaddrinfo.c(418) : warning C4090: 'function' : different 'const' qualifiers C:\Code\python\dist\src\Modules\getaddrinfo.c(418) : warning C4024: 'inet_pton' : different types for formal and actual parameter 2 C:\Code\python\dist\src\Modules\getaddrinfo.c(420) : warning C4101: 'pfx' : unreferenced local variable C:\Code\python\dist\src\Modules\getaddrinfo.c(495) : warning C4101: 'h_error' : unreferenced local variable C:\Code\python\dist\src\Modules\getnameinfo.c(101) : warning C4101: 'pfx' : unreferenced local variable C:\Code\python\dist\src\Modules\getaddrinfo.c(346) : warning C4761: integral size mismatch in argument; conversion supplied From andymac@bullseye.apana.org.au Thu Jul 19 14:40:55 2001 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Thu, 19 Jul 2001 23:40:55 +1000 (EST) Subject: [Python-Dev] experiments with PYMALLOC (long) Message-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---888574987-20871-995550055=:4658 Content-Type: TEXT/PLAIN; charset=US-ASCII [this post is primarily for informational purposes, although I would welcome serious suggestions on possible options for dealing with either the longexp issue or the PYMALLOC performance issue - AIM] In my port of Python to OS/2 using the EMX suite, I encountered the situation of not being able to pass the longexp test in the test suite. The test is simply: >>>NUMREPS = 65580 >>>eval('[' + '2,' * NUMREPS + ']') With the advent of PYMALLOC in 2.1 I hoped that this issue could be dealt with, however defining WITH_PYMALLOC achieved nothing other than to cause Numeric to fail on import (I am lead to believe that this is now fixed in Numeric 20.1). Revisiting my earlier diagnostic results reinforced the fact that the longexp test is really a stress test of the parser. In this test, the parser ends up creating humongous numbers of nodes. Each of these nodes is only 20 bytes (+1 for insurance) for which the EMX malloc() returns a chunk 64 bytes long - and there appears to be a minimum of 13 such nodes + a handful of 2+1 byte allocations occupying 12 bytes each for each element in the list being parsed. Not a happy situation, as it is sufficient to exhaust my dev system's swap space, and OS/2 stops dead. I then thought of doctoring Python to use PYMALLOC for _all_ interpreter memory management (the attached patch is all it took, against 2.1). And with the exception of the socket test, which fails the first time with a "no memory" error but succeeds the second time when the .pycs don't need to be recompiled, the completely PYMALLOC managed interpreter passes the regression test _including_ the longexp test. I was starting to think in terms of releasing the (yet to be) 2.1.1 port configured this way. But then I decided to benchmark the two interpreter configurations using the regression test as the benchmark..... On my dev system, the average results (of 3 runs) are: no .pyc w/.pyc std malloc 3m 41s 3m 25s (test_longexp skipped) PYMALLOC 6m 12s 5m 25s (test_socket fails in "no .pyc" case) [the skipped longexp test, run standalone on the PYMALLOC interpreter, takes <5s total, so its not a significant factor in the times] :-( :-( I think the OS/2 port is going to have to continue to risk failure on the longexp test on many systems as such a performance hit is hard to justify. Environment: System= AMD K6/2-300, 64M RAM, DMA IDE drive (pre UDMA33) 40MB preallocated swap space, that can expand to 140MB S/ware= OS/2 v4, FP12 EMX 0.9d fix 03, gcc 2.8.1 compile options "-O2 -fomit-frame-pointer" NDEBUG _not_ defined, so all assert()s still active [PS: please cc any replies to me as I'm not subscribed to this list] -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia ---888574987-20871-995550055=:4658 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="pymalloc_all.patch" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: pymalloc_all.patch Content-Disposition: attachment; filename="pymalloc_all.patch" KioqIEluY2x1ZGVccHltZW0uaC5vcmlnCVNhdCBTZXAgIDIgMDk6Mjk6MjYg MjAwMA0KLS0tIEluY2x1ZGVccHltZW0uaAlTdW4gSnVsIDE1IDE3OjQ0OjM4 IDIwMDENCioqKioqKioqKioqKioqKg0KKioqIDI1LDM2ICoqKioNCi0tLSAy NSw0NyAtLS0tDQogICAgIFNlZSB0aGUgY29tbWVudCBibG9jayBhdCB0aGUg ZW5kIG9mIHRoaXMgZmlsZSBmb3IgdHdvIHNjZW5hcmlvcw0KICAgICBzaG93 aW5nIGhvdyB0byB1c2UgdGhpcyB0byB1c2UgYSBkaWZmZXJlbnQgYWxsb2Nh dG9yLiAqLw0KICANCisgI2lmZGVmCVBZTUFMTE9DX0FMTA0KKyAjaWZuZGVm IFB5Q29yZV9NQUxMT0NfRlVOQw0KKyAjdW5kZWYgUHlDb3JlX1JFQUxMT0Nf RlVOQw0KKyAjdW5kZWYgUHlDb3JlX0ZSRUVfRlVOQw0KKyAjZGVmaW5lIFB5 Q29yZV9NQUxMT0NfRlVOQyAgICAgIF9QeUNvcmVfT2JqZWN0TWFsbG9jDQor ICNkZWZpbmUgUHlDb3JlX1JFQUxMT0NfRlVOQyAgICAgX1B5Q29yZV9PYmpl Y3RSZWFsbG9jDQorICNkZWZpbmUgUHlDb3JlX0ZSRUVfRlVOQyAgICAgICAg X1B5Q29yZV9PYmplY3RGcmVlDQorICNkZWZpbmUgTkVFRF9UT19ERUNMQVJF X01BTExPQ19BTkRfRlJJRU5ECTENCisgI2VuZGlmDQorICNlbHNlDQogICNp Zm5kZWYgUHlDb3JlX01BTExPQ19GVU5DDQogICN1bmRlZiBQeUNvcmVfUkVB TExPQ19GVU5DDQogICN1bmRlZiBQeUNvcmVfRlJFRV9GVU5DDQogICNkZWZp bmUgUHlDb3JlX01BTExPQ19GVU5DICAgICAgbWFsbG9jDQogICNkZWZpbmUg UHlDb3JlX1JFQUxMT0NfRlVOQyAgICAgcmVhbGxvYw0KICAjZGVmaW5lIFB5 Q29yZV9GUkVFX0ZVTkMgICAgICAgIGZyZWUNCisgI2VuZGlmDQogICNlbmRp Zg0KICANCiAgI2lmbmRlZiBQeUNvcmVfTUFMTE9DX1BST1RPDQoqKiogT2Jq ZWN0c1xvYm1hbGxvYy5jLm9yaWcJTW9uIE1hciAxMiAwNTozNjoxMiAyMDAx DQotLS0gT2JqZWN0c1xvYm1hbGxvYy5jCVRodSBKdWwgMTkgMjM6MjQ6MjQg MjAwMQ0KKioqKioqKioqKioqKioqDQoqKiogNzMsODIgKioqKg0KLS0tIDcz LDg5IC0tLS0NCiAgICogYWxsb2NhdG9yIHdoaWNoIGV4cG9ydHMgZnVuY3Rp b25zIHdpdGggbmFtZXMgX290aGVyXyB0aGFuIHRoZSBzdGFuZGFyZA0KICAg KiBtYWxsb2MsIGNhbGxvYywgcmVhbGxvYywgZnJlZS4NCiAgICovDQorICNp ZmRlZglQWU1BTExPQ19BTEwNCisgI2RlZmluZSBfU1lTVEVNX01BTExPQwkJ bWFsbG9jDQorICNkZWZpbmUgX1NZU1RFTV9DQUxMT0MJCS8qIHVudXNlZCAq Lw0KKyAjZGVmaW5lIF9TWVNURU1fUkVBTExPQwkJcmVhbGxvYw0KKyAjZGVm aW5lIF9TWVNURU1fRlJFRQkJZnJlZQ0KKyAjZWxzZQ0KICAjZGVmaW5lIF9T WVNURU1fTUFMTE9DCQlQeUNvcmVfTUFMTE9DX0ZVTkMNCiAgI2RlZmluZSBf U1lTVEVNX0NBTExPQwkJLyogdW51c2VkICovDQogICNkZWZpbmUgX1NZU1RF TV9SRUFMTE9DCQlQeUNvcmVfUkVBTExPQ19GVU5DDQogICNkZWZpbmUgX1NZ U1RFTV9GUkVFCQlQeUNvcmVfRlJFRV9GVU5DDQorICNlbmRpZg0KICANCiAg LyoNCiAgICogSWYgbWFsbG9jIGhvb2tzIGFyZSBuZWVkZWQsIG5hbWVzIG9m IHRoZSBob29rcycgc2V0ICYgZmV0Y2gNCg== ---888574987-20871-995550055=:4658-- From klm@digicool.com Thu Jul 19 23:43:45 2001 From: klm@digicool.com (Ken Manheimer) Date: Thu, 19 Jul 2001 18:43:45 -0400 (EDT) Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <20010719105737.D7282@ute.cnri.reston.va.us> Message-ID: On Thu, 19 Jul 2001, Andrew Kuchling wrote: > On Thu, Jul 19, 2001 at 10:15:49AM -0400, Simon Cozens wrote: > >If by UCS-2 you actually mean UTF-16, then using surrogates is the > >right approach. :) > > If a narrow Python uses UTF-16 (and it does seem to, > according to PEP 100), then the configure script's > --enable-unicode=ucs2 option should be changed, because it's > misleading. (-: I am becoming convinced that Unicode is a multi-national plot to take over the minds of our most gifted (and/or most obsessive) programmers, in pursuit of an elusive, unresolvable, and ultimately, undefinable goal. To what point? To divert those of merit, and enable the emergence of the mediocritocricy - a modest plot to elevate the overshadowed to positions of remotely impressive power. Now that i'm nearly convinced of this conspiracy, perhaps i should be nearly committed... Ken:-) From tim.one@home.com Fri Jul 20 00:05:21 2001 From: tim.one@home.com (Tim Peters) Date: Thu, 19 Jul 2001 19:05:21 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: Message-ID: [Ken Manheimer] > (-: I am becoming convinced that Unicode is a multi-national plot to > take over the minds of our most gifted (and/or most obsessive) > programmers, in pursuit of an elusive, unresolvable, and ultimately, > undefinable goal. > > To what point? I'm afraid the universal adoption of the IEEE-754 floating-point standard took the committe by surprise, and they had to start some other unboundedly detailed yet inherently futile project lest they find themselves in need of real jobs. stare-at-a-zero-width-non-breaking-space-hard-enough-and-you'll- find-kahan-staring-right-back-ly y'rs - tim From barry@digicool.com Fri Jul 20 00:36:53 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 19 Jul 2001 19:36:53 -0400 Subject: [Python-Dev] 2.2 Unicode questions References: <20010719105737.D7282@ute.cnri.reston.va.us> Message-ID: <15191.28437.315103.405941@anthem.wooz.org> >>>>> "KM" == Ken Manheimer writes: KM> (-: I am becoming convinced that Unicode is a multi-national KM> plot to take over the minds of our most gifted (and/or most KM> obsessive) programmers, in pursuit of an elusive, KM> unresolvable, and ultimately, undefinable goal. >From Andrew's (hilarious and wonderful) quotes page: http://www.amk.ca/quotations/python-quotes/page-7.html Unicode: everyone wants it, until they get it. Barry Warsaw, 16 May 2000 From DavidA@ActiveState.com Fri Jul 20 01:07:35 2001 From: DavidA@ActiveState.com (David Ascher) Date: Thu, 19 Jul 2001 17:07:35 -0700 Subject: [Python-Dev] 2.2 Unicode questions References: Message-ID: <3B577647.8F95721A@ActiveState.com> Ken Manheimer wrote: > (-: I am becoming convinced that Unicode is a multi-national plot to take > over the minds of our most gifted (and/or most obsessive) programmers, in > pursuit of an elusive, unresolvable, and ultimately, undefinable goal. Amen brother. Unicode is the first technology I have to deal with which makes me hope I die before I really _really_ *really* need to understand it fully. --david From alex_c@MIT.EDU Fri Jul 20 04:04:14 2001 From: alex_c@MIT.EDU (Alex Coventry) Date: Thu, 19 Jul 2001 23:04:14 -0400 Subject: [Python-Dev] Pointers to python-dev threads pertaining to Patch #441791? Message-ID: <200107200304.XAA15088@opus.mit.edu> Hi, I posted a patch recently, #441791, causing "import foo.bar" to set "sys.modules['foo'].bar = sys.modules['foo.bar']" even if an error is raised during the importing of bar. With this patch, import commands like "import foo.bar; reload(foo.bar)" work in a fashion more consistent with the way "import unpackaged_module; reload(unpackaged_module)" works. Thomas Wouters posted a reply saying that this has been discussed on python-dev before. I've searched the archives for the keywords "import.c", "import_submodule" (the function I modify,) and "package import" but didn't turn up anything relevant. Could someone point me at a thread which discusses this? This patch has proved very useful to me, as I tend to carry around a lot of data in long-running python process, and being able to reload submodules of a package has been very useful to me. It'd be nice if the patch got into python itself, so that I can retain my development habits without having to keep an eye on import.c. :) Alex. From akuchlin@mems-exchange.org Fri Jul 20 04:10:26 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Thu, 19 Jul 2001 23:10:26 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <3B577647.8F95721A@ActiveState.com>; from DavidA@activestate.com on Thu, Jul 19, 2001 at 05:07:35PM -0700 References: <3B577647.8F95721A@ActiveState.com> Message-ID: <20010719231026.A500@ute.cnri.reston.va.us> On Thu, Jul 19, 2001 at 05:07:35PM -0700, David Ascher wrote: >Unicode is the first technology I have to deal with which makes me hope >I die before I really _really_ *really* need to understand it fully. Welcome to the quote file (again), David! And Barry, thanks for posting your quote; if you hadn't posted it, I would have. We mustn't forget the third Unicode reference: I never realized it before, but having looked that over I'm certain I'd rather have my eyes burned out by zombies with flaming dung sticks than work on a conscientious Unicode regex engine. -- Tim Peters, 3 Dec 1998 Doesn't anyone have *anything* nice to say about Unicode? --amk From simon@netthink.co.uk Fri Jul 20 04:23:33 2001 From: simon@netthink.co.uk (Simon Cozens) Date: Thu, 19 Jul 2001 23:23:33 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <20010719231026.A500@ute.cnri.reston.va.us> Message-ID: <20010719232333.A815@netthink.co.uk> On Thu, Jul 19, 2001 at 11:10:26PM -0400, Andrew Kuchling wrote: > Doesn't anyone have *anything* nice to say about Unicode? Sure: having had to deal with three different Japanese encodings, (at least) two different Japanese character repertoires, one huge chunk of special-casing code and *no* idea what's going on, I'll take the zombies with flaming sticks any day. Thank you. Simon From paulp@ActiveState.com Fri Jul 20 05:22:54 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Thu, 19 Jul 2001 21:22:54 -0700 Subject: [Python-Dev] 2.2 Unicode questions References: Message-ID: <3B57B21E.AB922A9@ActiveState.com> Ken Manheimer wrote: > >... > > (-: I am becoming convinced that Unicode is a multi-national plot to take > over the minds of our most gifted (and/or most obsessive) programmers, in > pursuit of an elusive, unresolvable, and ultimately, undefinable goal. I know that you are half-kidding but if you think that internationalization is hard now you should have seen it before Unicode. Unicode is the *simplification*. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From esr@thyrsus.com Fri Jul 20 05:48:04 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Fri, 20 Jul 2001 00:48:04 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <3B57B21E.AB922A9@ActiveState.com>; from paulp@ActiveState.com on Thu, Jul 19, 2001 at 09:22:54PM -0700 References: <3B57B21E.AB922A9@ActiveState.com> Message-ID: <20010720004804.H6164@thyrsus.com> Paul Prescod : > I know that you are half-kidding but if you think that > internationalization is hard now you should have seen it before Unicode. > Unicode is the *simplification*. Quite. If Unicode is a horde of zombies with flaming dung sticks, the hideous intricacies of JIS, Chinese Big-5, Chinese Traditional, KOI-8, et cetera are at least an army of ogres with salt and flensing knives. -- Eric S. Raymond "The best we can hope for concerning the people at large is that they be properly armed." -- Alexander Hamilton, The Federalist Papers at 184-188 From tim.one@home.com Fri Jul 20 06:22:00 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 20 Jul 2001 01:22:00 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <20010719231026.A500@ute.cnri.reston.va.us> Message-ID: [Andrew Kuchling] > Doesn't anyone have *anything* nice to say about Unicode? Europeans do. Unfortunately, the Japanese seem to have little use for it, while the Anglo-Europeans keep repeating how it saves them from the nightmare of dealing with Japanese <0.9 wink>. just-so-long-as-we-don't-have-to-deal-with-the-french-ly y'rs - tim From DavidA@ActiveState.com Fri Jul 20 06:49:52 2001 From: DavidA@ActiveState.com (David Ascher) Date: Thu, 19 Jul 2001 22:49:52 -0700 Subject: [Python-Dev] 2.2 Unicode questions References: Message-ID: <3B57C680.8271C765@ActiveState.com> Tim Peters wrote: > just-so-long-as-we-don't-have-to-deal-with-the-french-ly y'rs - tim I stopped using accents and cedillas in my french writings in 1986 when I was stuck in a foreign land with IBM 3278 terminals on an EBCDIC system. Haven't missed 'em since. My greek graduate student buddy wrote greek in ASCII using a completely made-up transliteration system known only to greek expatriates on the internet. The Newton vs. Graffiti debate has shown for the Nth time that people are more adaptive than computers. 7 bits is enough for anything worth saying. Anything else consists of error-correcting bits. =) --david From thomas@xs4all.net Fri Jul 20 10:47:52 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 20 Jul 2001 11:47:52 +0200 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: References: Message-ID: <20010720114752.K2054@xs4all.nl> On Fri, Jul 20, 2001 at 01:22:00AM -0400, Tim Peters wrote: > [Andrew Kuchling] > > Doesn't anyone have *anything* nice to say about Unicode? > > Europeans do. Nonsense (or should I say, 'hypergeneralization' ? :) Most Europeans can deal with ISO8859-1 just fine... I can honestly say I don't care a burning dung stick about unicode ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim.one@home.com Fri Jul 20 10:53:57 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 20 Jul 2001 05:53:57 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <20010720114752.K2054@xs4all.nl> Message-ID: [Andrew Kuchling] >>> Doesn't anyone have *anything* nice to say about Unicode? [Tim, severly cut] >> Europeans do. [Thomas Wouters, less severely cut] > Nonsense (or should I say, 'hypergeneralization' ? :) If Paul Prescod and Simon Cozens aren't Europeans, then I suppose I'm not either. QED. grab-some-coffee-and-rediscover-your-sense-of-humor-ly y'rs - tim From moshez@zadka.site.co.il Fri Jul 20 11:06:10 2001 From: moshez@zadka.site.co.il (Moshe Zadka) Date: Fri, 20 Jul 2001 13:06:10 +0300 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <20010720114752.K2054@xs4all.nl> References: <20010720114752.K2054@xs4all.nl>, Message-ID: On Fri, 20 Jul 2001 11:47:52 +0200, Thomas Wouters wrote: > Nonsense (or should I say, 'hypergeneralization' ? :) Most Europeans can > deal with ISO8859-1 just fine... I can honestly say I don't care a burning > dung stick about unicode ;) *West* Europeans! East European languages in in -2, so East Europeans have a problem too. And, since in the last European Python Meeting I was officially declared to be a European too ;-), I must say that Unicode is a boon as far as Hebrew is concerned. -- gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21 4817 C7FC A636 46D0 1BD6 Insecure (accessible): C5A5 A8FA CA39 AB03 10B8 F116 1713 1BCF 54C4 E1FE From jack@oratrix.nl Fri Jul 20 11:22:40 2001 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 20 Jul 2001 12:22:40 +0200 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: Message by Thomas Wouters , Fri, 20 Jul 2001 11:47:52 +0200 , <20010720114752.K2054@xs4all.nl> Message-ID: <20010720102240.96711303181@snelboot.oratrix.nl> > On Fri, Jul 20, 2001 at 01:22:00AM -0400, Tim Peters wrote: > > [Andrew Kuchling] > > > Doesn't anyone have *anything* nice to say about Unicode? > > > > Europeans do. > > Nonsense (or should I say, 'hypergeneralization' ? :) Most Europeans can > deal with ISO8859-1 just fine... ... until you start supporting software for non-europeans. The various 8-bit macintosh codesets are hell to deal with, especially because you can't really test how things work if you speak a seven-bit-clean language with your computer as well as your loved ones. I assume the same problems apply to the windows codepage blabla. At least unicode gives the whole world a common ground. Or let me rephrase that as "... should give...". -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido@digicool.com Fri Jul 20 14:41:45 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 20 Jul 2001 09:41:45 -0400 Subject: [Python-Dev] Pointers to python-dev threads pertaining to Patch #441791? In-Reply-To: Your message of "Thu, 19 Jul 2001 23:04:14 EDT." <200107200304.XAA15088@opus.mit.edu> References: <200107200304.XAA15088@opus.mit.edu> Message-ID: <200107201341.JAA09907@cj20424-a.reston1.va.home.com> > Hi, I posted a patch recently, #441791, causing "import foo.bar" to set > "sys.modules['foo'].bar = sys.modules['foo.bar']" even if an error is > raised during the importing of bar. With this patch, import commands > like "import foo.bar; reload(foo.bar)" work in a fashion more consistent > with the way "import unpackaged_module; reload(unpackaged_module)" > works. > > Thomas Wouters posted a reply saying that this has been discussed on > python-dev before. I've searched the archives for the keywords > "import.c", "import_submodule" (the function I modify,) and "package > import" but didn't turn up anything relevant. Could someone point me at > a thread which discusses this? This patch has proved very useful to me, > as I tend to carry around a lot of data in long-running python process, > and being able to reload submodules of a package has been very useful to > me. It'd be nice if the patch got into python itself, so that I can > retain my development habits without having to keep an eye on > import.c. :) I hardly recall such a discussion, and I don't think that much light was shed on the situation. In any case, I agree it would be nice if this was fixed. But I'm too busy to look into this myself -- sorry. Maybe Thomas was thinking of a different issue, where some people want the sys.modules[name] entry to be *removed* when an import fails. I am not for that change, but I haven't recovered the reason (I know I had a good one when I implemented things this way). --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Fri Jul 20 15:21:04 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 20 Jul 2001 10:21:04 -0400 Subject: [Python-Dev] Pointers to python-dev threads pertaining to Patch #441791? In-Reply-To: <200107201341.JAA09907@cj20424-a.reston1.va.home.com> References: Your message of "Thu, 19 Jul 2001 23:04:14 EDT." <200107200304.XAA15088@opus.mit.edu> Message-ID: <3B580610.30772.ECBE36E@localhost> [Alex Coventry] > > Hi, I posted a patch recently, #441791, causing "import > > foo.bar" to set "sys.modules['foo'].bar = > > sys.modules['foo.bar']" even if an error is raised during the > > importing of bar. With this patch, import commands like > > "import foo.bar; reload(foo.bar)" work in a fashion more > > consistent with the way "import unpackaged_module; > > reload(unpackaged_module)" works. [Guido] > In any case, I agree it would be nice if this was fixed. Import issues are subtle, but this looks good to me. > Maybe Thomas was thinking of a different issue, where some people > want the sys.modules[name] entry to be *removed* when an import > fails. I am not for that change, but I haven't recovered the > reason (I know I had a good one when I implemented things this > way). Perhaps one could construct a situation with circular imports in which one module ends up with a name (and no error, because the name is being imported) that later turns into an error? There's also the issue of failed relative imports that succeed as absolute imports - you don't want every module in the package hunting around for package.sys. - Gordon From mal@lemburg.com Fri Jul 20 16:08:09 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 20 Jul 2001 17:08:09 +0200 Subject: [Python-Dev] Python 2.2a1 nits Message-ID: <3B584959.8AABCC9E@lemburg.com> Here's a summary of nits I found compiling Python 2.2a1: * configure options: the options should follow a single methodology (either all use --with(out)- or all use --enable-/--disable-) and the defaults should be clearly indicated... --without-gcc never use gcc --with-cxx= enable C++ support --with-suffix=.exe set executable suffix --with-pydebug build with Py_DEBUG defined --enable-ipv6 Enable ipv6 (with ipv4) support --disable-ipv6 Disable ipv6 support --with-libs='lib1 ...' link against additional libs --with-signal-module disable/enable signal module --with-dec-threads use DEC Alpha/OSF1 thread-safe libraries --with(out)-threads[=DIRECTORY] disable/enable thread support --with(out)-thread[=DIRECTORY] deprecated; use --with(out)-threads --with-pth use GNU pth threading libraries --with(out)-cycle-gc disable/enable garbage collection --with(out)-pymalloc disable/enable specialized mallocs --with-wctype-functions use wctype.h functions --with-sgi-dl=DIRECTORY IRIX 4 dynamic linking --with-dl-dld=DL_DIR,DLD_DIR GNU dynamic linking --with-fpectl enable SIGFPE catching --with-libm=STRING math library --with-libc=STRING C library --enable-unicode[=ucs2,ucs4] Enable Unicode strings (default is yes) I'd suggest going with --with(out)- since this seems to be the most often used one. * warnings: In file included from ./Modules/_sre.c:54: Modules/sre.h:24: warning: `SRE_CODE' redefined Modules/sre.h:19: warning: this is the location of the previous definition libpython2.2.a(posixmodule.o): In function `posix_tmpnam': /home/lemburg/orig/Python-2.2a1/./Modules/posixmodule.c:4262: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.2.a(posixmodule.o): In function `posix_tempnam': /home/lemburg/orig/Python-2.2a1/./Modules/posixmodule.c:4217: the use of `tempnam' is dangerous, better use `mkstemp' -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From barry@digicool.com Fri Jul 20 16:13:25 2001 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 20 Jul 2001 11:13:25 -0400 Subject: [Python-Dev] 2.2 Unicode questions References: <20010719231026.A500@ute.cnri.reston.va.us> Message-ID: <15192.19093.861241.438673@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> Europeans do. Unfortunately, the Japanese seem to have little TP> use for it, while the Anglo-Europeans keep repeating how it TP> saves them from the nightmare of dealing with Japanese <0.9 TP> wink>. Heck, it's even more regional than that. Us Marylanders have some /serious/ concerns about the characters down in Virginia. -Barry From mal@lemburg.com Fri Jul 20 16:56:17 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 20 Jul 2001 17:56:17 +0200 Subject: [Python-Dev] mail.python.org black listed ?! Message-ID: <3B5854A1.214EFF19@lemburg.com> Since yesterday evening I haven't received any email from mail.python.org. A look in my mail delivery log file showed that sendmail is rejecting mails from mail.python.org (63.102.49.29) with the following message: """ 5.3.0 Mail from 63.102.49.29 rejected - open relay;see http://www.orbs.org """ Checking www.orbs.org I find: """ Due to circumstances beyond our control, the ORBS website is no longer available. """ I've disabled the orbs.org spam filter in sendmail for now, but this could be a problem for others as well... Perhaps someone could find out how to remove mail.python.org from the ORBS black list since this is likely going to affect more people who use sendmail. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Jul 20 17:03:35 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 20 Jul 2001 18:03:35 +0200 Subject: [Python-Dev] mail.python.org black listed ?! References: <3B5854A1.214EFF19@lemburg.com> Message-ID: <3B585657.AEB8821E@lemburg.com> "M.-A. Lemburg" wrote: > > Since yesterday evening I haven't received any email from > mail.python.org. A look in my mail delivery log file showed > that sendmail is rejecting mails from mail.python.org > (63.102.49.29) with the following message: > > """ > 5.3.0 Mail from 63.102.49.29 rejected - open relay;see http://www.orbs.org > """ FYI, starship.python.net [63.102.49.32] has the same problem. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From akuchlin@mems-exchange.org Fri Jul 20 17:10:20 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 20 Jul 2001 12:10:20 -0400 Subject: [Python-Dev] mail.python.org black listed ?! In-Reply-To: <3B5854A1.214EFF19@lemburg.com>; from mal@lemburg.com on Fri, Jul 20, 2001 at 05:56:17PM +0200 References: <3B5854A1.214EFF19@lemburg.com> Message-ID: <20010720121020.B1470@ute.cnri.reston.va.us> On Fri, Jul 20, 2001 at 05:56:17PM +0200, M.-A. Lemburg wrote: >Since yesterday evening I haven't received any email from >mail.python.org. A look in my mail delivery log file showed >that sendmail is rejecting mails from mail.python.org >(63.102.49.29) with the following message: See http://www.uwsg.iu.edu/hypermail/linux/kernel/0107.1/0929.html . Short explanation: ORBS will, one eleventh of the time, report that any IP address is an open relay. Fix: your sysadmins have to stop using ORBS and switch to some other open relay detection service. --amk From mal@lemburg.com Fri Jul 20 17:23:46 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 20 Jul 2001 18:23:46 +0200 Subject: [Python-Dev] mail.python.org black listed ?! References: <3B5854A1.214EFF19@lemburg.com> <20010720121020.B1470@ute.cnri.reston.va.us> Message-ID: <3B585B12.57B409A6@lemburg.com> Andrew Kuchling wrote: > > On Fri, Jul 20, 2001 at 05:56:17PM +0200, M.-A. Lemburg wrote: > >Since yesterday evening I haven't received any email from > >mail.python.org. A look in my mail delivery log file showed > >that sendmail is rejecting mails from mail.python.org > >(63.102.49.29) with the following message: > > See http://www.uwsg.iu.edu/hypermail/linux/kernel/0107.1/0929.html . > > Short explanation: ORBS will, one eleventh of the time, report that any IP > address is an open relay. > > Fix: your sysadmins have to stop using ORBS and switch to some other > open relay detection service. Thanks for the pointer. I've switched off ORBS... and just saw that missed the nice Unicode thread yesterday ;-( -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Fri Jul 20 17:39:30 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 20 Jul 2001 18:39:30 +0200 Subject: [Python-Dev] 2.2 Unicode questions Message-ID: <3B585EC2.DF9225D@lemburg.com> >From Andrew's new pass: """ Python's Unicode support has been enhanced a bit in 2.2. Unicode strings are usually stored as UTF-16, as 16-bit unsigned integers. """ Please replace UTF-16 with UCS-2. Python's Unicode implementation does not support UTF-16 in a surrogate aware way, only some of the codecs do this. As a result, the internal storage format of Python is more precisely described with UCS-2. """ Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned integers, as its internal encoding by supplying \longprogramopt{enable-unicode=ucs4} to the configure script. When built to use UCS-4 (a ``wide Python''), the interpreter can natively handle Unicode characters from U+000000 to U+110000. The range of legal values for the \function{unichr()} function has been expanded; it used to only accept values up to 65535, but in 2.2 will accept values from 0 to 0x110000. Using a ``narrow Python'', an interpreter compiled to use UTF-16, values greater than 65535 will result in \function{unichr()} returning a string of length 2: \begin{verbatim} >>> s = unichr(65536) >>> s u'\ud800\udc00' >>> len(s) 2 \end{verbatim} """ Same here: UTF-16 -> UCS-2. Note that I very much favour removing the surrogate generation in unichr() for UCS2-builds. If I don't here strong opposition, I'll disable this feature which was added as part of the UCS-4 patches. unichr() will then raise an exception as it did in version 2.1. """ This possibly-confusing behaviour, breaking the intuitive invariant that \function{chr()} and\function{unichr()} always return strings of length 1, may be changed later in 2.2, depending on public reaction. """ Right. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From akuchlin@mems-exchange.org Fri Jul 20 17:42:47 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 20 Jul 2001 12:42:47 -0400 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <3B585EC2.DF9225D@lemburg.com>; from mal@lemburg.com on Fri, Jul 20, 2001 at 06:39:30PM +0200 References: <3B585EC2.DF9225D@lemburg.com> Message-ID: <20010720124247.A1769@ute.cnri.reston.va.us> On Fri, Jul 20, 2001 at 06:39:30PM +0200, M.-A. Lemburg wrote: >Same here: UTF-16 -> UCS-2. Note that I very much favour >removing the surrogate generation in unichr() for UCS2-builds. Do I understand the new behavior you intend to implement? * Narrow Python: unichr() accepts values from 0 .. 65535. len(unichr(x)) is always 1. * Wide Python: unichr() accepts values from 0 .. 0x110000. len(unichr(x)) is also always 1. --amk From mal@lemburg.com Fri Jul 20 17:49:04 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 20 Jul 2001 18:49:04 +0200 Subject: [Python-Dev] 2.2 Unicode questions References: <3B585EC2.DF9225D@lemburg.com> <20010720124247.A1769@ute.cnri.reston.va.us> Message-ID: <3B586100.B645A08C@lemburg.com> Andrew Kuchling wrote: > > On Fri, Jul 20, 2001 at 06:39:30PM +0200, M.-A. Lemburg wrote: > >Same here: UTF-16 -> UCS-2. Note that I very much favour > >removing the surrogate generation in unichr() for UCS2-builds. > > Do I understand the new behavior you intend to implement? > * Narrow Python: unichr() accepts values from 0 .. 65535. len(unichr(x)) > is always 1. > * Wide Python: unichr() accepts values from 0 .. 0x110000. len(unichr(x)) > is also always 1. Right. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Fri Jul 20 18:03:32 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 20 Jul 2001 13:03:32 -0400 Subject: [Python-Dev] RELEASED: Python 2.1.1 Message-ID: <200107201703.NAA16871@cj20424-a.reston1.va.home.com> I've released Python 2.1.1 today. This is the final version of this bugfix release for Python 2.1, and should be fully compatible with Python 2.1. There should be *no* reason to use 2.1 any more. Many thanks to Thomas Wouters for being the release manager! Pick up your copy here: http://www.python.org/2.1.1/ Python 2.1.1 is GPL-compatible. This means that it is okay to distribute Python binaries linked with GPL-licensed software; Python itself is not released under the GPL but under a less restrictive license which is Open Source compliant. PS: I've noticed some disarray of mail sent through python.org; this seems to have to do with the dysfunctional ORBS "spam-checker". See http://mail.python.org/pipermail/python-dev/2001-July/016151.html for details. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com (Skip Montanaro) Fri Jul 20 19:11:43 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 20 Jul 2001 13:11:43 -0500 Subject: [Python-Dev] mail.python.org black listed ?! In-Reply-To: <20010720121020.B1470@ute.cnri.reston.va.us> References: <3B5854A1.214EFF19@lemburg.com> <20010720121020.B1470@ute.cnri.reston.va.us> Message-ID: <15192.29791.789277.909128@beluga.mojam.com> mal> A look in my mail delivery log file showed that sendmail is rejecting mal> mails from mail.python.org (63.102.49.29) with the following message: amk> See http://www.uwsg.iu.edu/hypermail/linux/kernel/0107.1/0929.html . amk> Short explanation: ORBS will, one eleventh of the time, report that amk> any IP address is an open relay. More telling I think is that Ron Guilmette, the author of the note amk referenced, felt that he couldn't recommend any of the other black list "services". The server that I run a few mailman lists on occasionally has messages rejected by MAPS RBL. Whenever I go there to see what's what, it always tells me my server has never been on their black list. I think the whole black list exercise has been a net waste of time and certainly has done little to stem the flow of email spam. as-opposed-to-python-spam-ly y'rs, Skip From gward@python.net Fri Jul 20 19:32:05 2001 From: gward@python.net (Greg Ward) Date: Fri, 20 Jul 2001 14:32:05 -0400 Subject: [Python-Dev] Python 2.2a1 nits In-Reply-To: <3B584959.8AABCC9E@lemburg.com>; from mal@lemburg.com on Fri, Jul 20, 2001 at 05:08:09PM +0200 References: <3B584959.8AABCC9E@lemburg.com> Message-ID: <20010720143205.A2209@gerg.ca> On 20 July 2001, M.-A. Lemburg said: > * configure options: the options should follow a single > methodology (either all use --with(out)- or all > use --enable-/--disable-) and the > defaults should be clearly indicated... Actually, the Autoconf docs say there is a difference between "with/without" and "enable/disable": Some packages pay attention to `--enable-FEATURE' options to `configure', where FEATURE indicates an optional part of the package. They may also pay attention to `--with-PACKAGE' options, where PACKAGE is something like `gnu-as' or `x' (for the X Window System). The `README' should mention any `--enable-' and `--with-' options that the package recognizes. IOW, --enable is for internal features, --with is for interfaces to external features/programs/libraries (as I read it). So I think most of the --with/--enable options are right, but: > --with-pydebug build with Py_DEBUG defined should be --enable-pydebug > --with-signal-module disable/enable signal module should be --enable-signal-module (I think) > --with(out)-cycle-gc disable/enable garbage collection > --with(out)-pymalloc disable/enable specialized mallocs should be --enable-cycle-gc and --enable-pymalloc Greg -- Greg Ward - Linux weenie gward@python.net http://starship.python.net/~gward/ "What do you mean -- a European or an African swallow?" From guido@digicool.com Fri Jul 20 19:31:45 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 20 Jul 2001 14:31:45 -0400 Subject: [Python-Dev] mail.python.org black listed ?! In-Reply-To: Your message of "Fri, 20 Jul 2001 13:11:43 CDT." <15192.29791.789277.909128@beluga.mojam.com> References: <3B5854A1.214EFF19@lemburg.com> <20010720121020.B1470@ute.cnri.reston.va.us> <15192.29791.789277.909128@beluga.mojam.com> Message-ID: <200107201831.OAA22144@cj20424-a.reston1.va.home.com> > I think the whole black list exercise has been a net waste of time > and certainly has done little to stem the flow of email spam. Amen. My thoughts exactly. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Fri Jul 20 19:45:45 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 20 Jul 2001 14:45:45 -0400 (EDT) Subject: [Python-Dev] [Q] patches in maintenace branch Message-ID: <15192.31833.148845.560021@cj42289-a.reston1.va.home.com> Thomas, It came out the other day that you'd rather not have anyone making checkins on the maintenance branch, but would like to migrate patches yourself. Can you add a discussion about this to the bugfix-release PEP? The discussion should include rationale and any information needed to make branch management easier (cookbook-style instructions for multiple merges, for example!), and guidelines so that the managers for bugfix releases won't wait too long to integrate patches. One issue that might need to be addressed is that I'd like to be able to keep a fairly up-to-date version of the patched documentation available at http://python.sourceforge.net/maint-docs/, so I'd like to discourage long periods between integrations (unless of course there's nothing to merge in). I'd also like to thank you -- you did a great job as the release manager for 2.1.1! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido@digicool.com Fri Jul 20 20:48:23 2001 From: guido@digicool.com (Guido van Rossum) Date: Fri, 20 Jul 2001 15:48:23 -0400 Subject: [Python-Dev] Pointers to python-dev threads pertaining to Patch #441791? In-Reply-To: Your message of "Fri, 20 Jul 2001 10:21:04 EDT." <3B580610.30772.ECBE36E@localhost> References: Your message of "Thu, 19 Jul 2001 23:04:14 EDT." <200107200304.XAA15088@opus.mit.edu> <3B580610.30772.ECBE36E@localhost> Message-ID: <200107201948.PAA26281@cj20424-a.reston1.va.home.com> http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=441791 > > In any case, I agree it would be nice if this was fixed. > > Import issues are subtle, but this looks good to me. I've added my own version to the patch, which does the right thing if the module has a SyntaxError, and also conforms to the style guide (PEP 7). > > Maybe Thomas was thinking of a different issue, where some people > > want the sys.modules[name] entry to be *removed* when an import > > fails. I am not for that change, but I haven't recovered the > > reason (I know I had a good one when I implemented things this > > way). > > Perhaps one could construct a situation with circular imports > in which one module ends up with a name (and no error, > because the name is being imported) that later turns into an > error? Yes, that's the one (Moshe remembered this too in private mail). > There's also the issue of failed relative imports that succeed > as absolute imports - you don't want every module in the > package hunting around for package.sys. That's a different issue; those create None entries in sys.modules, and obviously those None entries shouldn't be removed on failure. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Fri Jul 20 22:02:42 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 20 Jul 2001 17:02:42 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010720210242.487322892E@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Some additional information for extension writers. From thomas@xs4all.net Sat Jul 21 00:05:33 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sat, 21 Jul 2001 01:05:33 +0200 Subject: [Python-Dev] Re: [Q] patches in maintenace branch In-Reply-To: <15192.31833.148845.560021@cj42289-a.reston1.va.home.com> Message-ID: <20010721010533.B2025@xs4all.nl> On Fri, Jul 20, 2001 at 02:45:45PM -0400, Fred L. Drake, Jr. wrote: > It came out the other day that you'd rather not have anyone making > checkins on the maintenance branch, but would like to migrate patches > yourself. Can you add a discussion about this to the bugfix-release > PEP? Yeah. I plan to update it with a bunch of practical info. > One issue that might need to be addressed is that I'd like to be > able to keep a fairly up-to-date version of the patched documentation > available at http://python.sourceforge.net/maint-docs/, so I'd like to > discourage long periods between integrations (unless of course there's > nothing to merge in). You were implictly allowed to do anything you want in the documentation tree. Documentation doesn't create broken code, and I didn't expect you to check in documentation that was wrong or for features that don't exist in the maintenance branch. (But I kept an eye on what you checked in none the less :) > I'd also like to thank you -- you did a great job as the release > manager for 2.1.1! Pfah, no, I didn't :) It started out good, but I lost energy and focus in the end. This has also a lot do with lousy planning on my side... My girlfriend was scheduled to go on vacation three weeks ago; instead, she's leaving tomorrow, and she decided to use tonight to rent a truck and move all of our leftover stuff from the old house (which we still lease for free) to the new one (which we've been living in for months :P) That's why I was offline for most of the US afternoon, today. Anyway, better luck next time :) Mental-note-to-self--need-to-add-sourceforge-address-to-'alternates'-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas@xs4all.net Sat Jul 21 00:21:32 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sat, 21 Jul 2001 01:21:32 +0200 Subject: [Python-Dev] mail.python.org black listed ?! In-Reply-To: <200107201831.OAA22144@cj20424-a.reston1.va.home.com> Message-ID: <20010721012132.A9882@xs4all.nl> On Fri, Jul 20, 2001 at 02:31:45PM -0400, Guido van Rossum wrote: > > I think the whole black list exercise has been a net waste of time > > and certainly has done little to stem the flow of email spam. > Amen. My thoughts exactly. I disagree, but for a very simple reason: I don't block based on blacklists, I flag :) Blocking on the SMTP server is a bad idea, IMO too, though I can understand that people running a single, small SMTP server want to block spam at the earliest moment. But by flagging it, I can save it to a different folder, or send auto-replies, or just colour it in my mail client (mutt). We have a basic procmailrc which does a whole boatload of spamchecks (ORBS, RBL, DUL, RSS, various header-checks for illegal ipadresses, well known spam software, well known addresses like friend@public.com, buffer-overflow attempts, etc) and simply adds a header to my emails, which I then give a scoring and color based on what spamtests it triggered. No single spam test is 100% accurate, but I haven't seen false positives on something that has ORBS or RBL *and* DUL, RSS, or one or more of the others. Sadly, ORBS is exit, and MAPS is turning into a commercial service. We're still debating, at work, whether to pay for it or not :P If people are interested in the procmailrc, and the perl script (sorry) it uses, let me know and I'll see if we can distribute it. It's already available to XS4ALL customers, together with a simple script to report spam to Spamcop, which is especially easy to use from inside mutt :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim.one@home.com Sat Jul 21 01:31:17 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 20 Jul 2001 20:31:17 -0400 Subject: [Python-Dev] Re: [Q] patches in maintenace branch In-Reply-To: <20010721010533.B2025@xs4all.nl> Message-ID: {Fred, to Thomas] > I'd also like to thank you -- you did a great job as the release > manager for 2.1.1! [Thomas Wouters] > Pfah, no, I didn't :) It started out good, but I lost energy and focus > in the end. Welcome to the club. No matter how much you may want to beat yourself up, you did an infinitely better job than the non-existent 2.1.1 release manager you didn't replace. Since the release is out, and on schedule, the only objective thing to be said is that your efforts met with more success than 97.31% of all industry releases! If you're very nice to Guido, he may even let you do it again . From aahz@rahul.net Sat Jul 21 04:54:40 2001 From: aahz@rahul.net (Aahz Maruch) Date: Fri, 20 Jul 2001 20:54:40 -0700 (PDT) Subject: [Python-Dev] Re: [Q] patches in maintenace branch In-Reply-To: <20010721010533.B2025@xs4all.nl> from "Thomas Wouters" at Jul 21, 2001 01:05:33 AM Message-ID: <20010721035440.E659399C83@waltz.rahul.net> Thomas Wouters wrote: > On Fri, Jul 20, 2001 at 02:45:45PM -0400, Fred L. Drake, Jr. wrote: >> >> I'd also like to thank you -- you did a great job as the release >> manager for 2.1.1! > > Pfah, no, I didn't :) As the author of PEP 6, I want to publicly echo Fred and Tim: the whole exercise of 2.0.1 and 2.1.1 turned out better than I actually hoped for. I did not anticipate such a wholesale migration of minor fixes. Quite frankly (and I know Tim will agree with me here ;-), I figured a likely outcome was that the "put up or shut up" of PEP 6 would result in no action. I think it's wonderful that that we've already had two maintenance releases go so smoothly! -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From mal@lemburg.com Sat Jul 21 11:28:39 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 21 Jul 2001 12:28:39 +0200 Subject: [Python-Dev] mail.python.org black listed ?! References: <20010721012132.A9882@xs4all.nl> Message-ID: <3B595957.1F4D85F5@lemburg.com> Thomas Wouters wrote: > > On Fri, Jul 20, 2001 at 02:31:45PM -0400, Guido van Rossum wrote: > > > I think the whole black list exercise has been a net waste of time > > > and certainly has done little to stem the flow of email spam. > > > Amen. My thoughts exactly. > > I disagree, but for a very simple reason: I don't block based on blacklists, > I flag :) Blocking on the SMTP server is a bad idea, IMO too, though I can > understand that people running a single, small SMTP server want to block > spam at the earliest moment. But by flagging it, I can save it to a > different folder, or send auto-replies, or just colour it in my mail client > (mutt). > > We have a basic procmailrc which does a whole boatload of spamchecks (ORBS, > RBL, DUL, RSS, various header-checks for illegal ipadresses, well known spam > software, well known addresses like friend@public.com, buffer-overflow > attempts, etc) and simply adds a header to my emails, which I then give a > scoring and color based on what spamtests it triggered. No single spam test > is 100% accurate, but I haven't seen false positives on something that has > ORBS or RBL *and* DUL, RSS, or one or more of the others. > > Sadly, ORBS is exit, and MAPS is turning into a commercial service. We're > still debating, at work, whether to pay for it or not :P > > If people are interested in the procmailrc, and the perl script (sorry) it > uses, let me know and I'll see if we can distribute it. It's already > available to XS4ALL customers, together with a simple script to report spam > to Spamcop, which is especially easy to use from inside mutt :) Perhaps we should start a small project for such a tool written in Python (to bring the subject back on topic ;-) and place it on the web somewhere ?! If we separate out the engine from the rest we could also have different backends, e.g. one which hooks into .forward as filter, a daemon style backend which does on-server flagging based on imap, a Mailman filter backend which does the same for mailing lists etc. Would be cool to have python-list mark non-python spam using a special header automagically ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Sat Jul 21 14:40:50 2001 From: guido@digicool.com (Guido van Rossum) Date: Sat, 21 Jul 2001 09:40:50 -0400 Subject: [Python-Dev] Re: [Q] patches in maintenace branch In-Reply-To: Your message of "Fri, 20 Jul 2001 20:54:40 PDT." <20010721035440.E659399C83@waltz.rahul.net> References: <20010721035440.E659399C83@waltz.rahul.net> Message-ID: <200107211340.JAA29547@cj20424-a.reston1.va.home.com> > Quite frankly (and I know Tim will agree with me here ;-), I figured a > likely outcome was that the "put up or shut up" of PEP 6 would result in > no action. You understand the PEP process all too well... :-) But in this case I genuinely thought that we needed to do this. > I think it's wonderful that that we've already had two > maintenance releases go so smoothly! It may look smooth to *you*... Behind the scenes, 2.1.1 final was a bit of a last-minute mess. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@rahul.net Sat Jul 21 15:22:59 2001 From: aahz@rahul.net (Aahz Maruch) Date: Sat, 21 Jul 2001 07:22:59 -0700 (PDT) Subject: [Python-Dev] Re: [Q] patches in maintenace branch In-Reply-To: <200107211340.JAA29547@cj20424-a.reston1.va.home.com> from "Guido van Rossum" at Jul 21, 2001 09:40:50 AM Message-ID: <20010721142259.E06A799C83@waltz.rahul.net> Guido van Rossum wrote: >Aahz: >> >> I think it's wonderful that that we've already had two >> maintenance releases go so smoothly! > > It may look smooth to *you*... Behind the scenes, 2.1.1 final was a > bit of a last-minute mess. :-) My "smooth" was referring less to the actual build process than to the lack of wrangling over what went in, despite the large numbers of patches. If in time we get no complaints about the inclusion of any patch and y'all didn't miss any critical patches, I'll consider this a complete, absolute, and unqualified success. I mean, when I wrote the PEP, you'll recall that I originally tried to stipulate a separate mailing list because I was worried that the traffic would overwhelm python-dev. Hasn't been an issue at *all*, in the end. Having been on the inside of major political fights over what should go into a maintenance release (and seeing some of the feature fights here on python-dev), I truly do find the smoothness of 2.0.1 and 2.1.1 absolutely amazing. I think kudos go all around to everyone who participated, and most especially to Moshe and Thomas for orchestrating them. And now I'm off to OSCON. I probably won't be back for a week. Hope to see many of you in person! -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From thomas@xs4all.net Sat Jul 21 17:02:02 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sat, 21 Jul 2001 18:02:02 +0200 Subject: [Python-Dev] Pointers to python-dev threads pertaining to Patch #441791? In-Reply-To: <200107201341.JAA09907@cj20424-a.reston1.va.home.com> References: <200107200304.XAA15088@opus.mit.edu> <200107201341.JAA09907@cj20424-a.reston1.va.home.com> Message-ID: <20010721180201.A619@xs4all.nl> On Fri, Jul 20, 2001 at 09:41:45AM -0400, Guido van Rossum wrote: > Maybe Thomas was thinking of a different issue, where some people want > the sys.modules[name] entry to be *removed* when an import fails. I was. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Sat Jul 21 22:17:13 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 21 Jul 2001 23:17:13 +0200 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types Message-ID: <3B59F159.919DF4C@lemburg.com> I've started playing with making mxDateTime types subclassable and have run into a few problems which the PEP does not seem to have answers to: 1. Are tp_new et al. inherited by subclassed types ? This is important when implementing the slot methods, since they may then see types other than the one for which they are defined (e.g. keeping a free list around will only work for the original types, not subclassed ones). 2. In which order are the allocation/deallocation methods of subclass and base class called (if at all) and how does this depend on whether they are implemented or inherited ? 3. How can I make attributes visible in subclassed types ? Even though I found out that I need to use the generic APIs PyObject_GenericGet|SetAttr() for the tp_get|setattro to make methods visible, attributes cannot be accessed (and this even though dir(instance) displays them). In any case, the new feature looks very promising ! Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Sat Jul 21 23:29:22 2001 From: guido@digicool.com (Guido van Rossum) Date: Sat, 21 Jul 2001 18:29:22 -0400 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types In-Reply-To: Your message of "Sat, 21 Jul 2001 23:17:13 +0200." <3B59F159.919DF4C@lemburg.com> References: <3B59F159.919DF4C@lemburg.com> Message-ID: <200107212229.SAA00894@cj20424-a.reston1.va.home.com> > I've started playing with making mxDateTime types subclassable Cool!!! > and have run into a few problems which the PEP does not seem > to have answers to: > > 1. Are tp_new et al. inherited by subclassed types ? My apologies that this stuff is so underdocumented -- there's just so *much* to be documented... in typeobject.c, in inherit_slots(), there's a call to COPYSLOT(tp_new), so the answer is yes. > This is important when implementing the slot methods, since > they may then see types other than the one for which they > are defined (e.g. keeping a free list around will only > work for the original types, not subclassed ones). Yes, I've worked out a scheme to make this work, but I don't think I've written it down anywhere yet. If your tp_new calls tp_alloc, and your tp_dealloc calls tp_free, then a subtype can override tp_alloc *and* tp_free and the right thing will happen. A subtype can also *extend* tp_new and tp_dealloc. (tp_new and tp_dealloc are sort-of each other's companions, and ditto for tp_alloc and tp_free.) > 2. In which order are the allocation/deallocation methods > of subclass and base class called (if at all) and how > does this depend on whether they are implemented or inherited ? Here's the scheme. A subtype's tp_new should call the base type's tp_new, passing the subtype. The base class will call tp_alloc, which is the subtype's version. Similar for deallocation: the subtype's tp_dealloc calls the base type's tp_dealloc which calls tp_free which is the subtype's version. > 3. How can I make attributes visible in subclassed types ? > > Even though I found out that I need to use the generic APIs > PyObject_GenericGet|SetAttr() for the tp_get|setattro to > make methods visible, attributes cannot be accessed (and this > even though dir(instance) displays them). Strange. This should work. Probably something's subtly wrong in your setup. Compare your code to xxsubtype.c. > In any case, the new feature looks very promising ! Thanks! --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@scottb.demon.co.uk Sun Jul 22 01:45:19 2001 From: barry@scottb.demon.co.uk (Barry Scott) Date: Sun, 22 Jul 2001 01:45:19 +0100 Subject: [Python-Dev] Leading with XML-RPC In-Reply-To: <018501c109e0$c345a450$4ffa42d5@hagrid> Message-ID: <000001c11247$94f7f2a0$060210ac@private> > (fwiw, my current thinking is that SOAP is a flawed idea, and that the > need for SOAP will go away when people get better XML/Schema tools, > but that's another story. and don't get me started on SOAP BDG...) Do you mean that the main claim to fame of SOAP is its standard encoding and that's just a schema? Don't we need that standard encoding schema? BArry P.S. Any date on your 0.92 SOAP lib? From guido@digicool.com Sun Jul 22 05:36:38 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 22 Jul 2001 00:36:38 -0400 Subject: [Python-Dev] Future division patch available (PEP 238) Message-ID: <200107220436.AAA05323@cj20424-a.reston1.va.home.com> For those interested in the future of the division operator a la PEP 238, I've produced a reasonably complete patch (relative to the CVS trunk, but it probably also works for the descr-branch or the 2.2a1 release). Get it here: http://sourceforge.net/tracker/index.php?func=detail&aid=443474&group_id=5470&atid=305470 It works as follows: - unconditionally, there's a new operator // that will always do int division (and an in-place companion //=). - by default, / is unchanged (and so is /=). - after "from __future__ import division", / is changed to return a float result from int or long operands (and so is /=). Read the patch description for more details; the implementation of int and float division are semi-lame. There's no warning yet for int division returning a truncated result; I'm not sure if I want such a warning to be part of 2.2 (maybe if it's off by default). I'm cc'ing Bruce Sherwood and Davin Scherer, because they asked for this and used a similar implementation in VPython. When this patch (or something not entirely unlike it) is accepted into Python 2.2, they will no longer have to maintain their own hacked Python. (We've already added 10**-15 returning a float to 2.2a1, also specifically for them; that was easier because it used to be an error, so no backwards compatibility code or future statement is necessary there.) I thought again about the merits of the '//' operator vs. 'div' (either as a function or as a keyword binary operator), and figured that '//' is the best choice: it doesn't introduce a new keyword (which would cause more pain), and it works as an augmented assignment (//=) as well. --Guido van Rossum (home page: http://www.python.org/~guido/) From moshez@zadka.site.co.il Sun Jul 22 06:14:42 2001 From: moshez@zadka.site.co.il (Moshe Zadka) Date: Sun, 22 Jul 2001 08:14:42 +0300 Subject: [Python-Dev] Future division patch available (PEP 238) In-Reply-To: <200107220436.AAA05323@cj20424-a.reston1.va.home.com> References: <200107220436.AAA05323@cj20424-a.reston1.va.home.com> Message-ID: On Sun, 22 Jul 2001 00:36:38 -0400, Guido van Rossum wrote: > For those interested in the future of the division operator a la PEP > 238, I've produced a reasonably complete patch (relative to the CVS > trunk, but it probably also works for the descr-branch or the 2.2a1 > release). Do you want me to update PEP-0238 to reflect the new realities as to "open issues"? (I saw you already added a link to the patch in the PEP. Great!) -- gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21 4817 C7FC A636 46D0 1BD6 Insecure (accessible): C5A5 A8FA CA39 AB03 10B8 F116 1713 1BCF 54C4 E1FE From mal@lemburg.com Sun Jul 22 12:49:45 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 22 Jul 2001 13:49:45 +0200 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> Message-ID: <3B5ABDD9.1A73D7E4@lemburg.com> Guido van Rossum wrote: > > > I've started playing with making mxDateTime types subclassable > > Cool!!! A few people keep asking me for new features on those types, so I guess enabling this for Python 2.2 would be a real advantage for them. I still haven't found out how to solve the construction problem though (the base type is hard coded into various factory functions and methods)... the factory methods could use self.__class__ to solve this, but the factory functions would need some different tweaking. > > and have run into a few problems which the PEP does not seem > > to have answers to: > > > > 1. Are tp_new et al. inherited by subclassed types ? > > My apologies that this stuff is so underdocumented -- there's just so > *much* to be documented... in typeobject.c, in inherit_slots(), > there's a call to COPYSLOT(tp_new), so the answer is yes. Ok. > > This is important when implementing the slot methods, since > > they may then see types other than the one for which they > > are defined (e.g. keeping a free list around will only > > work for the original types, not subclassed ones). > > Yes, I've worked out a scheme to make this work, but I don't think > I've written it down anywhere yet. If your tp_new calls tp_alloc, and > your tp_dealloc calls tp_free, then a subtype can override tp_alloc > *and* tp_free and the right thing will happen. A subtype can also > *extend* tp_new and tp_dealloc. (tp_new and tp_dealloc are sort-of > each other's companions, and ditto for tp_alloc and tp_free.) So I will have to implement tp_free as well ?! Currently I have tp_new (which calls tp_alloc), tp_alloc, tp_init for the creation procedure and tp_dealloc (which does not call tp_free) for the finalization. I wonder whether it'd be a good idea to have a tp_del in there as well (the __del__ at C level) which is then called instead of tp_dealloc if set and which must call tp_dealloc if the instance is going to be deleted for good. > > 2. In which order are the allocation/deallocation methods > > of subclass and base class called (if at all) and how > > does this depend on whether they are implemented or inherited ? > > Here's the scheme. A subtype's tp_new should call the base type's > tp_new, passing the subtype. The base class will call tp_alloc, which > is the subtype's version. Similar for deallocation: the subtype's > tp_dealloc calls the base type's tp_dealloc which calls tp_free which > is the subtype's version. Like this... ? subtype basetype ---------------------------------------------------- Creation tp_new(subtype) -> tp_new(subtype) # calls tp_alloc & tp_init tp_alloc(subtype) <- -> tp_alloc(subtype) tp_init(instance) <- -> tp_init(instance) Finalization ( tp_delete(instance) -> tp_delete(instance) # calls tp_dealloc if # the instance should # be deleted ) tp_dealloc(instance) -> tp_dealloc(instance) # calls tp_free tp_free(instance) <- -> tp_free(instance) > > 3. How can I make attributes visible in subclassed types ? > > > > Even though I found out that I need to use the generic APIs > > PyObject_GenericGet|SetAttr() for the tp_get|setattro to > > make methods visible, attributes cannot be accessed (and this > > even though dir(instance) displays them). > > Strange. This should work. Probably something's subtly wrong in your > setup. Compare your code to xxsubtype.c. The xxsubtype doesn't define any attributes and neither do lists or dictionaries so there seems to be no precedent. In mxDateTime under Python 2.1, the tp_gettattr slot takes care of processing attribute lookup. Now to enable the dynamic goodies in Python 2.2, I have to provide the tp_getattro slot (and set it to the generic APIs mentioned above). Since tp_getattro override the tp_getattr slots, I have to rely on the generic APIs calling back to the tp_getattr slots to process the attributes which are not dynamically set by the user or a subclass. However, the new generic lookup APIs do not call the tp_getattr slot at all and thus the attributes which were "defined" by the tp_getattr in Python 2.1 are no longer visible. - How do I have to implement attribute lookup in Python 2.2 for TP_BASETYPEs (methods are now magically handled by the tp_methods slot, there doesn't seem to be a corresponding feature for attributes though) ? - Could the generic APIs perhaps fall back to tp_getattr to make the transition from classic types to base types a little easier ? Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Sun Jul 22 15:14:18 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 22 Jul 2001 16:14:18 +0200 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> Message-ID: <3B5ADFBA.A058A0D7@lemburg.com> A suggestion after looking at the typeobject.c implementation: wouldn't PyType_InitDict() better be named something like PyType_InitType() ?! -- the API does so much more than only init the tp_dict dictionary... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Sun Jul 22 16:14:34 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 22 Jul 2001 11:14:34 -0400 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types In-Reply-To: Your message of "Sun, 22 Jul 2001 16:14:18 +0200." <3B5ADFBA.A058A0D7@lemburg.com> References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <3B5ADFBA.A058A0D7@lemburg.com> Message-ID: <200107221514.LAA11364@cj20424-a.reston1.va.home.com> > A suggestion after looking at the typeobject.c implementation: > wouldn't PyType_InitDict() better be named something like > PyType_InitType() ?! -- the API does so much more than only > init the tp_dict dictionary... Yes, absolutely. I just haven't gotten around to it yet... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Sun Jul 22 16:49:55 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 22 Jul 2001 11:49:55 -0400 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types In-Reply-To: Your message of "Sun, 22 Jul 2001 13:49:45 +0200." <3B5ABDD9.1A73D7E4@lemburg.com> References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> Message-ID: <200107221549.LAA11503@cj20424-a.reston1.va.home.com> > A few people keep asking me for new features on those types, so > I guess enabling this for Python 2.2 would be a real advantage for > them. > > I still haven't found out how to solve the construction problem > though (the base type is hard coded into various factory functions > and methods)... the factory methods could use self.__class__ > to solve this, but the factory functions would need some different > tweaking. Using the new "classmethod" feature you can make the factory functions class methods. > > Yes, I've worked out a scheme to make this work, but I don't think > > I've written it down anywhere yet. If your tp_new calls tp_alloc, and > > your tp_dealloc calls tp_free, then a subtype can override tp_alloc > > *and* tp_free and the right thing will happen. A subtype can also > > *extend* tp_new and tp_dealloc. (tp_new and tp_dealloc are sort-of > > each other's companions, and ditto for tp_alloc and tp_free.) > > So I will have to implement tp_free as well ?! Currently I have > tp_new (which calls tp_alloc), tp_alloc, tp_init for the creation > procedure and tp_dealloc (which does not call tp_free) for the > finalization. Yes, if your tp_new calls tp_alloc, your tp_dealloc should call tp_free. Otherwise the user can override tp_alloc to use a different heap, and tp_dealloc would mess up. > I wonder whether it'd be a good idea to have a tp_del in there > as well (the __del__ at C level) which is then called instead > of tp_dealloc if set and which must call tp_dealloc if the > instance is going to be deleted for good. I've been thinking about this. I don't think that's quite the right protocol; I don't want to complicate the DECREF macro any more. I think that tp_dealloc must call tp_del and then decide whether to proceed depending on the refcount. > > > 2. In which order are the allocation/deallocation methods > > > of subclass and base class called (if at all) and how > > > does this depend on whether they are implemented or inherited ? > > > > Here's the scheme. A subtype's tp_new should call the base type's > > tp_new, passing the subtype. The base class will call tp_alloc, which > > is the subtype's version. Similar for deallocation: the subtype's > > tp_dealloc calls the base type's tp_dealloc which calls tp_free which > > is the subtype's version. > > Like this... ? > > subtype basetype > ---------------------------------------------------- > Creation > > tp_new(subtype) > -> tp_new(subtype) # calls tp_alloc & tp_init > > tp_alloc(subtype) <- > -> tp_alloc(subtype) Typically, the derved type's tp_alloc shouldn't call the base type's tp_alloc -- tp_alloc is supposed to allocate memory for the actual type, zero it, set the type pointer and reference count, and register it with GC. Any other initializations that can't be left to tp_init (which is optional) are tp_new's responsibility. > tp_init(instance) <- > -> tp_init(instance) > > Finalization > > ( > tp_delete(instance) > -> tp_delete(instance) # calls tp_dealloc if > # the instance should > # be deleted > ) > tp_dealloc(instance) > -> tp_dealloc(instance) # calls tp_free > > tp_free(instance) <- > -> tp_free(instance) Likewise, tp_free needn't call the base tp_free. > > > 3. How can I make attributes visible in subclassed types ? > > > > > > Even though I found out that I need to use the generic APIs > > > PyObject_GenericGet|SetAttr() for the tp_get|setattro to > > > make methods visible, attributes cannot be accessed (and this > > > even though dir(instance) displays them). > > > > Strange. This should work. Probably something's subtly wrong in your > > setup. Compare your code to xxsubtype.c. > > The xxsubtype doesn't define any attributes and neither do lists > or dictionaries so there seems to be no precedent. > > In mxDateTime under Python 2.1, the tp_gettattr slot takes care of > processing attribute lookup. Now to enable the dynamic goodies in > Python 2.2, I have to provide the tp_getattro slot (and set it to > the generic APIs mentioned above). > > Since tp_getattro override the tp_getattr slots, I have to rely > on the generic APIs calling back to the tp_getattr slots to process > the attributes which are not dynamically set by the user or a > subclass. However, the new generic lookup APIs do not call the > tp_getattr slot at all and thus the attributes which were "defined" > by the tp_getattr in Python 2.1 are no longer visible. > > - How do I have to implement attribute lookup in Python 2.2 > for TP_BASETYPEs (methods are now magically handled by the tp_methods > slot, there doesn't seem to be a corresponding feature for attributes > though) ? Ah, now I see the question. There's a tp_members slot, similar to the tp_methods slot. The tp_members slot is a pointer to a NULL-terminated array of the same form that you would pass to PyMember_Get(). If your attributes require custom computation, there's also a tp_getset slot which points to a NULL-terminated array of 'struct getsetlist' items, which specify a name, a getter C function, a setter C function, and a context void *. This means you have to write a pair of (very simple) functions for each writable attribute, or a single function per read-only attribute. (The context pointer gives you a chance to share function implementations, but I haven't found the need for this yet.) Examples of all of these can be found in typeobject.c, look for type_getsets and type_members. > - Could the generic APIs perhaps fall back to tp_getattr to make > the transition from classic types to base types a little easier ? I'd rather not: that would prevent discovery of attributes supported by the classic tp_getattr. The beauty of the new scheme is that *all* attributes (methods and data) are listed in the type's __dict__. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Sun Jul 22 17:28:54 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 22 Jul 2001 18:28:54 +0200 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com> Message-ID: <3B5AFF46.7BBCE245@lemburg.com> Guido van Rossum wrote: > > > A few people keep asking me for new features on those types, so > > I guess enabling this for Python 2.2 would be a real advantage for > > them. > > > > I still haven't found out how to solve the construction problem > > though (the base type is hard coded into various factory functions > > and methods)... the factory methods could use self.__class__ > > to solve this, but the factory functions would need some different > > tweaking. > > Using the new "classmethod" feature you can make the factory functions > class methods. Hmm, I don't like these class methods, but it would probably help with the problem... from mx.DateTime import DateTime dt1 = DateTime(2001,1,16) dt2 = DateTime.From("16. Januar 2001") Still looks silly to me... (I don't like these class methods). > > > Yes, I've worked out a scheme to make this work, but I don't think > > > I've written it down anywhere yet. If your tp_new calls tp_alloc, and > > > your tp_dealloc calls tp_free, then a subtype can override tp_alloc > > > *and* tp_free and the right thing will happen. A subtype can also > > > *extend* tp_new and tp_dealloc. (tp_new and tp_dealloc are sort-of > > > each other's companions, and ditto for tp_alloc and tp_free.) > > > > So I will have to implement tp_free as well ?! Currently I have > > tp_new (which calls tp_alloc), tp_alloc, tp_init for the creation > > procedure and tp_dealloc (which does not call tp_free) for the > > finalization. > > Yes, if your tp_new calls tp_alloc, your tp_dealloc should call > tp_free. Otherwise the user can override tp_alloc to use a different > heap, and tp_dealloc would mess up. Ok. > > I wonder whether it'd be a good idea to have a tp_del in there > > as well (the __del__ at C level) which is then called instead > > of tp_dealloc if set and which must call tp_dealloc if the > > instance is going to be deleted for good. > > I've been thinking about this. I don't think that's quite the right > protocol; I don't want to complicate the DECREF macro any more. I > think that tp_dealloc must call tp_del and then decide whether to > proceed depending on the refcount. Have you tried to move the decref action into a separate function (which is only called in case the refcount reaches 0) ? I think that this could in fact enhance the overall performance since the compiler can then decide whether or not to inline the relevant code. I wonder what the impact would be... > > > > 2. In which order are the allocation/deallocation methods > > > > of subclass and base class called (if at all) and how > > > > does this depend on whether they are implemented or inherited ? > > > > > > Here's the scheme. A subtype's tp_new should call the base type's > > > tp_new, passing the subtype. The base class will call tp_alloc, which > > > is the subtype's version. Similar for deallocation: the subtype's > > > tp_dealloc calls the base type's tp_dealloc which calls tp_free which > > > is the subtype's version. > > > > Like this... ? > > > > subtype basetype > > ---------------------------------------------------- > > Creation > > > > tp_new(subtype) > > -> tp_new(subtype) # calls tp_alloc & tp_init > > > > tp_alloc(subtype) <- > > -> tp_alloc(subtype) > > Typically, the derved type's tp_alloc shouldn't call the base type's > tp_alloc -- tp_alloc is supposed to allocate memory for the actual > type, zero it, set the type pointer and reference count, and register > it with GC. Any other initializations that can't be left to tp_init > (which is optional) are tp_new's responsibility. Good, so overriding the tp_alloc/free slots is generally not a wise thing to do, I guess. > > tp_init(instance) <- > > -> tp_init(instance) > > > > Finalization > > > > ( > > tp_delete(instance) > > -> tp_delete(instance) # calls tp_dealloc if > > # the instance should > > # be deleted > > ) > > tp_dealloc(instance) > > -> tp_dealloc(instance) # calls tp_free > > > > tp_free(instance) <- > > -> tp_free(instance) > > Likewise, tp_free needn't call the base tp_free. > > > > > 3. How can I make attributes visible in subclassed types ? > > > > > > > > Even though I found out that I need to use the generic APIs > > > > PyObject_GenericGet|SetAttr() for the tp_get|setattro to > > > > make methods visible, attributes cannot be accessed (and this > > > > even though dir(instance) displays them). > > > > > > Strange. This should work. Probably something's subtly wrong in your > > > setup. Compare your code to xxsubtype.c. > > > > The xxsubtype doesn't define any attributes and neither do lists > > or dictionaries so there seems to be no precedent. > > > > In mxDateTime under Python 2.1, the tp_gettattr slot takes care of > > processing attribute lookup. Now to enable the dynamic goodies in > > Python 2.2, I have to provide the tp_getattro slot (and set it to > > the generic APIs mentioned above). > > > > Since tp_getattro override the tp_getattr slots, I have to rely > > on the generic APIs calling back to the tp_getattr slots to process > > the attributes which are not dynamically set by the user or a > > subclass. However, the new generic lookup APIs do not call the > > tp_getattr slot at all and thus the attributes which were "defined" > > by the tp_getattr in Python 2.1 are no longer visible. > > > > - How do I have to implement attribute lookup in Python 2.2 > > for TP_BASETYPEs (methods are now magically handled by the tp_methods > > slot, there doesn't seem to be a corresponding feature for attributes > > though) ? > > Ah, now I see the question. There's a tp_members slot, similar to the > tp_methods slot. The tp_members slot is a pointer to a > NULL-terminated array of the same form that you would pass to > PyMember_Get(). If your attributes require custom computation, > there's also a tp_getset slot which points to a NULL-terminated array > of 'struct getsetlist' items, which specify a name, a getter C > function, a setter C function, and a context void *. This means you > have to write a pair of (very simple) functions for each writable > attribute, or a single function per read-only attribute. (The context > pointer gives you a chance to share function implementations, but > I haven't found the need for this yet.) > > Examples of all of these can be found in typeobject.c, look for > type_getsets and type_members. Thanks. I'll take a look at the implementation ... > > - Could the generic APIs perhaps fall back to tp_getattr to make > > the transition from classic types to base types a little easier ? > > I'd rather not: that would prevent discovery of attributes supported > by the classic tp_getattr. The beauty of the new scheme is that *all* > attributes (methods and data) are listed in the type's __dict__. Uhm, I think you misunderstood me: tp_getattr is not used anymore once the Python interpreter finds a tp_getattro slot implementation, so there's nothing to prevent ;-): PyObject_GetAttr() does not use tp_getattr if tp_getattro is defined, while PyObject_GetAttrString() prefers tp_getattr over tp_getattro -- something is not symmertic here ! As a result, dir() finds the __members__ attribute which lists the attributes (it uses PyObject_GetAttrString(), but instance.attribute does not work because it uses PyObject_GetAttr(). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Sun Jul 22 17:42:49 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 22 Jul 2001 12:42:49 -0400 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types In-Reply-To: Your message of "Sun, 22 Jul 2001 18:28:54 +0200." <3B5AFF46.7BBCE245@lemburg.com> References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com> <3B5AFF46.7BBCE245@lemburg.com> Message-ID: <200107221642.MAA11705@cj20424-a.reston1.va.home.com> > Hmm, I don't like these class methods, but it would probably > help with the problem... > > from mx.DateTime import DateTime > > dt1 = DateTime(2001,1,16) > dt2 = DateTime.From("16. Januar 2001") > > Still looks silly to me... (I don't like these class methods). Maybe it's time you started to like them. :-) > > I've been thinking about this. I don't think that's quite the right > > protocol; I don't want to complicate the DECREF macro any more. I > > think that tp_dealloc must call tp_del and then decide whether to > > proceed depending on the refcount. > > Have you tried to move the decref action into a separate function > (which is only called in case the refcount reaches 0) ? I think > that this could in fact enhance the overall performance since > the compiler can then decide whether or not to inline the relevant > code. > > I wonder what the impact would be... My gut tells me that the compiler will usually *not* inline it, and then it will slow deallocation down by one extra function call. And if the compiler *does* inline it, it's code bloat. So either way you lose, my gut tells me. (The dealloc functions for most common types are very fast and I would hate to see them slow down.) > Good, so overriding the tp_alloc/free slots is generally not > a wise thing to do, I guess. If the base type has a custom free list (like the int type does), you *have* to override it if the instances of the subtype are larger than the base type. Currently int doesn't allow subtyping yet because I haven't refactored its code in this area yet. > > > - Could the generic APIs perhaps fall back to tp_getattr to make > > > the transition from classic types to base types a little easier ? > > > > I'd rather not: that would prevent discovery of attributes supported > > by the classic tp_getattr. The beauty of the new scheme is that *all* > > attributes (methods and data) are listed in the type's __dict__. > > Uhm, I think you misunderstood me: tp_getattr is not used anymore > once the Python interpreter finds a tp_getattro slot > implementation, so there's nothing to prevent ;-): > > PyObject_GetAttr() does not use tp_getattr if tp_getattro is > defined, while PyObject_GetAttrString() prefers tp_getattr over > tp_getattro -- something is not symmertic here ! > > As a result, dir() finds the __members__ attribute which lists > the attributes (it uses PyObject_GetAttrString(), but > instance.attribute does not work because it uses PyObject_GetAttr(). The simplified rule is that a type should only provide *either* tp_getattr *or* tp_getattro, and likewise for set. The complete rule is that if you insist on having both tp_getattr and tp_getattro, they should implement the same semantics -- tp_getattr should be faster when PyObject_GetAttrString() is called, and tp_getattro should be faster when PyObject_GetAttr() is called. Apparently you left your tp_getattr implementation in place but added PyObject_GenericGetAttr to the tp_getattro slot -- this simply doesn't follow the rules. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Sun Jul 22 18:41:56 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 22 Jul 2001 19:41:56 +0200 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com> <3B5AFF46.7BBCE245@lemburg.com> <200107221642.MAA11705@cj20424-a.reston1.va.home.com> Message-ID: <3B5B1064.5882398A@lemburg.com> Guido van Rossum wrote: > > > Hmm, I don't like these class methods, but it would probably > > help with the problem... > > > > from mx.DateTime import DateTime > > > > dt1 = DateTime(2001,1,16) > > dt2 = DateTime.From("16. Januar 2001") > > > > Still looks silly to me... (I don't like these class methods). > > Maybe it's time you started to like them. :-) I'll have a hard time finding my way through all these extra dots in the names ;-) > > Good, so overriding the tp_alloc/free slots is generally not > > a wise thing to do, I guess. > > If the base type has a custom free list (like the int type does), you > *have* to override it if the instances of the subtype are larger than > the base type. Currently int doesn't allow subtyping yet because I > haven't refactored its code in this area yet. What I did was to enhance the base class' tp_alloc and tp_dealloc APIs to only use the free list in case the type being passed to the APIs is a base type; in all other cases, standard processing takes place. Perhaps ints could do the same ? > > > > - Could the generic APIs perhaps fall back to tp_getattr to make > > > > the transition from classic types to base types a little easier ? > > > > > > I'd rather not: that would prevent discovery of attributes supported > > > by the classic tp_getattr. The beauty of the new scheme is that *all* > > > attributes (methods and data) are listed in the type's __dict__. > > > > Uhm, I think you misunderstood me: tp_getattr is not used anymore > > once the Python interpreter finds a tp_getattro slot > > implementation, so there's nothing to prevent ;-): > > > > PyObject_GetAttr() does not use tp_getattr if tp_getattro is > > defined, while PyObject_GetAttrString() prefers tp_getattr over > > tp_getattro -- something is not symmertic here ! > > > > As a result, dir() finds the __members__ attribute which lists > > the attributes (it uses PyObject_GetAttrString(), but > > instance.attribute does not work because it uses PyObject_GetAttr(). > > The simplified rule is that a type should only provide *either* > tp_getattr *or* tp_getattro, and likewise for set. The complete rule > is that if you insist on having both tp_getattr and tp_getattro, they > should implement the same semantics -- tp_getattr should be faster > when PyObject_GetAttrString() is called, and tp_getattro should be > faster when PyObject_GetAttr() is called. Ah, ok, didn't know that rule. > Apparently you left your tp_getattr implementation in place but added > PyObject_GenericGetAttr to the tp_getattro slot -- this simply doesn't > follow the rules. Yep. That's what I did. I'll move to the new scheme for 2.2 then and leave the old tp_getattr around for backward compatibility. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Mon Jul 23 03:40:27 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 22 Jul 2001 22:40:27 -0400 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types In-Reply-To: Your message of "Sun, 22 Jul 2001 19:41:56 +0200." <3B5B1064.5882398A@lemburg.com> References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com> <3B5AFF46.7BBCE245@lemburg.com> <200107221642.MAA11705@cj20424-a.reston1.va.home.com> <3B5B1064.5882398A@lemburg.com> Message-ID: <200107230240.WAA14644@cj20424-a.reston1.va.home.com> > What I did was to enhance the base class' tp_alloc and tp_dealloc > APIs to only use the free list in case the type being passed to the > APIs is a base type; in all other cases, standard processing takes > place. > > Perhaps ints could do the same ? Yes, that's what I was planning to do. > > The simplified rule is that a type should only provide *either* > > tp_getattr *or* tp_getattro, and likewise for set. The complete rule > > is that if you insist on having both tp_getattr and tp_getattro, they > > should implement the same semantics -- tp_getattr should be faster > > when PyObject_GetAttrString() is called, and tp_getattro should be > > faster when PyObject_GetAttr() is called. > > Ah, ok, didn't know that rule. Well, I just made it up today. :-) But it's a sensible rule, if you want predictable results. > I'll move to the new scheme for 2.2 then and leave the old tp_getattr > around for backward compatibility. You should #ifdef on the Python version, unless you make your tp_getattr do everything that tp_getattro does (possibly by calling on the latter). --Guido van Rossum (home page: http://www.python.org/~guido/) From MarkH@ActiveState.com Tue Jul 24 00:02:41 2001 From: MarkH@ActiveState.com (Mark Hammond) Date: Mon, 23 Jul 2001 16:02:41 -0700 Subject: [Python-Dev] 2.2 Unicode questions In-Reply-To: <3B56DB33.71C9161B@lemburg.com> Message-ID: > Guido van Rossum wrote: > > > > > First, a short one, Mark Hammond's patch for supporting MBCS on > > > Windows. I trust everyone can handle a little bit of TeX markup? > > > > > > % XXX is this explanation correct? > > > \item When presented with a Unicode filename on Windows, Python will > > > now correctly convert it to a string using the MBCS encoding. > > > Filenames on Windows are a case where Python's choice of ASCII as > > > the default encoding turns out to be an annoyance. > > > > > > This patch also adds \samp{et} as a format sequence to > > > \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and > > > an encoding name, and converts it to the given encoding if the > > > parameter turns out to be a Unicode string, or leaves it alone if > > > it's an 8-bit string, assuming it to already be in the desired > > > encoding. (This differs from the \samp{es} format character, which > > > assumes that 8-bit strings are in Python's default ASCII encoding > > > and converts them to the specified new encoding.) > > > > > > (Contributed by Mark Hammond with assistance from Marc-Andr\'e > > > Lemburg.) > > > > I learned something here, so I hope this is correct. :-) > > The last part is... the rest is for Mark to comment on. Sorry for the delay - I hope this reponse is not too late. The description is technically correct, but may be better phrased as: \item When presented with a Unicode filename on Windows, Python will now convert it to an MBCS encoded string, as used by the Microsoft file APIs. As MBCS is explicitly used by the file APIs, the default Python encoding (be it ASCII or any other encoding explicitly set) is generally not appropriate for these conversions. Mark. From fredrik@pythonware.com Mon Jul 23 08:45:35 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 23 Jul 2001 09:45:35 +0200 Subject: [Python-Dev] 2.2 Unicode questions References: <3B585EC2.DF9225D@lemburg.com> Message-ID: <01ae01c1134b$76ca7820$4ffa42d5@hagrid> mal wrote: > Same here: UTF-16 -> UCS-2. Note that I very much favour > removing the surrogate generation in unichr() for UCS2-builds. > > If I don't here strong opposition, I'll disable this feature > which was added as part of the UCS-4 patches. unichr() > will then raise an exception as it did in version 2.1. the rationale behind this change was that unichr() should behave like the \U escape. (they both take a 32-bit character code, and turn it into a unicode string; see GvR's mails in the ucs4 thread for more on this topic). don't change one of them without considering if the other one really does the right thing. From mal@lemburg.com Mon Jul 23 09:52:18 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 23 Jul 2001 10:52:18 +0200 Subject: [Python-Dev] 2.2 Unicode questions References: <3B585EC2.DF9225D@lemburg.com> <01ae01c1134b$76ca7820$4ffa42d5@hagrid> Message-ID: <3B5BE5C2.878EB6A2@lemburg.com> Fredrik Lundh wrote: > > mal wrote: > > Same here: UTF-16 -> UCS-2. Note that I very much favour > > removing the surrogate generation in unichr() for UCS2-builds. > > > > If I don't here strong opposition, I'll disable this feature > > which was added as part of the UCS-4 patches. unichr() > > will then raise an exception as it did in version 2.1. > > the rationale behind this change was that unichr() should > behave like the \U escape. Please note that unichr() is a low-level API which is part of the Unicode implementation. The implementation itself does not handle surrogates in any special way, only the codecs do (and after my last checkin unicode-escape and UTF-16 do handle surrogates correctly). To simplify the picture: the implementation itself only sees UCS-2 or UCS-4 depending on the compile time option and these do not treat surrogates in any special way except reserve code points for their usage. Accordingly, unichr() should not create UTF-16 but UCS-2 for narrow builds and UCS-4 on wide builds (unichr() is a contructor for code units, not code points). If an application needs an UTF-16 generating API, then it can easily implement one using the UCS-2 generating unichr() API to create Unicode code units representing isolated surrogates. > (they both take a 32-bit character code, and turn it into > a unicode string; see GvR's mails in the ucs4 thread for more > on this topic). > > don't change one of them without considering if the other > one really does the right thing. For those of you who are not too much into all these code unit vs. code point vs. character discussions, a look at the slides of the talk I gave at the European Python Meeting in Bordeaux may provide some insights: http://www.lemburg.com/python/Unicode-Talk.pdf Cheers, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From fredrik@pythonware.com Mon Jul 23 11:00:16 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 23 Jul 2001 12:00:16 +0200 Subject: [Python-Dev] 2.2 Unicode questions References: <3B585EC2.DF9225D@lemburg.com> <01ae01c1134b$76ca7820$4ffa42d5@hagrid> <3B5BE5C2.878EB6A2@lemburg.com> Message-ID: <032801c1135e$490c4900$4ffa42d5@hagrid> MAL wrote: > Please note that unichr() is a low-level API which is part > of the Unicode implementation. well, I thought unichr() was a built-in Python function... > To simplify the picture: the implementation itself only sees > UCS-2 or UCS-4 depending on the compile time option and these > do not treat surrogates in any special way except reserve > code points for their usage. Accordingly, unichr() should not > create UTF-16 but UCS-2 for narrow builds and UCS-4 on wide > builds you didn't answer my question: is there any reason why unichr(0xXXXXXXXX) shouldn't return exactly the same thing as "\UXXXXXXXX" ? in 2.0 and 2.1, it doesn't. in 2.2, it does. > (unichr() is a contructor for code units, not code points). really? according to the documentation, it creates unicode *characters*. so does \U, according to the documentation. imo, it makes more sense to let "characters" mean code points than code units, but that's me. the important thing here is to figure out if \U and unichr are the same thing, and fix the code and the documentation to do/say what we mean. From mal@lemburg.com Mon Jul 23 11:36:38 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 23 Jul 2001 12:36:38 +0200 Subject: [Python-Dev] 2.2 Unicode questions References: <3B585EC2.DF9225D@lemburg.com> <01ae01c1134b$76ca7820$4ffa42d5@hagrid> <3B5BE5C2.878EB6A2@lemburg.com> <032801c1135e$490c4900$4ffa42d5@hagrid> Message-ID: <3B5BFE36.B43F33A5@lemburg.com> Fredrik Lundh wrote: > > MAL wrote: > > To simplify the picture: the implementation itself only sees > > UCS-2 or UCS-4 depending on the compile time option and these > > do not treat surrogates in any special way except reserve > > code points for their usage. Accordingly, unichr() should not > > create UTF-16 but UCS-2 for narrow builds and UCS-4 on wide > > builds > > you didn't answer my question: is there any reason why > unichr(0xXXXXXXXX) shouldn't return exactly the same > thing as "\UXXXXXXXX" ? > > in 2.0 and 2.1, it doesn't. in 2.2, it does. > > > (unichr() is a contructor for code units, not code points). Doesn't this answer your question ? The point I wanted to make was that unichr() is a constructor for a single code unit just like chr() is a constructor for a single code unit -- in that sense the storage format used by the implementation defines the outcome: for UCS-2 builds, it can only create UCS-2 values, for UCS-4 builds, UCS-4 values are possible as well. The question of u"\UXXXXXXXX" creating surrogates on UCS-2 builds is different: \UXXXXXXXX is an encoding of a Unicode code point, so the codec has to decide whether or not to map this to two code units or an exception on UCS-2 builds. > really? according to the documentation, it creates unicode > *characters*. so does \U, according to the documentation. > > imo, it makes more sense to let "characters" mean code points > than code units, but that's me. The term "character" is vastly overloaded. There are three different forms of interpretation: graphemes (this is what a user usually sees as character on her display), codec points (this is what Unicode encodes) and code units (this is what the implementation uses a atom for storing code points). Since Python exposes code units (u[0] gives you direct access to the implementation defined storage area) and makes no assumption about surrogates, it would not be a good idea to suddenly introduce a break in the meaning of the outcome of indexing into a Unicode string (u[0]) and len(unichr()). I know that the name unichr() does not help in this situation, the correct name would be unicodeunit(). > the important thing here is to > figure out if \U and unichr are the same thing, and fix the code > and the documentation to do/say what we mean. Right. Note that apart from agreeing on a common meaning, we should also think about the consequences of breaking len(unichr())==1, e.g. when creating a Unicode string using unichr() you'd expect to find the generated code unit at the position you appended it to the Unicode object. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas@xs4all.net Mon Jul 23 12:04:54 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 23 Jul 2001 13:04:54 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Mac/Distributions/(vise) Python 2.1.vct,1.12,1.12.4.1 In-Reply-To: Message-ID: <20010723130453.A569@xs4all.nl> On Sun, Jul 22, 2001 at 03:10:52PM -0700, Jack Jansen wrote: > Update of /cvsroot/python/python/dist/src/Mac/Distributions/(vise) > In directory usw-pr-cvs1:/tmp/cvs-serv20074/Python/Mac/Distributions/(vise) > Modified Files: > Tag: release21-maint > Python 2.1.vct > Log Message: > Files used for 2.1.1c1 distribution. We really should different tags for the regular vs. the Mac release (or any other release, for that matter) next time :P > ***** Bogus filespec: Python Note to self: fix this bug in syncmail (and yes, we can fix this bug in syncmail.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From martin@strakt.com Mon Jul 23 13:21:57 2001 From: martin@strakt.com (Martin Sjögren) Date: Mon, 23 Jul 2001 14:21:57 +0200 Subject: [Python-Dev] BEGIN_ALLOW_THREADS Message-ID: <20010723142157.B16665@strakt.com> Hello Is there a reason the Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS don't allow an argument specifying what variable to save the state to? I needed this myself so I wrote the following: #ifdef WITH_THREAD # define MY_BEGIN_ALLOW_THREADS(st) \ { st =3D PyEval_SaveThread(); } # define MY_END_ALLOW_THREADS(st) \ { PyEval_RestoreThread(st); st =3D NULL; } #else # define MY_BEGIN_ALLOW_THREADS(st) # define MY_END_ALLOW_THREADS(st) { st =3D NULL; } #endif It works just fine but has one drawback: Whenever Py_BEGIN_ALLOW_THREADS changes, I have to change my macros too. Wouldn't it be reasonable to supply two sets of macros, one that allows exactly this, and one that does what Py_BEGIN_ALLOW_THREADS currently does. Martin Sj=F6gren --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 405242 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html From m.favas@per.dem.csiro.au Mon Jul 23 20:58:04 2001 From: m.favas@per.dem.csiro.au (Mark Favas) Date: Tue, 24 Jul 2001 03:58:04 +0800 Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c Message-ID: <3B5C81CC.E0F6D3CF@per.dem.csiro.au> In the current CVS of 2.2, a call to snprintf now occurs in socketmodule.c, breaking builds on those systems without such a library call (such as Tru64 Unix, and older Solarises). -- Mark Favas - m.favas@per.dem.csiro.au CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA From barry@zope.com Mon Jul 23 21:39:32 2001 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 23 Jul 2001 16:39:32 -0400 Subject: [Python-Dev] mail.python.org black listed ?! References: <20010721012132.A9882@xs4all.nl> <3B595957.1F4D85F5@lemburg.com> Message-ID: <15196.35716.121063.991831@anthem.wooz.org> >>>>> "M" == M writes: M> Perhaps we should start a small project for such a tool written M> in Python (to bring the subject back on topic ;-) and place it M> on the web somewhere ?! I think that's an excellent idea! M> If we separate out the engine from the rest we could also have M> different backends, e.g. one which hooks into .forward as M> filter, a daemon style backend which does on-server flagging M> based on imap, a Mailman filter backend which does the same for M> mailing lists etc. M> Would be cool to have python-list mark non-python spam using a M> special header automagically ;-) We could go one better in MM2.1. There's now a "topics filter" feature in the alpha codebase (sponsored by Control.com -- thanks guys!) and I can easily see how it might be extended to something like: - The filter marks the message with a % confidence of being spam (e.g. X-Spam: 75%) - Each Mailman recipient could specify the threshhold above which they do not want to receive the message (e.g. don't sent me anything that's spam with a more than 70% confidence level). This only works for regular delivery. -Barry From barry@zope.com Mon Jul 23 21:56:46 2001 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 23 Jul 2001 16:56:46 -0400 Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c References: <3B5C81CC.E0F6D3CF@per.dem.csiro.au> Message-ID: <15196.36750.517951.674031@anthem.wooz.org> >>>>> "MF" == Mark Favas writes: MF> In the current CVS of 2.2, a call to snprintf now occurs in MF> socketmodule.c, breaking builds on those systems without such MF> a library call (such as Tru64 Unix, and older Solarises). I have a GPL'd version of vsnprintf() -- taken from GNU screen -- in Mailman for systems that don't have native support. That's not appropriate for Python, but I seem to remember a few other LGPL or MIT-ish licensed versions floating around when I did a search a couple of years ago. Maybe it's time to add our own which would only be linked in if the platform didn't have native support? -Barry From m.favas@per.dem.csiro.au Mon Jul 23 22:21:05 2001 From: m.favas@per.dem.csiro.au (Mark Favas) Date: Tue, 24 Jul 2001 05:21:05 +0800 Subject: [Python-Dev] Warning on use of "unset VARIABLE_NOT_SET" in Makefiles on FreeBSD 4.3 Message-ID: <3B5C9541.B69285D9@per.dem.csiro.au> It seems that FreeBSD 4.3-RELEASE considers that "unset VARIABLE_NOT_ALREADY_SET" should be an error and sets the shell return code to 1. This causes "make" to exit with an error when executing (for example) unset PYTHONPATH PYTHONHOME PYTHONSTARTUP; \ ./$(PYTHON) $(srcdir)/setup.py build The "unset" is no longer in the CVS version, but is in 2.2a1... uname -a FreeBSD teche 4.3-RELEASE FreeBSD 4.3-RELEASE sh $ unset GGGG $ echo $? 1 GGGG=42 unset GGGG echo $? 0 -- Mark Favas - m.favas@per.dem.csiro.au CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA From m.favas@per.dem.csiro.au Mon Jul 23 22:26:28 2001 From: m.favas@per.dem.csiro.au (Mark Favas) Date: Tue, 24 Jul 2001 05:26:28 +0800 Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c References: <3B5C81CC.E0F6D3CF@per.dem.csiro.au> <15196.36750.517951.674031@anthem.wooz.org> Message-ID: <3B5C9684.BC43661B@per.dem.csiro.au> "Barry A. Warsaw" wrote: > = > >>>>> "MF" =3D=3D Mark Favas writes: > = > MF> In the current CVS of 2.2, a call to snprintf now occurs in > MF> socketmodule.c, breaking builds on those systems without such > MF> a library call (such as Tru64 Unix, and older Solarises). > = > I have a GPL'd version of vsnprintf() -- taken from GNU screen -- in > Mailman for systems that don't have native support. That's not > appropriate for Python, but I seem to remember a few other LGPL or > MIT-ish licensed versions floating around when I did a search a couple > of years ago. Maybe it's time to add our own which would only be > linked in if the platform didn't have native support? > = > -Barry How about the one at http://www.ijs.si/software/snprintf/ ? =46rom the URL: """ Author Mark Martinec , April 1999, June 2000 = Copyright =A9 1999, Mark Martinec = Terms and conditions ... This program is free software; you can redistribute it and/or modify it under the terms of the Frontier Artistic License which comes with this Kit. = Features careful adherence to specs regarding flags, field width and precision; = good performance for large string handling (large format, large argument or large paddings). Performance is similar to system's sprintf and in several cases significantly better (make sure you compile with optimizations turned on, tell the compiler the code is strict ANSI if necessary to give it more freedom for optimizations); = return value semantics per ISO/IEC 9899:1999 ("ISO C99"); = written in standard ISO/ANSI C - requires an ANSI C compiler. """ = -- = Mark Favas - m.favas@per.dem.csiro.au CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA From fdrake@acm.org Mon Jul 23 22:42:30 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 23 Jul 2001 17:42:30 -0400 (EDT) Subject: [Python-Dev] Warning on use of "unset VARIABLE_NOT_SET" in Makefiles on FreeBSD 4.3 In-Reply-To: <3B5C9541.B69285D9@per.dem.csiro.au> References: <3B5C9541.B69285D9@per.dem.csiro.au> Message-ID: <15196.39494.845671.730890@cj42289-a.reston1.va.home.com> Mark Favas writes: > The "unset" is no longer in the CVS version, but is in 2.2a1... And don't expect it to return; Neil's implementation of the -E option means we don't have to worry about this any more. (Thanks, Neil!) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From skip@pobox.com (Skip Montanaro) Mon Jul 23 23:11:22 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 23 Jul 2001 17:11:22 -0500 Subject: [Python-Dev] mail.python.org black listed ?! In-Reply-To: <15196.35716.121063.991831@anthem.wooz.org> References: <20010721012132.A9882@xs4all.nl> <3B595957.1F4D85F5@lemburg.com> <15196.35716.121063.991831@anthem.wooz.org> Message-ID: <15196.41226.977676.237807@beluga.mojam.com> BAW> - The filter marks the message with a % confidence of being spam BAW> (e.g. X-Spam: 75%) BAW> - Each Mailman recipient could specify the threshhold above which BAW> they do not want to receive the message (e.g. don't sent me BAW> anything that's spam with a more than 70% confidence level). BAW> This only works for regular delivery. On thing to consider is that many mail filters probably only have crude numeric comparison capability. In procmail I have to filter using regular expressions. (Most of) my mail comes through pobox.com who modifies the subject header to stuff like Subject: [spam score 9.00/10.0 -pobox] remove me While I'm sure I could create a regular expression that would allow me to classify pobox.com's spam score numerically (or call out to a Python script to do it for me), I'm lazy enough that I simply lump everything that has a pobox.com spam subject (I think 5.0/10.0 is their minimum criterion for subject mangling) that I just toss everything with spam.*-pobox in the Subject into the spam-hole. I assume other mail software systems' filtering capabilities are similarly limited. I would therefore suggest that the X-Spam header be simply a three-digit number in the range 000 to 100. (No percent sign, always with any necessary leading zeroes.) It might even be better to create an X-Spam-Value header in one-bit arithmetic, e.g. make a slightly smaller range (say 0 to 50) and include a header like: X-Spam-Value: sssssssssssssssssssssssssssssssssss to indicate a 70% likelihood (35 "s"s). You could then match it with X-Spam-Value: s{25,50} in procmail to spam-categorize anything with a probability of spamhood >= 50%. You could include a readable X-Spam header like: X-Spam: rated 75% probability of being spam by "Spam Pie v. 0.1" Skip From skip@pobox.com (Skip Montanaro) Mon Jul 23 23:42:33 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 23 Jul 2001 17:42:33 -0500 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? Message-ID: <15196.43097.529737.173915@beluga.mojam.com> There are several active or could-be-active PEPs related to Python's numeric behavior: S 211 pep-0211.txt Adding A New Outer Product Operator Wilson S 228 pep-0228.txt Reworking Python's Numeric Model Zadka S 237 pep-0237.txt Unifying Long Integers and Integers Zadka S 238 pep-0238.txt Non-integer Division Zadka S 239 pep-0239.txt Adding a Rational Type to Python Zadka S 240 pep-0240.txt Adding a Rational Literal to Python Zadka S 242 pep-0242.txt Numeric Kinds Dubois Instead of implementing them piecemeal, shouldn't we be considering them as a related group? For example, implementing any or all of PEPs 237, 239 and 240 might well have an effect on what needs to be done for PEP 238. With slight modifications, the proposals in PEP 242 might well subsume PEP 238's functionality in a different way. If the semantics of arithmetic are going to change, I think they should change in the context of expanded capability in the language. -- Skip Montanaro (skip@pobox.com) http://www.mojam.com/ http://www.musi-cal.com/ From fdrake@acm.org Mon Jul 23 23:04:50 2001 From: fdrake@acm.org (Fred L. Drake) Date: Mon, 23 Jul 2001 18:04:50 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010723220450.33D4428932@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Various minor updates. From martin@loewis.home.cs.tu-berlin.de Tue Jul 24 07:38:25 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 24 Jul 2001 08:38:25 +0200 Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c Message-ID: <200107240638.f6O6cPn03340@mira.informatik.hu-berlin.de> > In the current CVS of 2.2, a call to snprintf now occurs in > socketmodule.c, breaking builds on those systems without such a > library call (such as Tru64 Unix, and older Solarises). Following itojun's proposal, I have now added an autoconf test for snprintf, and use sprintf if it is not available. Regards, Martin From mal@lemburg.com Tue Jul 24 11:15:43 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 24 Jul 2001 12:15:43 +0200 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? References: <15196.43097.529737.173915@beluga.mojam.com> Message-ID: <3B5D4ACF.F79CD179@lemburg.com> Skip Montanaro wrote: > > There are several active or could-be-active PEPs related to Python's numeric > behavior: > > S 211 pep-0211.txt Adding A New Outer Product Operator Wilson > S 228 pep-0228.txt Reworking Python's Numeric Model Zadka > S 237 pep-0237.txt Unifying Long Integers and Integers Zadka > S 238 pep-0238.txt Non-integer Division Zadka > S 239 pep-0239.txt Adding a Rational Type to Python Zadka > S 240 pep-0240.txt Adding a Rational Literal to Python Zadka > S 242 pep-0242.txt Numeric Kinds Dubois > > Instead of implementing them piecemeal, shouldn't we be considering them as > a related group? For example, implementing any or all of PEPs 237, 239 and > 240 might well have an effect on what needs to be done for PEP 238. With > slight modifications, the proposals in PEP 242 might well subsume PEP 238's > functionality in a different way. > > If the semantics of arithmetic are going to change, I think they should > change in the context of expanded capability in the language. May I suggest that these rather controversial changes be carried out on a separate branch of the Python source tree before adding them to the trunk ?! The reasoning here is that numerics are so low-level that porting applications to a new release implementing these changes will cause a lot of work (mostly due to the dynamic nature of Python). Another suggestion I would like to make is that the new semantics are first implemented using alternative subclassed numeric objects (e.g. newint()) which can then live side-by-side with the old semantics types for a few releases until they replace the old types. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Tue Jul 24 11:08:34 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 24 Jul 2001 12:08:34 +0200 Subject: [Python-Dev] Spam flagging filter (mail.python.org black listed ?!) References: <20010721012132.A9882@xs4all.nl> <3B595957.1F4D85F5@lemburg.com> <15196.35716.121063.991831@anthem.wooz.org> Message-ID: <3B5D4922.6FBD8743@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "M" == M writes: > > M> Perhaps we should start a small project for such a tool written > M> in Python (to bring the subject back on topic ;-) and place it > M> on the web somewhere ?! > > I think that's an excellent idea! > > M> If we separate out the engine from the rest we could also have > M> different backends, e.g. one which hooks into .forward as > M> filter, a daemon style backend which does on-server flagging > M> based on imap, a Mailman filter backend which does the same for > M> mailing lists etc. > > M> Would be cool to have python-list mark non-python spam using a > M> special header automagically ;-) > > We could go one better in MM2.1. There's now a "topics filter" > feature in the alpha codebase (sponsored by Control.com -- thanks > guys!) and I can easily see how it might be extended to something > like: > > - The filter marks the message with a % confidence of being spam > (e.g. X-Spam: 75%) I think we ought to consider a format which allows easy mail filtering. Like Skip mentioned, mail filters are usually not very smart about parsing the headers, e.g. Netscape only allows you to do substring matching. Ideal would be a format like: X-SpamLevel: 0123456789x (100%) X-SpamLevel: 0123456789 (90%) X-SpamLevel: 0123456 (60%) X-SpamLevel: 0 (0%) A substring filter for e.g. "012" would then move all messages with a spam level of >=20% to Trash. > - Each Mailman recipient could specify the threshhold above which they > do not want to receive the message (e.g. don't sent me anything > that's spam with a more than 70% confidence level). This only works > for regular delivery. Cool (even though I think that client side filtering is more flexible). Could you send me the filter source code, so that I can look into splitting out the engine for use by e.g. procmail ?! Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From jepler@inetnebr.com Tue Jul 24 13:49:16 2001 From: jepler@inetnebr.com (Jeff Epler) Date: Tue, 24 Jul 2001 07:49:16 -0500 Subject: [Python-Dev] cygwin "test_pwd" failure on win98 Message-ID: <20010724074916.52922@bald.inetnebr.com> As described in an ancient cygnus mailing list message, http://www.cygwin.com/ml/cygwin-announce/2000/msg00042.html "getpwnam/getpwuid functions report NULL instead of a fallback entry *if /etc/passwd exist* and the user name or uid is not found in the file" (my emphasis) Once I generated a password file, using mkpasswd.exe > /etc/passwd test_pwd.py seems to succeed as expected. Strangely, the Cygnus install seems to have automatically generated the passwd file into /etc/group at install time, rather than into /etc/passwd. In any case, this test failure is due to incorrect cygwin32 setup, not Python. Jeff PS Tim, sorry about reporting those two known test failures yesterday. I hope this message is more helpful. In any case, when can I expect to receive my punishment from the PS -- \/ http://www.slashdot.org/ Jeff Epler jepler@inetnebr.com "One Architecture, One OS" also translates as "One Egg, One Basket". From jason@tishler.net Tue Jul 24 15:05:42 2001 From: jason@tishler.net (Jason Tishler) Date: Tue, 24 Jul 2001 10:05:42 -0400 Subject: [Python-Dev] cygwin "test_pwd" failure on win98 In-Reply-To: <20010724074916.52922@bald.inetnebr.com> Message-ID: <20010724100542.A328@dothill.com> Jeff, On Tue, Jul 24, 2001 at 07:49:16AM -0500, Jeff Epler wrote: > Once I generated a password file, using > mkpasswd.exe > /etc/passwd > test_pwd.py seems to succeed as expected. Thanks for tracking down the above. When I release the next Cygwin Python distribution, I will update the README with this new information. However, I'm surprise that mkpasswd works under Windows 9x/Me. IIRC, it only works under Windows NT/2000. Implying that if one desired a passwd file on 9x/Me, then they had to create it by hand. I do not have a 9x/Me machine handy, so I cannot verify the current behavior. > Strangely, the Cygnus install seems to have automatically generated the passwd > file into /etc/group at install time, rather than into /etc/passwd. In any > case, this test failure is due to incorrect cygwin32 setup, not Python. If the above is really true, then this is a bug in the current Cygwin installer (i.e., setup.exe 2.78.2.3) and you should report this to the Cygwin mailing list. Note I just reran the current setup.exe under NT and it generated valid passwd and group files. Please look in /etc/postinstall. Do you see a file called passwd-grp.bat.done? If so, then examining its content will determine the commands that were automatically run during the install. I would be very interested in your findings. My passwd-grp.bat.done contains the following: bin\mkpasswd -l > etc\passwd bin\mkgroup -l > etc\group BTW, the cygwin-apps or cygwin mailing list is a more appropriate forum for a Cygwin Python discussion of this nature. Thanks, Jason -- Jason Tishler Director, Software Engineering Phone: 732.264.8770 x235 Dot Hill Systems Corp. Fax: 732.264.8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com From mal@lemburg.com Tue Jul 24 15:07:35 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 24 Jul 2001 16:07:35 +0200 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com> <3B5AFF46.7BBCE245@lemburg.com> <200107221642.MAA11705@cj20424-a.reston1.va.home.com> <3B5B1064.5882398A@lemburg.com> <200107230240.WAA14644@cj20424-a.reston1.va.home.com> Message-ID: <3B5D8127.8BAE1BD@lemburg.com> Guido van Rossum wrote: > > > > The simplified rule is that a type should only provide *either* > > > tp_getattr *or* tp_getattro, and likewise for set. The complete rule > > > is that if you insist on having both tp_getattr and tp_getattro, they > > > should implement the same semantics -- tp_getattr should be faster > > > when PyObject_GetAttrString() is called, and tp_getattro should be > > > faster when PyObject_GetAttr() is called. > > > > Ah, ok, didn't know that rule. > > Well, I just made it up today. :-) > > But it's a sensible rule, if you want predictable results. I'll implement it. > > I'll move to the new scheme for 2.2 then and leave the old tp_getattr > > around for backward compatibility. > > You should #ifdef on the Python version, unless you make your > tp_getattr do everything that tp_getattro does (possibly by calling on > the latter). Sure; that was my plan. I have to maintain 1.5.2 compatibility for the packages, that's why I'm trying to keep code redundancy minimal in the code base. So far, that has worked rather well (except for the attribute lookup part). About the typeobject.h struct names: could you tell me the Py-prefixed names of the getset et al. lists ? I'd rather not use the current non-prefixed names. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From skip@pobox.com (Skip Montanaro) Tue Jul 24 18:21:42 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 24 Jul 2001 12:21:42 -0500 Subject: [Python-Dev] number-sig anyone? Message-ID: <15197.44710.656892.910976@beluga.mojam.com> Dev-ers, I have been guilty of generating as much heat as light the past few days on the subject of integer division (though not quite as much heat as Stephen Horne!). For that I apologize. There are several active PEPs related to various aspects of Python's concept of numbers. Yesterday I found these: S 211 pep-0211.txt Adding A New Outer Product Operator Wilson S 228 pep-0228.txt Reworking Python's Numeric Model Zadka S 237 pep-0237.txt Unifying Long Integers and Integers Zadka S 238 pep-0238.txt Non-integer Division Zadka S 239 pep-0239.txt Adding a Rational Type to Python Zadka S 240 pep-0240.txt Adding a Rational Literal to Python Zadka S 242 pep-0242.txt Numeric Kinds Dubois Today I took a look at http://mail.python.org/mailman/listinfo and could find no math-sig or number-sig mailing list. If Python's number system is going to change in one or more backwards-incompatible I think there may only be one chance to get it right. I think a number-sig mailing list would be a worthwhile forum to discuss these issues. If there's already a group specific to this topic I missed it. Point me and I will start reading archives. Skip From fdrake@acm.org Tue Jul 24 18:27:20 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 24 Jul 2001 13:27:20 -0400 (EDT) Subject: [Python-Dev] number-sig anyone? In-Reply-To: <15197.44710.656892.910976@beluga.mojam.com> References: <15197.44710.656892.910976@beluga.mojam.com> Message-ID: <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> Skip Montanaro writes: > Today I took a look at http://mail.python.org/mailman/listinfo and could > find no math-sig or number-sig mailing list. If Python's number system is > going to change in one or more backwards-incompatible I think there may only > be one chance to get it right. I think a number-sig mailing list would be a > worthwhile forum to discuss these issues. There is the python-numerics mailing list on SourceForge; find it from the Python project page there: http://sourceforge.net/projects/python/ -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From skip@pobox.com (Skip Montanaro) Tue Jul 24 18:57:59 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 24 Jul 2001 12:57:59 -0500 Subject: [Python-Dev] number-sig anyone? In-Reply-To: <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> Message-ID: <15197.46887.496488.246536@beluga.mojam.com> Fred> There is the python-numerics mailing list on SourceForge; find it Fred> from the Python project page there: Fred> http://sourceforge.net/projects/python/ I don't suppose there's any chance those three sourceforge-hosted mailing lists could be mentioned on mail.python.org, could they? Seems to me that those three mailing lists are sponsored by the same organization as those hosted on mail.python.org. Skip From skip@pobox.com (Skip Montanaro) Tue Jul 24 19:01:23 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 24 Jul 2001 13:01:23 -0500 Subject: [Python-Dev] number-sig anyone? In-Reply-To: <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> Message-ID: <15197.47091.783426.407294@beluga.mojam.com> Damn... Are the archives of the python-numeric list available somewhere as a single mbox file or something? Looks like geocrawler is going to make me wade through the archives message-by-message on their website. Skip From guido@digicool.com Tue Jul 24 19:26:33 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 24 Jul 2001 14:26:33 -0400 Subject: [Python-Dev] number-sig anyone? In-Reply-To: Your message of "Tue, 24 Jul 2001 12:57:59 CDT." <15197.46887.496488.246536@beluga.mojam.com> References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> <15197.46887.496488.246536@beluga.mojam.com> Message-ID: <200107241826.OAA07299@cj20424-a.reston1.va.home.com> > Fred> There is the python-numerics mailing list on SourceForge; find it > Fred> from the Python project page there: > > Fred> http://sourceforge.net/projects/python/ > > I don't suppose there's any chance those three sourceforge-hosted mailing > lists could be mentioned on mail.python.org, could they? Seems to me that > those three mailing lists are sponsored by the same organization as those > hosted on mail.python.org. I think they should be mentioned there. Fred, can you edit the MailingLists.ht file? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Tue Jul 24 19:33:35 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 24 Jul 2001 14:33:35 -0400 Subject: [Python-Dev] PEP 253: Subtyping Built-in Types In-Reply-To: Your message of "Tue, 24 Jul 2001 16:07:35 +0200." <3B5D8127.8BAE1BD@lemburg.com> References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com> <3B5AFF46.7BBCE245@lemburg.com> <200107221642.MAA11705@cj20424-a.reston1.va.home.com> <3B5B1064.5882398A@lemburg.com> <200107230240.WAA14644@cj20424-a.reston1.va.home.com> <3B5D8127.8BAE1BD@lemburg.com> Message-ID: <200107241833.OAA07373@cj20424-a.reston1.va.home.com> > About the typeobject.h struct names: could you tell me the Py-prefixed > names of the getset et al. lists ? I'd rather not use the current > non-prefixed names. Argh, there aren't any Py-prefixed names for these yet! Nor for structmember. Since these are just structure names, they aren't visible to the linker, so there shouldn't be any conflicts with 3rd party libraries. But for consistency, and for compile-time as opposed to link-time conflict avoidance, they really should use Py-prefixes. I've added a bug report for myself so I won't forget. --Guido van Rossum (home page: http://www.python.org/~guido/) From alex_c@MIT.EDU Tue Jul 24 20:04:26 2001 From: alex_c@MIT.EDU (Alex Coventry) Date: 24 Jul 2001 15:04:26 -0400 Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c In-Reply-To: "Martin v. Loewis"'s message of "Tue, 24 Jul 2001 08:38:25 +0200" References: <200107240638.f6O6cPn03340@mira.informatik.hu-berlin.de> Message-ID: > Following itojun's proposal, I have now added an autoconf test for > snprintf, and use sprintf if it is not available. In PySocket_getaddrinfo, would it make sense to increase the allocation of pbuf from 10 characters to, say, 30 characters, in case sprintf(pbuf, "%ld", PyInt_AsLong(pobj)); gets run on a 64-bit machine? Alex. From fdrake@acm.org Tue Jul 24 20:20:47 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 24 Jul 2001 15:20:47 -0400 (EDT) Subject: [Python-Dev] number-sig anyone? In-Reply-To: <15197.46887.496488.246536@beluga.mojam.com> References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> <15197.46887.496488.246536@beluga.mojam.com> <200107241826.OAA07299@cj20424-a.reston1.va.home.com> Message-ID: <15197.51855.237526.101524@cj42289-a.reston1.va.home.com> Skip Montanaro writes: > I don't suppose there's any chance those three sourceforge-hosted mailing > lists could be mentioned on mail.python.org, could they? Seems to me that > those three mailing lists are sponsored by the same organization as those > hosted on mail.python.org. I don't know any way to do that. If we could more easily create new lists on python.org (i.e., not have to wait for Barry), they never would have been created on SourceForge. Guido van Rossum writes: > I think they should be mentioned there. Fred, can you edit the > MailingLists.ht file? Done. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido@digicool.com Tue Jul 24 20:29:12 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 24 Jul 2001 15:29:12 -0400 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? In-Reply-To: Your message of "Tue, 24 Jul 2001 12:15:43 +0200." <3B5D4ACF.F79CD179@lemburg.com> References: <15196.43097.529737.173915@beluga.mojam.com> <3B5D4ACF.F79CD179@lemburg.com> Message-ID: <200107241929.PAA07684@cj20424-a.reston1.va.home.com> > Skip Montanaro wrote: > > > > There are several active or could-be-active PEPs related to Python's numeric > > behavior: > > > > S 211 pep-0211.txt Adding A New Outer Product Operator Wilson > > S 228 pep-0228.txt Reworking Python's Numeric Model Zadka > > S 237 pep-0237.txt Unifying Long Integers and Integers Zadka > > S 238 pep-0238.txt Non-integer Division Zadka > > S 239 pep-0239.txt Adding a Rational Type to Python Zadka > > S 240 pep-0240.txt Adding a Rational Literal to Python Zadka > > S 242 pep-0242.txt Numeric Kinds Dubois > > > > Instead of implementing them piecemeal, shouldn't we be > > considering them as a related group? For example, implementing > > any or all of PEPs 237, 239 and 240 might well have an effect on > > what needs to be done for PEP 238. With slight modifications, the > > proposals in PEP 242 might well subsume PEP 238's functionality in > > a different way. > > > > If the semantics of arithmetic are going to change, I think they should > > change in the context of expanded capability in the language. I think PEP 211 and PEP 242 don't belong in this list. PEP 211 doesn't affect Python's number system at all, and PEP 242 proposes a set of storage choices, not choices in semantics. PEP 242 is valid regardless of what we decide about int division. The others, however, indeed are connected. In fact the one that's currently generating so much heat, PEP 238, is an essential prerequisite for PEP 228, and so is PEP 237: if the different numeric types are to be made fully interchangeable, as PEP 228 requires, a different answer for 1/2 and 1.0/2.0 is impossible, and likewise, 1L will be treated the same as 1 (in fact, the 'L' suffix will probably be ignored eventually, and the representation choice is made solely based on the numerical value). But it's different the other way around: PEP 238 can easily stand on its own. It addresses a problem that exists even without a unified numeric model. Conversely, if PEP 238 is unacceptable, PEP 228 also has no hope, and PEP 239 is much less attractive. Since PEP 238 is the only one that cannot avoid breaking existing code, I want to introduce it as soon as I can, since the others can only be introduced after the full compatibility waiting period for PEP 238, at least two years. The relationship between PEP 238 and PEP 239 is interesting. PEP 238 currently proposes to let int division return a float, because that's the only available type. But I believe that if we decide down the road that int division should return a rational number instead, this will break little or no code, as long as we embed the rationals in the floats. That is, the coercion rules would use this ordering: int -> long -> rational -> float -> complex This is in spite of the fact that floating point numbers are actually representable exactly as rationals! (Using unbounded precision, which Python rationals should definitely have.) When I add a float to a rational number, I want the result to be a float, not a rational, because the float (most likely) represents an approximated value, and turning it into an exact rational seems a mistake in that case. The property which current division lacks, and which I think is an important step towards PEP 228, is the following: In an expression involving only numeric variables and operators, the *mathematical value* of the result (except for accuracy issues due to the fallibility of floating point hardware) should only depend on the mathematical value of the inputs. The *type* of the result should be the first type in the above coercion list that does not come before any of the input types, and that can represent the mathematical value of the result. With "mathematical value" I mean the abstraction of numbers generally used in mathematics, where the integers are embedded in the rationals which are embedded in the reals, etc. Mathematicians may talk about the type of a variable ("let i be an integer, etc.") but they never talk about the type of a *value*: integer literals are used without prejudice in formulas yielding real results. If we introduce rationals, and we redefine int division as returning a rational instead of a float, this will not affect the mathematical value. (BTW, float is a misnomer. I should have called it real, but alas, I was a little *too* much under the influence of C at the time. This is not worth fixing.) [MAL] > May I suggest that these rather controversial changes be carried > out on a separate branch of the Python source tree before adding > them to the trunk ?! Definitely. I am currently maintaining the PEP 238 implementation as a patch; I don't want to start any new branches before we've merged the descr-branch into the trunk. > The reasoning here is that numerics are so low-level that porting > applications to a new release implementing these changes will > cause a lot of work (mostly due to the dynamic nature of Python). I am aware of the amount of work; that's why I want to allow a very generous waiting period before making it law. > Another suggestion I would like to make is that the new semantics > are first implemented using alternative subclassed numeric > objects (e.g. newint()) which can then live side-by-side with the > old semantics types for a few releases until they replace the > old types. Hm, I don't think that that will be very useful. A new-division-aware module could create integer values of the new type, but in order to protect itself against old-style integers passed in as arguments, it would have to force a conversion of all arguments -- in which case the code becomes even uglier than if we added explicit float() coercions to all arguments. Have you looked at my PEP-238 patch at all? It solves the side-by-side problem with a future statement and two new division operators: one that forces int results, for //, one that forces float results, for / under the influence of the appropriate future statement, and one that implements the old behavior, for / without the future statement. --Guido van Rossum (home page: http://www.python.org/~guido/) From alex_c@MIT.EDU Tue Jul 24 21:09:36 2001 From: alex_c@MIT.EDU (Alex Coventry) Date: 24 Jul 2001 16:09:36 -0400 Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c In-Reply-To: Alex Coventry's message of "24 Jul 2001 15:04:26 -0400" References: Message-ID: > In PySocket_getaddrinfo, would it make sense to increase the > allocation of pbuf from 10 characters to, say, 30 characters Or even to "char pbuf[sizeof(long)*3];" so no one has to think about it anymore. Alex. From martin@loewis.home.cs.tu-berlin.de Tue Jul 24 21:48:39 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 24 Jul 2001 22:48:39 +0200 Subject: [Python-Dev] IPv6 committed Message-ID: <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de> Hi itojun, As you may have noticed, I just committed the last chunk of your IPv6 patch. Thanks a lot for your contributions, I think you've provided a highly valuable contribution to Python 2.2. We still have to figure out a way to provide documentation, but I expect that we can complete that before 2.2a2. As with all new code, there may occur some problems; I hope you'll be around for the coming weeks and give the professional advise that you've provided throughout the integration of the code. People finding problems in the IPv6 code (ie. with the current socket applications) are encouraged to use the SF bug-reporting procedure as they do for all other problems in the Python libraries; you can assign all such bugs to me. Kind regards, Martin From fdrake@acm.org Tue Jul 24 21:52:33 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 24 Jul 2001 16:52:33 -0400 (EDT) Subject: [Python-Dev] IPv6 committed In-Reply-To: <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de> References: <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de> Message-ID: <15197.57361.306046.539086@cj42289-a.reston1.va.home.com> Martin v. Loewis writes: > We still have to figure > out a way to provide documentation, but I expect that we can complete > that before 2.2a2. I'll warn you now that I know next to nothing about IPv6, and my attempts to spend enough time reading about it to be useful have been thwarted more than one. I'm afraid I'll be able to provide no more than editorial & markup assistance for the IPv6 documentation. ;-( -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido@digicool.com Tue Jul 24 22:02:30 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 24 Jul 2001 17:02:30 -0400 Subject: [Python-Dev] IPv6 committed In-Reply-To: Your message of "Tue, 24 Jul 2001 22:48:39 +0200." <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de> References: <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de> Message-ID: <200107242102.RAA07970@cj20424-a.reston1.va.home.com> Martin and Itojun, I would like to thank you both for adding IPv6 support to Python. It's a big boon for Python as well as for IPv6! --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Tue Jul 24 22:24:13 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 24 Jul 2001 17:24:13 -0400 Subject: [Python-Dev] number-sig anyone? References: <15197.44710.656892.910976@beluga.mojam.com> Message-ID: <15197.59261.754548.28233@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> Today I took a look at http://mail.python.org/mailman/listinfo SM> and could find no math-sig or number-sig mailing list. If SM> Python's number system is going to change in one or more SM> backwards-incompatible I think there may only be one chance to SM> get it right. I think a number-sig mailing list would be a SM> worthwhile forum to discuss these issues. +1. If others agree, I'll create the sig. -Barry From barry@zope.com Tue Jul 24 22:26:40 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 24 Jul 2001 17:26:40 -0400 Subject: [Python-Dev] number-sig anyone? References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> Message-ID: <15197.59408.52053.335198@anthem.wooz.org> >>>>> "Fred" == Fred L Drake, Jr writes: Fred> There is the python-numerics mailing list on SourceForge; Fred> find it from the Python project page there: Fred> http://sourceforge.net/projects/python/ Ah. Shouldn't this page http://www.python.org/sigs/ point to this page http://sourceforge.net/mail/?group_id=5470 ??? -Barry From guido@digicool.com Tue Jul 24 22:30:52 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 24 Jul 2001 17:30:52 -0400 Subject: [Python-Dev] number-sig anyone? In-Reply-To: Your message of "Tue, 24 Jul 2001 17:24:13 EDT." <15197.59261.754548.28233@anthem.wooz.org> References: <15197.44710.656892.910976@beluga.mojam.com> <15197.59261.754548.28233@anthem.wooz.org> Message-ID: <200107242130.RAA08105@cj20424-a.reston1.va.home.com> > >>>>> "SM" == Skip Montanaro writes: > > SM> Today I took a look at http://mail.python.org/mailman/listinfo > SM> and could find no math-sig or number-sig mailing list. If > SM> Python's number system is going to change in one or more > SM> backwards-incompatible I think there may only be one chance to > SM> get it right. I think a number-sig mailing list would be a > SM> worthwhile forum to discuss these issues. > > +1. If others agree, I'll create the sig. > > -Barry Sounds like a good plan, but please wait until we have a SIG owner/moderator and a charter. Without both of these a SIG will be a failure. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Tue Jul 24 22:41:37 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 24 Jul 2001 17:41:37 -0400 Subject: [Python-Dev] number-sig anyone? References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> <15197.46887.496488.246536@beluga.mojam.com> <200107241826.OAA07299@cj20424-a.reston1.va.home.com> <15197.51855.237526.101524@cj42289-a.reston1.va.home.com> Message-ID: <15197.60305.896122.475136@anthem.wooz.org> >>>>> "Fred" == Fred L Drake, Jr writes: Fred> I don't know any way to do that. If we could more easily Fred> create new lists on python.org (i.e., not have to wait for Fred> Barry), they never would have been created on SourceForge. Actually, creating the lists is no problem, takes just seconds. It's updating the sigs page that's the PITA. Note that in Mailman 2.1, you'll be able to create new lists thru-the-web, and we can delegate that responsibility to a "list creator" password which can be shared by a small group of trusted folks. Updating the sigs page is still a separate process. -Barry From klm@digicool.com Tue Jul 24 22:46:56 2001 From: klm@digicool.com (Ken Manheimer) Date: Tue, 24 Jul 2001 17:46:56 -0400 (EDT) Subject: [Python-Dev] mail.python.org black listed ?! In-Reply-To: <15196.41226.977676.237807@beluga.mojam.com> Message-ID: On Mon, 23 Jul 2001, Skip Montanaro wrote: > BAW> - The filter marks the message with a % confidence of being spam > BAW> (e.g. X-Spam: 75%) > > BAW> - Each Mailman recipient could specify the threshhold above which > BAW> they do not want to receive the message (e.g. don't sent me > BAW> anything that's spam with a more than 70% confidence level). > BAW> This only works for regular delivery. > > [Could use re's to match] > > I would therefore suggest that the X-Spam header be simply a three-digit > number in the range 000 to 100. (No percent sign, always with any necessary > leading zeroes.) It might even be better to create an X-Spam-Value header > in one-bit arithmetic, e.g. make a slightly smaller range (say 0 to 50) and > include a header like: > > X-Spam-Value: sssssssssssssssssssssssssssssssssss > > to indicate a 70% likelihood (35 "s"s). You could then match it with > > X-Spam-Value: s{25,50} > > in procmail to spam-categorize anything with a probability of spamhood >= > 50%. You could include a readable X-Spam header like: > > X-Spam: rated 75% probability of being spam by "Spam Pie v. 0.1" Um, yick!-) The idea of using a bar-like representation of the assessment strikes me like suggesting presentation of the info in a graph, and then screen-scraping to evaluate the graph. Aieee! How about a spam-estimate of 0-9? Pretty darn easy to match. I wouldn't imagine the lack of precision is going to be a problem, in this domain... Or is this all too off-topic? Ken klm@digicool.com From skip@pobox.com (Skip Montanaro) Tue Jul 24 22:54:54 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 24 Jul 2001 16:54:54 -0500 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? In-Reply-To: <200107241929.PAA07684@cj20424-a.reston1.va.home.com> References: <15196.43097.529737.173915@beluga.mojam.com> <3B5D4ACF.F79CD179@lemburg.com> <200107241929.PAA07684@cj20424-a.reston1.va.home.com> Message-ID: <15197.61102.391599.162359@beluga.mojam.com> >>>>> "Guido" == Guido van Rossum writes: >> Skip Montanaro wrote: >> > >> > There are several active or could-be-active PEPs related to Python's numeric >> > behavior: >> > >> > S 211 pep-0211.txt Adding A New Outer Product Operator Wilson >> > S 228 pep-0228.txt Reworking Python's Numeric Model Zadka >> > S 237 pep-0237.txt Unifying Long Integers and Integers Zadka >> > S 238 pep-0238.txt Non-integer Division Zadka >> > S 239 pep-0239.txt Adding a Rational Type to Python Zadka >> > S 240 pep-0240.txt Adding a Rational Literal to Python Zadka >> > S 242 pep-0242.txt Numeric Kinds Dubois Guido> I think PEP 211 and PEP 242 don't belong in this list. PEP 211 Guido> doesn't affect Python's number system at all, and PEP 242 Guido> proposes a set of storage choices, not choices in semantics. PEP Guido> 242 is valid regardless of what we decide about int division. The inclusion of PEP 211 in this message was an oversight. I pasted this list from another message. I included PEP 242 on purpose however. I think Paul gives you a language for perhaps defining other sorts of numeric properties besides numeric precision (which is what my reading led me to believe it was focused on). ... Guido> But it's different the other way around: PEP 238 can easily stand Guido> on its own. It addresses a problem that exists even without a Guido> unified numeric model. Guido> Conversely, if PEP 238 is unacceptable, PEP 228 also has no hope, Guido> and PEP 239 is much less attractive. Since PEP 238 is the only Guido> one that cannot avoid breaking existing code, I want to introduce Guido> it as soon as I can, since the others can only be introduced Guido> after the full compatibility waiting period for PEP 238, at least Guido> two years. ... Guido> If we introduce rationals, and we redefine int division as Guido> returning a rational instead of a float, this will not affect the Guido> mathematical value. ... Guido> I am currently maintaining the PEP 238 implementation as a patch; Guido> I don't want to start any new branches before we've merged the Guido> descr-branch into the trunk. I elided a bunch of valuable information, stuff I was previously unaware of. The acceptability or not of PEP 238 in the broader Python community appears to be based on people only looking back. As far as I know most people aren't aware of the long-term motivation. (It may have been there in one of Guido's or Tim's messages, but if so, I missed it.) I certainly wasn't aware of the motivation, and I just read the above PEPs in the past day or two. Connecting all that together (a "meta PEP"?) probably belongs in PEP 228. Here's what I propose. Once the descr-branch has been merged, create a new branch, call it mouse-branch. Add the PEP 238 and other changes there and update PEP 228 (last change: 4 Nov 2000) to include the rationale I deleted from Guido's message. Then urge anyone with an interest in any of these topics to check out the mouse from CVS and play with it. (Just don't squish it, that's the Python's job!) Initially, it will just have the one change that has stirred up such a hornet's nest. Still, even that will be instructive to play with, and in concert with a stronger motivation for the change in PEP 228 (and perhaps PEP 238) should help soften the blow caused by the change. As I mentioned in a previous message, I think you have one chance to make this change. If people perceive that "hey, he's going somewhere interesting with this stuff", I think they will be more open to the discomfort of individual changes. Then, once you're ready (I don't know if 2.2 is far enough out), have the Python eat the mouse and start a rat-branch that incorporates all the rational stuff (having never used a programming language that supported rational numbers, I find the prospect both a bit daunting and exciting). That branch will live for a fairly long time, probably at least until 2.4, when the int division change is complete, at which point the Python can eat the rat. Guido> Have you looked at my PEP-238 patch at all? Not yet. Should it be applied to the head branch or the descr-branch? Skip From skip@pobox.com (Skip Montanaro) Tue Jul 24 23:20:59 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 24 Jul 2001 17:20:59 -0500 Subject: [Python-Dev] number-sig anyone? In-Reply-To: <15197.59408.52053.335198@anthem.wooz.org> References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> <15197.59408.52053.335198@anthem.wooz.org> Message-ID: <15197.62667.379233.918619@beluga.mojam.com> Fred> There is the python-numerics mailing list on SourceForge; find it Fred> from the Python project page there: Fred> http://sourceforge.net/projects/python/ BAW> Ah. Shouldn't this page BAW> http://www.python.org/sigs/ BAW> point to this page BAW> http://sourceforge.net/mail/?group_id=5470 BAW> ??? Looks like we have at least three pages that list related info: http://www.python.org/sigs/ http://mail.python.org/ http://sourceforge.net/mail/?group_id=5470 Can they be unified? Skip From skip@pobox.com (Skip Montanaro) Tue Jul 24 23:38:59 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 24 Jul 2001 17:38:59 -0500 Subject: [Python-Dev] mail.python.org black listed ?! In-Reply-To: References: <15196.41226.977676.237807@beluga.mojam.com> Message-ID: <15197.63747.743439.630607@beluga.mojam.com> >> X-Spam-Value: sssssssssssssssssssssssssssssssssss Ken> Um, yick!-) The idea of using a bar-like representation of the Ken> assessment strikes me like suggesting presentation of the info in a Ken> graph, and then screen-scraping to evaluate the graph. Aieee! Well, that's not what I had in mind, but if that floats your boat. The fundamental I'm trying to solve is that many mail filter programs can't do numeric comparisons at all. I'm used to procmail which allows me to easily use regular expressions. Somebody else (Barry? Marc-Andre?) suggested X-Spam-Value: 0123456789 (90% probability) or X-Spam-Value: 012345 (50% probability) which can be matched by feeble filters like Netscape's that only supports substring matches ("X-Spam-Value: 0123456" would match anything of 60% probability or higher). Ken> How about a spam-estimate of 0-9? Pretty darn easy to match. I Ken> wouldn't imagine the lack of precision is going to be a problem, in Ken> this domain... You're unfortunately back to trying to make numeric comparisons or using fairly complex regular expressions to perform the comparisons. The poor saps using Netscape would have to have four rules to match a 60% or higher spam probability. This discussion almost certainly doesn't belong on python-dev. Is there a more appropriate Python-related list already in existence in which to hatch these ideas? Skip From tim@digicool.com Tue Jul 24 23:49:50 2001 From: tim@digicool.com (Tim Peters) Date: Tue, 24 Jul 2001 18:49:50 -0400 Subject: [Python-Dev] mail.python.org black listed ?! In-Reply-To: <15197.63747.743439.630607@beluga.mojam.com> Message-ID: [Ken Manheimer] > How about a spam-estimate of 0-9? Pretty darn easy to match. [Skip Montanaro] > You're unfortunately back to trying to make numeric comparisons or > using fairly complex regular expressions to perform the comparisons. > The poor saps using Netscape would have to have four rules to match a > 60% or higher spam probability. I don't use Netscape or know which flavor of regexps it supports, but I don't know of any regexp pkg that lacks character-class support. That is, in Python regexp syntax, r"X-Spam-Whatever:\s+[6-9]" # match >= 60% r"X-Spame-Whatever:\s+[0-5]" # match < 60% r"X-Spame-Whatever:\s+[2357]" # match int(spamprob/10) is prime > This discussion almost certainly doesn't belong on python-dev. Is > there a more appropriate Python-related list already in existence in > which to hatch these ideas? Only one that comes to mind is the numerics list, since this *is* about numeric comparisons, and has the potential to become ugly > From mwh@python.net Tue Jul 24 23:56:32 2001 From: mwh@python.net (Michael Hudson) Date: 24 Jul 2001 18:56:32 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib BaseHTTPServer.py,1.15,1.16 SocketServer.py,1.25,1.26 ftplib.py,1.53,1.54 httplib.py,1.35,1.36 poplib.py,1.14,1.15 smtplib.py,1.36,1.37 telnetlib.py,1.11,1.12 References: Message-ID: <2mvgkiuean.fsf@starship.python.net> "Martin v. L?wis" writes: > Update of /cvsroot/python/python/dist/src/Lib > In directory usw-pr-cvs1:/tmp/cvs-serv11791 > > Modified Files: > BaseHTTPServer.py SocketServer.py ftplib.py httplib.py > poplib.py smtplib.py telnetlib.py > Log Message: > Patch #401196: Use getaddrinfo and AF_INET6 in TCP servers and clients. > ! for res in socket.getaddrinfo(self.host, self.port, 0, socket.SOCK_STREAM): > ! af, socktype, proto, canonname, sa = res > ! try: > ! self.sock = socket.socket(af, socktype, proto) > ! self.sock.connect(sa) > ! except socket.error, msg: > ! self.sock.close() > ! self.sock = None > ! continue > ! break > ! if not self.sock: > ! raise socket.error, msg > ! for res in socket.getaddrinfo(None, 0, self.af, socket.SOCK_STREAM, 0, socket.AI_PASSIVE): > ! af, socktype, proto, canonname, sa = res > ! try: > ! sock = socket.socket(af, socktype, proto) > ! sock.bind(sa) > ! except socket.error, msg: > ! sock.close() > ! sock = None > ! continue > ! break > ! if not sock: > ! raise socket.error, msg > ! for res in socket.getaddrinfo(self.host, self.port, 0, socket.SOCK_STREAM): > ! af, socktype, proto, canonname, sa = res > ! try: > ! self.sock = socket.socket(af, socktype, proto) > ! if self.debuglevel > 0: > ! print "connect: (%s, %s)" % (self.host, self.port) > ! self.sock.connect(sa) > ! except socket.error, msg: > ! if self.debuglevel > 0: > ! print 'connect fail:', (self.host, self.port) > ! self.sock.close() > ! self.sock = None > ! continue > ! break > ! if not self.sock: > ! raise socket.error, msg > ! for res in socket.getaddrinfo(self.host, self.port, 0, socket.SOCK_STREAM): > ! af, socktype, proto, canonname, sa = res > ! try: > ! self.sock = socket.socket(af, socktype, proto) > ! self.sock.connect(sa) > ! except socket.error, msg: > ! self.sock.close() > ! self.sock = None > ! continue > ! break > ! if not self.sock: > ! raise socket.error, msg > ! for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM): > ! af, socktype, proto, canonname, sa = res > ! try: > ! self.sock = socket.socket(af, socktype, proto) > ! if self.debuglevel > 0: print 'connect:', (host, port) > ! self.sock.connect(sa) > ! except socket.error, msg: > ! if self.debuglevel > 0: print 'connect fail:', (host, port) > ! self.sock.close() > ! self.sock = None > ! continue > ! break > ! if not self.sock: > ! raise socket.error, msg > ! for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM): > ! af, socktype, proto, canonname, sa = res > ! try: > ! self.sock = socket.socket(af, socktype, proto) > ! self.sock.connect(sa) > ! except socket.error, msg: > ! self.sock.close() > ! self.sock = None > ! continue > ! break > ! if not self.sock: > ! raise socket.error, msg Excuse my ignorance, but: A case for refactoring? Also this patch introduced some hard tabs, but I guess Tim'll beat these to death with reindent.py at some point... Cheers, M. From guido@digicool.com Wed Jul 25 00:02:06 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 24 Jul 2001 19:02:06 -0400 Subject: [Python-Dev] number-sig anyone? In-Reply-To: Your message of "Tue, 24 Jul 2001 17:20:59 CDT." <15197.62667.379233.918619@beluga.mojam.com> References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> <15197.59408.52053.335198@anthem.wooz.org> <15197.62667.379233.918619@beluga.mojam.com> Message-ID: <200107242302.TAA08477@cj20424-a.reston1.va.home.com> > Looks like we have at least three pages that list related info: > > http://www.python.org/sigs/ > http://mail.python.org/ > http://sourceforge.net/mail/?group_id=5470 > > Can they be unified? I don't see how. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com (Skip Montanaro) Wed Jul 25 00:21:43 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 24 Jul 2001 18:21:43 -0500 Subject: [Python-Dev] mail.python.org black listed ?! In-Reply-To: References: <15197.63747.743439.630607@beluga.mojam.com> Message-ID: <15198.775.101437.858823@beluga.mojam.com> Tim> [Skip Montanaro] >> You're unfortunately back to trying to make numeric comparisons or >> using fairly complex regular expressions to perform the comparisons. >> The poor saps using Netscape would have to have four rules to match a >> 60% or higher spam probability. Tim> I don't use Netscape or know which flavor of regexps it supports, Tim> but I don't know of any regexp pkg that lacks character-class Tim> support. That is, in Python regexp syntax, Yeah, but Netscape apparently doesn't support regexs in its mail filters at all, just substring matches. >> This discussion almost certainly doesn't belong on python-dev. Is >> there a more appropriate Python-related list already in existence in >> which to hatch these ideas? Tim> Only one that comes to mind is the numerics list, since this *is* Tim> about numeric comparisons, and has the potential to become ugly Tim> > Hey, not a bad idea. I just subscribed and the archives suggest it's been idle. Perhaps I can hijack it... ;-) S From skip@pobox.com (Skip Montanaro) Wed Jul 25 00:24:33 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 24 Jul 2001 18:24:33 -0500 Subject: [Python-Dev] number-sig anyone? In-Reply-To: <200107242302.TAA08477@cj20424-a.reston1.va.home.com> References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> <15197.59408.52053.335198@anthem.wooz.org> <15197.62667.379233.918619@beluga.mojam.com> <200107242302.TAA08477@cj20424-a.reston1.va.home.com> Message-ID: <15198.945.330390.874100@beluga.mojam.com> >> Looks like we have at least three pages that list related info: >> >> http://www.python.org/sigs/ >> http://mail.python.org/ >> http://sourceforge.net/mail/?group_id=5470 >> >> Can they be unified? Guido> I don't see how. Perhaps they can at least be made to point incestuously to one another (though I imagine the sf page is completely out of our control and thus immune to such incest)... Skip From itojun@iijlab.net Wed Jul 25 00:43:53 2001 From: itojun@iijlab.net (itojun@iijlab.net) Date: Wed, 25 Jul 2001 08:43:53 +0900 Subject: [Python-Dev] IPv6 committed In-Reply-To: guido's message of Tue, 24 Jul 2001 17:02:30 -0400. <200107242102.RAA07970@cj20424-a.reston1.va.home.com> Message-ID: <2069.996018233@itojun.org> >Martin and Itojun, >I would like to thank you both for adding IPv6 support to Python. >It's a big boon for Python as well as for IPv6! no, thank you! and actually the last one mile was all done by Martin, i would really like to so thank Martin. itojun From itojun@iijlab.net Wed Jul 25 00:47:17 2001 From: itojun@iijlab.net (itojun@iijlab.net) Date: Wed, 25 Jul 2001 08:47:17 +0900 Subject: [Python-Dev] Re: IPv6 committed In-Reply-To: martin's message of Tue, 24 Jul 2001 22:48:39 +0200. <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de> Message-ID: <2086.996018437@itojun.org> >Hi itojun, > >As you may have noticed, I just committed the last chunk of your IPv6 >patch. Thanks a lot for your contributions, I think you've provided a >highly valuable contribution to Python 2.2. We still have to figure >out a way to provide documentation, but I expect that we can complete >that before 2.2a2. sorry that i'm delayed about documentation (specificaly socket module). i have no TeX environment now (i had before) and having trouble checking if i'm typesetting right. do you mind if i send you just plaintext? >As with all new code, there may occur some problems; I hope you'll be >around for the coming weeks and give the professional advise that >you've provided throughout the integration of the code. as for Lib/*.y changes, there shouldn't be much changes unless you have faulty IPv6 connectivity - the code will try to connect to IPv6 destination then IPv4 against FQDN hostnames, so if IPv6 connectivity is faulty you will see more delays. with Lib/ftp.py people is most likely to see something is happening as it will try protocol-independent FTP commands (EPSV/EPRT) first. anyway... if possible drop me notes. i don't check SF too frequently (i cannot adapt to the SF UI). i'll subscribe to python-dev. itojun From guido@digicool.com Wed Jul 25 01:09:18 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 24 Jul 2001 20:09:18 -0400 Subject: [Python-Dev] number-sig anyone? In-Reply-To: Your message of "Tue, 24 Jul 2001 18:24:33 CDT." <15198.945.330390.874100@beluga.mojam.com> References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> <15197.59408.52053.335198@anthem.wooz.org> <15197.62667.379233.918619@beluga.mojam.com> <200107242302.TAA08477@cj20424-a.reston1.va.home.com> <15198.945.330390.874100@beluga.mojam.com> Message-ID: <200107250009.UAA08631@cj20424-a.reston1.va.home.com> > >> Looks like we have at least three pages that list related info: > >> > >> http://www.python.org/sigs/ > >> http://mail.python.org/ > >> http://sourceforge.net/mail/?group_id=5470 > >> > >> Can they be unified? > > Guido> I don't see how. > > Perhaps they can at least be made to point incestuously to one another > (though I imagine the sf page is completely out of our control and thus > immune to such incest)... > > Skip And the mailman page is also auto-generated. This leaves the sigs page, which AFAIK already points to the others (incestuously or otherwise :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Wed Jul 25 03:44:58 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 24 Jul 2001 22:44:58 -0400 Subject: [Python-Dev] number-sig anyone? References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> <15197.59408.52053.335198@anthem.wooz.org> <15197.62667.379233.918619@beluga.mojam.com> <200107242302.TAA08477@cj20424-a.reston1.va.home.com> <15198.945.330390.874100@beluga.mojam.com> Message-ID: <15198.12970.318099.891202@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: >> Looks like we have at least three pages that list related info: >> http://www.python.org/sigs/ http://mail.python.org/ >> http://sourceforge.net/mail/?group_id=5470 Can they be unified? Guido> I don't see how. SM> Perhaps they can at least be made to point incestuously to one SM> another (though I imagine the sf page is completely out of our SM> control and thus immune to such incest)... Only the /sigs/ page is statically generated, so only it is easy to change. -Barry From guido@digicool.com Wed Jul 25 04:52:26 2001 From: guido@digicool.com (Guido van Rossum) Date: Tue, 24 Jul 2001 23:52:26 -0400 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? In-Reply-To: Your message of "Tue, 24 Jul 2001 16:54:54 CDT." <15197.61102.391599.162359@beluga.mojam.com> References: <15196.43097.529737.173915@beluga.mojam.com> <3B5D4ACF.F79CD179@lemburg.com> <200107241929.PAA07684@cj20424-a.reston1.va.home.com> <15197.61102.391599.162359@beluga.mojam.com> Message-ID: <200107250352.XAA01001@cj20424-a.reston1.va.home.com> > Guido> I think PEP 211 and PEP 242 don't belong in this list. PEP 211 > Guido> doesn't affect Python's number system at all, and PEP 242 > Guido> proposes a set of storage choices, not choices in semantics. PEP > Guido> 242 is valid regardless of what we decide about int division. > > The inclusion of PEP 211 in this message was an oversight. I pasted this > list from another message. I included PEP 242 on purpose however. I think > Paul gives you a language for perhaps defining other sorts of numeric > properties besides numeric precision (which is what my reading led me to > believe it was focused on). Maybe, but I don't think a review of PEP 242 is necessary in order to decide on the others. > ... > > Guido> But it's different the other way around: PEP 238 can easily stand > Guido> on its own. It addresses a problem that exists even without a > Guido> unified numeric model. > > Guido> Conversely, if PEP 238 is unacceptable, PEP 228 also has no hope, > Guido> and PEP 239 is much less attractive. Since PEP 238 is the only > Guido> one that cannot avoid breaking existing code, I want to introduce > Guido> it as soon as I can, since the others can only be introduced > Guido> after the full compatibility waiting period for PEP 238, at least > Guido> two years. > > ... > > Guido> If we introduce rationals, and we redefine int division as > Guido> returning a rational instead of a float, this will not affect the > Guido> mathematical value. > > ... > > Guido> I am currently maintaining the PEP 238 implementation as a patch; > Guido> I don't want to start any new branches before we've merged the > Guido> descr-branch into the trunk. > > I elided a bunch of valuable information, stuff I was previously unaware of. > The acceptability or not of PEP 238 in the broader Python community appears > to be based on people only looking back. As far as I know most people > aren't aware of the long-term motivation. (It may have been there in one of > Guido's or Tim's messages, but if so, I missed it.) I certainly wasn't > aware of the motivation, and I just read the above PEPs in the past day or > two. Connecting all that together (a "meta PEP"?) probably belongs in PEP > 228. I have Moshe's permission to co-author PEP 238, which I'll do as soon as I'm done with my remote keynote at the O'Reilly conference (due to circumstances beyond my control I'm not in San Diego), sometime tomorrow. > Here's what I propose. Once the descr-branch has been merged, create a new > branch, call it mouse-branch. Add the PEP 238 and other changes there and > update PEP 228 (last change: 4 Nov 2000) to include the rationale I deleted > from Guido's message. Then urge anyone with an interest in any of these > topics to check out the mouse from CVS and play with it. (Just don't squish > it, that's the Python's job!) Initially, it will just have the one change > that has stirred up such a hornet's nest. Still, even that will be > instructive to play with, and in concert with a stronger motivation for the > change in PEP 228 (and perhaps PEP 238) should help soften the blow caused > by the change. As I mentioned in a previous message, I think you have one > chance to make this change. If people perceive that "hey, he's going > somewhere interesting with this stuff", I think they will be more open to > the discomfort of individual changes. That's one suggestion. I've noticed that very few people check out branches unless you force them. The PEP-238 changes are localized enough that I can maintain them as a patch in the SF patch manager; that's easier to use for most people. > Then, once you're ready (I don't know if 2.2 is far enough out), have the > Python eat the mouse and start a rat-branch that incorporates all the > rational stuff (having never used a programming language that supported > rational numbers, I find the prospect both a bit daunting and exciting). > That branch will live for a fairly long time, probably at least until 2.4, > when the int division change is complete, at which point the Python can eat > the rat. I don't have time for rationals yet; but I do want to put phase 1 of PEP 238 in the 2.2 release, and preferably sooner (e.g. 2.2a2) rather than later. Phase 1 breaks no code; all it does is add the // operator and the future division statement. I also plan command line options to (1) add warnings for old-style / with int or long args; and (2) make new-style / the default. Both are tools (though not the only ones) for future-proofing code. (One goal is to make the whole library robust under any combination of command line options; this will require a branch or checking things in on the trunk, as it will affect a large number of files.) > Guido> Have you looked at my PEP-238 patch at all? > > Not yet. Should it be applied to the head branch or the descr-branch? It works with either. Also with the 2.2a1 release, I expect. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Wed Jul 25 06:01:26 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 25 Jul 2001 01:01:26 -0400 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? In-Reply-To: <3B5D4ACF.F79CD179@lemburg.com> Message-ID: [MAL] > May I suggest that these rather controversial changes be carried > out on a separate branch of the Python source tree before adding > them to the trunk ?! Sure, provided you're volunteering to keep the branch in synch with the trunk: branches are both expensive and risky, unless the intent is never to merge in either direction. Much as I hate the obfuscating effects of #ifdefs, these changes are localized enough that it would be a clear net win to use them rather than branches, if Guido gets weary of maintaining a patch. From skip@pobox.com (Skip Montanaro) Wed Jul 25 06:08:13 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 25 Jul 2001 00:08:13 -0500 Subject: [Python-Dev] post mortem after threading deadlock? Message-ID: <15198.21565.83325.86255@beluga.mojam.com> Is there any possibility of getting some post-mortem info out of a multi-threaded system whose threads are deadlocked? I am trying to multi-thread my xmlrpc server methods. It's working okay "almost all the time" and I see a wonderful overall throughput boost because slow operations tend to no longer impede fast ones. I got a deadlock today, however, and am wondering how I am going to go about figuring out what happened the next time this happens. The main thread never deadlocks. It just spins off a thread to perform handle the current request and goes back to listening for new requests. For the purposes of inspecting the deadlocked threads I plan to add a method to my server that roots around for interesting info without attempting to lock anything (and thus possibly joining the deadlock party). Can one thread get at any state from other threads? I can inspect the values of the various shared locks and semaphores I'm using, but I was hoping to get at perhaps the current frame of each of the deadlocked threads. Any chance of that? Failing that, what about locks and semaphores that can time out and raise exceptions? I think they'd be useful for debugging if they could be implemented. Thx, Skip From m@moshez.org Wed Jul 25 06:14:01 2001 From: m@moshez.org (m@moshez.org) Date: Wed, 25 Jul 2001 08:14:01 +0300 Subject: [Python-Dev] number-sig anyone? In-Reply-To: <200107242130.RAA08105@cj20424-a.reston1.va.home.com> References: <200107242130.RAA08105@cj20424-a.reston1.va.home.com>, <15197.44710.656892.910976@beluga.mojam.com> <15197.59261.754548.28233@anthem.wooz.org> Message-ID: On Tue, 24 Jul 2001 17:30:52 -0400, Guido van Rossum wrote: > Sounds like a good plan, but please wait until we have a SIG > owner/moderator and a charter. Without both of these a SIG will be a > failure. If we do have a number-sig, I suppose python-numberics@sf.net should die and merge into that, right? I am all for that, but I won't be volunteering to be the champion... I've learned my lesson about spreading myself too thin. -- gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21 4817 C7FC A636 46D0 1BD6 Insecure (accessible): C5A5 A8FA CA39 AB03 10B8 F116 1713 1BCF 54C4 E1FE Learn Python! http://www.ibiblio.org/obp/thinkCSpy From martin@loewis.home.cs.tu-berlin.de Wed Jul 25 07:31:53 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 25 Jul 2001 08:31:53 +0200 Subject: [Python-Dev] Re: IPv6 committed In-Reply-To: <2086.996018437@itojun.org> References: <2086.996018437@itojun.org> Message-ID: <200107250631.f6P6Vr602375@mira.informatik.hu-berlin.de> > sorry that i'm delayed about documentation (specificaly socket module). > i have no TeX environment now (i had before) and having trouble > checking if i'm typesetting right. do you mind if i send you just > plaintext? That's fine; I'll then put it into the python-tex format - in the end, Fred will compile it into HTML and publish it on SF (devel-docs), so that we can check whether it came out right. > anyway... if possible drop me notes. I'll keep you informed, no problem. Regards, Martin From martin@strakt.com Wed Jul 25 09:28:05 2001 From: martin@strakt.com (Martin Sjögren) Date: Wed, 25 Jul 2001 10:28:05 +0200 Subject: [Python-Dev] Memory leaks? Message-ID: <20010725102805.A24723@strakt.com> I'm a bit curious about the memory handling of the Py_InitModule4... When adding methods, if PyCFunction_New fails, NULL is returned without the module object being DECREF'd, and similarly, if the PyDict_SetItemString fails, NULL is returned without neither module objec= t nor function object being DECREF'd. Is this a problem, or is this taken care of somewhere else? --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 405242 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html From tanzer@swing.co.at Wed Jul 25 09:37:04 2001 From: tanzer@swing.co.at (Christian Tanzer) Date: Wed, 25 Jul 2001 10:37:04 +0200 Subject: [Python-Dev] Re: Future division patch available (PEP 238) In-Reply-To: Your message of "Sun, 22 Jul 2001 00:36:38 EDT." <200107220436.AAA05323@cj20424-a.reston1.va.home.com> Message-ID: Guido, I deeply respect your language design skills and I appreciate your ongoing improvements. I'm impatiently looking forward to using many of the features new in 2.2. The future division patch is a different issue, though. While I agree that the proposed changes are a definite improvement over the current semantics, the issue of backwards compatibility is a huge problem difficult to solve. I see the following problems: - What's going to happen to code released into the wild (i.e., I can change all my code but what about code I gave to others)? = - How can one write readable code working correctly in both old and new Python versions? - It takes a potentially huge effort to change all the existing code. - In some cases, warnings might not be seen (e.g., they might land in /dev/null or in some log files nobody looks at). - If an application jumps versions (e.g., from 2.1 to 2.6), no warnings might be generated at all. - Upgrading to a new version of an application might break user scripts o= r databases. I have no idea how to tackle all these issues but I'll offer some ideas nevertheless. - If `int` always truncated (instead of truncating or rounding, depending on how the C compiler does it), one could write reasonably readable version independent code for truncating integer division. Compare `int (a/b)` to `divmod (a, b) [0]` or `int (math.floor (a/b))`. - Just a wild idea: the problem you want to solve is that the existing division operator mixes two totally different meanings and thus leads to nasty surprises. What if `/` applied to two integer values returned neither an integer nor a float but an object carrying the float result but behaving like an integer if used in an integer context? For instance: >>> x =3D 1/2 >>> type(x) >>> print "%d %f %s" % (x, x, x) 0 0.5 0.5 >>> 2 * x 0 >>> 2. * x 1.0 The difficult issue here is how `integer context` is defined. Should multiplication by an integer be considered an integer context? Pro: would preserve correctness of existing code like `(size / 8) * 8`. Con: is incompatible with Rationals which might be added in the future. - Command line options are not a good way of handling this -- in many cases, different modules might need different settings. Even worse, looking at the code of a module won't tell you what option to use. - If there is a possibility of specifying division semantics on a per module case (via a directive or the file extension), it should also be possible to specify the semantics for thingies like `compile`, `execfile`, `exec`, and `eval`. This only works if absence of a semantics indicator means old-style division. = I think this would go the farthest to alleviate compatibility problems. I understand your desire to avoid dragging the past around with you wherever you go and I like Python for its cleanliness. But in this case, it might be worthwhile to carry the ballast. Let me outline the problems faced by my current customer TTTech (I'm working as consultant for them). [This is going to be long -- sorry.] TTTech provides design and on-line tools for embedded distributed real-time systems in safety critical application domains (e.g., aerospace, automotive, ...). TTTech sells software tools (programmed in Python) to customers worldwide. Currently, there is a major release once a year. Due to various reasons, the shipped tools normally don't use the most recent version of Python. The current release is still based on 1.5.2. We hope to use Python 2.1 for the release planned for the end of the year. Internally, we try to use the most recent Python version. Therefore, our Python code must be compatible to several Python versions. The division change effects: - Python programs - Python scripts - user scripts - design databases Python programs --------------- I just used Skip's div-finder (thanks, Skip) to check the code of three of our applications. It finds 391 uses of division. I know that many of those are meant to be truncating divisions, while many others are meant to be floating divisions. Somebody will have to look at each and every one and fix it -- automatic conversion won't be possible. Unfortunately, the applications also contain lots of code inside of strings feed to eval or exec during run-time. I don't know how many divisions are in those, but somebody will have to look at them as well. This is one area frequently overlooked when the effect of changes is discussed and conversion tools proposed on c.l.py. As these tools are frozen, they don't depend on what Python version the user has installed. Python scripts -------------- Internally, TTTech uses quite a number of Python scripts. Unfortunately, different users have different Python versions installed. Currently, 1.5.2, 2.0, and 2.1 are installed (there might still be the odd 1.5.1 around somewhere, too). As the scripts are taken from a central file server whereas Python is installed locally, the scripts and the library modules used must be compatible to all the Python versions deployed. That makes migration difficult if the same symbol means crossly different things in different versions. User scripts ------------ Our tools are user scriptable. These scripts are written in Python and executed in the application's context via execfile (they don't work as standalone scripts). Such scripts are written and maintained by unknown customers who may or may not be programmers and who may or may not have Python experience. (One of the nice features of Python is that even a computer naive user can start writing scripts with little knowledge about Python by modifying examples). Quite often, important scripts = have been implemented by people who since changed jobs. Various customers use such scripts for interfacing to other tools, creating designs, checking designs for conformance, writing test cases, implementing test frameworks, generating design reports, ... The delivery of a new tool version ***must not break*** such scripts. TTTech simply cannot tell their customers that they have to review all their scripts and change some but not all occurrences of the division operator. We don't want to get stuck with an outdated Python version, either. = OTOH, we cannot assume old style semantics in the scripts either as new users might never have heard about how division used to work in warty versions of Python. Design databases ----------------- Design databases store the design of an embedded distributed real-time system as specified by the user. Such databases must stay alive for a looooong time (think of 10+ years for some application domains). = Our tools allow the specification of symbolic expressions by the user. Such expressions are feed through eval at the right time (i.e., late) to get a numeric value. The symbolic expressions are stored in the database as entered by the user. Reading an old database with a new tool version ***must not change*** the semantics. To be honest, for TTTech design databases the change in division probably doesn't pose any problems. Due to user demand, the tools coerced divisions to floating point for a long time. Other companies might be bitten in this way, though. -- = Christian Tanzer tanzer@swing.co.= at Glasauergasse 32 Tel: +43 1 876 62 = 36 A-1130 Vienna, Austria Fax: +43 1 877 66 = 92 From mal@lemburg.com Wed Jul 25 10:42:06 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 25 Jul 2001 11:42:06 +0200 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? References: Message-ID: <3B5E946E.1105C79B@lemburg.com> Tim Peters wrote: > > [MAL] > > May I suggest that these rather controversial changes be carried > > out on a separate branch of the Python source tree before adding > > them to the trunk ?! > > Sure, provided you're volunteering to keep the branch in synch with the > trunk: branches are both expensive and risky, unless the intent is never to > merge in either direction. As you may have guessed: I'm not particulary interested in any change to the status quo w/r to Python's treatment of integer division, since I know that I have used the current C-like behaviour in code I've written in the past few years and that finding this code will be a nightmare. PEP 238 doesn't help with this either since it still changes the semantics of '/' instead of keeping them and adding the new semantics using a new operator '//' which wouldn't break anything and still make people happy. Also, I think that the warning framework will not help much for moving to PEP 238: if you generate a warning for every source code occurrance of '/' where integer division takes place, this will render at least some programs unusable: either due to the slow-down of having to branch through the warning machinery only to find that the user doesn't want to see the warning or by producing stderr messages in quantities which will keep any user out there from using the program. OTOH, I wouldn't mind if we add a per-module directive which then tells the compiler to generate new style semantics integer division opcodes. Guido's patch already implements this, except that it uses the magic __future__ import which will be phased out eventually... how about a "from __semantics__ import non_integer_division" which does not have a timeout attached to it ?! > Much as I hate the obfuscating effects of #ifdefs, these changes are > localized enough that it would be a clear net win to use them rather than > branches, if Guido gets weary of maintaining a patch. If that's feasable, sure... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Wed Jul 25 11:44:55 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 25 Jul 2001 12:44:55 +0200 Subject: [Python-Dev] Daily CVS snapshots Message-ID: <3B5EA327.F63D37EF@lemburg.com> Looks like the daily CVS snapshots are not working anymore: http://python.sourceforge.net/snapshots/ """ Daily snapshots from Python CVS repository python-20010501 tar.gz .zip python-20010430 tar.gz .zip python-20010429 tar.gz .zip ... """ Also, the .zip link points to a .gzip file ?! Could someone please check this ? Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From fdrake@acm.org Wed Jul 25 12:48:05 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 25 Jul 2001 07:48:05 -0400 (EDT) Subject: [Python-Dev] Daily CVS snapshots In-Reply-To: <3B5EA327.F63D37EF@lemburg.com> References: <3B5EA327.F63D37EF@lemburg.com> Message-ID: <15198.45557.572298.868990@cj42289-a.reston1.va.home.com> M.-A. Lemburg writes: > Looks like the daily CVS snapshots are not working anymore: ... > python-20010501 tar.gz .zip I suspect this date is tied to a furniture move at the PythonLabs office; Jeremy's workstation was unplugged without his assistance, and so things may not have come back up correctly. I don't think any of us know how he has this set up off-hand. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From loewis@informatik.hu-berlin.de Wed Jul 25 13:03:48 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 25 Jul 2001 14:03:48 +0200 (MEST) Subject: [Python-Dev] Opening sockets in protocol-independent manner (was: BaseHTTPServer.py etc) Message-ID: <200107251203.OAA27691@pandora.informatik.hu-berlin.de> > Excuse my ignorance, but: A case for refactoring? Certainly, but it is debatable what exactly the best refactorization is. Abstractly, these fall into two cases - open a stream connection to some address (aka "client socket") - open a server socket to wait for incoming clients Either of these may find that it opens AF_INET, AF_INET6, or AF_UNIX sockets, depending on the values of host and port, and depending on what name lookup returns. Also, similar procedures are required for opening datagram sockets. In Ruby, this loop is done completely in C code, and there is a number of wrapper classes to access the various options: - TCPSocket opens a client stream socket, i.e. does getaddrinfo(), socket(), connect() - UDPSocket opens a client datagram socket. Same as TCPSocket, only that it uses SOCK_DATAGRAM - TCPServer opens a server stream socket, i.e. does getaddrinfo, socket, bind, and listen(5) - UDPServer likewise - UNIXSocket opens a client AF_UNIX socket - UNIXServer opens a server AF_UNIX socket - Socket does socket() only, allowing for subsequent other low-level calls There are some base classes: IPSocket is the base for all {TCP,UDP}{Socket,Server}; BasicSocket is base for UNIX{Socket,Server}, IPSocket, and Socket. I cannot say that I particularly like this API, but I could not easily find other/better generalizations. Therefore, no API is defined, yet. Please note that refactorizing "for internal use only" is not an acceptable solution: This is the Python library, so any function that gets defined has to be supported for quite some time. Any new API probably needs to take the existing SocketServer into account, also. Regards, Martin From mal@lemburg.com Wed Jul 25 13:11:20 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 25 Jul 2001 14:11:20 +0200 Subject: [Python-Dev] Daily CVS snapshots References: <3B5EA327.F63D37EF@lemburg.com> <15198.45557.572298.868990@cj42289-a.reston1.va.home.com> Message-ID: <3B5EB768.9EF49C94@lemburg.com> "Fred L. Drake, Jr." wrote: > > M.-A. Lemburg writes: > > Looks like the daily CVS snapshots are not working anymore: > ... > > python-20010501 tar.gz .zip > > I suspect this date is tied to a furniture move at the PythonLabs > office; Jeremy's workstation was unplugged without his assistance, and > so things may not have come back up correctly. > I don't think any of us know how he has this set up off-hand. Wouldn't it be possible to set up a CRON job on SF which takes care of these snapshots ? I have no idea how to do this myself (and probably don't have the necessary permissions), but since the pep2html.py tool also uploads into the SF web-area, I suppose that this is possible. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From fdrake@acm.org Wed Jul 25 14:41:22 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 25 Jul 2001 09:41:22 -0400 (EDT) Subject: [Python-Dev] Daily CVS snapshots In-Reply-To: <3B5EB768.9EF49C94@lemburg.com> References: <3B5EA327.F63D37EF@lemburg.com> <15198.45557.572298.868990@cj42289-a.reston1.va.home.com> <3B5EB768.9EF49C94@lemburg.com> Message-ID: <15198.52354.271165.285519@cj42289-a.reston1.va.home.com> M.-A. Lemburg writes: > Wouldn't it be possible to set up a CRON job on SF which takes > care of these snapshots ? I have no idea how to do this myself Sure, it could be done. But we expect Jeremy to be back soon, and fixing this is probably a 5-min operation, whereas anyone else would need to create a new script to do the work and test it. I don't think it's worth worrying about for just a few days of snapshots. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From mal@lemburg.com Wed Jul 25 14:55:51 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 25 Jul 2001 15:55:51 +0200 Subject: [Python-Dev] Daily CVS snapshots References: <3B5EA327.F63D37EF@lemburg.com> <15198.45557.572298.868990@cj42289-a.reston1.va.home.com> <3B5EB768.9EF49C94@lemburg.com> <15198.52354.271165.285519@cj42289-a.reston1.va.home.com> Message-ID: <3B5ECFE7.3807915E@lemburg.com> "Fred L. Drake, Jr." wrote: > > M.-A. Lemburg writes: > > Wouldn't it be possible to set up a CRON job on SF which takes > > care of these snapshots ? I have no idea how to do this myself > > Sure, it could be done. But we expect Jeremy to be back soon, and > fixing this is probably a 5-min operation, whereas anyone else would > need to create a new script to do the work and test it. > I don't think it's worth worrying about for just a few days of > snapshots. Right on all accounts :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Wed Jul 25 15:11:39 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 25 Jul 2001 16:11:39 +0200 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? References: <3B5E946E.1105C79B@lemburg.com> Message-ID: <3B5ED39B.7A48C0D@lemburg.com> I just "discovered" the loooong threads on c.l.p about PEP 238 -- looks like Guido is getting flamed badly here and I certainly don't want to add to this, so just to summarize my previous post: the only issue I have with PEP 238 (and all other PEPs trying to change basic numeric properties in wild ways ;-) is backwards compatibility. IMHO, these are all great feature to have in a nice language, it's just that the path to these features should be carefully laid out and this is probably *much* harder to get right than the features themselves. BTW, I intend to make the mxNumber types subclassable once the dust has settled over the PEP 253 (subclassing builtin types) et al. features. I believe that this should provide a nice base for experimenting with rationals, long integers, etc. For example, it might turn out that having int / int create a rational number would solve most of the problems mentioned on the various threads about PEP 238 since rationals don't lose precision and simply defers the conversion to either integers or floats to the point where one of the two interpretations is actually needed by the code, e.g. an "i" parser marker will invoke truncation to an integer while float(result) will apply the conversion to a floating point number. If we make rationals a subtype of integers we wouldn't even have PyInt_Check() problems at C level.... hmm, I'm getting carried away. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mclay@nist.gov Wed Jul 25 15:17:47 2001 From: mclay@nist.gov (Michael McLay) Date: Wed, 25 Jul 2001 10:17:47 -0400 Subject: [meta-sig] Re: [Python-Dev] number-sig anyone? In-Reply-To: <15197.62568.918195.19877@beluga.mojam.com> References: <15197.44710.656892.910976@beluga.mojam.com> <15197.59261.754548.28233@anthem.wooz.org> <15197.62568.918195.19877@beluga.mojam.com> Message-ID: <0107251017470G.02438@fermi.eeel.nist.gov> On Tuesday 24 July 2001 06:19 pm, Skip Montanaro wrote: > SM> Today I took a look at http://mail.python.org/mailman/listinfo and > SM> could find no math-sig or number-sig mailing list. > > BAW> +1. If others agree, I'll create the sig. > > In light of the other responses to my mail, perhaps the python-numerics > list on Sourceforge is as good a place to carry this coversation as a new > SIG. How about just holding the conversation on the python-numberics. The members of that list will probably be interested in any proposed changes to the Python numeric model. From guido@digicool.com Wed Jul 25 15:32:49 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 25 Jul 2001 10:32:49 -0400 Subject: [Python-Dev] Daily CVS snapshots In-Reply-To: Your message of "Wed, 25 Jul 2001 14:11:20 +0200." <3B5EB768.9EF49C94@lemburg.com> References: <3B5EA327.F63D37EF@lemburg.com> <15198.45557.572298.868990@cj42289-a.reston1.va.home.com> <3B5EB768.9EF49C94@lemburg.com> Message-ID: <200107251432.KAA02123@cj20424-a.reston1.va.home.com> > "Fred L. Drake, Jr." wrote: > > > > M.-A. Lemburg writes: > > > Looks like the daily CVS snapshots are not working anymore: > > ... > > > python-20010501 tar.gz .zip > > > > I suspect this date is tied to a furniture move at the PythonLabs > > office; Jeremy's workstation was unplugged without his assistance, and > > so things may not have come back up correctly. > > I don't think any of us know how he has this set up off-hand. [MAL] > Wouldn't it be possible to set up a CRON job on SF which takes > care of these snapshots ? I have no idea how to do this myself > (and probably don't have the necessary permissions), but since the > pep2html.py tool also uploads into the SF web-area, I suppose that > this is possible. SF makes the latest snapshot available to project administrators. I could give you the URL but I don't think you can see them. So there's no need to run anything on SF, I think. BTW, I don't think the date (May 1st) correlated to our furniture move. I dunno *what* happened on May 2nd, nor who makes the tar copies (I thought it was Barry?). --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Wed Jul 25 15:34:41 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 25 Jul 2001 10:34:41 -0400 (EDT) Subject: [Python-Dev] Daily CVS snapshots In-Reply-To: <200107251432.KAA02123@cj20424-a.reston1.va.home.com> References: <3B5EA327.F63D37EF@lemburg.com> <15198.45557.572298.868990@cj42289-a.reston1.va.home.com> <3B5EB768.9EF49C94@lemburg.com> <200107251432.KAA02123@cj20424-a.reston1.va.home.com> Message-ID: <15198.55553.597831.427828@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > SF makes the latest snapshot available to project administrators. I > could give you the URL but I don't think you can see them. So there's > no need to run anything on SF, I think. No; SF makes tarballs of the repository available, but not snapshots of the current state of things. > BTW, I don't think the date (May 1st) correlated to our furniture > move. I dunno *what* happened on May 2nd, nor who makes the tar > copies (I thought it was Barry?). I'm pretty sure Barry just pushes the repository backups to tape, but that Jeremy handles the snapshots. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido@digicool.com Wed Jul 25 15:38:29 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 25 Jul 2001 10:38:29 -0400 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? In-Reply-To: Your message of "Wed, 25 Jul 2001 16:11:39 +0200." <3B5ED39B.7A48C0D@lemburg.com> References: <3B5E946E.1105C79B@lemburg.com> <3B5ED39B.7A48C0D@lemburg.com> Message-ID: <200107251438.KAA02162@cj20424-a.reston1.va.home.com> > I just "discovered" the loooong threads on c.l.p about PEP 238 -- > looks like Guido is getting flamed badly here and I certainly don't > want to add to this, so just to summarize my previous post: the > only issue I have with PEP 238 (and all other PEPs trying to change > basic numeric properties in wild ways ;-) is backwards compatibility. > > IMHO, these are all great feature to have in a nice language, it's > just that the path to these features should be carefully laid > out and this is probably *much* harder to get right than the > features themselves. Yup, and that's what I'm focusing on in my responses. I plan to lay out a very careful compatibility track and discuss *that* with the community in earnest. > BTW, I intend to make the mxNumber types subclassable once the > dust has settled over the PEP 253 (subclassing builtin types) > et al. features. Very cool. All extension types should be subclassable! (Also all built-in types. But that's my job. :-) > I believe that this should provide a nice base for experimenting > with rationals, long integers, etc. For example, it might turn > out that having int / int create a rational number would > solve most of the problems mentioned on the various threads about > PEP 238 since rationals don't lose precision and simply defers the > conversion to either integers or floats to the point where one of > the two interpretations is actually needed by the code, e.g. > an "i" parser marker will invoke truncation to an integer while > float(result) will apply the conversion to a floating point > number. If we make rationals a subtype of integers we wouldn't > even have PyInt_Check() problems at C level.... hmm, I'm getting > carried away. For the folks concerned about code breakage, it doesn't make much of a difference whether 1/2 returns a float or a rational -- in both case the integer division property that they want is broken. I actually expect that most conversion jobs will be easy -- all those folks who suffer from "Extreme Fear of Floating Point" (as Tim calls it) can simply change every / into a // in their program (using a tool that properly tokenizes) and they should be done, since most likely their code never uses floating point. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Wed Jul 25 15:55:51 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 25 Jul 2001 10:55:51 -0400 Subject: [Python-Dev] Memory leaks? In-Reply-To: Your message of "Wed, 25 Jul 2001 10:28:05 +0200." <20010725102805.A24723@strakt.com> References: <20010725102805.A24723@strakt.com> Message-ID: <200107251455.KAA02295@cj20424-a.reston1.va.home.com> [Martin Sjögren] > I'm a bit curious about the memory handling of the Py_InitModule4... > > When adding methods, if PyCFunction_New fails, NULL is returned > without the module object being DECREF'd, and similarly, if the > PyDict_SetItemString fails, NULL is returned without neither module > object nor function object being DECREF'd. > > Is this a problem, or is this taken care of somewhere else? The first one is not a problem. The module 'm' is received from PyImport_AddModule(), which (in a comment in the source) emphasizes that the return value does not have its reference count incremented. Instead, the module is kept alive because it is stored in sys.modules. The second one should really DECREF v when PyDict_SetItemString() fails. Ditto for the docstring. I've added a low-priority bug report, because this is very unlikely to happen. --Guido van Rossum (home page: http://www.python.org/~guido/) From perry@stsci.edu Wed Jul 25 16:02:21 2001 From: perry@stsci.edu (Perry Greenfield) Date: Wed, 25 Jul 2001 11:02:21 -0400 Subject: [Python-Dev] A future division proposal Message-ID: Clearly the issue of changing the semantics of division is a very [ahem] divisive one in the Python community. It seems to me that providing a foolproof way of providing backwards compatibility would go a long way to reducing the ire of those with a lot of code to inspect and change. I'm not particularly thrilled with the suggestions made so far. Suggesting that people continue to use an older version of Python if they have a problem with it is especially unsatisfying. Eventually that older version will have to be updated in some manner (resulting in a fork of Python) or they will have to make the necessary changes to their code (albeit over a longer time). Providing a new division operator (//) that handles integer division doesn't really solve the issue of inspecting the code either. There isn't any automatic way of telling when / or // should be used in old code. Command line switches or other mechanisms to indicate that division should have different behavior will confuse those trying to understand source code ("is this '/' a new or old division?"). Why not provide yet another division operator for backwards compatibility purpose? This operator would have exactly the same semantics as the current division operator. If this were available, it should be a relatively simple matter to provide a tool to convert all uses of the / operator to the new form for old code. With this solution, the code never has to be manually inspected to work with the new version, instead, it just has to be mechanically translated. The fact that the operator has different semantics will be evident in the translated code. I don't know what the best name or symbol would be (olddiv, ///?) and admittedly it is ugly to have 3 division operators. But the alternatives seem far, far uglier. Isn't this a case where practicality beats purity (for keywords or operators)? Perry Greenfield From barry@zope.com Wed Jul 25 16:21:56 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 25 Jul 2001 11:21:56 -0400 Subject: [Python-Dev] Daily CVS snapshots References: <3B5EA327.F63D37EF@lemburg.com> Message-ID: <15198.58388.537611.447486@anthem.wooz.org> Looks okay to me: % tar ztvf python-20010501.tar.gz | head drwxr-sr-x jhylton/python 0 2001-05-01 08:08:51 Python-20010501/ drwxr-sr-x jhylton/python 0 2001-05-01 08:08:50 Python-20010501/BeOS/ drwxr-sr-x jhylton/python 0 2001-05-01 08:08:50 Python-20010501/BeOS/ar-1.1/ drwxr-sr-x jhylton/python 0 2001-05-01 08:08:50 Python-20010501/BeOS/ar-1.1/docs/ drwxr-sr-x jhylton/python 0 2001-05-01 08:08:50 Python-20010501/Demo/ drwxr-sr-x jhylton/python 0 2001-05-01 08:08:50 Python-20010501/Demo/classes/ -rwxr-xr-x jhylton/python 7816 1997-12-09 14:38:39 Python-20010501/Demo/classes/Complex.py -rwxr-xr-x jhylton/python 7728 1998-09-14 11:34:45 Python-20010501/Demo/classes/Dates.py -rwxr-xr-x jhylton/python 1249 1993-12-17 09:23:52 Python-20010501/Demo/classes/Dbm.py -rw-r--r-- jhylton/python 597 1993-12-17 09:23:52 Python-20010501/Demo/classes/README Also, I run a nightly script to grab the CVS repository, and that looks fine to me too. -Barry From barry@zope.com Wed Jul 25 16:30:18 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 25 Jul 2001 11:30:18 -0400 Subject: [Python-Dev] Daily CVS snapshots References: <3B5EA327.F63D37EF@lemburg.com> <15198.45557.572298.868990@cj42289-a.reston1.va.home.com> <3B5EB768.9EF49C94@lemburg.com> <200107251432.KAA02123@cj20424-a.reston1.va.home.com> Message-ID: <15198.58890.633459.435982@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> BTW, I don't think the date (May 1st) correlated to our GvR> furniture move. I dunno *what* happened on May 2nd, nor who GvR> makes the tar copies (I thought it was Barry?). Nope, I pull down the CVS repository snapshots from http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz (I grab a bunch of tarballs, including for Jython, Mailman, and mimelib). These are different than what's on the other page mentioned; those are just CVS working directory snapshots. -Barry From barry@zope.com Wed Jul 25 16:32:07 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 25 Jul 2001 11:32:07 -0400 Subject: [Python-Dev] Daily CVS snapshots References: <3B5EA327.F63D37EF@lemburg.com> <15198.45557.572298.868990@cj42289-a.reston1.va.home.com> <3B5EB768.9EF49C94@lemburg.com> <200107251432.KAA02123@cj20424-a.reston1.va.home.com> <15198.55553.597831.427828@cj42289-a.reston1.va.home.com> Message-ID: <15198.58999.861192.236578@anthem.wooz.org> >>>>> "Fred" == Fred L Drake, Jr writes: Fred> I'm pretty sure Barry just pushes the repository backups Fred> to tape, but that Jeremy handles the snapshots. Yup. From Paul.Moore@atosorigin.com Wed Jul 25 16:53:45 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 25 Jul 2001 16:53:45 +0100 Subject: [Python-Dev] A future division proposal Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AF31@UKRUX002.rundc.uk.origin-it.com> > I don't know what the best name or symbol would be > (olddiv, ///?) and admittedly it is ugly to have 3 > division operators. But the alternatives seem far, > far uglier. Isn't this a case where practicality > beats purity (for keywords or operators)? You could probably write a function to do this. There's no need for anything built into Python. Actually, when I tried, I got into a bit of a mess getting the type checks (which you need) right - def olddiv(n,m): if type(n) == type(m) == type(0): return n//m else: return n/m But this needs the checks expanded to take longs into account. Which is where it gets messy. But: a) It can be done, and b) The fact that it's messy probably exposes what's wrong with the old semantics quite well :-) Paul. From skip@pobox.com (Skip Montanaro) Wed Jul 25 17:30:19 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 25 Jul 2001 11:30:19 -0500 Subject: [Python-Dev] find method for lists Message-ID: <15198.62491.532831.221942@beluga.mojam.com> This has probably been discussed before, but why doesn't the list object support a find method? Seems like if a non-exception-raising index method is good enough for strings, it should be good enough for lists as well. I realize I can use "l.count(x) and l.index(x)" to avoid the possible ValueError. (Or maybe it's strings that shouldn't have find, but can't be deleted not for code breakage reasons?) I'm mostly just curious. Am I missing something? Skip From fdrake@acm.org Wed Jul 25 17:29:41 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 25 Jul 2001 12:29:41 -0400 (EDT) Subject: [Python-Dev] find method for lists In-Reply-To: <15198.62491.532831.221942@beluga.mojam.com> References: <15198.62491.532831.221942@beluga.mojam.com> Message-ID: <15198.62453.125734.953636@cj42289-a.reston1.va.home.com> Skip Montanaro writes: > This has probably been discussed before, but why doesn't the list object > support a find method? Seems like if a non-exception-raising index method > is good enough for strings, it should be good enough for lists as well. I I've seen this brought up, but I'm not sure how important it is. It certainly seems like this would be handy. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido@digicool.com Wed Jul 25 17:37:09 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 25 Jul 2001 12:37:09 -0400 Subject: [Python-Dev] find method for lists In-Reply-To: Your message of "Wed, 25 Jul 2001 11:30:19 CDT." <15198.62491.532831.221942@beluga.mojam.com> References: <15198.62491.532831.221942@beluga.mojam.com> Message-ID: <200107251637.MAA07861@cj20424-a.reston1.va.home.com> > This has probably been discussed before, but why doesn't the list object > support a find method? Seems like if a non-exception-raising index method > is good enough for strings, it should be good enough for lists as well. I > realize I can use "l.count(x) and l.index(x)" to avoid the possible > ValueError. (Or maybe it's strings that shouldn't have find, but can't be > deleted not for code breakage reasons?) > > I'm mostly just curious. Am I missing something? List searching is much less common, and the string functions (both index() and find()) have different semantics: they look for substrings, while list.index() only searches for a particular item. With lists, if you need this, you're probbly using the wrong datastructure. With strings, substring matching is a standard pattern. --Guido van Rossum (home page: http://www.python.org/~guido/) From SBrunning@trisystems.co.uk Wed Jul 25 18:00:55 2001 From: SBrunning@trisystems.co.uk (Simon Brunning) Date: Wed, 25 Jul 2001 18:00:55 +0100 Subject: [Python-Dev] Small feature request - optional argument for string.strip() Message-ID: <31575A892FF6D1118F5800600846864D78BEFD@intrepid> Is it OK to post small feature requests directly to this list, or is there some other mechanism for them? What I have in mind certainly isn't worth a PEP. The .split method on strings splits at whitespace by default, but takes an optional argument allowing splitting by other strings. The .strip method (and its siblings) always strip whitespace - on more than one occasion I would have found it useful if these methods also took an optional argument allowing other strings to be stripped. For example, to strip, say, asterisks from a file you could do: >>>fred = '**word**word**' >>>fred.strip('*') word**word Does this sound sensible/useful? Cheers, Simon Brunning. ----------------------------------------------------------------------- The information in this email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this email by anyone else is unauthorised. If you are not the intended recipient, any disclosure, copying, distribution, or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. TriSystems Ltd. cannot accept liability for statements made which are clearly the senders own. From mal@lemburg.com Wed Jul 25 18:04:41 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 25 Jul 2001 19:04:41 +0200 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? References: <3B5E946E.1105C79B@lemburg.com> <3B5ED39B.7A48C0D@lemburg.com> <200107251438.KAA02162@cj20424-a.reston1.va.home.com> Message-ID: <3B5EFC29.B4B30CC2@lemburg.com> Guido van Rossum wrote: > ... > I actually expect that most conversion jobs will be easy -- all those > folks who suffer from "Extreme Fear of Floating Point" (as Tim calls > it) can simply change every / into a // in their program (using a tool > that properly tokenizes) and they should be done, since most likely > their code never uses floating point. :-) Well, that would break floating points then... unless float // float works like float / float does now. Perhaps you should simply add a nb_altdivide slot to the numeric set of slots which is then called for a // b. Floats would then reuse their nb_divide for //. BTW, my idea about rationals turns out not to work too well: 1/6 + 5/6 would give 6/6 == 1 while the current semantics return 0 in this case. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Wed Jul 25 18:22:53 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 25 Jul 2001 13:22:53 -0400 Subject: [Python-Dev] Small feature request - optional argument for string.strip() In-Reply-To: Your message of "Wed, 25 Jul 2001 18:00:55 BST." <31575A892FF6D1118F5800600846864D78BEFD@intrepid> References: <31575A892FF6D1118F5800600846864D78BEFD@intrepid> Message-ID: <200107251722.NAA08073@cj20424-a.reston1.va.home.com> > Is it OK to post small feature requests directly to this list, or is > there some other mechanism for them? What I have in mind certainly > isn't worth a PEP. It's better to use the SF feture request tracker: http://sourceforge.net/tracker/?atid=355470&group_id=5470&func=browse > The .split method on strings splits at whitespace by default, but > takes an optional argument allowing splitting by other strings. The > .strip method (and its siblings) always strip whitespace - on more > than one occasion I would have found it useful if these methods also > took an optional argument allowing other strings to be stripped. For > example, to strip, say, asterisks from a file you could do: > > >>>fred = '**word**word**' > >>>fred.strip('*') > word**word > > Does this sound sensible/useful? Marginally. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Wed Jul 25 18:26:28 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 25 Jul 2001 13:26:28 -0400 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? In-Reply-To: Your message of "Wed, 25 Jul 2001 19:04:41 +0200." <3B5EFC29.B4B30CC2@lemburg.com> References: <3B5E946E.1105C79B@lemburg.com> <3B5ED39B.7A48C0D@lemburg.com> <200107251438.KAA02162@cj20424-a.reston1.va.home.com> <3B5EFC29.B4B30CC2@lemburg.com> Message-ID: <200107251726.NAA08088@cj20424-a.reston1.va.home.com> > Guido van Rossum wrote: > > ... > > I actually expect that most conversion jobs will be easy -- all those > > folks who suffer from "Extreme Fear of Floating Point" (as Tim calls > > it) can simply change every / into a // in their program (using a tool > > that properly tokenizes) and they should be done, since most likely > > their code never uses floating point. :-) > > Well, that would break floating points then... Not under the assumption that they will never use floating point. > unless float // float works like float / float does now. No, that would be a bad idea. float//float should either raise an exception or return a rounded-towards-minus-infinity result. > Perhaps you should simply > add a nb_altdivide slot to the numeric set of slots which is then > called for a // b. Floats would then reuse their nb_divide > for //. Something like this is part of the implemenation plan (not yet part of the patch). > BTW, my idea about rationals turns out not to work too well: > 1/6 + 5/6 would give 6/6 == 1 while the current semantics > return 0 in this case. Indeed, rationals can't ease the pain of PEP 238 -- but PEP 238 is required before rationals can make sense. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com (Skip Montanaro) Wed Jul 25 18:48:51 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 25 Jul 2001 12:48:51 -0500 Subject: [Python-Dev] A future division proposal In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AF31@UKRUX002.rundc.uk.origin-it.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF31@UKRUX002.rundc.uk.origin-it.com> Message-ID: <15199.1667.970769.894705@beluga.mojam.com> Paul> Actually, when I tried, I got into a bit of a mess getting the Paul> type checks (which you need) right - Paul> def olddiv(n,m): Paul> if type(n) == type(m) == type(0): Paul> return n//m Paul> else: Paul> return n/m Paul> But this needs the checks expanded to take longs into Paul> account. Which is where it gets messy. Wouldn't this work for ints and longs? def olddiv(n,m): ints = [type(0), type(0L)] if type(n) in ints and type(m) in ints: return n//m else: return n/m -- Skip Montanaro (skip@pobox.com) http://www.mojam.com/ http://www.musi-cal.com/ From perry@stsci.edu Wed Jul 25 18:48:10 2001 From: perry@stsci.edu (Perry Greenfield) Date: Wed, 25 Jul 2001 13:48:10 -0400 Subject: [Python-Dev] A future division proposal In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AF31@UKRUX002.rundc.uk.origin-it.com> Message-ID: [Paul Moore] > You could probably write a function to do this. There's no need > for anything > built into Python. > Sure, a functional form would be just as feasible and not require another operator. On the other hand there are perhaps a couple reasons not to do it this way: 1) It can make a mess of the expressions (if automatically translated) and make the code far less readable. Some may object to this. 2) If I recall some objected to a functional version on the basis of speed, but I'm not sure about that. Perry From mal@lemburg.com Wed Jul 25 19:14:15 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 25 Jul 2001 20:14:15 +0200 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? References: <3B5E946E.1105C79B@lemburg.com> <3B5ED39B.7A48C0D@lemburg.com> <200107251438.KAA02162@cj20424-a.reston1.va.home.com> <3B5EFC29.B4B30CC2@lemburg.com> <200107251726.NAA08088@cj20424-a.reston1.va.home.com> Message-ID: <3B5F0C77.DEE608F8@lemburg.com> Guido van Rossum wrote: > > > Guido van Rossum wrote: > > > ... > > > I actually expect that most conversion jobs will be easy -- all those > > > folks who suffer from "Extreme Fear of Floating Point" (as Tim calls > > > it) can simply change every / into a // in their program (using a tool > > > that properly tokenizes) and they should be done, since most likely > > > their code never uses floating point. :-) > > > > Well, that would break floating points then... > > Not under the assumption that they will never use floating point. Verifying such an assumption will be just as hard as auditing the code itself, I'm afraid. > > unless float // float works like float / float does now. > > No, that would be a bad idea. float//float should either raise an > exception or return a rounded-towards-minus-infinity result. Hmm, it would assure that your tool doesn't accidentally break floating point code. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@digicool.com Wed Jul 25 20:05:47 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 25 Jul 2001 15:05:47 -0400 Subject: [Python-Dev] A future division proposal In-Reply-To: Your message of "Wed, 25 Jul 2001 13:48:10 EDT." References: Message-ID: <200107251905.PAA08305@cj20424-a.reston1.va.home.com> Can we please keep this discussion out of python-dev? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Wed Jul 25 20:11:06 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 25 Jul 2001 15:11:06 -0400 Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together? In-Reply-To: Your message of "Wed, 25 Jul 2001 20:14:15 +0200." <3B5F0C77.DEE608F8@lemburg.com> References: <3B5E946E.1105C79B@lemburg.com> <3B5ED39B.7A48C0D@lemburg.com> <200107251438.KAA02162@cj20424-a.reston1.va.home.com> <3B5EFC29.B4B30CC2@lemburg.com> <200107251726.NAA08088@cj20424-a.reston1.va.home.com> <3B5F0C77.DEE608F8@lemburg.com> Message-ID: <200107251911.PAA08352@cj20424-a.reston1.va.home.com> > > Not under the assumption that they will never use floating point. > > Verifying such an assumption will be just as hard as auditing the > code itself, I'm afraid. Not for the biggest cry-babies -- I've seen several claims from folks who say that they never use floating point, and I believe them. > > > unless float // float works like float / float does now. > > > > No, that would be a bad idea. float//float should either raise an > > exception or return a rounded-towards-minus-infinity result. > > Hmm, it would assure that your tool doesn't accidentally > break floating point code. A better idea then would be to make float//float raise an exception. --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@python.net Wed Jul 25 20:15:57 2001 From: gward@python.net (Greg Ward) Date: Wed, 25 Jul 2001 15:15:57 -0400 Subject: [Python-Dev] Small feature request - optional argument for string.strip() In-Reply-To: <31575A892FF6D1118F5800600846864D78BEFD@intrepid>; from SBrunning@trisystems.co.uk on Wed, Jul 25, 2001 at 06:00:55PM +0100 References: <31575A892FF6D1118F5800600846864D78BEFD@intrepid> Message-ID: <20010725151557.A2013@gerg.ca> On 25 July 2001, Simon Brunning said: > >>>fred = '**word**word**' > >>>fred.strip('*') > word**word > > Does this sound sensible/useful? Not really. I can't recall ever having a need for such a feature in any programming language I've ever used. Greg -- Greg Ward - programmer-at-big gward@python.net http://starship.python.net/~gward/ I haven't lost my mind; I know exactly where I left it. From gward@python.net Wed Jul 25 21:23:05 2001 From: gward@python.net (Greg Ward) Date: Wed, 25 Jul 2001 16:23:05 -0400 Subject: [Python-Dev] Branches here, branches there, branches everywherte Message-ID: <20010725162305.A2390@gerg.ca> I've finally started reviewing the changes made to the Distutils during my extended leave-of-absence, and making a few minor commits. So far I've just made these commits on the trunk, because I don't really understand anything else. (Yes, yes, I've read the CVS docs many many times. It just takes a while to sink in.) Am I right in doing this? Ie. will 2.2a2 be released from the trunk? Or should I be doing commits that I want in 2.2a2 on the 22a1 branch? Greg -- Greg Ward - nerd gward@python.net http://starship.python.net/~gward/ All of science is either physics or stamp collecting. From guido@digicool.com Wed Jul 25 21:44:39 2001 From: guido@digicool.com (Guido van Rossum) Date: Wed, 25 Jul 2001 16:44:39 -0400 Subject: [Python-Dev] Branches here, branches there, branches everywherte In-Reply-To: Your message of "Wed, 25 Jul 2001 16:23:05 EDT." <20010725162305.A2390@gerg.ca> References: <20010725162305.A2390@gerg.ca> Message-ID: <200107252044.QAA08895@cj20424-a.reston1.va.home.com> > I've finally started reviewing the changes made to the Distutils during > my extended leave-of-absence, and making a few minor commits. Great! > So far I've just made these commits on the trunk, because I don't > really understand anything else. (Yes, yes, I've read the CVS docs > many many times. It just takes a while to sink in.) Am I right in > doing this? Ie. will 2.2a2 be released from the trunk? Or should I > be doing commits that I want in 2.2a2 on the 22a1 branch? You needn't worry about the branches at all. Everything checked in on the trunk will be merged into the branch. In fact, if you check it in on the trunk *and* on the branch, you'd end up creating more pain for the bot doing the merges. Checking in on the branch should only be done if the change specifically applies to the branch only. For example, only type/class unification changes should be checked in on the descr-branch. And yes, I plan to merge the descr-branch back into the trunk, hopefully (but not yet certainly) before 2.2a2 is due. Branches are a necessary evil. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Wed Jul 25 21:52:04 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 25 Jul 2001 16:52:04 -0400 Subject: [Python-Dev] Branches here, branches there, branches everywherte In-Reply-To: <20010725162305.A2390@gerg.ca> Message-ID: [Greg Ward] > I've finally started reviewing the changes made to the Distutils during > my extended leave-of-absence, and making a few minor commits. Welcome back! I hope you've recovered from Candianness . > So far I've just made these commits on the trunk, Good! > because I don't really understand anything else. That's the way Guido likes it . > (Yes, yes, I've read the CVS docs many many times. It just takes a > while to sink in.) Am I right in doing this? Yes. > Ie. will 2.2a2 be released from the trunk? Unknown at this time. > Or should I be doing commits that I want in 2.2a2 on the 22a1 branch? Definitely not. Trunk. There is no 22a1 branch, BTW, 22a1 is just a tag applied at the time of the 2.2a1 release. 2.2a1 was released from the descr-branch. It's my job to magically merge trunk checkins into descr-branch while you sleep. This approach will become a nightmare if people check stuff into descr-branch themselves (except for Guido and Fred and me, who are doing some work *specific* to descr-branch). The best thing you can do to help is look for massive clumps of merge checkins (usually early AM EDT on Saturdays), then run whatever distutils tests you have *from* a descr-branch checkout. I don't believe the ongoing distutils merges on descr-branch get tested at all now, and that's not good. From martin@loewis.home.cs.tu-berlin.de Wed Jul 25 22:06:03 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 25 Jul 2001 23:06:03 +0200 Subject: [Python-Dev] post mortem after threading deadlock? Message-ID: <200107252106.f6PL63301643@mira.informatik.hu-berlin.de> > Is there any possibility of getting some post-mortem info out of a > multi-threaded system whose threads are deadlocked? You could attach to the process using a C debugger (e.g. gdb), and have a look at the C stacks of each thread. Then, you can look into the variables of the eval_code invocations to get a clue of what the Python stack is. Regards, Martin P.S. Isn't this off-topic for python-dev, and rather a question to python-list or python-tutor? From bckfnn@worldonline.dk Wed Jul 25 22:27:37 2001 From: bckfnn@worldonline.dk (Finn Bock) Date: Wed, 25 Jul 2001 21:27:37 GMT Subject: [Python-Dev] zipfiles on sys.path Message-ID: <3b5f2b11.50733180@mail.wanadoo.dk> Hi, We have recently added support for .zip files on sys.path to Jython. Now, after the fact, I wondered what prior art exists for such a feature and the semantic that is used. We came up with a solution where: - It is the name (as a string) of the zipfile that can be added to sys.path. - The zipfile is opened on the next import that checks this sys.path entry and kept open until all references to the zipfile is gone (including references from packages). - A side effect of the implementation is that the identity of a string on sys.path or __path__ might change during import. The value of the string stay the same. - The __path__ vrbl in a package 'foo.bar' loaded from zipfile.zip will have the value ['zipfile.zip!foo/bar'] and this same syntax can also be used when adding entries to sys.path and __path__. I hope it doesn't conflict too much with the solutions that already exists or the solution (of any) that CPython might choose to adopt. regards, finn From skip@pobox.com (Skip Montanaro) Wed Jul 25 22:57:03 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 25 Jul 2001 16:57:03 -0500 Subject: [Python-Dev] Re: post mortem after threading deadlock? In-Reply-To: <200107252106.f6PL63301643@mira.informatik.hu-berlin.de> References: <200107252106.f6PL63301643@mira.informatik.hu-berlin.de> Message-ID: <15199.16559.543625.870439@beluga.mojam.com> [suggestions elided - thanks, I will look into gdb's thread debugging capabilities] Martin> P.S. Isn't this off-topic for python-dev, and rather a question Martin> to python-list or python-tutor? Well sort of. However, if you read my problem as a thinly veiled enhancement request, the people most likely to be able to implement such a thing are on this list. I sort of suspect that from the Python level about all I can do today is what I'm already doing - poking around the various locks and semaphores that the threads all share. Skip From jack@oratrix.nl Wed Jul 25 22:58:25 2001 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 25 Jul 2001 23:58:25 +0200 Subject: [Python-Dev] zipfiles on sys.path In-Reply-To: Message by bckfnn@worldonline.dk (Finn Bock) , Wed, 25 Jul 2001 21:27:37 GMT , <3b5f2b11.50733180@mail.wanadoo.dk> Message-ID: <20010725215830.2F49D14A25D@oratrix.oratrix.nl> Recently, bckfnn@worldonline.dk (Finn Bock) said: > Hi, > > We have recently added support for .zip files on sys.path to Jython. > Now, after the fact, I wondered what prior art exists for such a feature > and the semantic that is used. MacPython uses a similar scheme, but slightly different. If there is a file on sys.path it will be inspected for "PYC " resources with the module name. (The main use for this feature is that you can put the application itself in sys.path, compile all your modules into PYC resources and you have a frozen Python program without having used a C compiler. A boon on a platform where all C compilers cost money or are arcane). I'll go thru the issues one by one: > We came up with a solution where: > > - It is the name (as a string) of the zipfile that can be added to > sys.path. Same. > - The zipfile is opened on the next import that checks this sys.path > entry and kept open until all references to the zipfile is gone > (including references from packages). Different, MacPython opens it every time. With the exception of the application itself, which is already open (and this is checked). What MacPython does do, and what speeds up imports immensely, is that it interns all sys.path strings, and keeps a cache of the sys.path entries that are known to be files, not directories. This forestalls the import code testing many non-existing paths for existence (/path/to/myfile.zip/mod.py, path/o/myfile.zip/mod.pyc, etc). > - A side effect of the implementation is that the identity of a string > on sys.path or __path__ might change during import. The value of the > string stay the same. > > - The __path__ vrbl in a package 'foo.bar' loaded from zipfile.zip > will have the value ['zipfile.zip!foo/bar'] and this same syntax can > also be used when adding entries to sys.path and __path__. __path__ is set to the package name. I'm not sure of the exact rationale for this (Just did the package support) but it seems to work fine. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From tim.one@home.com Wed Jul 25 23:11:16 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 25 Jul 2001 18:11:16 -0400 Subject: [Python-Dev] Re: post mortem after threading deadlock? In-Reply-To: <15199.16559.543625.870439@beluga.mojam.com> Message-ID: [Skip Montanaro] > ... > However, if you read my problem as a thinly veiled enhancement request, > the people most likely to be able to implement such a thing are on this > list. I sort of suspect that from the Python level about all I can do > today is what I'm already doing - poking around the various locks > and semaphores that the threads all share. I've got better advice : Never use semaphores for anything. Never use locks except for dirt-simple one- or two-line critical sections. For everything but the latter, always use condition variables. They're the only synch protocol I've seen that non-specialist thread programmers can use without routinely screwing themselves. The genius of the condvar protocol is that, used correctly, you *always* run-time test your crucial assumptions about non-local state (and automatically do so under the protection of a critical section), and *always* loop back to try again if your hopes or assumptions turn out not to be true. This saves you from a universe of possible problems with non-local state changing in unanticipated ways. if-you-had-used-condvars-you-wouldn't-be-debugging-now-ly y'rs - tim From just@letterror.com Wed Jul 25 23:11:06 2001 From: just@letterror.com (Just van Rossum) Date: Thu, 26 Jul 2001 00:11:06 +0200 Subject: [Python-Dev] zipfiles on sys.path In-Reply-To: <20010725215830.2F49D14A25D@oratrix.oratrix.nl> Message-ID: <20010726001112-r01010700-776380a4-0910-010c@213.84.27.177> Jack Jansen wrote: > > - The __path__ vrbl in a package 'foo.bar' loaded from zipfile.zip > > will have the value ['zipfile.zip!foo/bar'] and this same syntax can > > also be used when adding entries to sys.path and __path__. > > __path__ is set to the package name. I'm not sure of the exact > rationale for this (Just did the package support) but it seems to work > fine. I don't know the rationale either (or at least: not anymore ;-), I just copied the behavior of frozen packages (as in freeze.py) from import.c. PyImport_ImportFrozenModule() contains this snippet: if (ispackage) { /* Set __path__ to the package name */ ... Just From guido@zope.com Wed Jul 25 23:16:45 2001 From: guido@zope.com (Guido van Rossum) Date: Wed, 25 Jul 2001 18:16:45 -0400 Subject: [Python-Dev] Re: post mortem after threading deadlock? In-Reply-To: Your message of "Wed, 25 Jul 2001 18:11:16 EDT." References: Message-ID: <200107252216.SAA09493@cj20424-a.reston1.va.home.com> > I've got better advice : Never use semaphores for anything. Never > use locks except for dirt-simple one- or two-line critical sections. For > everything but the latter, always use condition variables. They're the only > synch protocol I've seen that non-specialist thread programmers can use > without routinely screwing themselves. The genius of the condvar protocol > is that, used correctly, you *always* run-time test your crucial assumptions > about non-local state (and automatically do so under the protection of a > critical section), and *always* loop back to try again if your hopes or > assumptions turn out not to be true. This saves you from a universe of > possible problems with non-local state changing in unanticipated ways. I believe that Aahz, in his thread tutorial, has even more radical advice: use the Queue module for all inter-thread communication. It is even higher level than semaphores, and has the same nice properties. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Wed Jul 25 23:31:01 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 25 Jul 2001 18:31:01 -0400 Subject: [Python-Dev] Re: post mortem after threading deadlock? In-Reply-To: <200107252216.SAA09493@cj20424-a.reston1.va.home.com> Message-ID: [Guido van Rossum] > I believe that Aahz, in his thread tutorial, has even more radical > advice: use the Queue module for all inter-thread communication. It > is even higher level than semaphores, and has the same nice > properties. If they're *flexible* enough for Skip, I endorse Queues too. Else condvars are the bee's second-prettiest knees. From barry@scottb.demon.co.uk Wed Jul 25 23:59:32 2001 From: barry@scottb.demon.co.uk (Barry Scott) Date: Wed, 25 Jul 2001 23:59:32 +0100 Subject: [Python-Dev] Please have a look at proposed doc changes for time epoch In-Reply-To: <15189.61907.883300.127987@beluga.mojam.com> Message-ID: <001c01c1155d$77c64970$060210ac@private> If you use the POSIX.1 functions the base time is always 1 Jan 1970 even if the OS you are running on has a different epoch. Barry From skip@pobox.com (Skip Montanaro) Thu Jul 26 00:04:11 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 25 Jul 2001 18:04:11 -0500 Subject: [Python-Dev] Re: post mortem after threading deadlock? In-Reply-To: <200107252216.SAA09493@cj20424-a.reston1.va.home.com> References: <200107252216.SAA09493@cj20424-a.reston1.va.home.com> Message-ID: <15199.20587.527593.965607@beluga.mojam.com> Tim> I've got better advice : Never use semaphores for anything. Tim> Never use locks except for dirt-simple one- or two-line critical Tim> sections. I didn't find either particularly difficult to work with. Guess I was fooling myself. ;-) Guido> I believe that Aahz, in his thread tutorial, has even more Guido> radical advice: use the Queue module for all inter-thread Guido> communication. It is even higher level than semaphores, and has Guido> the same nice properties. Ah, thanks! I saw the mention of Queues in his slides and thought he was talking about a queue class that he wrote as an add-on. It never occurred to me that it would be a core library module. doh! A queue is really what I want anyway. I'm sharing a limited pool of database connections between a (potentially large) set of threads. one-regulation-head-slap-has-been-administered-sir!-ly y'rs, Skip From paulp@ActiveState.com Thu Jul 26 00:29:32 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Wed, 25 Jul 2001 16:29:32 -0700 Subject: [Python-Dev] Re: Static method and class method comments References: <9jn382$lqf$1@license1.unx.sas.com> Message-ID: <3B5F565C.5F452AC5@ActiveState.com> >Kevin Smith wrote: > > I am very glad to see the new features of Python 2.2, but I do have a minor > gripe about the implementation of static and class methods. My issue stems > from the fact that when glancing over Python code that uses static or class > methods, you cannot tell that a method is a static or class method by looking > at the point where it is defined. > ... Agree strongly. This will also be a problem for documentation generation tools, type extraction tools and class browsers. I believe it would be easy to add a contextual keyword > class C: > def static foo(x, y): > print "classmethod", x, y -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From greg@cosc.canterbury.ac.nz Thu Jul 26 00:31:53 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Jul 2001 11:31:53 +1200 (NZST) Subject: Python 3 (Re: [Python-Dev] shouldn't we be considering all pending numeric proposals together?) In-Reply-To: <3B5E946E.1105C79B@lemburg.com> Message-ID: <200107252331.LAA04164@s454.cosc.canterbury.ac.nz> "M.-A. Lemburg" : > how about a "from __semantics__ import > non_integer_division" which does not have a timeout attached > to it ?! If we're to have some form of version declaration in perpetuity, I hope we can find a MUCH nicer syntax for it than that! I suggest simply putting python 3 at the top of the module (and calling the first release which supports it 3.0). This would completely eliminate all backwards-compatibility objections at a stroke; there wouldn't even be any need for warnings. And we wouldn't necessarily be committing to "dragging the past around forever", since there's always the possibility of dropping support for older versions in some release suitably far in the future. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From paulp@ActiveState.com Thu Jul 26 00:38:01 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Wed, 25 Jul 2001 16:38:01 -0700 Subject: [Python-Dev] number-sig anyone? References: <15197.44710.656892.910976@beluga.mojam.com> Message-ID: <3B5F5859.EBDFDC84@ActiveState.com> Skip Montanaro wrote: > > Dev-ers, > >... > > Today I took a look at http://mail.python.org/mailman/listinfo and could > find no math-sig or number-sig mailing list. If Python's number system is > going to change in one or more backwards-incompatible I think there may only > be one chance to get it right. That implies there is a "right". There isn't. There are just a bunch of opinions. And I can't imagine that a SIG would lead to a convergence of opinions because people come from such radically different backgrounds. I would rather see a rational-sig, float-division-sig, decimal-sig and so forth. Each could come up with a "locally coherent" plan and Guido could pick and choose. Otherwise it is might as well be called numeric-flame-flame-flame-sig. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From skip@pobox.com (Skip Montanaro) Thu Jul 26 00:54:05 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 25 Jul 2001 18:54:05 -0500 Subject: [Python-Dev] Re: Static method and class method comments In-Reply-To: <3B5F565C.5F452AC5@ActiveState.com> References: <9jn382$lqf$1@license1.unx.sas.com> <3B5F565C.5F452AC5@ActiveState.com> Message-ID: <15199.23581.653505.336350@beluga.mojam.com> Paul> Agree strongly. This will also be a problem for documentation Paul> generation tools, type extraction tools and class browsers. I Paul> believe it would be easy to add a contextual keyword >> class C: >> def static foo(x, y): >> print "classmethod", x, y Even better yet, why not simply reuse the class keyword in this context: class C: def class foo(x, y): print "classmethod", x, y -- Skip Montanaro (skip@pobox.com) http://www.mojam.com/ http://www.musi-cal.com/ From skip@pobox.com (Skip Montanaro) Thu Jul 26 05:05:42 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 25 Jul 2001 23:05:42 -0500 Subject: [Python-Dev] number-sig anyone? In-Reply-To: <3B5F5859.EBDFDC84@ActiveState.com> References: <15197.44710.656892.910976@beluga.mojam.com> <3B5F5859.EBDFDC84@ActiveState.com> Message-ID: <15199.38678.456223.981364@beluga.mojam.com> Skip> Today I took a look at http://mail.python.org/mailman/listinfo and Skip> could find no math-sig or number-sig mailing list. If Python's Skip> number system is going to change in one or more backwards- Skip> incompatible [ways] I think there may only be one chance to get it Skip> right. Paul> That implies there is a "right". There isn't. There are just a Paul> bunch of opinions. And I can't imagine that a SIG would lead to a Paul> convergence of opinions because people come from such radically Paul> different backgrounds. I would rather see a rational-sig, Paul> float-division-sig, decimal-sig and so forth. Each could come up Paul> with a "locally coherent" plan and Guido could pick and choose. Paul, My operational definition of "right" in this context is perhaps different than yours. I realize there is no obviously right numeric model. If there was, most programming languages would use it and we wouldn't need bots like Tim to help guide us through minefields like IEEE 754. By "right" I mean that we can arrive at a long-term stable numeric model that will be accepted by both the Python community as a whole *and* by the decision makers who will vote thumbs up or down on adopting Python in their organizations. One of the most vocal opponents to PEP 238 (I won't mention his name, but his initials are S.H. ;-) lamented loudly that he'd be a laughing stock in his company because of that "division thing". He mentioned something about being a "right arse" I think. By having a well-considered overall plan for Python's numeric behavior, if you have to make an incompatible change today, another next year and a third two years after that, you can point to the plan that shows people where you're headed, how you plan to get there, and how they can write their programs in the meantime so as to be as resilient as possible. Without such a plan -- or with several potentially competing plans as you proposed -- every change proposed or made will simply fuel the fires of those people who dismiss Python because "it's unstable". The funny thing is, Python's semantics changed so little for so long that by comparison the rate of change does seem pretty high, but it's still much better than many applications or application libraries (such as the relatively recent glibc upheaval or the API changes Gtk is undergoing now). And let's not even mention the folks in Redmond... Skip From tim.one@home.com Thu Jul 26 05:41:02 2001 From: tim.one@home.com (Tim Peters) Date: Thu, 26 Jul 2001 00:41:02 -0400 Subject: [Python-Dev] number-sig anyone? In-Reply-To: <15199.38678.456223.981364@beluga.mojam.com> Message-ID: Briefly: [Skip Montanaro] > ... > I realize there is no obviously right numeric model. There are many that are reasonable, though -- and that's the other half of the problem. > If there was, most programming languages would use it and we wouldn't > need bots like Tim to help guide us through minefields like IEEE 754. You've never seen a language that supports 754 properly (== as the committee intended). Certainly not Python, C or Java. It's far less a minefield when properly supported, and was designed to be much saner than previous binary f.p. systems. One problem is that languages only support the *corner* of 754 that intersects with 1950's Fortran; the other is that very few chips other than Pentium support the 754 80-bit extended format that's key to making binary f.p. much safer for non-experts. OTOH, the "proper 754 support" in the C99 Annex is a minefield of its own. > By "right" I mean that we can arrive at a long-term stable numeric > model that will be accepted by both the Python community as a whole > *and* by the decision makers who will vote thumbs up or down on > adopting Python in their organizations. The danger I see here is that Scheme's "numeric tower" is almost obviously a reasonable numeric model, but in practice is so vague that you can't really count on anything beyond simple small-int arithmetic working the same way across Scheme implementations. Guido appears to have come to an appreciation of that model in the abstract, but hoping that there's not much difference between floats and rationals in practice "because they represent the same mathematical values" just isn't going to pan out (IMO). 1/49*49 equals 1 or it doesn't; it doesn't using IEEE doubles, it does using rationals, and the difference will be significant to programs. Certainly better to switch from floats to rationals someday than to move in the other direction, though. I've come to suspect the issues *may& be complicated . From mal@lemburg.com Thu Jul 26 09:32:28 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 26 Jul 2001 10:32:28 +0200 Subject: [Python-Dev] Re: Static method and class method comments References: <9jn382$lqf$1@license1.unx.sas.com> <3B5F565C.5F452AC5@ActiveState.com> <15199.23581.653505.336350@beluga.mojam.com> Message-ID: <3B5FD59C.5FA52611@lemburg.com> Skip Montanaro wrote: > > Paul> Agree strongly. This will also be a problem for documentation > Paul> generation tools, type extraction tools and class browsers. I > Paul> believe it would be easy to add a contextual keyword > > >> class C: > >> def static foo(x, y): > >> print "classmethod", x, y > > Even better yet, why not simply reuse the class keyword in this context: > > class C: > def class foo(x, y): > print "classmethod", x, y AFAIK, the only way to add classmethods to a class is by doing so after creation of the class object. In that sense you don't have a problem with parsing doc-extraction tools at all: they don't have a chance of finding the class methods anyway ;-) Importing doc-extraction tools won't have a problem with these though and neither will human doc-extraction tools, since these will note that the class methods are special :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas.heller@ion-tof.com Thu Jul 26 10:22:59 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 26 Jul 2001 11:22:59 +0200 Subject: [Python-Dev] Re: Static method and class method comments References: <9jn382$lqf$1@license1.unx.sas.com> <3B5F565C.5F452AC5@ActiveState.com> <15199.23581.653505.336350@beluga.mojam.com> <3B5FD59C.5FA52611@lemburg.com> Message-ID: <017d01c115b4$8ff5ed00$e000a8c0@thomasnotebook> From: "M.-A. Lemburg" > AFAIK, the only way to add classmethods to a class is by doing > so after creation of the class object. Wrong IMO: C:\>c:\python22\python.exe Python 2.2a1 (#21, Jul 18 2001, 04:25:46) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. >>> class X: ... def foo(*args): return args ... goo = classmethod(foo) ... global x ... x = (foo, goo) ... >>> print x (, ) >>> print X.foo, X.goo > The classmethod is created before the class is done, it is converted into a method bound to the class when you access it. Thomas From mal@lemburg.com Thu Jul 26 11:10:36 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 26 Jul 2001 12:10:36 +0200 Subject: [Python-Dev] Re: Static method and class method comments References: <9jn382$lqf$1@license1.unx.sas.com> <3B5F565C.5F452AC5@ActiveState.com> <15199.23581.653505.336350@beluga.mojam.com> <3B5FD59C.5FA52611@lemburg.com> <017d01c115b4$8ff5ed00$e000a8c0@thomasnotebook> Message-ID: <3B5FEC9C.4248B663@lemburg.com> Thomas Heller wrote: >=20 > From: "M.-A. Lemburg" > > AFAIK, the only way to add classmethods to a class is by doing > > so after creation of the class object. > Wrong IMO: >=20 > C:\>c:\python22\python.exe > Python 2.2a1 (#21, Jul 18 2001, 04:25:46) [MSC 32 bit (Intel)] on win32 > Type "copyright", "credits" or "license" for more information. > >>> class X: > ... def foo(*args): return args > ... goo =3D classmethod(foo) > ... global x > ... x =3D (foo, goo) > ... > >>> print x > (, ) > >>> print X.foo, X.goo > > >=20 > The classmethod is created before the class is done, > it is converted into a method bound to the class > when you access it. Touch=E9 :-) --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From martin@strakt.com Thu Jul 26 12:12:23 2001 From: martin@strakt.com (Martin Sjögren) Date: Thu, 26 Jul 2001 13:12:23 +0200 Subject: [Python-Dev] Import hassle Message-ID: <20010726131222.A30459@strakt.com> Hello I've been writing quite a few mails lately, all concerning import problems. I thought I'd write a little longer mail to explain what I'm doing and what I find strange here. Basically all (at least the 10-20 ones I've checked) the C modules in the distribution have one thing in common: if something in their initFoo() function fails, they return without freeing any memory. I.e. they return an incomplete module. The only way I can think of that one of the standard modules could fail i= s when you're out of memory, and that's kinda hard to simulate, so I put in a faked failure, i.e. I raised an exception and returned prematurely (in one of my own C modules, not one in the distribution!). The code looks like this: PyErr_SetString(PyExc_ImportError, "foo"); return; /* do other things here, this "fails" */ >>> import Foo Traceback (most recent call last): File "", line 1, in ? ImportError: foo >>> import Foo >>> dir() ['Foo', '__builtins__', '__doc__', '__name__'] Huh?! How did this happen? What is Foo doing there? Even more interesting, say that I create a submodule and throw in a bunch of PyCFunctions in it (I stole the code from InitModule since I don't kno= w how to fake submodules in a C module in another way, is there a way?). I create the module, fail on inserting it into the dictionary and DECREF it. Now, that ought to free the darn submodule, doesn't it? Anyway, I wrote a simple "mean" script to test this: try: import Foo except: import Foo while 1: try: reload(Foo) except: pass And this leaks memory like I-don't-know-what! What memory doesn't get freed? Now to my questions: What exactly SHOULD I do when loading my module fail= s halfway through? Common sense says I should free the memory I've used and the module object ought to be unusable. Why-oh-why can I import Foo, catch the exception, import it again and it shows up in the dictionary? What's the purpose of this? How do I work with submodules in a C module? I find the import semantics really weird here, something is not quite right... Regards, Martin Sj=F6gren --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 405242 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html From Donald Beaudry Thu Jul 26 15:00:46 2001 From: Donald Beaudry (Donald Beaudry) Date: Thu, 26 Jul 2001 10:00:46 -0400 Subject: [Python-Dev] Re: Static method and class method comments References: <9jn382$lqf$1@license1.unx.sas.com> <3B5F565C.5F452AC5@ActiveState.com> Message-ID: <200107261400.KAA28632@localhost.localdomain> Paul Prescod wrote, > >Kevin Smith wrote: > > > > I am very glad to see the new features of Python 2.2, but I do have a minor > > gripe about the implementation of static and class methods. My issue stems > > from the fact that when glancing over Python code that uses static or class > > methods, you cannot tell that a method is a static or class method by looking > > at the point where it is defined. > > ... > > Agree strongly. This will also be a problem for documentation generation > tools, type extraction tools and class browsers. I believe it would be > easy to add a contextual keyword > > > class C: > > def static foo(x, y): > > print "classmethod", x, y My favorite way to spell this is: class C: class __class__: def foo(c, x, y): print "class method", x, y Or in words, class methods defined in their own name space, inside the class __class__. As for the distinction between "static methods" and "class methods", I havnt been able to convince myself that it's useful. -- Donald Beaudry Ab Initio Software Corp. 201 Spring Street donb@abinito.com Lexington, MA 02421 ...So much code, so little time... From guido@zope.com Thu Jul 26 16:25:51 2001 From: guido@zope.com (Guido van Rossum) Date: Thu, 26 Jul 2001 11:25:51 -0400 Subject: [Python-Dev] Import hassle In-Reply-To: Your message of "Thu, 26 Jul 2001 13:12:23 +0200." <20010726131222.A30459@strakt.com> References: <20010726131222.A30459@strakt.com> Message-ID: <200107261525.LAA11553@cj20424-a.reston1.va.home.com> > I've been writing quite a few mails lately, all concerning import > problems. I thought I'd write a little longer mail to explain what I'm > doing and what I find strange here. Martin, Why does this interest you? This never happens in reality unless your memory allocator is broken, and then you have worse problems than "leaks". Also, why are you posting to python-dev? > Basically all (at least the 10-20 ones I've checked) the C modules in the > distribution have one thing in common: if something in their initFoo() > function fails, they return without freeing any memory. I.e. they return > an incomplete module. > > The only way I can think of that one of the standard modules could > fail is when you're out of memory, and that's kinda hard to > simulate, so I put in a faked failure, i.e. I raised an exception > and returned prematurely (in one of my own C modules, not one in the > distribution!). > > The code looks like this: > PyErr_SetString(PyExc_ImportError, "foo"); > return; > /* do other things here, this "fails" */ > > >>> import Foo > Traceback (most recent call last): > File "", line 1, in ? > ImportError: foo > >>> import Foo > >>> dir() > ['Foo', '__builtins__', '__doc__', '__name__'] > > Huh?! How did this happen? What is Foo doing there? In general, when import fails after a certain point, the module has already been created in sys.modules. There is a reason for this, having to do with recursive imports. > Even more interesting, say that I create a submodule and throw in a > bunch of PyCFunctions in it (I stole the code from InitModule since > I don't know how to fake submodules in a C module in another way, is > there a way?). I create the module, fail on inserting it into the > dictionary and DECREF it. Now, that ought to free the darn > submodule, doesn't it? Anyway, I wrote a simple "mean" script to > test this: > > try: import Foo > except: import Foo > while 1: > try: reload(Foo) > except: pass > > And this leaks memory like I-don't-know-what! > What memory doesn't get freed? Memory leaks are hard to find. I prefer to focus on memory leaks that occur in real situations, rather than theoretical leaks. > Now to my questions: What exactly SHOULD I do when loading my module fails > halfway through? Common sense says I should free the memory I've used and > the module object ought to be unusable. You should free the memory if you care. "Disabling" the module is unnecessary -- in practice, the program usually quits when an import fails anyway. > Why-oh-why can I import Foo, catch the exception, import it again and it > shows up in the dictionary? What's the purpose of this? > > How do I work with submodules in a C module? > > I find the import semantics really weird here, something is not quite > right... Consider two modules, A and B, where A imports B and B imports A. This is perfectly legal, and works fine as long as B's module initialization doesn't use names defined in A. In order to make this work, sys.module['A'] is initialized to an empty module and filled with names during A's initialization; ditto for sys.modules['B']. Now suppose A triggers an exception after it has successfully loaded and imported B. B already has a reference to A. A is not completely initialized, but it's not empty either. Should we delete B's reference to A? No -- that's interference with B's namespace, and we don't know whether B might have stored references to A elsewhere, so we don't know if this would be effective. Should we delete sys.modules['A']? I don't think so. If we delete sys.modules['A'], and later someone attempts to import A again, the following will happen: when A imports B, it finds sys.modules['B'], so it doesn't reload B; it will use the existing B. But now B has a reference to the *old* A, not the new one. There are now two possibilities: either the second import of A somehow succeeds (this could only happen if somehow the problem that caused it to trigger an exception was repaired before the second attempted import), or the second import of A fails again. If it succeeds, the situation is still broken, because B references the old, incomplete A. If it fails, we my end up in an infinite loop, attempting to reimport A, failing, and catching the exception forever. Neither is good. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@zope.com Thu Jul 26 16:38:18 2001 From: guido@zope.com (Guido van Rossum) Date: Thu, 26 Jul 2001 11:38:18 -0400 Subject: [Python-Dev] number-sig anyone? In-Reply-To: Your message of "Thu, 26 Jul 2001 00:41:02 EDT." References: Message-ID: <200107261538.LAA11658@cj20424-a.reston1.va.home.com> > The danger I see here is that Scheme's "numeric tower" is almost obviously a > reasonable numeric model, but in practice is so vague that you can't really > count on anything beyond simple small-int arithmetic working the same way > across Scheme implementations. I certainly expect that we'll be able to do better than Scheme in our cross-implementation semantics -- Scheme is infamous for this. > Guido appears to have come to an > appreciation of that model in the abstract, but hoping that there's not much > difference between floats and rationals in practice "because they represent > the same mathematical values" just isn't going to pan out (IMO). 1/49*49 > equals 1 or it doesn't; it doesn't using IEEE doubles, it does using > rationals, and the difference will be significant to programs. Certainly > better to switch from floats to rationals someday than to move in the other > direction, though. Indeed, my only assumption is that switching from floats to rationals shouldn't be very disruptive. In my ideal numeric model, rationals auto-convert to floats but not the other way around, and str() and repr() of rationals would yield a decimal floating point representation similar to that of floats. (This is more or less what ABC did, except that for floats it added an annoying "~" as inexactness indicator.) To get a rational to print as x/y, you'd have to extract the numerator and denominator explicitly, or use some standard method. > I've come to suspect the issues *may& be complicated . Sure. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@zope.com Thu Jul 26 17:01:07 2001 From: guido@zope.com (Guido van Rossum) Date: Thu, 26 Jul 2001 12:01:07 -0400 Subject: [Python-Dev] number-sig anyone? In-Reply-To: Your message of "Wed, 25 Jul 2001 23:05:42 CDT." <15199.38678.456223.981364@beluga.mojam.com> References: <15197.44710.656892.910976@beluga.mojam.com> <3B5F5859.EBDFDC84@ActiveState.com> <15199.38678.456223.981364@beluga.mojam.com> Message-ID: <200107261601.MAA11741@cj20424-a.reston1.va.home.com> > By "right" I mean that we can arrive at a long-term stable numeric > model that will be accepted by both the Python community as a whole > *and* by the decision makers who will vote thumbs up or down on > adopting Python in their organizations. One of the most vocal > opponents to PEP 238 (I won't mention his name, but his initials are > S.H. ;-) lamented loudly that he'd be a laughing stock in his > company because of that "division thing". He mentioned something > about being a "right arse" I think. I'm not so worried. While many of the opponents tried to explain their position by arguing that int division was "right", their real worry was backwards compatibility. PEP 238 represents the *only* serious backwards incompatibility in the transition to a new numeric model that I can imagine. The transition plan that I hope to be checking into PEP 238 deals with the fears of the opponents by putting the sea change off until Python 3.0. > By having a well-considered overall plan for Python's numeric > behavior, if you have to make an incompatible change today, another > next year and a third two years after that, you can point to the > plan that shows people where you're headed, how you plan to get > there, and how they can write their programs in the meantime so as > to be as resilient as possible. Without such a plan -- or with > several potentially competing plans as you proposed -- every change > proposed or made will simply fuel the fires of those people who > dismiss Python because "it's unstable". The funny thing is, > Python's semantics changed so little for so long that by comparison > the rate of change does seem pretty high, but it's still much better > than many applications or application libraries (such as the > relatively recent glibc upheaval or the API changes Gtk is > undergoing now). And let's not even mention the folks in Redmond... Any additional serious incompatibilities will also be put off till Python 3.0. But I repeat, I don't expect any. Let me review: Int/long unification: - This "breaks" code that counts on the OverflowError. - This changes the meaning of left shift (the only operation that silently throws away bits rather than raising OverflowError). - This changes the meaning of octal and hex constants that set the sign bit in short integers -- 0xffffffff is currently a fancy way of writing -1 on a 32-bit machine, but after unification it will be the same as 0xffffffffL (i.e., 2**32-1). None of these is a big deal I think. Rationals: - The introduction of a new rational type in itself doesn't break anything. - Making 1/2 return a rational instead of a float could break some things but not at the scale of PEP 238. - Making 0.5 be a rational instead of a float will break more; we'll have to discuss this. I should note that the inclusion of rationals in the new numeric model is far from certain. There are potential problems with rationals that may require them to remain a separate type forever. This is about the extent of the changes to the numeric model that I'm contemplating; I don't think that the new numeric model should change much else. (I don't care much for some of the details of PEP 228, but I have to think more about it.) In other words, while the plan isn't spelled out yet, the only disruption is PEP 238. --Guido van Rossum (home page: http://www.python.org/~guido/) From cgw@alum.mit.edu Thu Jul 26 20:40:23 2001 From: cgw@alum.mit.edu (Charles G Waldman) Date: Thu, 26 Jul 2001 14:40:23 -0500 Subject: [Python-Dev] Problems building info documentation Message-ID: <15200.29223.607996.271764@nyx.dyndns.org> I did a CVS update and am trying to rebuild the info docs, and am getting the following errors. Any suggestions (other than the obvious one of rewriting html2texi.pl as html2texi.py)? cd /home/cgw/Python/python/dist/src/Doc/info/ make -k ../tools/mkinfo ../html/api/api.html perl -I/home/cgw/Python/python/dist/src/Doc/tools /home/cgw/Python/python/dist/src/Doc/tools/html2texi.pl /home/cgw/Python/python/dist/src/Doc/html/api/api.html /usr/lib/perl5/site_perl/5.6.1/HTML/Element.pm:2091: function main::collect_if_text expected 3 arguments, got 5: Front Matter 1 1 HTML::Element=HASH(0x81bf70c) 0 make: *** [python-api.info] Error 255 ../tools/mkinfo ../html/ext/ext.html perl -I/home/cgw/Python/python/dist/src/Doc/tools /home/cgw/Python/python/dist/src/Doc/tools/html2texi.pl /home/cgw/Python/python/dist/src/Doc/html/ext/ext.html /usr/lib/perl5/site_perl/5.6.1/HTML/Element.pm:2091: function main::collect_if_text expected 3 arguments, got 5: Front Matter 1 1 HTML::Element=HASH(0x81bf6b8) 0 make: *** [python-ext.info] Error 255 ../tools/mkinfo ../html/lib/lib.html perl -I/home/cgw/Python/python/dist/src/Doc/tools /home/cgw/Python/python/dist/src/Doc/tools/html2texi.pl /home/cgw/Python/python/dist/src/Doc/html/lib/lib.html /usr/lib/perl5/site_perl/5.6.1/HTML/Element.pm:2091: function main::collect_if_text expected 3 arguments, got 5: Front Matter 1 1 HTML::Element=HASH(0x81bf76c) 0 ... From paulp@ActiveState.com Thu Jul 26 20:52:29 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Thu, 26 Jul 2001 12:52:29 -0700 Subject: [Python-Dev] number-sig anyone? References: <15197.44710.656892.910976@beluga.mojam.com> <3B5F5859.EBDFDC84@ActiveState.com> <15199.38678.456223.981364@beluga.mojam.com> Message-ID: <3B6074FD.3822B771@ActiveState.com> Skip Montanaro wrote: > >... > > By "right" I mean that we can arrive at a long-term stable numeric model > that will be accepted by both the Python community as a whole *and* by the > decision makers who will vote thumbs up or down on adopting Python in their > organizations. But I don't think that there is any numeric model that will be accepted by the whole Python community. Some will like any change and some will dislike it. Likely there will appear to be equal numbers on either side of any issue because each flame is answered by one or more counter-flames. >... > By having a well-considered overall plan for Python's numeric behavior, if > you have to make an incompatible change today, another next year and a third > two years after that, you can point to the plan that shows people where > you're headed, how you plan to get there, and how they can write their > programs in the meantime so as to be as resilient as possible. I'm not a against such a plan but I don't think it can be designed in a committee. It would be largely the vision of one or two like-minded people with the same weighting of factors such as performance, ease of use, backwards compatibility and so forth. If you put an representation-obsessed engineer in the same committee with a purity-obsessed mathematician you'll find that the only thing they can agree on is to disagree. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From mclay@nist.gov Thu Jul 26 08:40:36 2001 From: mclay@nist.gov (Michael McLay) Date: Thu, 26 Jul 2001 03:40:36 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python Message-ID: <01072603403600.02216@fermi.eeel.nist.gov> PEP: XXX Title: Adding a Decimal type to Python Version: $Revision:$ Author: mclay@nist.gov Status: Draft Type: ?? Created: 25-Jul-2001 Python-Version: 2.2 Abstract Several PEPs have been written about fixing Python's numerical types. The proposed changes raise issues about breaking backwards compatibility in the process. Changing the existing numerical types can be avoided by introducing a decimal number type. This change will also enhance the utility of Python for several key markets. A decimal type is also a natural super-type of both integers and floating point numbers. This makes it an important root type for an inheritance tree of numerical types. This PEP suggests adding the decimal number type to Python in such a way that the existing number types will be the default type for .py files and the python command and the new decimal number type will be used for .dp files and the dpython command. Rationale Conflicts surface in the discussion of language design when programming goals differ. One example of this is found when selecting the best method for interpreting numerical values. The correct answer is dependent on the application domain of the software. While Python is very good at providing a simple generalized language, it is not an ideal language in all cases. For developers of scientific application the use of binary numbers, are often important for performance reasons. The developers of financial application need to use decimal numbers in order to control roundoff errors. Decimal numbers are also best for newbie users because decimal numbers have simpler rules and fewer surprises. The current implementation of numbers in Python is limited to a binary floating point type (both imaginary and real) and two types of integers. This makes the language suitable for scientific programming. Python is also suitable for domains which do not make use of numerical types. Changing the existing python implementation to use decimal numbers and the default type for literals is likely to irritate scientific programmers. Having to use special notation for decimal literals will make financial application developers second class citizen. Both groups can coexist and share compiled modules by making the parser of Python sensitive to the context of the syntax. This can be done by adding a new decimal type and then selectively changing the definition of default literals (that is a literal without a type suffix). In the proposed implementation the .py files and the python command would continue to parse numerical literals as they currently are interpreted. The new decimal type would be used for number literals for .dp files and the dpython command. Proposal A new decimal type will be added to Python. The new type will be based on the ANSI standard for decimal numbers. The proposal will also add two new literal for representing numbers A decimal literal will have a 'd' appended to the number and a float literal or an integer literal will have a 'f' appended to the number. The current '.py' file and the use of the python command will continue to use the existing float and integer types for the number literals without a suffix. The proposed change will add support for a second file type with a '.dp' suffix. There will also be an alternative command name, 'dpython', for the Python executable. The decimal number will be used for the interpretation numerical literals in a '.dp' file and when using the 'dpython' command. The following examples illustrate the two commands. $ ./dpython Python 2.2a1 (#87, Jul 26 2001, 11:07:58) [GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux2 Type "copyright", "credits" or "license" for more information. >>> type(21.2) >>> type(21.2f) >>> type(21f) >>> 21.2f 21.199999999999999 >>> 21.2 2.12 >>> 1f/2f 0 >>> 1/2 0.5 >>> $./python Python 2.2a1 (#87, Jul 26 2001, 11:07:58) [GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux2 Type "copyright", "credits" or "license" for more information. >>> type(21.2) >>> type(21.2f) >>> type(21.2d) >>> 1/2 0 >>> 21.2 21.199999999999999 >>> 21.2d 21.2 The new decimal type is a "super-type" of float, integer, and long, so when decimal math is used there are only decimal numbers, regardless of whether it is an integer or a floating point number. Newbies and developers of financial applications would use the dpython command and the '.dp' suffix for modules. The language will remain unchanged for existing programs. The addition of a decimal type that can be sub-classed may eliminate the need to add inheritance to float or integer types. The inheritance from float and integer are likely to be challenging. How will the inheritance from the float or integer type work? The definition and implementation of these types are dependent on the C compiler used to compile the interpreter. By contrast, a new decimal type could be designed to be highly customizable. The implementation could be implemented like class instances with a dictionary that starts out with three members, a sign, a coefficient, and an exponent. This basic type could be extended with flags that set the type of rounding to be used, or by adding a member that sets the precision of the numbers, or perhaps a minimum and maximum value member could be added. Adding the new file type is also an opportunity to fix some other ugliness in Python. The tab character could be eliminated in from block indentation. The default character type could be set to Unicode. (In dpython a 'b' would be added to the front of strings that are sequences of bytes.) Using Unicode as the default has one important downside. The change would limit the viewing of the '.dp' files to display devices that are Unicode enabled. This may have been a problem five years ago. Would it be today? --- need to add other improvement that could be done in dpython --- Backwards Compatibility The proposed change is backward compatible with the existing syntax when the python command is used. The new dpython command would be used to take advantage of the new language syntax. The python command will have access to the decimal number type and the dpython command will have access to the traditional float and integer types. Both versions of the language could be used to write exactly the same programs that generate exactly the same byte code output. The only difference will be a few syntax improvements in the dpython language. Prototype Implementation An implementation of this PEP has been started, but has not been completed. The parsing works as described, and a partial implementation of a decimal type has been started. The prototype implementation of the decimal object is sufficient for testing the approach of mingling dpython and python. The design of the current implementation does not support sub-classing. This minimal implementation of a decimal type could be completed with a days work. The development of an extendable type, as was described above, could take place in a later release. The interpretation of number literal that does not have a suffix is determined in in the parsetok() function. The function adds a 'd' or 'f' flag to any numerical literal that does not already have a number type suffix. The suffix attached to the numerical literal is based on the command used to invoke the parser or the suffix of the filename. The parsenumber() function in compile.c file was modified to key off the number type suffix. This type indicator is used in a switch statement for compiling the text of the literal into to the correct type of number. The implementation of the decimal type was created by copying the complexobject.[hc] files and then doing a global replace of the word complex with the word decimal. The PyDecimal_FromString method in decimalobject.c interprets the string encoding of a decimal number correctly and populates the data structure that contains the sign, coefficient, and exponent values of a decimal number. A minimal printing of the decimal number has been enabled. It is hard-coded to just print out a scientific notation of the number. The only operator that works properly at this time is negation operator defined in decimal_neg(). The d_sum() and d_prod() function have been started, but they are very broken. No work has been done on implementing the d_quot() function. The example that shows integer division working properly above was done by editing the output. The format of the echoed decimal number was also edited. When a directory in the path contains a '.dp' module and a '.py' module with the same module name the '.dp' module is used. The prototype implementation is available at http://www.gencam.org/python The implementation has only be tested on Mandrake Linux 8.0. Known Problems and Questions The parsetok.c file was duplicated and renamed to parsetok2.c because the pgen program could not resolve the Py_GetProgramName() function. The dpython repr() function should probably return a number with a suffix of 'd' for decimal types if the module is a '.py' module or if the python command is used. Should the repr() function add the 'f' suffix to float and integer values when accessed from a '.dp' module or the dpython command is used? Common Objections Adding a new type results in more rules to remember regarding the use of numbers in Python. Response: In general the rules for using a the decimal number type will be simpler than the rules governing the current set of numerical types. This should make it easier for newbies to learn the dpython language. The benefits to the users who need a decimal type are significant and the added rules will primarily impact these users. The decimal numbers are more precise, which is essential for some application domains. The decimal number rules will tend to simplify the use of python for these applications. The types used in an application will most likely be selected to match the user's requirements. Crossover between the new decimal types and the classic types will be infrequent. For cases where types must be mixed the language will be explicit. There will be no automatic coercion between the types. Exceptions will be raised if an explicit conversion isn't used. Having two languages will confuse users. Response: This is unlikely to be a problem because there will rarely be a python module that requires both types of numbers. If number types must be mixed in a module the proposed syntax provides an easy method to visually distinguish between the different number types. When types are mixed the choice between python and dpython will probably be dictated by the domain of the application developer. The distinction between python and dpython disappears once the language syntax has been compiled. The only problem that might occur is in recognizing which language version is being used when editing a module. An IDE can minimize the chances of confusion by using different background colors or highlighting schemes to distinguish between the versions of the language. Anyone still using vi on a black and white monitor will just have to remember the name of the file being edited. (Which is probably how they think it should be:-) Shouldn't the root numerical type be a rational type? Response: ??? References [1] ANSI standard X3.274-1996. (See http://www2.hursley.ibm.com/decimal/deccode.html) Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil End: From tim@digicool.com Thu Jul 26 21:52:18 2001 From: tim@digicool.com (Tim Peters) Date: Thu, 26 Jul 2001 16:52:18 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: <01072603403600.02216@fermi.eeel.nist.gov> Message-ID: > [1] ANSI standard X3.274-1996. > (See http://www2.hursley.ibm.com/decimal/deccode.html) Michael, this is merely a standard for *encoding* decimal numbers; it doesn't say anything about semantics, or exceptions, or anything else visible to users. Are you aware that Aahz is implementing "the real" spec for Python, a level up at http://www2.hursley.ibm.com/decimal/ under "Base specification"? There are so few people working on the decimal idea that I hate to see it fragmented already. From barry@zope.com Thu Jul 26 22:06:19 2001 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 26 Jul 2001 17:06:19 -0400 Subject: [Python-Dev] Breakage in CVS on SF? Message-ID: <15200.34379.61043.569809@yyz.digicool.com> Has anybody noticed this problem? % cvs -q up -P -d P Mac/Lib/findertools.py cvs [update aborted]: cannot open .new.findertoo: Permission denied write stdout: Broken pipe From mclay@nist.gov Thu Jul 26 22:38:27 2001 From: mclay@nist.gov (Michael McLay) Date: Thu, 26 Jul 2001 17:38:27 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python Message-ID: <01072617382702.02216@fermi.eeel.nist.gov> On Thursday 26 July 2001 04:52 pm, Tim Peters wrote: > > [1] ANSI standard X3.274-1996. > > (See http://www2.hursley.ibm.com/decimal/deccode.html) > > Michael, this is merely a standard for *encoding* decimal numbers; it > doesn't say anything about semantics, or exceptions, or anything else > visible to users. This was a proposal for a mechanism for mingling types safely. It was not intended as a definition of how decimal numbers should be implemented. My implementation tests the interaction of the current number types with the decimal type and I only completed enought of the decimal type implementation to support this testing. I was not expecting to discuss how decimal types should work. That has been discussed already. I was primarily interested in testing the effects of adding a new number type as I described in the PEP. What did you think of the idea of adding a new command and file format? > Are you aware that Aahz is implementing "the real" spec for Python, a level > up at > > http://www2.hursley.ibm.com/decimal/ > > under "Base specification"? There are so few people working on the decimal > idea that I hate to see it fragmented already. Yes I have played with the Decimal.py module. I developed decimalobject.c so I could test the inpact of introducing an additional command and file format to Python. I expect this code to be replaced. As I said in the PEP I also think the decimal number implementation will evolve into a type that supports inheritance. From tim@digicool.com Thu Jul 26 22:43:31 2001 From: tim@digicool.com (Tim Peters) Date: Thu, 26 Jul 2001 17:43:31 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: <01072617382702.02216@fermi.eeel.nist.gov> Message-ID: [Michael McLay] > ... > What did you think of the idea of adding a new command and file format? I haven't gotten that far yet -- really, I just skimmed the top and the bottom so far. Too much to do; will read later, though. From guido@zope.com Thu Jul 26 23:21:31 2001 From: guido@zope.com (Guido van Rossum) Date: Thu, 26 Jul 2001 18:21:31 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: Your message of "Thu, 26 Jul 2001 17:38:27 EDT." <01072617382702.02216@fermi.eeel.nist.gov> References: <01072617382702.02216@fermi.eeel.nist.gov> Message-ID: <200107262221.SAA21517@cj20424-a.reston1.va.home.com> [Michael] > This was a proposal for a mechanism for mingling types safely. It > was not intended as a definition of how decimal numbers should be > implemented. My implementation tests the interaction of the current > number types with the decimal type and I only completed enought of > the decimal type implementation to support this testing. I was not > expecting to discuss how decimal types should work. That has been > discussed already. I was primarily interested in testing the effects > of adding a new number type as I described in the PEP. Can you summarize the rules you used for mixed arithmetic? I forget what your PEP said would happen when you add a decimal 1 to a binary 1. Is the result decimal 2 or binary 2? Why? > What did you think of the idea of adding a new command and file format? I don't think that would be necessary. I'd prefer the 'd' and 'f' (or maybe 'b'?) suffixes to be explicit, perhaps combined with an optional per-module directive to set the default. This would be more robust than keying on the filename extension. If you have to change the default globally, I'd prefer a command line option. After all it's only the scanner that needs to know about the different defaults, right? I wonder about the effectiveness of the default though. If you write a module for decimal arithmetic, how do you prevent a caller to pass in a binary number? > > Are you aware that Aahz is implementing "the real" spec for > > Python, a level up at > > > > http://www2.hursley.ibm.com/decimal/ > > > > under "Base specification"? There are so few people working on > > the decimal idea that I hate to see it fragmented already. > > Yes I have played with the Decimal.py module. I developed > decimalobject.c so I could test the inpact of introducing an > additional command and file format to Python. I expect this code to > be replaced. As I said in the PEP I also think the decimal number > implementation will evolve into a type that supports inheritance. Please, please, please, unify all these efforts. A decimal PEP would be a good one, but there should be only one. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg@cosc.canterbury.ac.nz Fri Jul 27 02:51:54 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 27 Jul 2001 13:51:54 +1200 (NZST) Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: <01072617382702.02216@fermi.eeel.nist.gov> Message-ID: <200107270151.NAA04533@s454.cosc.canterbury.ac.nz> Michael McLay (by way of himself): > This was a proposal for a mechanism for mingling types safely. It was not > intended as a definition of how decimal numbers should be > implemented. Perhaps you could make it clearer in the introduction that this is the part of the problem your PEP is addressing. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@home.com Fri Jul 27 06:27:27 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 27 Jul 2001 01:27:27 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: <200107262221.SAA21517@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > ... > Please, please, please, unify all these efforts. A decimal PEP would > be a good one, but there should be only one. I elect Michael . Note there is *no* decimal PEP now -- not even a decimal PEP number assigned. Aahz isn't going to write one, either. I was hoping to write one instead if time allowed, but that looks increasingly unlikely by the hour. From tim.one@home.com Fri Jul 27 06:35:08 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 27 Jul 2001 01:35:08 -0400 Subject: [Python-Dev] Breakage in CVS on SF? In-Reply-To: <15200.34379.61043.569809@yyz.digicool.com> Message-ID: [Barry A. Warsaw] > Has anybody noticed this problem? > > % cvs -q up -P -d > P Mac/Lib/findertools.py > cvs [update aborted]: cannot open .new.findertoo: Permission denied > write stdout: Broken pipe I have not. Do you still see it? if-so-get-a-real-os-with-a-real-cvs-ly y'rs - tim From mclay@erols.com Fri Jul 27 06:44:33 2001 From: mclay@erols.com (Michael McLay) Date: Fri, 27 Jul 2001 01:44:33 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python Message-ID: <01072701443301.05085@localhost.localdomain> The PEP I posted yesterday, which currently doesn't have a number, addresses the syntactic issues of adding a decimal number type to Python and it investigates how to safely introduce the new type in a language with a large base of legacy code. The PEP does not address the definition of how decimal numbers should be implemented in Python. This topic has been the subject of other PEPs. The PEP also proposes the definition of a new language dialect that makes some small improvements on the syntax of the classic Python language. The changes to the numerical model is tailored to make the language attractive to two very important markets. Many of the users attracted from these markets may initially have little or no interest in classic Python. They may not even know that the Python language exists. They will happily use a language called dpython that works very well for their profession. The interesting thing about the proposed language is how little effort will be required to create and maintain it. The additions to Python were straightforward and the total patch was only a few hundred lines. The prototype implementation uses the following rules when interpreting the type to be created from a number literal. literal '.py' '.dp' interactive interactive value file file python dpython 2.2b float float float float 2b int int int int 2.2 float decimal float decimal 2 int decimal int decimal 2.2d decimal decimal decimal decimal 2d decimal decimal decimal decimal Based on a comment from Guido I've decided to change the 'f' to 'b' in the next version of dpython. That will be more descriptive of the distinction the types. [Michael] >> This was a proposal for a mechanism for mingling types safely. It >> was not intended as a definition of how decimal numbers should be >> implemented. My implementation tests the interaction of the current >> number types with the decimal type and I only completed enough of >> the decimal type implementation to support this testing. I was not >> expecting to discuss how decimal types should work. That has been >> discussed already. I was primarily interested in testing the effects >> of adding a new number type as I described in the PEP. > Can you summarize the rules you used for mixed arithmetic? I forget > what your PEP said would happen when you add a decimal 1 to a binary > 1. Is the result decimal 2 or binary 2? Why? The rule is very simple. You can't mix the types. You must explicitly cast a binary to a decimal or a decimal to a binary. This introduces the least chance of error. This pedantic behavior is very important in fields like accounting. I want accountants to think of the proposed dpython language as the COBOL version of Python:-) This approach is also the correct one to take for newbies. They will get a nice clean exception if they mix the types. This error will be something they can easily look up in the documentation. An unexpected answer, like 1/2=0, will just leave them scratching their head. This proposal tries to be consistent with what I like about Python and what I think makes Python a great language. The implementation maintains complete backwards compatibility and it requires that programmers explicitly state that they want to do something rather than have bad things happen unexpectedly. Mixing different types of numbers can lead to bugs that are very difficult to identify. The nature of the errors that would occur when binary numbers are used instead of decimals would be particularly difficult to detect. The answers would always be very close, and sometimes they would be correct. Without the use of an explicit cast these errors would be silent. The price paid for being pedantic will be the occasional need to add an int() or float() around a decimal number or an decimal() around a float or int. >> What did you think of the idea of adding a new command and file format? > I don't think that would be necessary. I'd prefer the 'd' and 'f' (or > maybe 'b'?) suffixes to be explicit, perhaps combined with an optional > per-module directive to set the default. This would be more robust > than keying on the filename extension. Why do you think a directive statement would be more robust than using a file suffix or command name as the directive? I'll try to explain why I think the opposite is true. Take the example of teaching a newbie to program. They must be told some basic things For instance, they will have only been told to use a specific suffix for the file name in order to create a new module. So how do you make sure that the newbie always uses decimal numbers? If a directive statement is required then the newbie must remember to always add this statement at the top of a file. If they forget they will not get the expected results. With the file suffix based approach they will have to use a '.dp' suffix for the file name of a new module. If the are told to use a '.dp' suffix from the outset then the chances of their accidentally typing '.py' instead of '.dp' is very unlikely, whereas, forgetting to add a directive would be a silent error that they might easily forget. Your request to have an explicit 'd' and 'f' is already implemented. The prototype implementation allows an explicit 'd' or 'f' to be used at anytime. The rules on the interpretation of the values that have no suffix were defined earlier. The prototype implementation simply uses the suffix of the module file and the name of the command as the directive. This approach provides a very natural language experience to someone working in a profession that normally uses decimal numbers. They are not treated as second class citizens who must endure the clutter of a magic directive statements at the top of every module they create. They just use their special command and the file extension. > If you have to change the > default globally, I'd prefer a command line option. After all it's > only the scanner that needs to know about the different defaults, > right? I think there would be a problem with only using a command line option. It would work for files that are named on the command line and for code being interpreted in an interactive session. However, for imported modules the meaning of a number literal must be based on the author intentions when the module was created. This means that the interpreter must recognize the type of file so it can determine how compile the literals defined in the module. If the command line option determines how a scanner is to convert the number literals then a module source file could incorrectly be converted if the wrong command line option were used. > I wonder about the effectiveness of the default though. If you write > a module for decimal arithmetic, how do you prevent a caller to pass > in a binary number? Since the module is written with decimal numbers an exception would be raised if a binary number was used where a decimal number was required. For instance: --------------- #File spam.py a = 1.0 --------------- #File eggs.dp import spam c = a + 1.0 --------------- The name 'a' was compiled into a float type object when the spam.py file was scanned. So when the expression being assigned to 'c' is executed it would result in an TypeError being raised because a float was being added to a decimal. >> decimalobject.c so I could test the impact of introducing an >> additional command and file format to Python. I expect this code to >> be replaced. As I said in the PEP I also think the decimal number >> implementation will evolve into a type that supports inheritance. > Please, please, please, unify all these efforts. A decimal PEP would > be a good one, but there should be only one. Absolutely. The PEP process is suppose to formalize the capture of ideas so they can be reference. This PEP is mostly orthogonal to Aahz's proposal. They can be merge, or we can reference each others PEP. I'm probably not the best choice for doing the implement of the decimal number semantics, so I'd be happy to work with Aahz. From mclay@erols.com Fri Jul 27 06:52:29 2001 From: mclay@erols.com (Michael McLay) Date: Fri, 27 Jul 2001 01:52:29 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: <01072701443301.05085@localhost.localdomain> References: <01072701443301.05085@localhost.localdomain> Message-ID: <01072701522902.05085@localhost.localdomain> Oops On Friday 27 July 2001 01:44 am, you wrote: > > --------------- > #File spam.py > a = 1.0 > > --------------- > #File eggs.dp > import spam > c = a + 1.0 that should have been c = spam.a + 1.0 time for bed:-) From fdrake@acm.org Fri Jul 27 00:17:07 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 26 Jul 2001 19:17:07 -0400 (EDT) Subject: [Python-Dev] Import hassle In-Reply-To: <20010726131222.A30459@strakt.com> References: <20010726131222.A30459@strakt.com> Message-ID: <15200.42227.309819.315626@cj42289-a.reston1.va.home.com> Martin Sj=F6gren writes: > Even more interesting, say that I create a submodule and throw in a = bunch > of PyCFunctions in it (I stole the code from InitModule since I don'= t know > how to fake submodules in a C module in another way, is there a way?= ). I Martin, You can take a look at the code in pyexpat.c's init function; this creates a couple of module objects to hold constants in segregated namespaces. I'm not sure that it does everything it needs to to properly build up all the namespaces, but it should do reasonably well. -Fred --=20 Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Fri Jul 27 00:09:05 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 26 Jul 2001 19:09:05 -0400 (EDT) Subject: [Python-Dev] Problems building info documentation In-Reply-To: <15200.29223.607996.271764@nyx.dyndns.org> References: <15200.29223.607996.271764@nyx.dyndns.org> Message-ID: <15200.41745.113986.278767@cj42289-a.reston1.va.home.com> Charles G Waldman writes: > I did a CVS update and am trying to rebuild the info docs, and am > getting the following errors. Any suggestions (other than the obvious > one of rewriting html2texi.pl as html2texi.py)? This has been reported before, but I don't know of a fix. I don't think anyone has spent any time on it. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From martin@strakt.com Fri Jul 27 09:34:13 2001 From: martin@strakt.com (Martin Sjögren) Date: Fri, 27 Jul 2001 10:34:13 +0200 Subject: [Python-Dev] Import hassle In-Reply-To: <200107261525.LAA11553@cj20424-a.reston1.va.home.com> References: <20010726131222.A30459@strakt.com> <200107261525.LAA11553@cj20424-a.reston1.va.home.com> Message-ID: <20010727103413.A10266@strakt.com> On Thu, Jul 26, 2001 at 11:25:51AM -0400, Guido van Rossum wrote: > > I've been writing quite a few mails lately, all concerning import > > problems. I thought I'd write a little longer mail to explain what I'= m > > doing and what I find strange here. >=20 > Martin, >=20 > Why does this interest you? This never happens in reality unless your > memory allocator is broken, and then you have worse problems than > "leaks". Short answer: I want to do it Right Long answer: I'm curious about how it works, and I found the import statement very odd, what with importing "broken" modules and reloading them and so on. I could easily get python to leak lots and lots of memor= y by catching the import and then catch the reload() in an infinite loop. Basically, I like Python but Python could be better ;) > Also, why are you posting to python-dev? Good question. I never seemed to get the answers I wanted from python-list, so my first thought was to mail you personally seeing as "he created the thing, surely he knows" but then I thought that it'd be a bit rude so I thought "who else would know a lot about this?" and I came up with the answer python-dev. If I shouldn't have done this, I apologize, but after fooling around with import and trying to figure out where and what I should free when the init failed, I was mildly confuddled. I posted this to python-dev too since I started out there, making my erro= r worse I guess. Feel free to flame me :-) [snip] > > Even more interesting, say that I create a submodule and throw in a > > bunch of PyCFunctions in it (I stole the code from InitModule since > > I don't know how to fake submodules in a C module in another way, is > > there a way?). I create the module, fail on inserting it into the > > dictionary and DECREF it. Now, that ought to free the darn > > submodule, doesn't it? Anyway, I wrote a simple "mean" script to > > test this: > >=20 > > try: import Foo > > except: import Foo > > while 1: > > try: reload(Foo) > > except: pass > >=20 > > And this leaks memory like I-don't-know-what! > > What memory doesn't get freed? >=20 > Memory leaks are hard to find. I prefer to focus on memory leaks that > occur in real situations, rather than theoretical leaks. Agreed, though it's nice to do it Right, especially when I get asked on the code review at work "shouldn't you free memory here?" and the only thing I can reply is "nobody else does", and my boss says "just because nobody else does it Right, there's no reason you shouldn't" But, what IS the Right way to do this anwyay? > > Now to my questions: What exactly SHOULD I do when loading my module = fails > > halfway through? Common sense says I should free the memory I've used= and > > the module object ought to be unusable. >=20 > You should free the memory if you care. "Disabling" the module is > unnecessary -- in practice, the program usually quits when an import > fails anyway. Okay, so how about the situation where an import fails halfway through bu= t the things you need are initialized "before" that. Say that you catch th= e exception on import and check wether the things you need are there. If they are, fine. If they aren't, fail. I don't see this situation as something that's show up all the time, but it certainly is possible, isn'= t it? In that situation it would be nice if there were no memory leaks... Then again, maybe I'm just foolish. > > Why-oh-why can I import Foo, catch the exception, import it again and= it > > shows up in the dictionary? What's the purpose of this? > >=20 > > How do I work with submodules in a C module? > >=20 > > I find the import semantics really weird here, something is not quite > > right... > Consider two modules, A and B, where A imports B and B imports A. > This is perfectly legal, and works fine as long as B's module > initialization doesn't use names defined in A. >=20 > In order to make this work, sys.module['A'] is initialized to an empty > module and filled with names during A's initialization; ditto for > sys.modules['B']. >=20 > Now suppose A triggers an exception after it has successfully loaded > and imported B. B already has a reference to A. A is not completely > initialized, but it's not empty either. Should we delete B's > reference to A? No -- that's interference with B's namespace, and we > don't know whether B might have stored references to A elsewhere, so > we don't know if this would be effective. Should we delete > sys.modules['A']? I don't think so. If we delete sys.modules['A'], > and later someone attempts to import A again, the following will > happen: when A imports B, it finds sys.modules['B'], so it doesn't > reload B; it will use the existing B. But now B has a reference to > the *old* A, not the new one. >=20 > There are now two possibilities: either the second import of A somehow > succeeds (this could only happen if somehow the problem that caused it > to trigger an exception was repaired before the second attempted > import), or the second import of A fails again. If it succeeds, the > situation is still broken, because B references the old, incomplete > A. If it fails, we my end up in an infinite loop, attempting to > reimport A, failing, and catching the exception forever. Neither is > good. Ah-hah. Now I get it, thank you! Martin --=20 Martin Sj=F6gren martin@strakt.com ICQ : 41245059 Phone: +46 (0)31 405242 Cell: +46 (0)739 169191 GPG key: http://www.strakt.com/~martin/gpg.html From mal@lemburg.com Fri Jul 27 11:13:02 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 27 Jul 2001 12:13:02 +0200 Subject: [Python-Dev] PEP for adding a decimal type to Python References: <01072701443301.05085@localhost.localdomain> Message-ID: <3B613EAE.D1F4AEC4@lemburg.com> Just a suggestion which might also open the door for other numeric type extensions to play along nicely: Would it make sense to have an extensible registry of constructors for numeric types which maps number literal modifiers to constructors ? I am thinking of 123L -> long("123") 123i -> int("123") 123.45f -> float("123.45") The registry would map 'L' to long(), 'i' to int(), 'f' to float() and be extensible in the sense, that e.g. an extension like mxNumber could register its own mappings which would make the types defined in these extensions much more accessible without having to path the interpreter. mxNumber for example could then register 'r' to map to mx.Number.Rational() and a user could then write 1/2r would map to 1 / mx.Number.Rational("2") and generate a Rational number object for 1/2. The registry would have to be made smart enough to seperate integer notations from floating point ones and use two separate default mapping for these, e.g. '' -> int() and '' -> float(). The advantage of such a mechanism would be that a user could easily change the literal semantics at his/her taste. Note that I don't think that we really need a separate interpreter just to add decimals or rationals to the core. All that is needed is some easy way to construct these number objects without too much programming overhead (i.e. number of keys to hit ;-). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@zope.com Fri Jul 27 14:08:51 2001 From: guido@zope.com (Guido van Rossum) Date: Fri, 27 Jul 2001 09:08:51 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: Your message of "Fri, 27 Jul 2001 12:13:02 +0200." <3B613EAE.D1F4AEC4@lemburg.com> References: <01072701443301.05085@localhost.localdomain> <3B613EAE.D1F4AEC4@lemburg.com> Message-ID: <200107271308.JAA23972@cj20424-a.reston1.va.home.com> > Just a suggestion which might also open the door for other numeric > type extensions to play along nicely: > > Would it make sense to have an extensible registry of constructors > for numeric types which maps number literal modifiers to constructors ? > > I am thinking of > > 123L -> long("123") > 123i -> int("123") > 123.45f -> float("123.45") > > The registry would map 'L' to long(), 'i' to int(), 'f' to float() > and be extensible in the sense, that e.g. an extension like > mxNumber could register its own mappings which would make > the types defined in these extensions much more accessible > without having to path the interpreter. mxNumber for example could > then register 'r' to map to mx.Number.Rational() and a user could > then write 1/2r would map to 1 / mx.Number.Rational("2") and > generate a Rational number object for 1/2. > > The registry would have to be made smart enough to seperate > integer notations from floating point ones and use two separate > default mapping for these, e.g. '' -> int() and '' -> > float(). > > The advantage of such a mechanism would be that a user could > easily change the literal semantics at his/her taste. > > Note that I don't think that we really need a separate interpreter > just to add decimals or rationals to the core. All that is needed > is some easy way to construct these number objects without too > much programming overhead (i.e. number of keys to hit ;-). Funny, I had a similar idea today in the shower (always the best place to think :-). I'm not sure exactly how it would work yet -- currently, literals are converted to values at compile-time, so the registry would have to be available to the compiler, but the concept seems to make more sense if it is available and changeable at runtime. Nevertheless, we should keep this in mind. --Guido van Rossum (home page: http://www.python.org/~guido/) From mclay@nist.gov Fri Jul 27 14:21:01 2001 From: mclay@nist.gov (Michael McLay) Date: Fri, 27 Jul 2001 09:21:01 -0400 Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python In-Reply-To: <3B613EAE.D1F4AEC4@lemburg.com> References: <01072701443301.05085@localhost.localdomain> <3B613EAE.D1F4AEC4@lemburg.com> Message-ID: <01072708200303.02216@fermi.eeel.nist.gov> On Friday 27 July 2001 06:13 am, M.-A. Lemburg wrote: > Just a suggestion which might also open the door for other numeric > type extensions to play along nicely: > > Would it make sense to have an extensible registry of constructors > for numeric types which maps number literal modifiers to constructors ? > > I am thinking of > > 123L -> long("123") > 123i -> int("123") > 123.45f -> float("123.45") With the changes made in the prototype this would be relatively easy to implement. Using an 'i' suffix could be confused with an imaginary number. It would be very easy for someone to mistakenly type 12i instead of 12j and get an integer instead of an imaginary number The next implementation of my PEP will change the 'f' to a 'b', as in binary number. The same suffix is used for both integer and float because they work together as a binary number implementation of numbers. With the decimal number implementation there is only one type for both integer and float. 123b -> int("123") 123.45b -> float("123.45") > > The registry would map 'L' to long(), 'i' to int(), 'f' to float() > and be extensible in the sense, that e.g. an extension like > mxNumber could register its own mappings which would make > the types defined in these extensions much more accessible > without having to path the interpreter. mxNumber for example could > then register 'r' to map to mx.Number.Rational() and a user could > then write 1/2r would map to 1 / mx.Number.Rational("2") and > generate a Rational number object for 1/2. > > The registry would have to be made smart enough to seperate > integer notations from floating point ones and use two separate > default mapping for these, e.g. '' -> int() and '' -> > float(). The tokenizer just passes a number with a suffix as a string to a function in compiler.c The number in the string could be any valid number, e.g. 123, 123.45, 123.45e-3, or .123. The function processing the string then determines what type of number object to create based on the suffix. It would be the responsibility of the function that processes the 'r' suffix to accept or reject the number encoded in the string. > The advantage of such a mechanism would be that a user could > easily change the literal semantics at his/her taste. > > Note that I don't think that we really need a separate interpreter > just to add decimals or rationals to the core. All that is needed > is some easy way to construct these number objects without too > much programming overhead (i.e. number of keys to hit ;-). I wasn't suggesting creating a separate interpreter, I was suggesting adding a simple mechanism for allowing a new dialect of Python to be added to the existing interpreter. This new dialect would be easier to use for certain types of programming activities. The use of a decimal number type as the default type in this new language dialect is only one change that was proposed. Another would be to use Unicode as the default character set. This would allow Unicode characters to be in strings without needing to escape them. The proposal also suggests removing the tab character from indentation of blocks. The goal is to create a language that would clean up some of the warts in the Python syntax and take advantage of the capabilities of modern IDE environments. The idea of adding a new language on top of the existing infrastructure isn't that unusual. The gcc compiler can process many languages to produce a common machine dependant object code. I can envision taking my simple changes a few steps further and turning the entire tokenizer into a replaceable unit. This approach would allows projects to build other languages on top of the Python byte code interpreter. Imagine having Javascript, VBasic, or sh tokenizer frontends generating Python bytecodes. Think of it as the pyNET architecture:-) This change probably belongs in Python4k. Perhaps the PEP should be split into two parts. The first PEP would be to add decimal characters with a 'd' suffix and also allow suffix characters to be added to the default float and integer types. I think everyone agrees that this change is needed. The second PEP will cover the proposed creation of the dpython dialect. This PEP would be a container for proposed changes to the Python syntax that would make the language easier to teach to newbies and easier to use in a financial application. Your suggestion to allow additional numerical types to be added by users would be included in the first PEP if the BDFL thinks this is a good idea. From juergen.erhard@gmx.net Fri Jul 27 10:46:42 2001 From: juergen.erhard@gmx.net (=?ISO-8859-1?Q?=22J=FCrgen_A=2E_Erhard=22?=) Date: Fri, 27 Jul 2001 11:46:42 +0200 Subject: [Python-Dev] Re: Future division patch available (PEP 238) In-Reply-To: (tanzer@swing.co.at) References: Message-ID: <27072001.2@wanderer.local.jae.dyndns.org> --pgp-sign-Multipart_Fri_Jul_27_11:46:08_2001-1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable >>>>> "Christian" =3D=3D Christian Tanzer writes: [snipperoonio... lots of interesting stuff about real-life and seemingly highly dynamic Python deployments] Christian> To be honest, for TTTech design databases the change in Christian> division probably doesn't pose any problems. Due to Christian> user demand, the tools coerced divisions [in Christian> customer-written code] to floating point for a long Christian> time. "Due to customer demand"... well, seems to me you have given great support to PEP 238 with this. ;-) Bye, J PS: No, I'm not seeing you in the (raving) anti-PEP-238 camp, Christian. Your post was much too level-headed for this confusion to happen. ;-) --=20 J=FCrgen A. Erhard (juergen.erhard@gmx.net, jae@users.sourceforge.net) My WebHome: http://members.tripod.com/Juergen_Erhard "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- B. Franklin --pgp-sign-Multipart_Fri_Jul_27_11:46:08_2001-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iEYEABECAAYFAjthOG0ACgkQN0B+CS56qs2PnQCeMG6ytmv1MCM1bXKnzt8kYrHv lAkAoJChztYXVwXHR0+xidKHQ4rQOEdo =8h75 -----END PGP SIGNATURE----- --pgp-sign-Multipart_Fri_Jul_27_11:46:08_2001-1-- From gward@python.net Fri Jul 27 15:14:00 2001 From: gward@python.net (Greg Ward) Date: Fri, 27 Jul 2001 10:14:00 -0400 Subject: [Python-Dev] Advice in stat.py Message-ID: <20010727101400.A1016@gerg.ca> stat.py in 2.2a1 starts with the following sage advice: """Constants/functions for interpreting results of os.stat() and os.lstat(). Suggested usage: from stat import * """ Is ths still the suggested usage? Greg -- Greg Ward - geek gward@python.net http://starship.python.net/~gward/ A man without religion is like a fish without a bicycle. From fdrake@acm.org Fri Jul 27 15:27:54 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 27 Jul 2001 10:27:54 -0400 (EDT) Subject: [Python-Dev] Advice in stat.py In-Reply-To: <20010727101400.A1016@gerg.ca> References: <20010727101400.A1016@gerg.ca> Message-ID: <15201.31338.499533.710253@cj42289-a.reston1.va.home.com> Greg Ward writes: > Suggested usage: from stat import * > """ > > Is ths still the suggested usage? Well, I would never suggest that, but I didn't write the stat module. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mal@lemburg.com Fri Jul 27 15:27:43 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 27 Jul 2001 16:27:43 +0200 Subject: [Python-Dev] PEP for adding a decimal type to Python References: <01072701443301.05085@localhost.localdomain> <3B613EAE.D1F4AEC4@lemburg.com> <200107271308.JAA23972@cj20424-a.reston1.va.home.com> Message-ID: <3B617A5F.E193B2CC@lemburg.com> Guido van Rossum wrote: > > > Just a suggestion which might also open the door for other numeric > > type extensions to play along nicely: > > > > Would it make sense to have an extensible registry of constructors > > for numeric types which maps number literal modifiers to constructors ? > > > > I am thinking of > > > > 123L -> long("123") > > 123i -> int("123") > > 123.45f -> float("123.45") > > > > The registry would map 'L' to long(), 'i' to int(), 'f' to float() > > and be extensible in the sense, that e.g. an extension like > > mxNumber could register its own mappings which would make > > the types defined in these extensions much more accessible > > without having to path the interpreter. mxNumber for example could > > then register 'r' to map to mx.Number.Rational() and a user could > > then write 1/2r would map to 1 / mx.Number.Rational("2") and > > generate a Rational number object for 1/2. > > > > The registry would have to be made smart enough to seperate > > integer notations from floating point ones and use two separate > > default mapping for these, e.g. '' -> int() and '' -> > > float(). > > > > The advantage of such a mechanism would be that a user could > > easily change the literal semantics at his/her taste. > > > > Note that I don't think that we really need a separate interpreter > > just to add decimals or rationals to the core. All that is needed > > is some easy way to construct these number objects without too > > much programming overhead (i.e. number of keys to hit ;-). > > Funny, I had a similar idea today in the shower (always the best place > to think :-). I'm not sure exactly how it would work yet -- > currently, literals are converted to values at compile-time, so the > registry would have to be available to the compiler, but the concept > seems to make more sense if it is available and changeable at runtime. True, but deferring the conversion to runtime (by e.g. using literal descriptors ;-) would cause a significant slowdown. So, I believe that the compiler would have be told before starting the compile process or within the process by looking at some magical constant/comment in the source code (I think that this ought to be a per-file overrideable setting, since some code may simply fail to work if it suddenly starts to work with different types). > Nevertheless, we should keep this in mind. I could reformat the above into a PEP or Michael could simply the idea as section to his PEP. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From paul@pfdubois.com Fri Jul 27 15:56:25 2001 From: paul@pfdubois.com (Paul F. Dubois) Date: Fri, 27 Jul 2001 07:56:25 -0700 Subject: [Python-Dev] PEP for adding a decimal type to Python Message-ID: In dpython, what is 2.0j? Is the "standard" way of writing complex numbers, 3.0 + 2.0j, valid? From guido@zope.com Fri Jul 27 16:47:13 2001 From: guido@zope.com (Guido van Rossum) Date: Fri, 27 Jul 2001 11:47:13 -0400 Subject: [Python-Dev] Advice in stat.py In-Reply-To: Your message of "Fri, 27 Jul 2001 10:14:00 EDT." <20010727101400.A1016@gerg.ca> References: <20010727101400.A1016@gerg.ca> Message-ID: <200107271547.LAA24634@cj20424-a.reston1.va.home.com> > stat.py in 2.2a1 starts with the following sage advice: > > """Constants/functions for interpreting results of os.stat() and os.lstat(). > > Suggested usage: from stat import * > """ > > Is ths still the suggested usage? I don't see why not. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@zope.com Fri Jul 27 16:50:15 2001 From: guido@zope.com (Guido van Rossum) Date: Fri, 27 Jul 2001 11:50:15 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: Your message of "Fri, 27 Jul 2001 16:27:43 +0200." <3B617A5F.E193B2CC@lemburg.com> References: <01072701443301.05085@localhost.localdomain> <3B613EAE.D1F4AEC4@lemburg.com> <200107271308.JAA23972@cj20424-a.reston1.va.home.com> <3B617A5F.E193B2CC@lemburg.com> Message-ID: <200107271550.LAA24662@cj20424-a.reston1.va.home.com> > > Funny, I had a similar idea today in the shower (always the best place > > to think :-). I'm not sure exactly how it would work yet -- > > currently, literals are converted to values at compile-time, so the > > registry would have to be available to the compiler, but the concept > > seems to make more sense if it is available and changeable at runtime. > > True, but deferring the conversion to runtime (by e.g. using > literal descriptors ;-) would cause a significant slowdown. > > So, I believe that the compiler would have be told before starting > the compile process or within the process by looking at some magical > constant/comment in the source code (I think that this ought to be > a per-file overrideable setting, since some code may simply fail > to work if it suddenly starts to work with different types). This may be the first place where a 'directive' statement actually makes sense to me. > > Nevertheless, we should keep this in mind. > > I could reformat the above into a PEP or Michael could simply > the idea as section to his PEP. I'm not optimistic about Michael's PEP. He seems to insist on a total separation between decimal and binary numbers that I don't believe can work. I haven't replied to him yet because I can't explain it well enough yet -- but I don't believe there's much of a future in his particular idea. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@zope.com Fri Jul 27 17:35:34 2001 From: guido@zope.com (Guido van Rossum) Date: Fri, 27 Jul 2001 12:35:34 -0400 Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python In-Reply-To: Your message of "Fri, 27 Jul 2001 09:21:01 EDT." <01072708200303.02216@fermi.eeel.nist.gov> References: <01072701443301.05085@localhost.localdomain> <3B613EAE.D1F4AEC4@lemburg.com> <01072708200303.02216@fermi.eeel.nist.gov> Message-ID: <200107271635.MAA25690@cj20424-a.reston1.va.home.com> [me] > > Note that I don't think that we really need a separate interpreter > > just to add decimals or rationals to the core. All that is needed > > is some easy way to construct these number objects without too > > much programming overhead (i.e. number of keys to hit ;-). [Michael] > I wasn't suggesting creating a separate interpreter, I was > suggesting adding a simple mechanism for allowing a new dialect of > Python to be added to the existing interpreter. Understood. I see no big difference in having two binaries or one binary with a command line option; the two binaries effectively contain the same functionality, just with a different default. I would vote for one binary; if you really think it's too much for your users to say "python -d" instead of "dpython", give them a script. (I know that the -d option currently means something else. That's a detail to worry about later.) > This new dialect would be easier to use for certain types of > programming activities. The use of a decimal number type as the > default type in this new language dialect is only one change that > was proposed. I'm not very fond of having multiple dialects. There are lots of contexts where the dialect in use is not explicitly mentioned (e.g. when people discuss fragments of Python code). > Another would be to use Unicode as the default character set. This > would allow Unicode characters to be in strings without needing to > escape them. That's not a dialect, that's a different input encoding. MAL already has a PEP for that. > The proposal also suggests removing the tab character from > indentation of blocks. The goal is to create a language that would > clean up some of the warts in the Python syntax and take advantage > of the capabilities of modern IDE environments. What does removing tab characters have to do with decimal numbers? One topic per PEP, please! > The idea of adding a new language on top of the existing > infrastructure isn't that unusual. The gcc compiler can process many > languages to produce a common machine dependant object code. I can > envision taking my simple changes a few steps further and turning > the entire tokenizer into a replaceable unit. This approach would > allows projects to build other languages on top of the Python byte > code interpreter. Imagine having Javascript, VBasic, or sh > tokenizer frontends generating Python bytecodes. Think of it as the > pyNET architecture:-) This change probably belongs in Python4k. Or in Python .NET. Decoupling the various part of the parse+compile pipeline is something I've considered. But again this has nothing to do with decimal numbers: your proposal allows the mixing of decimal and binary numbers (as long as one of them uses an explicit base indicator) so you don't really need two parsers -- you need one tokenizer plus a way to specify the default numeric base for literals. > Perhaps the PEP should be split into two parts. The first PEP would > be to add decimal characters with a 'd' suffix and also allow suffix > characters to be added to the default float and integer types. I > think everyone agrees that this change is needed. It's needed *if* we agree that we need a decimal data type. > The second PEP will cover the proposed creation of the dpython > dialect. This PEP would be a container for proposed changes to the > Python syntax that would make the language easier to teach to > newbies and easier to use in a financial application. I'll have to go back to your defense of the two dialect approach, but I think it's neither sufficient nor necessary. > Your suggestion to allow additional numerical types to be added by > users would be included in the first PEP if the BDFL thinks this is > a good idea. Well, sometimes more generality than you need hurts. I'm not convinced that we need an open-ended set of numeric literals. But in the light of the unified numeric model, we may need ways to make exactness or inexactness explicit, and/or we may need a way to specify rational numbers. If we can fit all of these in the number-with-letter-suffix mold, that would be nice for the lexer, I suppose. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@zope.com Fri Jul 27 17:38:33 2001 From: guido@zope.com (Guido van Rossum) Date: Fri, 27 Jul 2001 12:38:33 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: Your message of "Fri, 27 Jul 2001 01:27:27 EDT." References: Message-ID: <200107271638.MAA25779@cj20424-a.reston1.va.home.com> > I elect Michael . Note there is *no* decimal PEP now -- not even a > decimal PEP number assigned. I thought all PEPs had decimal numbers? :) > Aahz isn't going to write one, either. I was hoping to write one > instead if time allowed, but that looks increasingly unlikely by the > hour. I'm not sure that Michael would write the PEP we want. You may be "it", after all, if you want this done right. --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@python.net Fri Jul 27 17:51:20 2001 From: gward@python.net (Greg Ward) Date: Fri, 27 Jul 2001 12:51:20 -0400 Subject: [Python-Dev] Advice in stat.py In-Reply-To: <200107271547.LAA24634@cj20424-a.reston1.va.home.com>; from guido@zope.com on Fri, Jul 27, 2001 at 11:47:13AM -0400 References: <20010727101400.A1016@gerg.ca> <200107271547.LAA24634@cj20424-a.reston1.va.home.com> Message-ID: <20010727125120.O677@gerg.ca> On 27 July 2001, Guido van Rossum said: > > stat.py in 2.2a1 starts with the following sage advice: > > > > """Constants/functions for interpreting results of os.stat() and os.lstat(). > > > > Suggested usage: from stat import * > > """ > > > > Is ths still the suggested usage? > > I don't see why not. My understanding was that it's generally considered Bad Form to do this at module level, while doing it at function level is tricky (or a performance hit? whatever...) because of nested scopes. Greg -- Greg Ward - Unix geek gward@python.net http://starship.python.net/~gward/ No animals were harmed in transmitting this message. From guido@zope.com Fri Jul 27 17:57:00 2001 From: guido@zope.com (Guido van Rossum) Date: Fri, 27 Jul 2001 12:57:00 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: Your message of "Thu, 26 Jul 2001 17:38:27 EDT." <01072617382702.02216@fermi.eeel.nist.gov> References: <01072617382702.02216@fermi.eeel.nist.gov> Message-ID: <200107271657.MAA26156@cj20424-a.reston1.va.home.com> Michael, Your PEP doesn't spell out what happens when a binary and a decimal number are the input for a numerical operator. I believe you said that this would be an unconditional error. But I foresee serious problems. Most standard library modules use numbers. Most of the modules using numbers occasionally use a literal (e.g. 0 or 1). According to your PEP, literals in module files ending with .py default to binary. This means that almost any use of a standard library module from your "dpython" will fail as soon as a literal is used. I can't believe that this will work satisfactorily. Another example of the kind of problem your approach runs into: what should the type of len("abc") be? 3d or 3b? Should it depend on the default mode? I suppose sequence indexing has to accept decimal as well as binary integers as indexes -- certainly in a decimal program you will want to be able to use decimal integers for indexes. The whole thing seems screwed. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@zope.com Fri Jul 27 18:14:36 2001 From: guido@zope.com (Guido van Rossum) Date: Fri, 27 Jul 2001 13:14:36 -0400 Subject: [Python-Dev] Advice in stat.py In-Reply-To: Your message of "Fri, 27 Jul 2001 12:51:20 EDT." <20010727125120.O677@gerg.ca> References: <20010727101400.A1016@gerg.ca> <200107271547.LAA24634@cj20424-a.reston1.va.home.com> <20010727125120.O677@gerg.ca> Message-ID: <200107271714.NAA26383@cj20424-a.reston1.va.home.com> > > > Suggested usage: from stat import * > > > > > > Is ths still the suggested usage? > > > > I don't see why not. > > My understanding was that it's generally considered Bad Form to do this > at module level, while doing it at function level is tricky (or a > performance hit? whatever...) because of nested scopes. Generally yes, but there's an explicit disclaimer "unless the module is written for this". And stat.py is (hence the recommendation in the docstring). Inside a function, from ... import * is always bad form. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Fri Jul 27 18:12:54 2001 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 27 Jul 2001 13:12:54 -0400 Subject: [Python-Dev] Advice in stat.py References: <20010727101400.A1016@gerg.ca> <200107271547.LAA24634@cj20424-a.reston1.va.home.com> <20010727125120.O677@gerg.ca> Message-ID: <15201.41238.543442.486438@anthem.wooz.org> >>>>> "GW" == Greg Ward writes: GW> My understanding was that it's generally considered Bad Form GW> to do this at module level, while doing it at function level GW> is tricky (or a performance hit? whatever...) because of GW> nested scopes. Yes, but some modules are designed for from-import-* so they're less evil. types, stat, and Tkinter are the three most common ones for me. Usually though if I'm importing fewer than about 3 symbols, I'll import then explicitly. -Barry From jepler@inetnebr.com Fri Jul 27 19:59:16 2001 From: jepler@inetnebr.com (Jeff Epler) Date: Fri, 27 Jul 2001 13:59:16 -0500 Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python In-Reply-To: <200107271635.MAA25690@cj20424-a.reston1.va.home.com>; from guido@zope.com on Fri, Jul 27, 2001 at 12:35:34PM -0400 References: <01072701443301.05085@localhost.localdomain> <3B613EAE.D1F4AEC4@lemburg.com> <01072708200303.02216@fermi.eeel.nist.gov> <200107271635.MAA25690@cj20424-a.reston1.va.home.com> Message-ID: <20010727135914.B18280@inetnebr.com> On Fri, Jul 27, 2001 at 12:35:34PM -0400, Guido van Rossum wrote: > But again this has nothing to do with decimal numbers: your proposal > allows the mixing of decimal and binary numbers (as long as one of > them uses an explicit base indicator) so you don't really need two > parsers -- you need one tokenizer plus a way to specify the default > numeric base for literals. If this were possible, then could it be a per-module decision what "1/2" produces, depending whether unadorned whole-number literals correspond to ClassicInt or NewInt ? That sounds miles better than writing "1//2" to me. Jeff From mclay@nist.gov Fri Jul 27 20:51:38 2001 From: mclay@nist.gov (Michael McLay) Date: Fri, 27 Jul 2001 15:51:38 -0400 Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python In-Reply-To: <200107271635.MAA25690@cj20424-a.reston1.va.home.com> References: <01072701443301.05085@localhost.localdomain> <01072708200303.02216@fermi.eeel.nist.gov> <200107271635.MAA25690@cj20424-a.reston1.va.home.com> Message-ID: <01072715513805.02216@fermi.eeel.nist.gov> On Friday 27 July 2001 12:35 pm, Guido van Rossum wrote: > [me] > > > I wasn't suggesting creating a separate interpreter, I was > > suggesting adding a simple mechanism for allowing a new dialect of > > Python to be added to the existing interpreter. > > Understood. I see no big difference in having two binaries or one > binary with a command line option; the two binaries effectively > contain the same functionality, just with a different default. I > would vote for one binary; if you really think it's too much for your > users to say "python -d" instead of "dpython", give them a script. (I > know that the -d option currently means something else. That's a > detail to worry about later.) I decided to use a symbolic link to a different command name to set the default encoding of numerical literals. I did this because refer to the 'dpython' command more concise than "python -d". The executable could also have command options to select between python and dpython modes. > I'm not very fond of having multiple dialects. There are lots of > contexts where the dialect in use is not explicitly mentioned > (e.g. when people discuss fragments of Python code). I'm not fond of dialects when they don't serve a significant purpose. However, I believe it would be useful to at least discuss creating a special purpose "safe" mode for the Python lexer. This mode would be attractive to newbies and financial programmers. Calling this a new dialect is an overstatement. It is more like defining a subset of the language that uses a special vocabulary for working with decimal types. > > Another would be to use Unicode as the default character set. This > > would allow Unicode characters to be in strings without needing to > > escape them. > > That's not a dialect, that's a different input encoding. MAL already > has a PEP for that. I know about the PEP. I was refering to making it the default string type for a '.dp' file. There would be no prefix 'u' required. I'll remove this and the other unrelated items from the decimal type PEP If you don't agree with the idea of adding dpython lexer mode then there is no point in discussing the features that would be in that mode. > > The idea of adding a new language on top of the existing > > infrastructure isn't that unusual. The gcc compiler can process many > > languages to produce a common machine dependant object code. I can > > envision taking my simple changes a few steps further and turning > > the entire tokenizer into a replaceable unit. This approach would > > allows projects to build other languages on top of the Python byte > > code interpreter. Imagine having Javascript, VBasic, or sh > > tokenizer frontends generating Python bytecodes. Think of it as the > > pyNET architecture:-) This change probably belongs in Python4k. > > Or in Python .NET. Decoupling the various part of the parse+compile > pipeline is something I've considered. Did you decide against it, or has it just not been a high enough priority? > But again this has nothing to do with decimal numbers: your proposal > allows the mixing of decimal and binary numbers (as long as one of > them uses an explicit base indicator) so you don't really need two > parsers -- you need one tokenizer plus a way to specify the default > numeric base for literals. That is exactly what I implemented. The dpython command and the '.dp' cause the Py_USE_DECIMAL_AS_DEFAULT[1] flag to be set. When this flag is set decimal numbers are used for literals. > > I'll have to go back to your defense of the two dialect approach, but > I think it's neither sufficient nor necessary. I have mixed too many ideas into a PEP. I'll rework the PEP to remove the cruft and focus on the addition of decimal numbers. I move the other ideas into a separate PEP. > Well, sometimes more generality than you need hurts. I'm not > convinced that we need an open-ended set of numeric literals. But in > the light of the unified numeric model, we may need ways to make > exactness or inexactness explicit, and/or we may need a way to specify > rational numbers. If we can fit all of these in the > number-with-letter-suffix mold, that would be nice for the lexer, I > suppose. I worry about a "unified numerical model" getting overly complex. I think decimal numbers help because they are a better choice than binary numbers for a significant percentage of all software applications. I know that rationale numbers are imporant in some applications. Am I overlooking some huge class of applications that use rationales? While Tim and some of the other Pythoneers can probably think of dozens of specialized numerical types, I would venture to guess that binary types and a decimal type probably cover 90% of all the user's requirements. [1] I'll be renaming the flat to this in the next version. The flag is currently called Py_NEW_PARSER. I named it that because at one time I was creating a new parser. I trimmed the changes down to just a few edits of the tokenizer and compile.c From guido@zope.com Fri Jul 27 20:49:57 2001 From: guido@zope.com (Guido van Rossum) Date: Fri, 27 Jul 2001 15:49:57 -0400 Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 Message-ID: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> Here's a new revision of PEP 238. I've incorporated clarifications of issues that were brought up during the discussion of rev 1.10 -- from typos via rewording of ambiguous phrasing to the addition of new open issues. I've decided not to go for the "quotient and ratio" terminology -- my rationale is in the PEP. I'm posting this also to c.l.py and c.l.py.a, to make sure enough people see it. Feel free to discuss it either in c.l.py or here in python-dev, but please don't change the subject. --Guido van Rossum (home page: http://www.python.org/~guido/) PEP: 238 Title: Changing the Division Operator Version: $Revision: 1.12 $ Author: pep@zadka.site.co.il (Moshe Zadka), guido@python.org (Guido van Rossum) Status: Draft Type: Standards Track Created: 11-Mar-2001 Python-Version: 2.2 Post-History: 16-Mar-2001, 26-Jul-2001, 27-Jul-2001 Abstract The current division (/) operator has an ambiguous meaning for numerical arguments: it returns the floor of the mathematical result of division if the arguments are ints or longs, but it returns a reasonable approximation of the division result if the arguments are floats or complex. This makes expressions expecting float or complex results error-prone when integers are not expected but possible as inputs. We propose to fix this by introducing different operators for different operations: x/y to return a reasonable approximation of the mathematical result of the division ("true division"), x//y to return the floor ("floor division"). We call the current, mixed meaning of x/y "classic division". Because of severe backwards compatibility issues, not to mention a major flamewar on c.l.py, we propose the following transitional measures (starting with Python 2.2): - Classic division will remain the default in the Python 2.x series; true division will be standard in Python 3.0. - The // operator will be available to request floor division unambiguously. - The future division statement, spelled "from __future__ import division", will change the / operator to mean true division throughout the module. - A command line option will enable run-time warnings for classic division applied to int or long arguments; another command line option will make true division the default. - The standard library will use the future division statement and the // operator when appropriate, so as to completely avoid classic division. Motivation The classic division operator makes it hard to write numerical expressions that are supposed to give correct results from arbitrary numerical inputs. For all other operators, one can write down a formula such as x*y**2 + z, and the calculated result will be close to the mathematical result (within the limits of numerical accuracy, of course) for any numerical input type (int, long, float, or complex). But division poses a problem: if the expressions for both arguments happen to have an integral type, it implements floor division rather than true division. The problem is unique to dynamically typed languages: in a statically typed language like C, the inputs, typically function arguments, would be declared as double or float, and when a call passes an integer argument, it is converted to double or float at the time of the call. Python doesn't have argument type declarations, so integer arguments can easily find their way into an expression. The problem is particularly pernicious since ints are perfect substitutes for floats in all other circumstances: math.sqrt(2) returns the same value as math.sqrt(2.0), 3.14*100 and 3.14*100.0 return the same value, and so on. Thus, the author of a numerical routine may only use floating point numbers to test his code, and believe that it works correctly, and a user may accidentally pass in an integer input value and get incorrect results. Another way to look at this is that classic division makes it difficult to write polymorphic functions that work well with either float or int arguments; all other operators already do the right thing. No algorithm that works for both ints and floats has a need for truncating division in one case and true division in the other. The correct work-around is subtle: casting an argument to float() is wrong if it could be a complex number; adding 0.0 to an argument doesn't preserve the sign of the argument if it was minus zero. The only solution without either downside is multiplying an argument (typically the first) by 1.0. This leaves the value and sign unchanged for float and complex, and turns int and long into a float with the corresponding value. It is the opinion of the authors that this is a real design bug in Python, and that it should be fixed sooner rather than later. Assuming Python usage will continue to grow, the cost of leaving this bug in the language will eventually outweigh the cost of fixing old code -- there is an upper bound to the amount of code to be fixed, but the amount of code that might be affected by the bug in the future is unbounded. Another reason for this change is the desire to ultimately unify Python's numeric model. This is the subject of PEP 228[0] (which is currently incomplete). A unified numeric model removes most of the user's need to be aware of different numerical types. This is good for beginners, but also takes away concerns about different numeric behavior for advanced programmers. (Of course, it won't remove concerns about numerical stability and accuracy.) In a unified numeric model, the different types (int, long, float, complex, and possibly others, such as a new rational type) serve mostly as storage optimizations, and to some extent to indicate orthogonal properties such as inexactness or complexity. In a unified model, the integer 1 should be indistinguishable from the floating point number 1.0 (except for its inexactness), and both should behave the same in all numeric contexts. Clearly, in a unified numeric model, if a==b and c==d, a/c should equal b/d (taking some liberties due to rounding for inexact numbers), and since everybody agrees that 1.0/2.0 equals 0.5, 1/2 should also equal 0.5. Likewise, since 1//2 equals zero, 1.0//2.0 should also equal zero. Variations Aesthetically, x//y doesn't please everyone, and hence several variations have been proposed: x div y, or div(x, y), sometimes in combination with x mod y or mod(x, y) as an alternative spelling for x%y. We consider these solutions inferior, on the following grounds. - Using x div y would introduce a new keyword. Since div is a popular identifier, this would break a fair amount of existing code, unless the new keyword was only recognized under a future division statement. Since it is expected that the majority of code that needs to be converted is dividing integers, this would greatly increase the need for the future division statement. Even with a future statement, the general sentiment against adding new keywords unless absolutely necessary argues against this. - Using div(x, y) makes the conversion of old code much harder. Replacing x/y with x//y or x div y can be done with a simple query replace; in most cases the programmer can easily verify that a particular module only works with integers so all occurrences of x/y can be replaced. (The query replace is still needed to weed out slashes occurring in comments or string literals.) Replacing x/y with div(x, y) would require a much more intelligent tool, since the extent of the expressions to the left and right of the / must be analyzed before the placement of the "div(" and ")" part can be decided. Alternatives In order to reduce the amount of old code that needs to be converted, several alternative proposals have been put forth. Here is a brief discussion of each proposal (or category of proposals). If you know of an alternative that was discussed on c.l.py that isn't mentioned here, please mail the second author. - Let / keep its classic semantics; introduce // for true division. This still leaves a broken operator in the language, and invites to use the broken behavior. It also shuts off the road to a unified numeric model a la PEP 228[0]. - Let int division return a special "portmanteau" type that behaves as an integer in integer context, but like a float in a float context. The problem with this is that after a few operations, the int and the float value could be miles apart, it's unclear which value should be used in comparisons, and of course many contexts (like conversion to string) don't have a clear integer or float context. - Use a directive to use specific division semantics in a module, rather than a future statement. This retains classic division as a permanent wart in the language, requiring future generations of Python programmers to be aware of the problem and the remedies. - Use "from __past__ import division" to use classic division semantics in a module. This also retains the classic division as a permanent wart, or at least for a long time (eventually the past division statement could raise an ImportError). - Use a directive (or some other way) to specify the Python version for which a specific piece of code was developed. This requires future Python interpreters to be able to emulate *exactly* several previous versions of Python, and moreover to do so for multiple versions within the same interpreter. This is way too much work. A much simpler solution is to keep multiple interpreters installed. API Changes During the transitional phase, we have to support *three* division operators within the same program: classic division (for / in modules without a future division statement), true division (for / in modules with a future division statement), and floor division (for //). Each operator comes in two flavors: regular, and as an augmented assignment operator (/= or //=). The names associated with these variations are: - Overloaded operator methods: __div__(), __floordiv__(), __truediv__(); __idiv__(), __ifloordiv__(), __itruediv__(). - Abstract API C functions: PyNumber_Divide(), PyNumber_FloorDivide(), PyNumber_TrueDivide(); PyNumber_InPlaceDivide(), PyNumber_InPlaceFloorDivide(), PyNumber_InPlaceTrueDivide(). - Byte code opcodes: BINARY_DIVIDE, BINARY_FLOOR_DIVIDE, BINARY_TRUE_DIVIDE; INPLACE_DIVIDE, INPLACE_FLOOR_DIVIDE, INPLACE_TRUE_DIVIDE. - PyNumberMethod slots: nb_divide, nb_floor_divide, nb_true_divide, nb_inplace_divide, nb_inplace_floor_divide, nb_inplace_true_divide. The added PyNumberMethod slots require an additional flag in tp_flags; this flag will be named Py_TPFLAGS_HAVE_NEWDIVIDE and will be included in Py_TPFLAGS_DEFAULT. The true and floor division APIs will look for the corresponding slots and call that; when that slot is NULL, they will raise an exception. There is no fallback to the classic divide slot. In Python 3.0, the classic division semantics will be removed; the classic division APIs will become synonymous with true division. Command Line Option The -D command line option takes a string argument that can take three values: "old", "warn", or "new". The default is "old" in Python 2.2 but will change to "warn" in later 2.x versions. The "old" value means the classic division operator acts as described. The "warn" value means the classic division operator issues a warning (a DeprecationWarning using the standard warning framework) when applied to ints or longs. The "new" value changes the default globally so that the / operator is always interpreted as true division. The "new" option is only intended for use in certain educational environments, where true division is required, but asking the students to include the future division statement in all their code would be a problem. This option will not be supported in Python 3.0; Python 3.0 will always interpret / as true division. (Other names have been proposed, like -Dclassic, -Dclassic-warn, -Dtrue, or -Dold_division etc.; these seem more verbose to me without much advantage. After all the term classic division is not used in the language at all (only in the PEP), and the term true division is rarely used in the language -- only in __truediv__.) Semantics of Floor Division Floor division will be implemented in all the Python numeric types, and will have the semantics of a // b == floor(a/b) except that the result type will be the common type into which a and b are coerced before the operation. Specifically, if a and b are of the same type, a//b will be of that type too. If the inputs are of different types, they are first coerced to a common type using the same rules used for all other arithmetic operators. In particular, if a and b are both ints or longs, the result has the same type and value as for classic division on these types (including the case of mixed input types; int//long and long//int will both return a long). For floating point inputs, the result is a float. For example: 3.5//2.0 == 1.0 For complex numbers, // raises an exception, since float() of a complex number is not allowed. For user-defined classes and extension types, all semantics are up to the implementation of the class or type. Semantics of True Division True division for ints and longs will convert the arguments to float and then apply a float division. That is, even 2/1 will return a float (2.0), not an int. For floats and complex, it will be the same as classic division. Note that for long arguments, true division may lose information; this is in the nature of true division (as long as rationals are not in the language). Algorithms that consciously use longs should consider using //. If and when a rational type is added to Python (see PEP 239[2]), true division for ints and longs should probably return a rational. This avoids the problem with true division of longs losing information. But until then, for consistency, float is the only choice for true division. The Future Division Statement If "from __future__ import division" is present in a module, or if -Dnew is used, the / and /= operators are translated to true division opcodes; otherwise they are translated to classic division (until Python 3.0 comes along, where they are always translated to true division). The future division statement has no effect on the recognition or translation of // and //=. See PEP 236[4] for the general rules for future statements. (It has been proposed to use a longer phrase, like "true_division" or "modern_division". These don't seem to add much information.) Open Issues - It has been proposed to call // the quotient operator, and the / operator the ratio operator. I'm not sure about this -- for some people quotient is just a synonym for division, and ratio suggests rational numbers, which is wrong. I prefer the terminology to be slightly awkward if that avoids unambiguity. Also, for some folks "quotient" suggests truncation towards zero, not towards infinity as "floor division" says explicitly. - It has been argued that a command line option to change the default is evil. It can certainly be dangerous in the wrong hands: for example, it would be impossible to combine a 3rd party library package that requires -Dnew with another one that requires -Dold. But I believe that the VPython folks need a way to enable true division by default, and other educators might need the same. These usually have enough control over the library packages available in their environment. - For very large long integers, the definition of true division as returning a float causes problems, since the range of Python longs is much larger than that of Python floats. This problem will disappear if and when rational numbers are supported. In the interim, maybe the long-to-float conversion could be made to raise OverflowError if the long is out of range. FAQ Q. Why isn't true division called float division? A. Because I want to keep the door open to *possibly* introducing rationals and making 1/2 return a rational rather than a float. See PEP 239[2]. Q. Why is there a need for __truediv__ and __itruediv__? A. We don't want to make user-defined classes second-class citizens. Certainly not with the type/class unification going on. Q. How do I write code that works under the classic rules as well as under the new rules without using // or a future division statement? A. Use x*1.0/y for true division, divmod(x, y)[0] for int division. Especially the latter is best hidden inside a function. You may also write float(x)/y for true division if you are sure that you don't expect complex numbers. If you know your integers are never negative, you can use int(x/y) -- while the documentation of int() says that int() can round or truncate depending on the C implementation, we know of no C implementation that doesn't truncate, and we're going to change the spec for int() to promise truncation. Note that for negative ints, classic division (and floor division) round towards negative infinity, while int() rounds towards zero. Q. How do I specify the division semantics for input(), compile(), execfile(), eval() and exec? A. They inherit the choice from the invoking module. PEP 236[4] lists this as a partially resolved problem. Q. What about code compiled by the codeop module? A. Alas, this will always use the default semantics (set by the -D command line option). This is a general problem with the future statement; PEP 236[4] lists it as an unresolved problem. You could have your own clone of codeop.py that includes a future division statement, but that's not a general solution. Q. Will there be conversion tools or aids? A. Certainly, but these are outside the scope of the PEP. Q. Why is my question not answered here? A. Because we weren't aware of it. If it's been discussed on c.l.py and you believe the answer is of general interest, please notify the second author. (We don't have the time or inclination to answer every question sent in private email, hence the requirement that it be discussed on c.l.py first.) Implementation A very early implementation (not yet following the above spec, but supporting // and the future division statement) is available from the SourceForge patch manager[5]. References [0] PEP 228, Reworking Python's Numeric Model http://www.python.org/peps/pep-0228.html [1] PEP 237, Unifying Long Integers and Integers, Zadka, http://www.python.org/peps/pep-0237.html [2] PEP 239, Adding a Rational Type to Python, Zadka, http://www.python.org/peps/pep-0239.html [3] PEP 240, Adding a Rational Literal to Python, Zadka, http://www.python.org/peps/pep-0240.html [4] PEP 236, Back to the __future__, Peters, http://www.python.org/peps/pep-0236.html [5] Patch 443474, from __future__ import division http://sourceforge.net/tracker/index.php?func=detail&aid=443474&group_id=5470&atid=305470 Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil End: From mclay@nist.gov Fri Jul 27 21:30:56 2001 From: mclay@nist.gov (Michael McLay) Date: Fri, 27 Jul 2001 16:30:56 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python Message-ID: <01072716305607.02216@fermi.eeel.nist.gov> On Friday 27 July 2001 12:57 pm, Guido van Rossum wrote: > Michael, > > Your PEP doesn't spell out what happens when a binary and a decimal > number are the input for a numerical operator. I believe you said > that this would be an unconditional error. > > But I foresee serious problems. Most standard library modules use > numbers. Most of the modules using numbers occasionally use a literal > (e.g. 0 or 1). According to your PEP, literals in module files ending > with .py default to binary. This means that almost any use of a > standard library module from your "dpython" will fail as soon as a > literal is used. No, because the '.py' file will generate bytecodes for a number literals as binary number when the module is compiled. If a '.dp' file imports the contents of a '.py' file the binary numbers will be imported as binary numbers. If the '.dp' file will need to use the binary number in a calculation with a decimal number the binary number will have to be cast it to a decimal number. --------------------- #gui.py BLUE = 155 x_axis = 1024 y_axis = 768 -------------------- #calculator.dp import gui ytd_interest = 0.04 # ytd_interest is now a decimal number win = gui.open_window(gui.bg, x_size=gui.x_axis, y_size=gui.y_axis) app = win.dialog("Bank Balance", bankbalance_callback) bb = app.get_bankbalance() # bb now contains a string newbalance = decimal(bb) *ytd_interest # now update the display app.set_bankbalance(str(newbalance)) ------------------- In the example the gui module was used in the calculator module, but they were alway handled as binary numbers. The parser did not convert them to decimal numbers because they had been parsed into a gui.pyc file prior to being loaded into calculator.dp. > I can't believe that this will work satisfactorily. I think it will. There will be some cases where it might be necessary to add modules of convenience functions to make it easier to to use applications that cross boundaries, but I think these cases will be rare. Immediately following the introduction of the decimal number types all binary modules will work as the work today. There will be no additional pain to continue using those module. There will be no decimal modules, so there is no problem with making them work with the binary modules. As decimal module users start developing applications they will develop techniques for working with the binary modules. Initially it may require a significant effort, but eventually bondaries will be created and they two domains will coexists. > Another example of the kind of problem your approach runs into: what > should the type of len("abc") be? 3d or 3b? Should it depend on the > default mode? That is an interesting question. With my current proposal the following would be required: stlen = decimal(len("abc")) A dlen() function could be added, or perhaps allowing the automatic promotion of int to a decimal would be a reasonable exception. That is one case were there is no chance of data loss. I'm not apposed to automatic conversions if there is no danger of errors being introduced. > I suppose sequence indexing has to accept decimal as well as binary > integers as indexes -- certainly in a decimal program you will want to > be able to use decimal integers for indexes. That is how I would expect it to work. From mclay@nist.gov Fri Jul 27 21:32:42 2001 From: mclay@nist.gov (Michael McLay) Date: Fri, 27 Jul 2001 16:32:42 -0400 Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: <200107271550.LAA24662@cj20424-a.reston1.va.home.com> References: <01072701443301.05085@localhost.localdomain> <3B617A5F.E193B2CC@lemburg.com> <200107271550.LAA24662@cj20424-a.reston1.va.home.com> Message-ID: <01072716324208.02216@fermi.eeel.nist.gov> On Friday 27 July 2001 11:50 am, Guido van Rossum wrote: > > I'm not optimistic about Michael's PEP. He seems to insist on a total > separation between decimal and binary numbers that I don't believe can > work. I'm not insisting on total separation. I propose that we start with a requirement that an explicit call be made to a conversion function. These functions would allow a decimal type to be converted to a float or to an int. There would also be conversion function going from a float or an int to a decimal type. What I would like to avoid is creating a decimal type in Python that enables silent errors that are difficult to recognize. Allowing automatic coersion between the binary and decimal types will open the door to errors that would be detected if a conversion is required. If at some point in the future it becomes apparent that a particular form of coersion is safe and useful it could be added. I'd like to move slowly on opening up this potential trouble spot. > I haven't replied to him yet because I can't explain it well > enough yet -- but I don't believe there's much of a future in his > particular idea. I guess I'm not understanding something about the direction you are taking Python. As I understood the goals of the CP4E project you were attempting to make Python appealing to a wider audience and make it possible for everyone to learn to write programs. And then there are occasional references to a Python 3k which will fix some Python warts. My proposal moves Python towards these goals, while retaining full backwards compatible. I am not trying to create a new interpreter. I'm trying to make the current interpreter useful to a wider market. What is it you are trying to accomplish in the process of "unifying the numerical types" in Python? From mclay@nist.gov Fri Jul 27 21:40:38 2001 From: mclay@nist.gov (Michael McLay) Date: Fri, 27 Jul 2001 16:40:38 -0400 Subject: [Python-Dev] dpython interaction with imaginary types Message-ID: <01072716403809.02216@fermi.eeel.nist.gov> Paul F. Dubois writes: > In dpython, what is 2.0j? Is the "standard" way of writing complex numbers, > 3.0 + 2.0j, valid? If this expression is placed in a module with a '.py' extension it will work exactly as it does today. An exception will be raised if the expression is in a '.dp' module because the 3.0 would be a decimal number and the complex type is expecting a binary number for the real portiion. From guido@zope.com Fri Jul 27 22:06:37 2001 From: guido@zope.com (Guido van Rossum) Date: Fri, 27 Jul 2001 17:06:37 -0400 Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python In-Reply-To: Your message of "Fri, 27 Jul 2001 15:51:38 EDT." <01072715513805.02216@fermi.eeel.nist.gov> References: <01072701443301.05085@localhost.localdomain> <01072708200303.02216@fermi.eeel.nist.gov> <200107271635.MAA25690@cj20424-a.reston1.va.home.com> <01072715513805.02216@fermi.eeel.nist.gov> Message-ID: <200107272106.RAA27755@cj20424-a.reston1.va.home.com> [Michael] > I'm not fond of dialects when they don't serve a significant > purpose. However, I believe it would be useful to at least discuss > creating a special purpose "safe" mode for the Python lexer. This > mode would be attractive to newbies and financial programmers. > Calling this a new dialect is an overstatement. It is more like > defining a subset of the language that uses a special vocabulary for > working with decimal types. Sounds like a dialect to me. But alright, I'll take your word for it. :-) [Michael] > > > Another would be to use Unicode as the default character set. This > > > would allow Unicode characters to be in strings without needing to > > > escape them. [Guido] > > That's not a dialect, that's a different input encoding. MAL already > > has a PEP for that. [Michael] > I know about the PEP. I was refering to making it the default string type > for a '.dp' file. There would be no prefix 'u' required. Have you thourght this through? What would be the input encoding? How do you expect your programmers to edit their Unicode files? Otherwise, the only effect of making all string literals Unicode strings is to break most of the standard library. You can get this effect with "python -U" today. It's not pretty. (That option exists to see how much progress has been made with Python's Unicodification, not for anything very practical.) > I'll remove this and the other unrelated items from the decimal type PEP It would indeed be better to focus on one idea at a time. > If you don't agree with the idea of adding dpython lexer mode then > there is no point in discussing the features that would be in that > mode. Maybe you can rewrite the PEP to explain the idea better. It wasn't very clear the first time. > > Or in Python .NET. Decoupling the various part of the parse+compile > > pipeline is something I've considered. > > Did you decide against it, or has it just not been a high enough priority? It's one of those many "would-be-nice" things that I never get to... > > But again this has nothing to do with decimal numbers: your proposal > > allows the mixing of decimal and binary numbers (as long as one of > > them uses an explicit base indicator) so you don't really need two > > parsers -- you need one tokenizer plus a way to specify the default > > numeric base for literals. > > That is exactly what I implemented. The dpython command and the > '.dp' cause the Py_USE_DECIMAL_AS_DEFAULT[1] flag to be set. When > this flag is set decimal numbers are used for literals. Where is this flag set? Is it a global variable? If my main program has the .dp extension, does the flag remain set for all other module that it imports? > > I'll have to go back to your defense of the two dialect approach, but > > I think it's neither sufficient nor necessary. > > I have mixed too many ideas into a PEP. I'll rework the PEP to remove the > cruft and focus on the addition of decimal numbers. I move the other ideas > into a separate PEP. Posterity will be grateful. > > Well, sometimes more generality than you need hurts. I'm not > > convinced that we need an open-ended set of numeric literals. But in > > the light of the unified numeric model, we may need ways to make > > exactness or inexactness explicit, and/or we may need a way to specify > > rational numbers. If we can fit all of these in the > > number-with-letter-suffix mold, that would be nice for the lexer, I > > suppose. > > I worry about a "unified numerical model" getting overly complex. Funny. I think that a unified numeric model will take away some complexity from the current model; for example the programmer would no longer have to be aware of the limit on int values, so nobody would have to learn about long any more. > I think decimal numbers help because they are a better choice than > binary numbers for a significant percentage of all software > applications. (Just not for most of the apps that are likely to be written in Python today. :-) > I know that rationale numbers are imporant in some applications. Am > I overlooking some huge class of applications that use rationales? I doubt it -- if I was allowed to add exactly *one* numeric type to Python, and I had to choose between decimal and rational, I'd choose decimal. Practicality beats purity. > While Tim and some of the other Pythoneers can probably think of > dozens of specialized numerical types, I would venture to guess that > binary types and a decimal type probably cover 90% of all the user's > requirements. Add rational, and I'd agree. > [1] I'll be renaming the flat to this in the next version. The flag > is currently called Py_NEW_PARSER. I named it that because at one > time I was creating a new parser. I trimmed the changes down to > just a few edits of the tokenizer and compile.c Why does a flag variable have an UPPER_CASE name? That normally means the name is a preprocessor symbol. [Next message] [Guido] > > But I foresee serious problems. Most standard library modules use > > numbers. Most of the modules using numbers occasionally use a > > literal (e.g. 0 or 1). According to your PEP, literals in module > > files ending with .py default to binary. This means that almost > > any use of a standard library module from your "dpython" will fail > > as soon as a literal is used. [Michael] > No, because the '.py' file will generate bytecodes for a number > literals as binary number when the module is compiled. If a '.dp' > file imports the contents of a '.py' file the binary numbers will be > imported as binary numbers. If the '.dp' file will need to use the > binary number in a calculation with a decimal number the binary > number will have to be cast it to a decimal number. I understood all that. but what if the decimal module wants to pass some numbers into a binary module. Then it has to make sure all the arguments it passes are decimal. > --------------------- > #gui.py > BLUE = 155 > x_axis = 1024 > y_axis = 768 > > -------------------- > #calculator.dp > import gui > ytd_interest = 0.04 > # ytd_interest is now a decimal number > win = gui.open_window(gui.bg, x_size=gui.x_axis, y_size=gui.y_axis) > app = win.dialog("Bank Balance", bankbalance_callback) > bb = app.get_bankbalance() > # bb now contains a string > newbalance = decimal(bb) *ytd_interest > # now update the display > app.set_bankbalance(str(newbalance)) > > ------------------- > > In the example the gui module was used in the calculator module, but they > were alway handled as binary numbers. The parser did not convert them to > decimal numbers because they had been parsed into a gui.pyc file prior to > being loaded into calculator.dp. Blech. That means that whenever you use a library module that does something useful with your data, you have to convert all your data explicitly to binary, even if it's just integers. Yuck. Bah. (Need I say more? OK, one more then. Argh! :-) > > I can't believe that this will work satisfactorily. > > I think it will. There will be some cases where it might be > necessary to add modules of convenience functions to make it easier > to to use applications that cross boundaries, but I think these > cases will be rare. I would be much more comfortable if there was just one integer type, or if at least binary ints would mix freely with decimal ints. I see a lot of use for decimal *floating point* (more predictable arithmetic, calculator style), and also a lot of use for decimal *fixed point* (money calculations), but I don't see the need for distinguishing the radix of of integers. > Immediately following the introduction of the decimal number types > all binary modules will work as the work today. There will be no > additional pain to continue using those module. There will be no > decimal modules, so there is no problem with making them work with > the binary modules. As decimal module users start developing > applications they will develop techniques for working with the > binary modules. Initially it may require a significant effort, but > eventually bondaries will be created and they two domains will > coexists. You make it sound as if most of the standard library would not be useful for decimal users. I doubt that. Decimal users also need to parse XML, do bisection on lists, use database files, and so on. > > Another example of the kind of problem your approach runs into: what > > should the type of len("abc") be? 3d or 3b? Should it depend on the > > default mode? > > That is an interesting question. With my current proposal the following > would be required: > > stlen = decimal(len("abc")) > > A dlen() function could be added, or perhaps allowing the automatic > promotion of int to a decimal would be a reasonable exception. That > is one case were there is no chance of data loss. I'm not apposed > to automatic conversions if there is no danger of errors being > introduced. OK, then we agree. Let's freely allow mixing decimal and binary integers. That makes much more sense. > > I suppose sequence indexing has to accept decimal as well as > > binary integers as indexes -- certainly in a decimal program you > > will want to be able to use decimal integers for indexes. > > That is how I would expect it to work. But it contradicts your original assertion that decimal and binary numbers were two incompatible types. Glad we sorted that out. [Next message] [Guido] > > I'm not optimistic about Michael's PEP. He seems to insist on a > > total separation between decimal and binary numbers that I don't > > believe can work. [Michael] > I'm not insisting on total separation. I propose that we start with > a requirement that an explicit call be made to a conversion > function. These functions would allow a decimal type to be > converted to a float or to an int. There would also be conversion > function going from a float or an int to a decimal type. (Except for ints, we have now established.) > What I would like to avoid is creating a decimal type in Python that > enables silent errors that are difficult to recognize. Allowing > automatic coersion between the binary and decimal types will open > the door to errors that would be detected if a conversion is > required. If at some point in the future it becomes apparent that a > particular form of coersion is safe and useful it could be added. > I'd like to move slowly on opening up this potential trouble spot. I recommend that you make a more complete analysis of what errors you want to avoid. Every binary can be represented in decimal if you allow enough digits. On the other hand, if you are thinking of decimal floating point, some decimal calculations will also lose precision. If you never want to lose precision, the radix of the numbers is a red herring, and you might as well use rationals under the covers. If you allow the kind of precision loss that decimal floating point can cause, I would like to understand more about what it *is* that you are trying to avoid with your Draconian separation rule. Floating point decimal arithmetic cannot avoid loss of precision for division (e.g. 1d/3d cannot be represented exactly with a finite number of decimal digits). Fixed point decimal arithmetic isn't any better. > > I haven't replied to him yet because I can't explain it well > > enough yet -- but I don't believe there's much of a future in his > > particular idea. > > I guess I'm not understanding something about the direction you are > taking Python. As I understood the goals of the CP4E project you > were attempting to make Python appealing to a wider audience and > make it possible for everyone to learn to write programs. And then > there are occasional references to a Python 3k which will fix some > Python warts. My proposal moves Python towards these goals, while > retaining full backwards compatible. I am not trying to create a > new interpreter. I think you haven't completely thought through the rules you are proposing, and you haven't stated your underlying goals very clearly. I believe the rules that you *claim* to propose won't further your goals, but it seems that you aren't sure of the rules you propose and maybe you aren't sure of your goal either. Under these adverse circumstances I'm trying to tease out a set of rules that might further the kind of goal I *think* you want to obtain, but it's hard because you have overspecified your "solution". > I'm trying to make the current interpreter useful to a wider market. Adding an Oracle module to the standard library would probably do more to further that goal than any wrangling with the numeric model that we can carry out here... :-) > What is it you are trying to accomplish in the process of "unifying > the numerical types" in Python? Removing specific warts of the current numeric system that require the programmer to be aware of more details than necessary. We will never be able to remove the need for careful numerical analysis of algorithms involving floating point (be it binary or decimal). But we can certainly remove the need to be aware of the number of bits in a machine word (long/int unification, PEP 237) or the need to explicitly promote ints to floats in certain cases (PEP 238). --Guido van Rossum (home page: http://www.python.org/~guido/) From ping@lfw.org Sat Jul 28 00:37:08 2001 From: ping@lfw.org (Ka-Ping Yee) Date: Fri, 27 Jul 2001 16:37:08 -0700 (PDT) Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 In-Reply-To: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> Message-ID: This all looks pretty good! Nice work, Guido -- especially given the minefield of compatibility issues you have been tiptoeing through. On Fri, 27 Jul 2001, Guido van Rossum wrote: > - Overloaded operator methods: > > __div__(), __floordiv__(), __truediv__(); I'm concerned about this. Does this mean that a/b will call __truediv__? So code that today expects a/b to call __div__ will be permanently broken? I think you might want to provide a little table in the PEP. Here is my stab at describing the current proposal, so you can correct it: in Python 2.1 in Python 2.2 in Python 3.0 [*] / on numbers classic division classic division true division // on numbers nothing floor division floor division / on instances __div__ __div__ __truediv__? // on instances nothing __floordiv__? __floordiv__ / API call PyNumber_Divide PyNumber_Divide PyNumber_TrueDivide? // API call nothing PyNumber_FloorDivide PyNumber_FloorDivide / AsNumber slot nb_divide nb_divide nb_true_divide? // AsNumber slot nothing nb_floor_divide nb_floor_divide / opcode BINARY_DIVIDE BINARY_DIVIDE BINARY_TRUE_DIVIDE // opcode nothing BINARY_FLOOR_DIVIDE BINARY_FLOOR_DIVIDE [*] or in Python >= 2.2 with "from __future__ import division" I'm thinking that nb_true_divide and __truediv__ should be replaced with just nb_divide and __div__ in the above table. > Semantics of Floor Division [...] > For complex numbers, // raises an exception, since float() of a > complex number is not allowed. I assume you meant "floor()" here rather than "float()". -- ?!ng From fdrake@acm.org Sat Jul 28 04:44:31 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 27 Jul 2001 23:44:31 -0400 (EDT) Subject: [Python-Dev] [OT] Expat 1.95.2 released Message-ID: <15202.13599.964445.917473@cj42289-a.reston1.va.home.com> Slightly off-topic, but this may be interesting to a few of you... In case anyone is interested, Expat 1.95.2 has been released, with both a source archive for Unix users and a handy installer for Windows victims (thanks to Tim Peters for getting me started!). This release fixes some small bugs and improves the portability of the build process (and there is one for Windows this time). You can pick up the 1.95.2 release at: http://sourceforge.net/projects/expat/ -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig From michel@digicool.com Sat Jul 28 08:41:21 2001 From: michel@digicool.com (Michel Pelletier) Date: Sat, 28 Jul 2001 00:41:21 -0700 Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python References: <01072701443301.05085@localhost.localdomain> <01072708200303.02216@fermi.eeel.nist.gov> <200107271635.MAA25690@cj20424-a.reston1.va.home.com> <01072715513805.02216@fermi.eeel.nist.gov> Message-ID: <3B626CA1.617A5121@digicool.com> Michael McLay wrote: > > On Friday 27 July 2001 12:35 pm, Guido van Rossum wrote: > > > I'm not very fond of having multiple dialects. There are lots of > > contexts where the dialect in use is not explicitly mentioned > > (e.g. when people discuss fragments of Python code). > > I'm not fond of dialects when they don't serve a significant purpose. > However, I believe it would be useful to at least discuss creating a special > purpose "safe" mode for the Python lexer. This mode would be attractive to > newbies and financial programmers. Calling this a new dialect is an > overstatement. It is more like defining a subset of the language that uses a > special vocabulary for working with decimal types. I don't know nothin about no number theory, but I did use a simliar dialect technique to implement a PEP 245 prototype using mobius. Like what I've read so far about dpython, it's objects from *.pyi files (a superset of python) could be easily intermigled with objects from *.py files. I'm all for no dialects at large, but some people may find need to implement new languages on top of python's run time engine. Especially people embedding python into specialized applications. Mobius was a way to control the python language using the language itself, it would be cool to have this kind of thing stock in python. -Michel From tim.one@home.com Sat Jul 28 09:12:47 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 28 Jul 2001 04:12:47 -0400 Subject: [Python-Dev] Advice in stat.py In-Reply-To: <200107271714.NAA26383@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > ... > Inside a function, from ... import * is always bad form. Worse, according to the Reference Manual, The "from" form with "*" may only occur in a module scope. From mwh@python.net Sat Jul 28 10:35:01 2001 From: mwh@python.net (Michael Hudson) Date: 28 Jul 2001 05:35:01 -0400 Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 In-Reply-To: Guido van Rossum's message of "Fri, 27 Jul 2001 15:49:57 -0400" References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> Message-ID: <2mn15pwg56.fsf@starship.python.net> Not directly relavent to the PEP, but... Guido van Rossum writes: > Q. What about code compiled by the codeop module? > > A. Alas, this will always use the default semantics (set by the -D > command line option). This is a general problem with the > future statement; PEP 236[4] lists it as an unresolved > problem. You could have your own clone of codeop.py that > includes a future division statement, but that's not a general > solution. Did you look at my Nasty Hack(tm) to bodge around this? It's at http://starship.python.net/crew/mwh/hacks/codeop-hack.diff if you haven't. I'm not sure it will work with what you're planning for division, but it works for generators (and worked for nested scopes when that was relavent). There are a host of saner ways round this, of course - like adding an optional "flags" argument to compile, for instance. Cheers, M. -- ARTHUR: Why should a rock hum? FORD: Maybe it feels good about being a rock. -- The Hitch-Hikers Guide to the Galaxy, Episode 8 From guido@zope.com Sat Jul 28 14:54:21 2001 From: guido@zope.com (Guido van Rossum) Date: Sat, 28 Jul 2001 09:54:21 -0400 Subject: [Python-Dev] Advice in stat.py In-Reply-To: Your message of "Sat, 28 Jul 2001 04:12:47 EDT." References: Message-ID: <200107281354.JAA30808@cj20424-a.reston1.va.home.com> > Worse, according to the Reference Manual, > > The "from" form with "*" may only occur in a module scope. I don't know when that snuck in, but it's not enforced. If we're serious, we should at least add a warning! I'll add a bug report. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@zope.com Sat Jul 28 14:57:28 2001 From: guido@zope.com (Guido van Rossum) Date: Sat, 28 Jul 2001 09:57:28 -0400 Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 In-Reply-To: Your message of "28 Jul 2001 05:35:01 EDT." <2mn15pwg56.fsf@starship.python.net> References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> Message-ID: <200107281357.JAA30859@cj20424-a.reston1.va.home.com> > Not directly relavent to the PEP, but... > > Guido van Rossum writes: > > > Q. What about code compiled by the codeop module? > > > > A. Alas, this will always use the default semantics (set by the -D > > command line option). This is a general problem with the > > future statement; PEP 236[4] lists it as an unresolved > > problem. You could have your own clone of codeop.py that > > includes a future division statement, but that's not a general > > solution. > > Did you look at my Nasty Hack(tm) to bodge around this? It's at > > http://starship.python.net/crew/mwh/hacks/codeop-hack.diff > > if you haven't. I'm not sure it will work with what you're planning > for division, but it works for generators (and worked for nested > scopes when that was relavent). Ouch. Nasty. Hat off to you for thinking of this! > There are a host of saner ways round this, of course - like adding an > optional "flags" argument to compile, for instance. We'll have to keep that in mind. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@zope.com Sat Jul 28 16:28:55 2001 From: guido@zope.com (Guido van Rossum) Date: Sat, 28 Jul 2001 11:28:55 -0400 Subject: [Python-Dev] Ready to merge descr-branch back into the trunk? Message-ID: <200107281528.LAA31137@cj20424-a.reston1.va.home.com> Is it time to merge the descr-branch (from which 2.2a1 was built) back into the trunk? In the fray over PEP 238 I haven't seen too much feedback on the alpha release, but there have been plenty of downloads. Telling from the bug reports, a few people have clearly been kicking the tires quite a bit. I don't think we'll have to withdraw the type/class unification, and I'd like to fire Tim from his branch-merge duties. :) I'll post a qury about this on c.l.py too. --Guido van Rossum (home page: http://www.python.org/~guido/) From paulp@ActiveState.com Sat Jul 28 17:40:37 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sat, 28 Jul 2001 09:40:37 -0700 Subject: [Python-Dev] pep-discuss Message-ID: <3B62EB05.396DF4D7@ActiveState.com> We've talked about having a mailing list for general PEP-related discussions. Two things make me think that revisiting this would be a good idea right now. First, the recent loosening up of the python-dev rules threatens the quality of discussion about bread and butter issues such as patch discussions and process issues. Second, the flamewar on python-list basically drowned out the usual newbie questions and would give a person coming new to Python a very negative opinion about the language's future and the friendliness of the community. I would rather redirect as much as possible of that to a list that only interested participants would have to endure. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From paulp@ActiveState.com Sat Jul 28 18:03:37 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sat, 28 Jul 2001 10:03:37 -0700 Subject: [Python-Dev] ActiveCobolScript? Message-ID: <3B62F069.D0CB002E@ActiveState.com> http://www.cobolscript.com/ Too bad this up and coming language already has a corporate benefactor. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From paulp@ActiveState.com Sat Jul 28 18:24:42 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sat, 28 Jul 2001 10:24:42 -0700 Subject: [Python-Dev] ActiveCobolScript? References: <3B62F069.D0CB002E@ActiveState.com> Message-ID: <3B62F55A.28EE3866@ActiveState.com> Sorry guys, I meant to send this to ActiveState's internal lists where we plot the takeover of the world...my brain hasn't totally recovered from my trip to Tijuana (er, I mean San Diego). -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From paulp@ActiveState.com Sat Jul 28 20:08:17 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sat, 28 Jul 2001 12:08:17 -0700 Subject: [Python-Dev] Ready to merge descr-branch back into the trunk? References: <200107281528.LAA31137@cj20424-a.reston1.va.home.com> Message-ID: <3B630DA1.6523B177@ActiveState.com> Guido van Rossum wrote: > > Is it time to merge the descr-branch (from which 2.2a1 was built) back > into the trunk? In the fray over PEP 238 I haven't seen too much > feedback on the alpha release, but there have been plenty of > downloads. Telling from the bug reports, a few people have clearly > been kicking the tires quite a bit. I'm not deeply concerned from a backwards compatibility standpoint but I would like to see more documentation and more widespread understanding of the feature before we say "yes, this is the right way." I wonder how many people truly understand all of the changes. When you added metaclasses you labelled the feature experimental so you could change it once people got a sense of it. I propose you do the same thing in this case. I could even imagine using the warnings framework to tell people that they are playing with stuff that may change. As smart as you are, you are only one person, with experience with a certain set of problems. Wider understanding, experimentation and discussion might help you to improve the design...but they may break some of the features you have already added. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From tim.one@home.com Sat Jul 28 21:13:53 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 28 Jul 2001 16:13:53 -0400 Subject: [Python-Dev] Picking on platform fmod Message-ID: Here's your chance to prove your favorite platform isn't a worthless pile of monkey crap . Please run the attached. If it prints anything other than 0 failures in 10000 tries it will probably print a lot. In that case I'd like to know which flavor of C+libc+libm you're using, and the OS; a few of the failures it prints may be helpful too. If it only prints one or two failures, it's probably a bug in *my* code, so I especially want to know about that. If it dies with an assertion error, that would be mondo interesting to know, and then the flavor of HW chip may also be relevant. I already know MSVC 6 under Win98SE passes ("0 failures") on two distinct boxes , so no need for more about that one. What you're looking for: Given finite doubles x>0 and y>0, there's a unique integer N and unique real number R such that x = N*y + R exactly and 0 <= R < y. N may not be *representable* as an integer (or long, or long long, or double) on the machine, but it's A Theorem that the infinitely-precise value of R is exactly representable as a machine double. The C stds have never (IMO) been clear about whether fmod(x, y) must return this exact R, although the C99 *rationale* is clear that this is the intent (why committees don't fold these subtleties into the bodies of their stds is beyond me). The program below generates nasty test cases at random, computes the exact R in a clever (read "on the edge of not working but pretty fast") way using Python, and compares it to the platform fmod result. It takes about 6 seconds to run on a 866MHz box using current CVS Python, so it shouldn't be a burden to try. If you get no failure, of course I'd like to hear about that too. it's-not-like-you-had-anything-fun-to-do-this-weekend-ly y'rs - tim from math import frexp as _frexp, ldexp as _ldexp # ffmod is a Pythonic variant of fmod, returning a remainder with the # same sign as y. Excepting infs and NaNs, the result is exact. def ffmod(x, y): if y == 0: raise ZeroDivisionError("ffmod: divide by 0") remainder = abs(x) yabs = abs(y) if remainder >= yabs: dexp = _frexp(remainder)[1] - _frexp(yabs)[1] assert dexp >= 0 yshifted = _ldexp(yabs, dexp) # exact for i in xrange(dexp + 1): # compute one bit of the quotient (but not materialized; # we only care about the remainder at the end) if remainder >= yshifted: assert remainder < yshifted * 2.0 remainder -= yshifted # exact yshifted *= 0.5 # exact assert yshifted * 2.0 == yabs assert remainder < yabs if y < 0 and remainder > 0: remainder -= yabs # exact assert remainder < 0 return remainder # ffmod and C99 fmod should agree whenever x>0 and y>0. Try one, and # return 1 iff they don't agree. def _tryone(x, y, dump=0): n = math.fmod(x, y) e = ffmod(x, y) if dump: print "fmod" + `x, y` print "native:", `n` print " exact:", `e` return n != e # Test n random inputs, in the sense of random mantissas and random # exponents in range(-300, 301). The hardest cases have x much larger # than y, and this will generate lots of those. def _test(n, showresults=0): from random import random, randrange nfail = 0 for i in xrange(n): x = _ldexp(random(), randrange(-300, 301)) y = _ldexp(random(), randrange(-300, 301)) if x < y: x, y = y, x if _tryone(x, y, showresults): nfail += 1 _tryone(x, y, 1) print nfail, "failures in", n, "tries" if __name__ == "__main__": import math _test(10000, 0) From tim.one@home.com Sun Jul 29 12:10:45 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 29 Jul 2001 07:10:45 -0400 Subject: [Python-Dev] Advice in stat.py In-Reply-To: <200107281354.JAA30808@cj20424-a.reston1.va.home.com> Message-ID: [Tim] > Worse, according to the Reference Manual, > > The "from" form with "*" may only occur in a module scope. [Guido] > I don't know when that snuck in, On Friday, 14 Aug 1992: it's in rev 1.1 of ref6.tex, and was there in the 0.98 release. This proved tedious to trace backwards, because you and Fred went through an amazing variety of ways to mark up "*" . > but it's not enforced. If we're serious, we should at least add a > warning! I thought we had agreed to do this back when the nested-scopes warnings were being added; guess not. > I'll add a bug report. aka-the-retroactive-todo-list-ly y'rs - tim From thomas@xs4all.net Sun Jul 29 16:33:43 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sun, 29 Jul 2001 17:33:43 +0200 Subject: [Python-Dev] Advice in stat.py In-Reply-To: <200107281354.JAA30808@cj20424-a.reston1.va.home.com> Message-ID: <20010729173343.H21770@xs4all.nl> On Sat, Jul 28, 2001 at 09:54:21AM -0400, Guido van Rossum wrote: > > Worse, according to the Reference Manual, > > The "from" form with "*" may only occur in a module scope. > I don't know when that snuck in, but it's not enforced. If we're > serious, we should at least add a warning! Eh, last I looked, you and Jeremy were most serious about this :) It came up during the nested-scopes change in 2.1, where it was first made illegal, and later just illegal in the presence of a nested scope: (without future statement) >>> def spam(x): ... from stat import * ... def eggs(): ... print x ... :1: SyntaxWarning: local name 'x' in 'spam' shadows use of 'x' as global in nested scope 'eggs' :1: SyntaxWarning: import * is not allowed in function 'spam' because it contains a nested function with free variables (with future statement) >>> def spam(x): ... from stat import * ... def eggs(): ... print x ... File "", line 2 SyntaxError: import * is not allowed in function 'spam' because it contains a nested function with free variables > I'll add a bug report. Should we warn about exec (without 'in' clause) in functions as well ? (without future statement) >>> def spam(x,y): ... exec y ... def eggs(): ... print x ... :1: SyntaxWarning: local name 'x' in 'spam' shadows use of 'x' as global in nested scope 'eggs' :1: SyntaxWarning: unqualified exec is not allowed in function 'spam' it contains a nested function with free variables (with future statement) >>> def spam(x,y): ... exec y ... def eggs(): ... print x ... File "", line 2 SyntaxError: unqualified exec is not allowed in function 'spam' it contains a nested function with free variables The warnings *only* occur in the presence of a nested scope, though. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas@xs4all.net Sun Jul 29 17:13:59 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sun, 29 Jul 2001 18:13:59 +0200 Subject: [Python-Dev] Picking on platform fmod In-Reply-To: References: Message-ID: <20010729181359.I21770@xs4all.nl> On Sat, Jul 28, 2001 at 04:13:53PM -0400, Tim Peters wrote: > Here's your chance to prove your favorite platform isn't a worthless pile of > monkey crap . Please run the attached. If it prints anything other > than > 0 failures in 10000 tries > it will probably print a lot. Worked fine on BSDI, FreeBSD and Linux on Intel hardware here, as well as Solaris on SPARC hardware and Linux on (IBM) PPC and (Compaq) Alpha hardware, at least the ones in the SourceForge compilefarm :) However, on this sourceforge compilefarm machine: Linux usf-cf-sparc-linux-1 2.2.18pre21 #1 SMP Wed Nov 22 17:27:17 EST 2000 sparc64 unknown (Linux on SPARC hardware, somewhat different hardware than the Solaris compilefarm machine, though both are UltraSparcs (sun4u), so I assume they have the same FPU hardware too.) cpu : TI UltraSparc II (BlackBird) fpu : UltraSparc II integrated FPU promlib : Version 3 Revision 17 prom : 3.17.0 type : sun4u ncpus probed : 2 ncpus active : 2 It fails like so: Python-2.1.1/SPARC-linux/python fmodtest.py Traceback (most recent call last): File "fmodtest.py", line 60, in ? _test(10000, 0) File "fmodtest.py", line 53, in _test if _tryone(x, y, showresults): File "fmodtest.py", line 33, in _tryone n = math.fmod(x, y) OverflowError: math range error This is most often on the first time through _tryone, or otherwise on the second time through. Since it didn't print the oodles of info you wanted, here are some values that cause this: x: 1.9855727039972493e-39 y: 3.3665190124762732e-65 x: 9.5227191085185764e+47 y: 4.2603743746337035e-20 x: 5.9222419270524289e+19 y: 1.1515096079336105e-17 x: 1.0095372277815077e+37 y: 5.1347483313106109e-23 x: 7612675.5666046143 y: 16016.095533924272 x: 1.9710117673387707e+27 y: 3.8974792352555581e-75 x: 7.2481762337961828e-72 y: 6.9275805608109076e-91 x: 444606.5185310659 y: 0.040252210139028341 Here are some that it didn't break on: x: 6.4925064019277635e+82 y: 3.3863081542612738e+39 x: 1.5537102518838885e+28 y: 7.4706363056326852e+21 x: 6.0545201466539534e+82 y: 1.0632674821830584e+22 x: 2.4744658600351291e+51 y: 3.8431582369146088e+39 x: 5.019729019166613e+56 y: 1.18286034219559e+48 Notice how none of these have a '-' in them... -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From cgw@alum.mit.edu Sun Jul 29 17:24:31 2001 From: cgw@alum.mit.edu (Charles G Waldman) Date: Sun, 29 Jul 2001 11:24:31 -0500 Subject: [Python-Dev] Picking on platform fmod Message-ID: <15204.14527.823870.226422@nyx.dyndns.org> > If you get no failure, of course I'd like to hear about that too. Got "no failure" on: SunOS 5.8 Generic_108529-08 Linux 2.2.18 / glibc 2.1.3 From aahz@rahul.net Sun Jul 29 17:44:28 2001 From: aahz@rahul.net (Aahz Maruch) Date: Sun, 29 Jul 2001 09:44:28 -0700 (PDT) Subject: [Python-Dev] Re: post mortem after threading deadlock? In-Reply-To: <200107252216.SAA09493@cj20424-a.reston1.va.home.com> from "Guido van Rossum" at Jul 25, 2001 06:16:45 PM Message-ID: <20010729164429.4FA5D99C90@waltz.rahul.net> Guido van Rossum wrote: > > I believe that Aahz, in his thread tutorial, has even more radical > advice: use the Queue module for all inter-thread communication. It > is even higher level than semaphores, and has the same nice > properties. Not only that, Queue.Queue has the especially nice property of handling both data protection (mutexes) and synchronization. I'm following up primarily to announce that I've just uploaded my OSCON slides (new and improved!) to http:/starship.python.net/crew/aahz/ -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From guido@zope.com Sun Jul 29 18:00:59 2001 From: guido@zope.com (Guido van Rossum) Date: Sun, 29 Jul 2001 13:00:59 -0400 Subject: [Python-Dev] Advice in stat.py In-Reply-To: Your message of "Sun, 29 Jul 2001 17:33:43 +0200." <20010729173343.H21770@xs4all.nl> References: <20010729173343.H21770@xs4all.nl> Message-ID: <200107291700.NAA07488@cj20424-a.reston1.va.home.com> > On Sat, Jul 28, 2001 at 09:54:21AM -0400, Guido van Rossum wrote: > > > Worse, according to the Reference Manual, > > > > The "from" form with "*" may only occur in a module scope. > > > I don't know when that snuck in, but it's not enforced. If we're > > serious, we should at least add a warning! > > Eh, last I looked, you and Jeremy were most serious about this :) It came up > during the nested-scopes change in 2.1, where it was first made illegal, and > later just illegal in the presence of a nested scope: > > (without future statement) > >>> def spam(x): > ... from stat import * > ... def eggs(): > ... print x > ... > :1: SyntaxWarning: local name 'x' in 'spam' shadows use of 'x' as > global in nested scope 'eggs' > :1: SyntaxWarning: import * is not allowed in function 'spam' because > it contains a nested function with free variables > > (with future statement) > >>> def spam(x): > ... from stat import * > ... def eggs(): > ... print x > ... > File "", line 2 > SyntaxError: import * is not allowed in function 'spam' because it contains > a nested function with free variables > > > I'll add a bug report. Hm. I'm curious why it was not made a warning without a nested function. Perhaps because too much 3rd party code would trigger the warning? (I have a feeling that lots of amateur programmers are a lot fonder of import * than they should be :-( ). > Should we warn about exec (without 'in' clause) in functions as well ? > > (without future statement) > >>> def spam(x,y): > ... exec y > ... def eggs(): > ... print x > ... > :1: SyntaxWarning: local name 'x' in 'spam' shadows use of 'x' as > global in nested scope 'eggs' > :1: SyntaxWarning: unqualified exec is not allowed in function 'spam' > it contains a nested function with free variables > > (with future statement) > >>> def spam(x,y): > ... exec y > ... def eggs(): > ... print x > ... > File "", line 2 > SyntaxError: unqualified exec is not allowed in function 'spam' it contains > a nested function with free variables > > The warnings *only* occur in the presence of a nested scope, though. That one is just fine I think. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@rahul.net Sun Jul 29 18:19:27 2001 From: aahz@rahul.net (Aahz Maruch) Date: Sun, 29 Jul 2001 10:19:27 -0700 (PDT) Subject: [Python-Dev] Small feature request - optional argument for string.strip() In-Reply-To: <31575A892FF6D1118F5800600846864D78BEFD@intrepid> from "Simon Brunning" at Jul 25, 2001 06:00:55 PM Message-ID: <20010729171927.9516699C90@waltz.rahul.net> Simon Brunning wrote: > > The .split method on strings splits at whitespace by default, but takes an > optional argument allowing splitting by other strings. The .strip method > (and its siblings) always strip whitespace - on more than one occasion I > would have found it useful if these methods also took an optional argument > allowing other strings to be stripped. For example, to strip, say, asterisks > from a file you could do: > > >>>fred = '**word**word**' > >>>fred.strip('*') > word**word > > Does this sound sensible/useful? I've never seen a case where this was wanted except to delete *all* such characters. string.translate() does that, but in an awkward way. Perhaps a wrapper for string.translate() might make sense, called something like string.delete(). -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From thomas@xs4all.net Sun Jul 29 19:23:37 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sun, 29 Jul 2001 20:23:37 +0200 Subject: [Python-Dev] Advice in stat.py In-Reply-To: <200107291700.NAA07488@cj20424-a.reston1.va.home.com> References: <20010729173343.H21770@xs4all.nl> <200107291700.NAA07488@cj20424-a.reston1.va.home.com> Message-ID: <20010729202337.M570@xs4all.nl> On Sun, Jul 29, 2001 at 01:00:59PM -0400, Guido van Rossum wrote: > > (with future statement) > > >>> def spam(x): > > ... from stat import * > > ... def eggs(): > > ... print x > > ... > > File "", line 2 > > SyntaxError: import * is not allowed in function 'spam' because it contains > > a nested function with free variables > Hm. I'm curious why it was not made a warning without a nested > function. Perhaps because too much 3rd party code would trigger the > warning? Yes. > (I have a feeling that lots of amateur programmers are a lot > fonder of import * than they should be :-( ). Oh yeah. If ActiveState's mailinglist statistics were extended to show howmany of my posts preach against using 'import *', I'd be top dog in the python-list stats :-) I also still owe Fred a tutorial chapter on why not to use import * :) > > >>> def spam(x,y): > > ... exec y > > ... def eggs(): > > ... print x > That one is just fine I think. Why is 'import *' inside a function fine, but a bare exec isn't ? Weren't you going to deprecate bare exec's altogether ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From gball@cfa.harvard.edu Sun Jul 29 22:25:37 2001 From: gball@cfa.harvard.edu (Greg Ball) Date: Sun, 29 Jul 2001 17:25:37 -0400 (EDT) Subject: [Python-Dev] Picking on platform fmod Message-ID: I got no failure on OSF1 cfata6.harvard.edu V4.0 878 alpha SunOS cfa0 5.8 Generic_108528-06 sun4u sparc SUNW,Ultra-Enterprise building with or without gcc. --Greg Ball From guido@zope.com Sun Jul 29 23:40:22 2001 From: guido@zope.com (Guido van Rossum) Date: Sun, 29 Jul 2001 18:40:22 -0400 Subject: [Python-Dev] Advice in stat.py In-Reply-To: Your message of "Sun, 29 Jul 2001 20:23:37 +0200." <20010729202337.M570@xs4all.nl> References: <20010729173343.H21770@xs4all.nl> <200107291700.NAA07488@cj20424-a.reston1.va.home.com> <20010729202337.M570@xs4all.nl> Message-ID: <200107292240.SAA08051@cj20424-a.reston1.va.home.com> > > > >>> def spam(x,y): > > > ... exec y > > > ... def eggs(): > > > ... print x > > > That one is just fine I think. > > Why is 'import *' inside a function fine, but a bare exec isn't ? Weren't > you going to deprecate bare exec's altogether ? You mean the other way around don't you? I proposed a warning for import * but not for bare exec. I guess for me the difference is that import * is just stupid (potentially lots of work going on every time you call the function) while the main problem with bare exec is that it gets in the way of optimizers and the like. Since we don't have an optimizer (yet) I don't care so much (yet). --Guido van Rossum (home page: http://www.python.org/~guido/) From tim@zope.com Sun Jul 29 23:55:19 2001 From: tim@zope.com (Tim Peters) Date: Sun, 29 Jul 2001 18:55:19 -0400 Subject: [Python-Dev] Python Windows installer: good news! Message-ID: Wise Solutions generously offered PythonLabs use of their InstallerMaster 8.1 system. Every PythonLabs Windows installer produced to date used Wise 5.0a, and while that's done a great job for us over the years, some of you have noticed that it was starting to show its age. I've completed upgrading our Windows installation procedures to InstallerMaster 8.1, and we'll release the next alpha of Windows Python 2.2 using it. Even if you have no interest in *testing* 2.2a2 at that time, if you're running on a Windows system please download the installer (when it's released) just to be sure it works for you! As always, we have direct access to only a few Windows boxes, so we rely on cheerful volunteerd to uncover surprises. Some things to note: + The installer it produces is a 32-bit program, so this should be the end of "failure in 16-bit subsystem" deaths some people see on Win2K (at least 4 reports of that, and no real handle on why). + The uninstaller has a new "repair" option. The install.log saves away file fingerprints at installation time, and so long as you still have the original installer .exe, the repair option can detect installed files that changed since installation, and (optionally) restore them from the original .exe. + Aborting an installation in midstream no longer (necessarily) leaves a bunch of crap sitting around. Instead you get a new dialog box offering to roll back the changes made so far. This even works if you hit the "Cancel" button on the final "installation finished" screen. + A Backup directory is created under the root of the Python installation, where the installer stores files it changes or replaces. We don't do much of that, but it *does* allow the uninstaller to restore Start Menu entries too -- nice for alpha and beta testers (before, whatever pre-existing Start Menu entries they had were simply wiped out by an uninstall). + Since IDLE is an essential part of the Windows Experience for most PythonLabs users, I folded the old Tcl/Tk component into the main Python interpreter component -- one less checkbox to worry about. Also removed the time-wasting "Welcome!" dialog, and made a few cosmetic improvements. Other than those, the look and feel are much the same, it just runs better! It's slick -- I think you'll like it. and-if-you-don't-write-a-pep-ly y'rs - tim From aahz@rahul.net Mon Jul 30 00:32:38 2001 From: aahz@rahul.net (Aahz Maruch) Date: Sun, 29 Jul 2001 16:32:38 -0700 (PDT) Subject: [Python-Dev] PEP for adding a decimal type to Python In-Reply-To: <01072701443301.05085@localhost.localdomain> from "Michael McLay" at Jul 27, 2001 01:44:33 AM Message-ID: <20010729233239.56D6999C85@waltz.rahul.net> Michael McLay wrote: > > Absolutely. The PEP process is suppose to formalize the capture of > ideas so they can be reference. This PEP is mostly orthogonal to > Aahz's proposal. They can be merge, or we can reference each others > PEP. I'm probably not the best choice for doing the implement of the > decimal number semantics, so I'd be happy to work with Aahz. Note that I am unwilling to discuss this in the context of any PEP until/unless I finish my implementation. There is already a spec for what I'm doing (Cowlishaw), and I see no point in talking until code is ready for use. If someone wants to take over my work, I won't complain; I've already done the easy work. ;-) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From paulp@ActiveState.com Mon Jul 30 00:43:00 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 29 Jul 2001 16:43:00 -0700 Subject: [Python-Dev] Python on Playstation 2 Message-ID: <3B649F84.5D241B38@ActiveState.com> It isn't publicly available but Python has been ported to the Playstation 2 video game console. The only weakness is that a binary distribution wouldn't be useful because the format of Playstation CDs isn't portable. Jason Asbahr told me about. Perhaps at the python conference he can slip us some bootleg disks with a raw interpreter prompt "game". It might be somewhat tedious programming with a gamepad thingee but it would nevertheless be cool to be able to. The real goal of the port is to use Python as a scripting language for a game with a silly-sounding name that I can't remember. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From greg@cosc.canterbury.ac.nz Mon Jul 30 00:47:20 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 30 Jul 2001 11:47:20 +1200 (NZST) Subject: [Python-Dev] Advice in stat.py In-Reply-To: <200107271547.LAA24634@cj20424-a.reston1.va.home.com> Message-ID: <200107292347.LAA00409@s454.cosc.canterbury.ac.nz> Guido: > > Suggested usage: from stat import * > > """ > > > > Is ths still the suggested usage? > > I don't see why not. Because it flies in the face of the usual advice, which is never to use import *. How are we supposed to convince impressionable newbies to stay away from the evil drug of import * if the docs for one of the standard modules is brazenly advocating its use? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Mon Jul 30 01:21:33 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 30 Jul 2001 12:21:33 +1200 (NZST) Subject: [Python-Dev] Picking on platform fmod In-Reply-To: Message-ID: <200107300021.MAA00429@s454.cosc.canterbury.ac.nz> Tim: > Please run the attached. SunOS s454 5.7 Generic_106541-10 sun4m sparc SUNW,SPARCstation-4: 0 failures in 10000 tries SunOS pc200 5.8 Generic_108529-03 i86pc i386 i86pc: 0 failures in 10000 tries Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aahz@rahul.net Mon Jul 30 03:51:47 2001 From: aahz@rahul.net (Aahz Maruch) Date: Sun, 29 Jul 2001 19:51:47 -0700 (PDT) Subject: [Python-Dev] Advice in stat.py In-Reply-To: <200107271547.LAA24634@cj20424-a.reston1.va.home.com> from "Guido van Rossum" at Jul 27, 2001 11:47:13 AM Message-ID: <20010730025148.3279199C85@waltz.rahul.net> Guido van Rossum wrote: >Greg Ward: >> >> Suggested usage: from stat import * >> >> Is ths still the suggested usage? > > I don't see why not. Here's why not: from stat import * from threading import * # Look at the docs if you don't believe me from Tkinter import * from types import * If you have a single module that imports all four of these (and I don't think that's particularly bizarre), tracing back any random symbol to its source becomes an annoying trek through *five* modules. There are probably a few other modules I don't know about that are declared "safe" for import *. IMO, this quickly leads to disaster, particularly when trying to debug someone else's code (and I've wasted more time than I'd like over this). It just plain goes against "explicit is better than implicit". I think we should declare a universal policy of NEVER recommending import *, except for interactive use. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From barry@zope.com Mon Jul 30 04:21:03 2001 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 29 Jul 2001 23:21:03 -0400 Subject: [Python-Dev] Advice in stat.py References: <200107271547.LAA24634@cj20424-a.reston1.va.home.com> <20010730025148.3279199C85@waltz.rahul.net> Message-ID: <15204.53919.982508.201595@anthem.wooz.org> >>>>> "AM" == Aahz Maruch writes: AM> If you have a single module that imports all four of these AM> (and I don't think that's particularly bizarre), tracing back AM> any random symbol to its source becomes an annoying trek AM> through *five* modules. There are probably a few other AM> modules I don't know about that are declared "safe" for import AM> *. IMO, this quickly leads to disaster, particularly when AM> trying to debug someone else's code (and I've wasted more time AM> than I'd like over this). AM> It just plain goes against "explicit is better than implicit". AM> I think we should declare a universal policy of NEVER AM> recommending import *, except for interactive use. Just because you can doesn't mean you should. :) I think it's a good thing that those modules you mention are declared safe for import-* but certainly in the situation you describe it isn't a good idea to use them that way. I don't remember a situation where I've ever import-*'d more than a couple of modules in any single file. There are often good reasons to use import-* at the module global level. Mailman has two places where this is used effectively. The more interesting place is in a configuration file called mm_cfg.py. This file is where users are supposed to put all their customizations overriding out-of-the-box defaults. At the top of the file there's a line like from Defaults import * Which brings all the symbols from the out-of-the-box default file (i.e. Defaults.py) into mm_cfg.py. Overrides go after this import line. Mailman modules always import mm_cfg and never import Defaults, so it makes for a very convenient way to arrange things so users only have to care about overriding specific variables, and never have to worry about the installation procedure overwriting their defaults ("make install" may write a new Defaults.py but never a mm_cfg.py). import-* is often good for creating this kind of transparent aliasing of one module's namespace into a second. I know no one's talking about outlawing from-import-*. It needs to be used judiciously, but it definitely has its uses. -Barry From tim.one@home.com Mon Jul 30 05:46:23 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 30 Jul 2001 00:46:23 -0400 Subject: [Python-Dev] Picking on platform fmod In-Reply-To: <200107300021.MAA00429@s454.cosc.canterbury.ac.nz> Message-ID: Thanks all for torturing your boxes with fmod()! Would still like to hear about Platforms from Mars (Macs , Tru64, HP-UX), but it got a clean bill of health on all these: Win98SE + MSVC 6 SuSE 7.2 system Linux mira 2.4.4-4GB #1 Wed May 16 00:37:55 GMT 2001 i686 unknown glibc-2.2.2-38 gcc-2.95.3-52 Python 2.2a0 (#383, Jul 24 2001, 09:26:51) Solaris 8 SunOS pandora 5.8 Generic_108528-06 sun4u sparc SUNW,Ultra-Enterprise Python 2.1.1 (#1, Jul 21 2001, 20:59:12) [GCC 2.95.2 19991024 (release)] on sunos5 Reliant ReliantUNIX-Y deukalion 5.45 B0032 RM600 4/512 R4000 Python 2.0 (#6, Apr 10 2001, 13:20:15) [C] on reliantunix-y5 Linux % uname -a Linux anthem 2.2.18 #21 SMP Mon Jan 8 00:33:29 EST 2001 i686 unknown % rpm -q libc libc-5.3.12-31 % gcc --version egcs-2.91.66 Linux-Mandrake 7.2 Linux kernel 2.2.17 GNU libc 2.1.3 (includes libm), and GCC 2.95.3. SunOS 5.8 Generic_108529-08 Linux 2.2.18 / glibc 2.1.3 OSF1 cfata6.harvard.edu V4.0 878 alpha SunOS cfa0 5.8 Generic_108528-06 sun4u sparc SUNW,Ultra-Enterprise building with or without gcc SunOS s454 5.7 Generic_106541-10 sun4m sparc SUNW,SPARCstation-4 SunOS pc200 5.8 Generic_108529-03 i86pc i386 i86pc BSDI, FreeBSD and Linux on Intel hardware Solaris on SPARC hardware Linux on (IBM) PPC (SourceForge?) Linxux on (Compaq) Alpha (SourceForge) Only one failure report, from Thomas Wouters on: Linux usf-cf-sparc-linux-1 2.2.18pre21 #1 SMP Wed Nov 22 17:27:17 EST 2000 sparc64 unknown (SourceForge) dying with OverflowError in math.fmod(). Looks like a *badly* buggy fmod() to me! If we had tried this a decade ago, *most* platforms would have gotten a wrong answer on almost every try. Assuming we don't get a failure report on Mac, the major platforms do this correctly now, so we can save Python from growing another few hundred lines of excruciating workaround code: http://www.netlib.org/fdlibm/e_fmod.c when-even-windows-gets-it-right-there's-no-excuse-ly y'rs - tim From fdrake@acm.org Mon Jul 30 06:12:17 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 30 Jul 2001 01:12:17 -0400 (EDT) Subject: [Python-Dev] Advice in stat.py In-Reply-To: <200107292347.LAA00409@s454.cosc.canterbury.ac.nz> References: <200107271547.LAA24634@cj20424-a.reston1.va.home.com> <200107292347.LAA00409@s454.cosc.canterbury.ac.nz> Message-ID: <15204.60593.503331.453900@cj42289-a.reston1.va.home.com> Greg Ewing writes: > How are we supposed to convince impressionable newbies to stay away > from the evil drug of import * if the docs for one of the standard > modules is brazenly advocating its use? If Guido will back off on saying that it's acceptable to use it that way, I can assure you the docs will be corrected. But if he's still advocating that usage (silly Dutchman), I don't think I should touch it. Even if it is silly. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Mon Jul 30 06:23:39 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 30 Jul 2001 01:23:39 -0400 (EDT) Subject: [Python-Dev] Advice in stat.py In-Reply-To: <20010730025148.3279199C85@waltz.rahul.net> References: <200107271547.LAA24634@cj20424-a.reston1.va.home.com> <20010730025148.3279199C85@waltz.rahul.net> <15204.53919.982508.201595@anthem.wooz.org> Message-ID: <15204.61275.24195.153670@cj42289-a.reston1.va.home.com> Aahz Maruch writes: > If you have a single module that imports all four of these (and I don't > think that's particularly bizarre), tracing back any random symbol to Only if you import-* them all, and that *is* a pathelogical case. Any time you import-* *two* modules, you have a pathelogical case on your hands, just ready to explode. > It just plain goes against "explicit is better than implicit". I think > we should declare a universal policy of NEVER recommending import *, > except for interactive use. I'd be willing to give it up even there, esp. now that we have import-as. Barry sez: > There are often good reasons to use import-* at the module global > level. Mailman has two places where this is used effectively. The > more interesting place is in a configuration file called mm_cfg.py. > This file is where users are supposed to put all their customizations > overriding out-of-the-box defaults. At the top of the file there's a This is about the only kind of thing I've ever found it useful for: re-implementing a module's interface, but when I only want to change a few things. Needing to do this never feels like a good solution, and probably indicates that some object needs to accept a parameter that offers the module's interface instead of finding it by name. So yeah, I'd give up import-*. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Mon Jul 30 06:26:43 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 30 Jul 2001 01:26:43 -0400 (EDT) Subject: [Python-Dev] Python on Playstation 2 In-Reply-To: <3B649F84.5D241B38@ActiveState.com> References: <3B649F84.5D241B38@ActiveState.com> Message-ID: <15204.61459.319425.84281@cj42289-a.reston1.va.home.com> Paul Prescod writes: > It isn't publicly available but Python has been ported to the > Playstation 2 video game console. The only weakness is that a binary > distribution wouldn't be useful because the format of Playstation CDs > isn't portable. Jason Asbahr told me about. Perhaps at the python I'm curious: Is it the filesystem format or the lower-level tracking format? If its only the former, a prepared image should be useful. (And no, I don't have a PS2 waiting to boot up Python!) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@zope.com Mon Jul 30 06:38:33 2001 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 30 Jul 2001 01:38:33 -0400 Subject: [Python-Dev] Advice in stat.py References: <200107271547.LAA24634@cj20424-a.reston1.va.home.com> <20010730025148.3279199C85@waltz.rahul.net> <15204.53919.982508.201595@anthem.wooz.org> <15204.61275.24195.153670@cj42289-a.reston1.va.home.com> Message-ID: <15204.62169.600623.580141@anthem.wooz.org> >>>>> "Fred" == Fred L Drake, Jr writes: Fred> This is about the only kind of thing I've ever found it Fred> useful for: re-implementing a module's interface, but when I Fred> only want to change a few things. I call it "aliasing" a module (i.e. aliasing module A's symbols exported through module B). Fred> Needing to do this never feels like a good solution, and Fred> probably indicates that some object needs to accept a Fred> parameter that offers the module's interface instead of Fred> finding it by name. Um, sure, but it's can be pretty inconvenient to export 193 symbols this way :). >>> len(dir(Defaults)) 193 I've often thought that it would be nice to have better delegation support in Python, and no __getattr__() doesn't really hack it. I'm encouraged that some of the Py2.2 descr-branch stuff might actually make this valid programming technique more useful and then even I could see (eventually) giving up on import-*. -Barry From tim.one@home.com Mon Jul 30 06:43:36 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 30 Jul 2001 01:43:36 -0400 Subject: [Python-Dev] Advice in stat.py In-Reply-To: <15204.60593.503331.453900@cj42289-a.reston1.va.home.com> Message-ID: [Fred L. Drake, Jr., on import *] > If Guido will back off on saying that it's acceptable to use it that > way, I can assure you the docs will be corrected. But if he's still > advocating that usage (silly Dutchman), I don't think I should touch > it. > Even if it is silly. Guido never advocates it, but you're not going to get him to say it's Evil either. The thing is, an intelligent adult can use import-* profitably and safely, in the handful of cases an intelligent adult realizes it's profitable and safe to do so . When Jeremy played w/ "import *"-inside-functions wngs, most popped up in Guido's code. This was most often from Tkinter import * in a module's _test() function, where doing so was handy amd harmless. I never use it myself -- but then I never write Tkinter code, and "stat" is some Unix abomination . It couldn't hurt to add an "intelligent adult" warning to the docs! I suggest an icon showing a refined English lady trying hard not to notice a West Virginian next to her picking his nose. perfect-images-are-too-rare-to-pass-up-ly y'rs - tim From shang@cc.jyu.fi Mon Jul 30 07:56:09 2001 From: shang@cc.jyu.fi (Sami Hangaslammi) Date: Mon, 30 Jul 2001 09:56:09 +0300 (EET DST) Subject: [Python-Dev] Iterator addition? Message-ID: Since iterator objects work like sequences in several contexts, maybe they could support sequence-like operations such as addition. This would let you write for x in iter1 + iter2: do_something(x) instead of for x in iter1: do_something(x) for x in iter2: do_something(x) or the slightly better for i in iter1,iter2: for x in i: do_something(x) -- Sami Hangaslammi -- From m@moshez.org Mon Jul 30 09:38:45 2001 From: m@moshez.org (Moshe Zadka) Date: Mon, 30 Jul 2001 11:38:45 +0300 Subject: [Python-Dev] Iterator addition? In-Reply-To: References: Message-ID: On Mon, 30 Jul 2001, Sami Hangaslammi wrote: > Since iterator objects work like sequences in several contexts, maybe they > could support sequence-like operations such as addition. This would let > you write > > for x in iter1 + iter2: > do_something(x) > > instead of > > for x in iter1: > do_something(x) > > for x in iter2: > do_something(x) > > or the slightly better > > for i in iter1,iter2: > for x in i: > do_something(x) No, instead of: class concat: def __init__(self, *iterators): self.iterators = list(iterators) def __iter__(self): return self def next(self): while self.iterators: try: return self.iterators[0].next() except StopIteration: del self.iterators[0] else: raise StopIteration for x in concat(iter1, iter2): do_something(x) (Note that the first n-2 lines can be refactored. Wasn't there talk about having an iterator module with useful stuff like that?) -- gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21 4817 C7FC A636 46D0 1BD6 Insecure (accessible): C5A5 A8FA CA39 AB03 10B8 F116 1713 1BCF 54C4 E1FE Learn Python! http://www.ibiblio.org/obp/thinkCSpy From just@letterror.com Mon Jul 30 09:53:35 2001 From: just@letterror.com (Just van Rossum) Date: Mon, 30 Jul 2001 10:53:35 +0200 Subject: [Python-Dev] Iterator addition? In-Reply-To: Message-ID: <20010730105341-r01010700-3955aee6-0910-010c@10.0.0.2> Moshe Zadka wrote: > No, instead of: > > > class concat: > > def __init__(self, *iterators): > self.iterators = list(iterators) > > def __iter__(self): return self > > def next(self): > while self.iterators: > try: > return self.iterators[0].next() > except StopIteration: > del self.iterators[0] > else: > raise StopIteration > > > for x in concat(iter1, iter2): > do_something(x) Or: from __future__ import generators def concat(*iterators): for i in iterators: for x in i: yield x for x in concat(iter1, iter2): do_something(x) Just From shang@cc.jyu.fi Mon Jul 30 10:20:54 2001 From: shang@cc.jyu.fi (Sami Hangaslammi) Date: Mon, 30 Jul 2001 12:20:54 +0300 (EET DST) Subject: [Python-Dev] Iterator addition? In-Reply-To: <20010730105341-r01010700-3955aee6-0910-010c@10.0.0.2> Message-ID: Just van Rossum wrote: > from __future__ import generators > > def concat(*iterators): > for i in iterators: > for x in i: > yield x > > for x in concat(iter1, iter2): > do_something(x) Yes, this is the solution that I eventually ended up with too. However, the real point I was trying to raise was, wether interators should look like sequences regarding addition, since the two are already exchangeable in so many places (e.g. tuple unpacking). Moshe Zadka wrote: > Wasn't there talk about having an iterator module with useful stuff > like that? This would be a great idea. I've ended up with a sizeable bunch of small utility functions when playing around with generators/iterators in 2.2a1. -- Sami Hangaslammi -- From m@moshez.org Mon Jul 30 12:05:58 2001 From: m@moshez.org (Moshe Zadka) Date: Mon, 30 Jul 2001 14:05:58 +0300 Subject: [Python-Dev] Nostalgic Versions Message-ID: Where can I find *really* old Python versions? I managed to find 1.2, but I want to get my hands on <1.0 versions if at all possible... Thanks. -- gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21 4817 C7FC A636 46D0 1BD6 Insecure (accessible): C5A5 A8FA CA39 AB03 10B8 F116 1713 1BCF 54C4 E1FE Learn Python! http://www.ibiblio.org/obp/thinkCSpy From guido@zope.com Mon Jul 30 13:42:49 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 08:42:49 -0400 Subject: [Python-Dev] Iterator addition? In-Reply-To: Your message of "Mon, 30 Jul 2001 12:20:54 +0300." References: Message-ID: <200107301242.IAA09350@cj20424-a.reston1.va.home.com> > the real point I was trying to raise was, wether interators should look > like sequences regarding addition, since the two are already exchangeable > in so many places (e.g. tuple unpacking). No. Adding a + operator would conflict in the case an iterator is also a user-defined object. Adding a * operator can't work because an iterator cannot be restarted (you have to extract the iterator afresh from the original object). Adding any other sequence operation (slicing, indexing, len() etc.) flies in the face of the "forward-only" nature of iterators. The *only* thing that iterators and sequences have in common is that they can be iterated over. So they are substitutable in all context where that's all you do -- including sequence (not tuple!) unpacking. And not in any other contexts. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@zope.com Mon Jul 30 13:46:39 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 08:46:39 -0400 Subject: [Python-Dev] Nostalgic Versions In-Reply-To: Your message of "Mon, 30 Jul 2001 14:05:58 +0300." References: Message-ID: <200107301246.IAA09379@cj20424-a.reston1.va.home.com> > Where can I find *really* old Python versions? I managed to find > 1.2, but I want to get my hands on <1.0 versions if at all possible... You can try to check out by symbolic release tag from the CVS. I think it goes back to 0.9.8 and maybe even before. Building may be problematic: I think the oldest Makefiles got lost, and some files were renamed -- CVS logs leave no trails of renaming. For what purpose, may I ask? --Guido van Rossum (home page: http://www.python.org/~guido/) From free@greatbabes.gz.ee Mon Jul 30 11:03:18 2001 From: free@greatbabes.gz.ee (free@greatbabes.gz.ee) Date: 30 Jul 2001 12:03:18 +0200 Subject: [Python-Dev] Special Deal This Week Only !!! Message-ID: ------=_COY6vAPF_ZEiJgplY_MA Content-Type: text/plain Content-Transfer-Encoding: 8bit -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- (This safeguard is not inserted when using the registered version) -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- (This safeguard is not inserted when using the registered version) -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- -------------------------------------------------------------------- ------=_COY6vAPF_ZEiJgplY_MA Content-Type: text/html Content-Transfer-Encoding: 8bit Get Acceess to 10 Sites For $1.99 Only !!!

Do not miss out on the opportunity to get a full week of access to 8 websites for the amazing low price of only $1.99,
including Big Tit Fantasies - The Ultimate Tit Lover's Paradise!
You could search the internet for months and you wouldn't find a better deal anywhere!

$1.99 SPECIAL DEAL
THIS WEEK ONLY!

 

 

8 Sites for the Price of One!
Do you really think you're gonna find a better deal somewhere else!?
Click here now and stop wasting valuable jerk-off time!

Powered by GREAT BABES - the best free pics here.


FastCounter by bCentral

------=_COY6vAPF_ZEiJgplY_MA-- From loewis@informatik.hu-berlin.de Mon Jul 30 14:10:23 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Mon, 30 Jul 2001 15:10:23 +0200 (MEST) Subject: [Python-Dev] Nostalgic Versions Message-ID: <200107301310.PAA18619@pandora.informatik.hu-berlin.de> > Where can I find *really* old Python versions? Here's what I found: ftp://ftp.enst.fr/pub/unix/lang/python/src/python1.1.tar.gz ftp://ftp.warwick.ac.uk/pub/misc/python1.0.1.tar.gz Regards, Martin From m@moshez.org Mon Jul 30 14:24:56 2001 From: m@moshez.org (Moshe Zadka) Date: Mon, 30 Jul 2001 16:24:56 +0300 Subject: [Python-Dev] Nostalgic Versions In-Reply-To: <200107301310.PAA18619@pandora.informatik.hu-berlin.de> References: <200107301310.PAA18619@pandora.informatik.hu-berlin.de> Message-ID: On Mon, 30 Jul 2001, Martin von Loewis wrote: > Here's what I found: > > ftp://ftp.enst.fr/pub/unix/lang/python/src/python1.1.tar.gz > ftp://ftp.warwick.ac.uk/pub/misc/python1.0.1.tar.gz Thanks a lot. -- gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21 4817 C7FC A636 46D0 1BD6 Insecure (accessible): C5A5 A8FA CA39 AB03 10B8 F116 1713 1BCF 54C4 E1FE Learn Python! http://www.ibiblio.org/obp/thinkCSpy From mal@lemburg.com Mon Jul 30 13:56:32 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 30 Jul 2001 14:56:32 +0200 Subject: [Python-Dev] Python API version & optional features Message-ID: <3B655980.948BCDEF@lemburg.com> Martin has uploaded a patch which modifies the Python API level number depending on the setting of the compile time option for internal Unicode width (UCS-2/UCS-4): https://sourceforge.net/tracker/?func=detail&aid=445717&group_id=5470&atid=305470 I am not sure whether this is the right way to approach this problem, though, since it affects all extensions -- not only ones using Unicode. If at all possible, I'd prefer some other means to handle this situation (extension developers are certainly not going to start shipping binaries for narrow and wide Python versions if their extension does not happen to use Unicode). Any ideas ? Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From thomas@xs4all.net Mon Jul 30 14:39:09 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 30 Jul 2001 15:39:09 +0200 Subject: [Python-Dev] Advice in stat.py In-Reply-To: <15204.61275.24195.153670@cj42289-a.reston1.va.home.com> Message-ID: <20010730153909.A20676@xs4all.nl> On Mon, Jul 30, 2001 at 01:23:39AM -0400, Fred L. Drake, Jr. wrote: > Aahz Maruch writes: > > If you have a single module that imports all four of these (and I don't > > think that's particularly bizarre), tracing back any random symbol to > > Only if you import-* them all, and that *is* a pathelogical case. > Any time you import-* *two* modules, you have a pathelogical case on > your hands, just ready to explode. We could generate a warning if the compiler detects two or more import *'s in the same codeblock ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From fdrake@acm.org Mon Jul 30 14:40:25 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 30 Jul 2001 09:40:25 -0400 (EDT) Subject: [Python-Dev] Python API version & optional features In-Reply-To: <3B655980.948BCDEF@lemburg.com> References: <3B655980.948BCDEF@lemburg.com> Message-ID: <15205.25545.353887.299167@cj42289-a.reston1.va.home.com> M.-A. Lemburg writes: > I am not sure whether this is the right way to approach this > problem, though, since it affects all extensions -- not only > ones using Unicode. Given that unicodeobject.h defines many macros and size-sensitive types in the public API, I don't see any way around this. If the API always used UCS4 (including in the macros), or defined both UCS2 and UCS4 versions of everything affected, then we could get around it. That seems like a high price to pay. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From thomas@xs4all.net Mon Jul 30 14:52:26 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 30 Jul 2001 15:52:26 +0200 Subject: [Python-Dev] Python on Playstation 2 In-Reply-To: <15204.61459.319425.84281@cj42289-a.reston1.va.home.com> Message-ID: <20010730155226.B20676@xs4all.nl> On Mon, Jul 30, 2001 at 01:26:43AM -0400, Fred L. Drake, Jr. wrote: > Paul Prescod writes: > > It isn't publicly available but Python has been ported to the > > Playstation 2 video game console. The only weakness is that a binary > > distribution wouldn't be useful because the format of Playstation CDs > > isn't portable. Jason Asbahr told me about. Perhaps at the python I'd *love* a copy of that :) > I'm curious: Is it the filesystem format or the lower-level > tracking format? If its only the former, a prepared image should be > useful. A prepared image won't help. Playstation CD's are copy-protected using a gimmick that makes images useless (unless you perform or get someone to perform a probably illegal operation on your console ;) > (And no, I don't have a PS2 waiting to boot up Python!) I do.. It's not exactly doing nothing right now, but it can sure use some Python :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Mon Jul 30 14:56:51 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 30 Jul 2001 15:56:51 +0200 Subject: [Python-Dev] Python API version & optional features References: <3B655980.948BCDEF@lemburg.com> <15205.25545.353887.299167@cj42289-a.reston1.va.home.com> Message-ID: <3B6567A3.E386EAB9@lemburg.com> "Fred L. Drake, Jr." wrote: > > M.-A. Lemburg writes: > > I am not sure whether this is the right way to approach this > > problem, though, since it affects all extensions -- not only > > ones using Unicode. > > Given that unicodeobject.h defines many macros and size-sensitive > types in the public API, I don't see any way around this. If the API > always used UCS4 (including in the macros), or defined both UCS2 and > UCS4 versions of everything affected, then we could get around it. > That seems like a high price to pay. I think Guido suggested using macros to turn the Unicode APIs into e.g. PyUnicodeUCS4_Encode() vs. PyUnicodeUCS2_Encode(). That would prevent loading of non-compatible extensions using Unicode APIs (it doesn't catch the argument parser usage, though, e.g. "u"). Perhaps that's the way to go ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From gward@python.net Mon Jul 30 15:08:20 2001 From: gward@python.net (Greg Ward) Date: Mon, 30 Jul 2001 10:08:20 -0400 Subject: [Python-Dev] Picking on platform fmod In-Reply-To: ; from tim.one@home.com on Sat, Jul 28, 2001 at 04:13:53PM -0400 References: Message-ID: <20010730100815.A1031@gerg.ca> On 28 July 2001, Tim Peters said: > Here's your chance to prove your favorite platform isn't a worthless pile of > monkey crap . Please run the attached. If it prints anything other > than I tried it on a 64-bit SGI box running IRIX 6.5. It dumped core. But there seem to be *lot* of problems with 2.2a1 on this platform; it died pretty early in the test suite. Guess I'll go file some bug reports... Greg -- Greg Ward - Unix weenie gward@python.net http://starship.python.net/~gward/ Life is too short for ordinary music. From guido@zope.com Mon Jul 30 15:27:32 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 10:27:32 -0400 Subject: [Python-Dev] Python API version & optional features In-Reply-To: Your message of "Mon, 30 Jul 2001 15:56:51 +0200." <3B6567A3.E386EAB9@lemburg.com> References: <3B655980.948BCDEF@lemburg.com> <15205.25545.353887.299167@cj42289-a.reston1.va.home.com> <3B6567A3.E386EAB9@lemburg.com> Message-ID: <200107301427.f6UERW802779@odiug.digicool.com> > > > I am not sure whether this is the right way to approach this > > > problem, though, since it affects all extensions -- not only > > > ones using Unicode. > > > > Given that unicodeobject.h defines many macros and size-sensitive > > types in the public API, I don't see any way around this. If the API > > always used UCS4 (including in the macros), or defined both UCS2 and > > UCS4 versions of everything affected, then we could get around it. > > That seems like a high price to pay. > > I think Guido suggested using macros to turn the Unicode APIs > into e.g. PyUnicodeUCS4_Encode() vs. PyUnicodeUCS2_Encode(). > > That would prevent loading of non-compatible extensions using Unicode > APIs (it doesn't catch the argument parser usage, though, e.g. > "u"). > > Perhaps that's the way to go ?! Hm, the "u" argument parser is a nasty one to catch. How likely is this to be the *only* reference to Unicode in a particular extension? I'm trying to convince myself that the magic number patch is okay, and here's what I come up with. If someone builds a Python with a non-standard Unicode width and accidentally uses a directory full of extensions built for the standard Unicode width on his platform, he deserves a warning. Since most extensions come with source anyway, people who want to experiment with UCS4 will have to be more adventurous and build all the extensions they need from source. The warnings will remind them. If there's a particular extension that they can only get in binary *and* that extension doesn't use Unicode, they can train themselves to ignore that warning. These warnings should use the warnings framework, by the way, to make it easier to ignore a specific warning. Currently it's a hard write to stderr. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Mon Jul 30 15:51:13 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 30 Jul 2001 16:51:13 +0200 Subject: [Python-Dev] Picking on platform fmod In-Reply-To: <20010730100815.A1031@gerg.ca> References: <20010730100815.A1031@gerg.ca> Message-ID: <20010730165113.C20676@xs4all.nl> On Mon, Jul 30, 2001 at 10:08:20AM -0400, Greg Ward wrote: > On 28 July 2001, Tim Peters said: > > Here's your chance to prove your favorite platform isn't a worthless pile of > > monkey crap . Please run the attached. If it prints anything other > > than > I tried it on a 64-bit SGI box running IRIX 6.5. It dumped core. But > there seem to be *lot* of problems with 2.2a1 on this platform; it died > pretty early in the test suite. Guess I'll go file some bug reports... Note that I didn't see Tim asking for a test on 2.2a1, and I didn't test it on 2.2 on all but my own Linux box. Instead, I used 2.1.1 (since I already had binaries for all SourceForge compilefarm machines and all of our production machines :) and Tim hasn't complained yet. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Mon Jul 30 15:59:38 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 30 Jul 2001 16:59:38 +0200 Subject: [Python-Dev] Python API version & optional features References: <3B655980.948BCDEF@lemburg.com> <15205.25545.353887.299167@cj42289-a.reston1.va.home.com> <3B6567A3.E386EAB9@lemburg.com> <200107301427.f6UERW802779@odiug.digicool.com> Message-ID: <3B65765A.9706A4A2@lemburg.com> Guido van Rossum wrote: > > > > > I am not sure whether this is the right way to approach this > > > > problem, though, since it affects all extensions -- not only > > > > ones using Unicode. > > > > > > Given that unicodeobject.h defines many macros and size-sensitive > > > types in the public API, I don't see any way around this. If the API > > > always used UCS4 (including in the macros), or defined both UCS2 and > > > UCS4 versions of everything affected, then we could get around it. > > > That seems like a high price to pay. > > > > I think Guido suggested using macros to turn the Unicode APIs > > into e.g. PyUnicodeUCS4_Encode() vs. PyUnicodeUCS2_Encode(). > > > > That would prevent loading of non-compatible extensions using Unicode > > APIs (it doesn't catch the argument parser usage, though, e.g. > > "u"). > > > > Perhaps that's the way to go ?! > > Hm, the "u" argument parser is a nasty one to catch. How likely is > this to be the *only* reference to Unicode in a particular extension? It is not very likely but IMHO possible for e.g. extensions which rely on the fact that wchar_t == Py_UNICODE and then do direct interfacing to some other third party code. I guess one could argue that extension writers should check for narrow/wide builds in their extensions before using Unicode. Since the number of Unicode extension writers is much smaller than the number of users, I think that this apporach would be reasonable, provided that we document the problem clearly in the NEWS file. > I'm trying to convince myself that the magic number patch is okay, and > here's what I come up with. If someone builds a Python with a > non-standard Unicode width and accidentally uses a directory full of > extensions built for the standard Unicode width on his platform, he > deserves a warning. Since most extensions come with source anyway, > people who want to experiment with UCS4 will have to be more > adventurous and build all the extensions they need from source. The > warnings will remind them. If there's a particular extension that > they can only get in binary *and* that extension doesn't use Unicode, > they can train themselves to ignore that warning. Hmm, that would probably not make UCS-4 builds very popular ;-) > These warnings should use the warnings framework, by the way, to make > it easier to ignore a specific warning. Currently it's a hard write > to stderr. Using the warnings framework would indeed be a good idea (many older extensions work just fine even with later API levels; the warnings are annoying, though) ! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From paulp@ActiveState.com Mon Jul 30 16:09:04 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Mon, 30 Jul 2001 08:09:04 -0700 Subject: [Python-Dev] Python on Playstation 2 References: <3B649F84.5D241B38@ActiveState.com> <15204.61459.319425.84281@cj42289-a.reston1.va.home.com> Message-ID: <3B657890.98ED00CB@ActiveState.com> "Fred L. Drake, Jr." wrote: > > Paul Prescod writes: > > It isn't publicly available but Python has been ported to the > > Playstation 2 video game console. The only weakness is that a binary > > distribution wouldn't be useful because the format of Playstation CDs > > isn't portable. Jason Asbahr told me about. Perhaps at the python > > I'm curious: Is it the filesystem format or the lower-level > tracking format? If its only the former, a prepared image should be > useful. > (And no, I don't have a PS2 waiting to boot up Python!) Most likely those "in the know" aren't even allowed to tell us that much. For all I know the filesystem is encrypted. Remember that these game manufacturers do NOT want an independent third party market to arise. They want you to come to them for the specs. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From gward@python.net Mon Jul 30 16:46:06 2001 From: gward@python.net (Greg Ward) Date: Mon, 30 Jul 2001 11:46:06 -0400 Subject: [Python-Dev] Picking on platform fmod In-Reply-To: <20010730100815.A1031@gerg.ca>; from gward@python.net on Mon, Jul 30, 2001 at 10:08:20AM -0400 References: <20010730100815.A1031@gerg.ca> Message-ID: <20010730114606.B1031@gerg.ca> On 30 July 2001, I said: > I tried it on a 64-bit SGI box running IRIX 6.5. It dumped core. OK, it worked this time. I guess I improved my py-karma by building seventeen different ways and submitting bug reports for all the problems I had building on this platform. Details: $ uname -a IRIX64 mouldy 6.5 10181058 IP27 $ hinv 4 180 MHZ IP27 Processors CPU: MIPS R10000 Processor Chip Revision: 2.6 FPU: MIPS R10010 Floating Point Chip Revision: 0.0 [...] $ time ./python ~/ffmod.py 0 failures in 10000 tries 51.671u 0.083s 0:51.99 99.5% 0+0k 0+0io 0pf+0w Greg -- Greg Ward - nerd gward@python.net http://starship.python.net/~gward/ There are no stupid questions -- only stupid people. From gward@python.net Mon Jul 30 16:48:06 2001 From: gward@python.net (Greg Ward) Date: Mon, 30 Jul 2001 11:48:06 -0400 Subject: [Python-Dev] Picking on platform fmod In-Reply-To: ; from tim.one@home.com on Sat, Jul 28, 2001 at 04:13:53PM -0400 References: Message-ID: <20010730114806.C1031@gerg.ca> On 28 July 2001, Tim Peters said: > Here's your chance to prove your favorite platform isn't a worthless pile of > monkey crap . Please run the attached. If it prints anything other > than Oops, another data point: I didn't see an AMD Athlon or Linux 2.4 in your list of successes, so here's one: $ uname -a Linux cthulhu 2.4.2 #1 Thu May 3 14:30:48 EST 2001 i686 unknown $ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 4 model name : AMD Athlon(tm) Processor stepping : 2 cpu MHz : 807.197 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow bogomips : 1608.90 $ time python ffmod.py 0 failures in 10000 tries python ffmod.py 5.68s user 0.01s system 92% cpu 6.153 total Greg -- Greg Ward - Unix weenie gward@python.net http://starship.python.net/~gward/ God is real, unless declared integer. From guido@zope.com Mon Jul 30 16:47:43 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 11:47:43 -0400 Subject: [Python-Dev] Python API version & optional features In-Reply-To: Your message of "Mon, 30 Jul 2001 16:59:38 +0200." <3B65765A.9706A4A2@lemburg.com> References: <3B655980.948BCDEF@lemburg.com> <15205.25545.353887.299167@cj42289-a.reston1.va.home.com> <3B6567A3.E386EAB9@lemburg.com> <200107301427.f6UERW802779@odiug.digicool.com> <3B65765A.9706A4A2@lemburg.com> Message-ID: <200107301547.f6UFlhB02991@odiug.digicool.com> > > Hm, the "u" argument parser is a nasty one to catch. How likely is > > this to be the *only* reference to Unicode in a particular extension? > > It is not very likely but IMHO possible for e.g. extensions > which rely on the fact that wchar_t == Py_UNICODE and then do > direct interfacing to some other third party code. > > I guess one could argue that extension writers should check > for narrow/wide builds in their extensions before using Unicode. > > Since the number of Unicode extension writers is much smaller > than the number of users, I think that this apporach would be > reasonable, provided that we document the problem clearly in the > NEWS file. OK. I approve. > Hmm, that would probably not make UCS-4 builds very popular ;-) Do you have any reason to assume that it would be popular otherwise? :-) :-) :-) > > These warnings should use the warnings framework, by the way, to make > > it easier to ignore a specific warning. Currently it's a hard write > > to stderr. > > Using the warnings framework would indeed be a good idea (many older > extensions work just fine even with later API levels; the warnings > are annoying, though) ! Exactly. I'm not going to make the change, but it should be a two-liner in Python/modsupport.c:Py_InitModule4(). --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@rahul.net Mon Jul 30 16:49:36 2001 From: aahz@rahul.net (Aahz Maruch) Date: Mon, 30 Jul 2001 08:49:36 -0700 (PDT) Subject: [Python-Dev] pep-discuss In-Reply-To: <3B62EB05.396DF4D7@ActiveState.com> from "Paul Prescod" at Jul 28, 2001 09:40:37 AM Message-ID: <20010730154936.AE36899C94@waltz.rahul.net> Paul Prescod wrote: > > We've talked about having a mailing list for general PEP-related > discussions. Two things make me think that revisiting this would be a > good idea right now. > > First, the recent loosening up of the python-dev rules threatens the > quality of discussion about bread and butter issues such as patch > discussions and process issues. > > Second, the flamewar on python-list basically drowned out the usual > newbie questions and would give a person coming new to Python a very > negative opinion about the language's future and the friendliness of the > community. I would rather redirect as much as possible of that to a list > that only interested participants would have to endure. While what you say makes sense, overall, there are a lot of people (me included) who prefer discussion on newsgroups, and I can't quite see creating a newsgroup for PEP discussions yet. Call me -0.25 for kicking discussion off c.l.py and +0.25 for getting it off python-dev. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From esr@thyrsus.com Mon Jul 30 04:56:44 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Sun, 29 Jul 2001 23:56:44 -0400 Subject: [Python-Dev] Picking on platform fmod In-Reply-To: <20010730114806.C1031@gerg.ca>; from gward@python.net on Mon, Jul 30, 2001 at 11:48:06AM -0400 References: <20010730114806.C1031@gerg.ca> Message-ID: <20010729235644.A13628@thyrsus.com> Greg Ward : > On 28 July 2001, Tim Peters said: > > Here's your chance to prove your favorite platform isn't a worthless pile of > > monkey crap . Please run the attached. If it prints anything other > > than > > Oops, another data point: I didn't see an AMD Athlon or Linux 2.4 in > your list of successes, so here's one: My RH 7.1 success report should have been more specific. Also 2.4.2, running on a Dual Pentium II box. -- Eric S. Raymond Before a standing army can rule, the people must be disarmed, as they are in almost every kingdom in Europe. The supreme power in America cannot enforce unjust laws by the sword, because the people are armed, and constitute a force superior to any band of regular troops. -- Noah Webster From aahz@rahul.net Mon Jul 30 17:20:48 2001 From: aahz@rahul.net (Aahz Maruch) Date: Mon, 30 Jul 2001 09:20:48 -0700 (PDT) Subject: [Python-Dev] Picking on platform fmod In-Reply-To: from "Tim Peters" at Jul 28, 2001 04:13:53 PM Message-ID: <20010730162048.E922999C9F@waltz.rahul.net> Tim Peters wrote: > > Here's your chance to prove your favorite platform isn't a worthless pile of > monkey crap . Please run the attached. If it prints anything other > than > > 0 failures in 10000 tries > > it will probably print a lot. In that case I'd like to know which flavor of > C+libc+libm you're using, and the OS; a few of the failures it prints may be > helpful too. Successful with Python 2.0 on NetBSD (unknown CPU) and Win98SE with Athlon. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From mclay@nist.gov Mon Jul 30 16:06:52 2001 From: mclay@nist.gov (Michael McLay) Date: Mon, 30 Jul 2001 11:06:52 -0400 Subject: [Python-Dev] Revised decimal type PEP Message-ID: <0107301106520A.02216@fermi.eeel.nist.gov> PEP: 2XX Title: Adding a Decimal type to Python Version: $Revision:$ Author: mclay@nist.gov Status: Draft Type: ?? Created: 25-Jul-2001 Python-Version: 2.2 Introduction This PEP describes the addition of a decimal number type to Python. Rationale The original Python numerical model included int, float, and long. By popular request the imaginary type was added to improve support for engineering and scientific applications. The addition of a decimal number type to Python will improve support for business applications as well as improve the utility of Python a teaching language. The number types currently used in Python are encoded as base two binary numbers. The base 2 arithmetic used by binary numbers closely approximates the decimal number system and for many applications the differences in the calculations are unimportant. The decimal number type encodes numbers as decimal digits and use base 10 arithmetic. This is the number system taught to the general public and it is the system used by businesses when making financial calculations. For financial and accounting applications the difference between binary and decimal types is significant. Consequently the computer languages used for business application development, such as COBOL, use decimal types. The decimal number type meets the expectations of non-computer scientists when making calculations. For these users the rounding errors that occur when using binary numbers is a source of confusion and irritation. Implementation The tokenizer will be modified to recognized number literals with a 'd' suffix and a decimal() function will be added to __builtins__. A decimal number can be used to represent integers and floating point numbers and decimal numbers can also be displayed using scientific notation. Examples of decimal numbers include: 1234d -1234d 1234.56d -1234.56d 1234.56e2d -1234.56e-2d The type returned by either a decimal floating point or a decimal integer is the same: >>> type(12.2d) >>> type(12d) >>> type(-12d+12d) >>> type(12d+12.0d) This proposal will also add an optional 'b' suffix to the representation of binary float type literals and binary int type literals. >>> float(12b) 12.0 >>> type(12.2b) >>> type(float(12b)) >>> type(12b) The decimal() conversion function added to __builtins__ will support conversions of strings, and binary types to decimal. >>> type(decimal("12d")) >>> type(decimal("12")) >>> type(decimal(12b)) >>> type(decimal(12.0b)) >>> type(decimal(123456789123L)) The conversion functions int() and float() in the __builtin__ module will support conversion of decimal numbers to the binary number types. >>> type(int(12d)) >>> type(float(12.0d)) Expressions that mix integers with decimals will automatically convert the integer to decimal and the result will be a decimal number. >>> type(12d + 4b) >>> type(12b + 4d) >>> type(12d + len('abc')) >>> 3d/4b 0.75 Expressions that mix binary floats with decimals introduce the possibility of unexpected results because the two number types use different internal representations for the same numerical value. The severity of this problem is dependent on the application domain. For applications that normally use binary numbers the error may not be important and the conversion should be done silently. For newbie programmers a warning should be issued so the newbie will be able to locate the source of a discrepancy between the expected results and the results that were achieved. For financial applications the mixing of floating point with binary numbers should raise an exception. To accommodate the three possible usage models the python interpreter command line options will be used to set the level for warning and error messages. The three levels are: promiscuous mode, -f or --promiscuous safe mode -s or --save pedantic mode -p or --pedantic The default setting will be set to the safe setting. In safe mode mixing decimal and binary floats in a calculation will trigger a warning message. >>> type(12.3d + 12.2b) Warning: the calculation mixes decimal numbers with binary floats In promiscuous mode warnings will be turned off. >>> type(12.3d + 12.2b) In pedantic mode warning from safe mode will be turned into exceptions. >>> type(12.3d + 12.2b) Traceback (innermost last): File "", line 1, in ? TypeError: the calculation mixes decimal numbers with binary floats Semantics of Decimal Numbers ?? From skip@pobox.com (Skip Montanaro) Mon Jul 30 17:53:44 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 30 Jul 2001 11:53:44 -0500 Subject: [Python-Dev] Nostalgic Versions In-Reply-To: <200107301246.IAA09379@cj20424-a.reston1.va.home.com> References: <200107301246.IAA09379@cj20424-a.reston1.va.home.com> Message-ID: <15205.37144.824975.214559@beluga.mojam.com> >> Where can I find *really* old Python versions? I managed to find >> 1.2, but I want to get my hands on <1.0 versions if at all possible... Guido> You can try to check out ... Guido> For what purpose, may I ask? I'll wager Moshe is planning on "fixing" division. ;-) Skip From fdrake@acm.org Mon Jul 30 17:51:57 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 30 Jul 2001 12:51:57 -0400 (EDT) Subject: [Python-Dev] Advice in stat.py In-Reply-To: <15204.62169.600623.580141@anthem.wooz.org> References: <200107271547.LAA24634@cj20424-a.reston1.va.home.com> <20010730025148.3279199C85@waltz.rahul.net> <15204.53919.982508.201595@anthem.wooz.org> <15204.61275.24195.153670@cj42289-a.reston1.va.home.com> <15204.62169.600623.580141@anthem.wooz.org> Message-ID: <15205.37037.725140.801916@cj42289-a.reston1.va.home.com> Barry A. Warsaw writes: > Um, sure, but it's can be pretty inconvenient to export 193 symbols > this way :). Yeah, that's a lot. ;-) > I've often thought that it would be nice to have better delegation > support in Python, and no __getattr__() doesn't really hack it. I'm > encouraged that some of the Py2.2 descr-branch stuff might actually > make this valid programming technique more useful and then even I > could see (eventually) giving up on import-*. Definately; it would be good to have nicer delegation support. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido@zope.com Mon Jul 30 18:40:06 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 13:40:06 -0400 Subject: [Python-Dev] pep-discuss In-Reply-To: Your message of "Mon, 30 Jul 2001 08:49:36 PDT." <20010730154936.AE36899C94@waltz.rahul.net> References: <20010730154936.AE36899C94@waltz.rahul.net> Message-ID: <200107301740.f6UHe6K03226@odiug.digicool.com> > Paul Prescod wrote: > > > > We've talked about having a mailing list for general PEP-related > > discussions. Two things make me think that revisiting this would be a > > good idea right now. > > > > First, the recent loosening up of the python-dev rules threatens the > > quality of discussion about bread and butter issues such as patch > > discussions and process issues. > > > > Second, the flamewar on python-list basically drowned out the usual > > newbie questions and would give a person coming new to Python a very > > negative opinion about the language's future and the friendliness of the > > community. I would rather redirect as much as possible of that to a list > > that only interested participants would have to endure. > > While what you say makes sense, overall, there are a lot of people (me > included) who prefer discussion on newsgroups, and I can't quite see > creating a newsgroup for PEP discussions yet. Call me -0.25 for kicking > discussion off c.l.py and +0.25 for getting it off python-dev. For me personally, it would just be another list to follow, no matter where it happens, so consider me -0. I won't object if a majority on python-dev wants this though. --Guido van Rossum (home page: http://www.python.org/~guido/) From esr@thyrsus.com Mon Jul 30 06:48:59 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Mon, 30 Jul 2001 01:48:59 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? Message-ID: <20010730014859.A15971@thyrsus.com> The 2001 O'Reilly Open Source Convention, was, as usual, a very stimulating event and a forum for a lot of valuable high-level conversations between the principal developers of many open-source projects. Many of you know that I maintain friendly and relatively close relations with a number of senior Perl hackers, including both Larry Wall himself and others like Chip Salzenberg, Randall Schwartz, Tom Christiansen, Adam Turoff, and more recently Simon Cozens (who lurks on this list these days). At OSCon I believe I got a pretty good picture of what the leaders of the Perl community are thinking and planning these days. They have definitely come out of the slump they were in a year ago -- there's a much-renewed sense of energy over there. I think their plans offer both the Perl and Python communities some large strategic opportunities. Specifically, I'm urging the Python community's leadership to seriously explore the possibility of helping make the Parrot hoax into a working reality. I have discussed this with Guido by phone, and though he is skeptical about such an implementation being actually possible, he also thinks the idea has tremendous potential and says he is willing to support it in public. The Perl people have blocked out an architecture for Perl 6 that envisages a new bytecode level, designed and implemented from scratch. They're very serious about this; I participated in some discussions of the bytecode design (and, incidentally, argued that the bytecode should emulate a stack rather than a register machine because the cost/speed disparities that justify register architectures in hardware don't exist in a software VM). The Perl people are receptive -- indeed, some of them are actively pushing -- the idea that their new bytecode should not be Perl-specific. Dan Sugalski, the current lead for the bytecode interpreter project, has named it Parrot. At the Perl 6 talk I attended, Chip Salzenberg speculated in public about possibly supporting a common runtime for Perl, Python, Ruby, and Intercal(!). One of the things that makes this an unprecedented opportunity is that the design of Perl 6 is not yet set in stone -- and Larry has already shown a willingness to move it in a Pythonward direction. Syntactically, Perl 5's -> syntax is going away to be replaced by a Python-like dot with implicit dereferencing (and Larry said in public this was Python's influence, not Java's). The languages have of course converged in other ways recently -- Python's new lexical scoping actually brings it closer to Perl "use strict" semantics. I believe the way is open for Python's leading technical people to be involved as co-equals in the design and implementation of the Parrot bytecode interpreter. I have even detected some willingness to use Python's existing bytecode as a design model for Parrot, and perhaps even portions of the Python interpreter codebase! One bold but possibly workable proposal would be to offer Dan and the Parrot project the Python bytecode interpreter as a base for the Parrot code, and then be willing to incorporate whatever (probably relatively minor) extensions are needed to support Perl 6's primitives. Following my conversation with Guido, I've put doing an architectural comparison of the existing Python and Perl bytecodes at the top of my priority list. I'm flying to Taipei tomorrow and will have a lot of hours on airplanes with my laptop to do this. Committing a common runtime with Perl would mean relinquishing exclusive design control of our bytecode level, but the Perl people seem themselves willing to give up *their* exclusive control to make this work. It is rather remarkable how respectful of Python they have become, and I can't emphasize enough that I think they are truly ready for us to come to the project as equal partners. (One important place where I think everybody understands the Python side of the force would clearly win out in a final Parrot design is in the extension-and-embedding facilities. Perl XS is acknowledged to be a nasty mess. My guess is the Perl guys would drop it like a hot rock for our stuff -- that would be as clear a win for them as co-opting Perl-style regexps was for us.) I think the benefits of a successful unification at the bytecode level, together with Larry's willingness to Pythonify the upper level of Perl 6 a bit, could be vast -- both for the Python community in particular and for scripting-language users in general. 1. Mixed-language programming in Perl and Python could become almost seamless, with all that implies for both languages getting the use of each others' libraries. 2. The prospects for getting decent Python compilation to native code would improve if both the Python and Perl communities were strongly motivated to solve the bytecode-compilation problem. 3. More generally, the fact remains that Perl's user/developer base is still much larger than ours. Successful unification would co-opt a lot of that energy for Python. Because the brain drain between Perl and Python is pretty much unidirectional in Python's favor (a fact even the top Perl hackers ruefully acknowledge), I don't think we need worry about being subsumed in that larger community either. I think there is a wonderful opportunity here for the Python and Perl developers to lead the open-source world. If we can do a good Parrot design together, I think it will be hard for the smaller scripting language communities to resist its pull. Ultimately, the common Parrot runtime could become the open-source community's answer -- a very competititive answer -- to the common runtime Microsoft is pushing for .NET. I think trying to make Parrot work would be worth some serious effort. -- Eric S. Raymond The Bible is not my book, and Christianity is not my religion. I could never give assent to the long, complicated statements of Christian dogma. -- Abraham Lincoln From mwh@python.net Mon Jul 30 19:16:26 2001 From: mwh@python.net (Michael Hudson) Date: 30 Jul 2001 14:16:26 -0400 Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 In-Reply-To: Guido van Rossum's message of "Sat, 28 Jul 2001 09:57:28 -0400" References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> <200107281357.JAA30859@cj20424-a.reston1.va.home.com> Message-ID: <2m3d7emged.fsf@starship.python.net> Guido van Rossum writes: > > Not directly relavent to the PEP, but... > > > > Guido van Rossum writes: > > > > > Q. What about code compiled by the codeop module? > > > > > > A. Alas, this will always use the default semantics (set by the -D > > > command line option). This is a general problem with the > > > future statement; PEP 236[4] lists it as an unresolved > > > problem. You could have your own clone of codeop.py that > > > includes a future division statement, but that's not a general > > > solution. > > > > Did you look at my Nasty Hack(tm) to bodge around this? It's at > > > > http://starship.python.net/crew/mwh/hacks/codeop-hack.diff > > > > if you haven't. I'm not sure it will work with what you're planning > > for division, but it works for generators (and worked for nested > > scopes when that was relavent). > > Ouch. Nasty. Hat off to you for thinking of this! I'll choose to take this as a positive remoark :-) > > There are a host of saner ways round this, of course - like adding an > > optional "flags" argument to compile, for instance. > > We'll have to keep that in mind. Here's a fairly short pre-PEP on the issue. If I haven't made any gross editorial blunders, can Barry give it a number and check the sucker in? PEP: XXXX Title: Supporting __future__ statements in simulated shells Version: $Version:$ Author: Michael Hudson Status: Draft Type: Standards Track Requires: 0236 Created: 30-Jul-2001 Python-Version: 2.2 Post-History: Abstract As noted in PEP 263, there is no clear way for "simulated interactive shells" to simulate the behaviour of __future__ statements in "real" interactive shells, i.e. have __future__ statements' effects last the life of the shell. This short PEP proposes to make this possible by adding an optional fourth argument to the builtin function "compile" and adding machinery to the standard library modules "codeop" and "code" to make the construction of such shells easy. Specification I propose adding a fourth, optional, "flags" argument to the builtin "compile" function. If this argument is omitted, there will be no change in behaviour from that of Python 2.1. If it is present it is expected to be an integer, representing various possible compile time options as a bitfield. The bitfields will have the same values as the PyCF_* flags #defined in Include/pythonrun.h (at the time of writing there are only two - PyCF_NESTED_SCOPES and PyCF_GENERATORS). These are currently not exposed to Python, so I propose adding them to codeop.py (because it's already here, basically). XXX Should the supplied flags be or-ed with the flags of the calling frame, or do we override them? I'm for the former, slightly. I also propose adding a pair of classes to the standard library module codeop. One - probably called Compile - will sport a __call__ method which will act much like the builtin "compile" of 2.1 with the difference that after it has compiled a __future__ statement, it "remembers" it and compiles all subsequent code with the __future__ options in effect. It will do this by examining the co_flags field of any code object it returns, which in turn means writing and maintaining a Python version of the function PyEval_MergeCompilerFlags found in Python/ceval.c. Objects of the other class added to codeop - probably called CommandCompiler or somesuch - will do the job of the existing codeop.compile_command function, but in a __future__-aware way. Finally, I propose to modify the class InteractiveInterpreter in the standard library module code to use a CommandCompiler to emulate still more closely the behaviour of the default Python shell. Backward Compatibility Should be very few or none; the changes to compile will make no difference to existing code, nor will adding new functions or classes to codeop. Exisiting code using code.InteractiveInterpreter may change in behaviour, but only for the better in that the "real" Python shell will be being better impersonated. Forward Compatibility codeop will require very mild tweaking as each new __future__ statement is added. Such events will hopefully be very rare, so such a burden is unlikely to cause significant pain. Implementation None yet; none of the above should be at all hard. If this draft is well received, I'll upload a patch to sf "soon" and point to it here. Copyright This document has been placed in the public domain. -- ARTHUR: The ravenours bugblatter beast of Traal ... is it safe? FORD: Oh yes, it's perfectly safe ... it's just us who are in trouble. -- The Hitch-Hikers Guide to the Galaxy, Episode 6 From paulp@ActiveState.com Mon Jul 30 19:53:24 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Mon, 30 Jul 2001 11:53:24 -0700 Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> <200107281357.JAA30859@cj20424-a.reston1.va.home.com> <2m3d7emged.fsf@starship.python.net> Message-ID: <3B65AD24.84DD88A2@ActiveState.com> Michael Hudson wrote: > >... > I propose adding a fourth, optional, "flags" argument to the > builtin "compile" function. If this argument is omitted, there > will be no change in behaviour from that of Python 2.1. > > If it is present it is expected to be an integer, representing > various possible compile time options as a bitfield. Nit: What is the virtue to using a C-style bitfield? The efficiency isn't much of an issue. I'd prefer either keyword arguments or a list of strings. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From Samuele Pedroni Mon Jul 30 20:00:12 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Mon, 30 Jul 2001 21:00:12 +0200 (MET DST) Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 Message-ID: <200107301900.VAA09388@core.inf.ethz.ch> Hi. [Michael Hudson] > One - probably called Compile - will sport a __call__ method which > will act much like the builtin "compile" of 2.1 with the > difference that after it has compiled a __future__ statement, it > "remembers" it and compiles all subsequent code with the > __future__ options in effect. > > It will do this by examining the co_flags field of any code object > it returns, which in turn means writing and maintaining a Python > version of the function PyEval_MergeCompilerFlags found in > Python/ceval.c. FYI, in Jython (internally) we have a series of compile_flags functions that take a "opaque" object CompilerFlags that is passed to the function and compilation actually change the object in order to reflect future statements encoutered during compilation... Not elegant but avoids code duplication. Of course we can change that. Samuele Pedroni. From mwh@python.net Mon Jul 30 20:11:16 2001 From: mwh@python.net (Michael Hudson) Date: 30 Jul 2001 15:11:16 -0400 Subject: [Python-Dev] Simulating shells (was Re: Changing the Division Operator -- PEP 238, rev 1.12) In-Reply-To: Paul Prescod's message of "Mon, 30 Jul 2001 11:53:24 -0700" References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> <200107281357.JAA30859@cj20424-a.reston1.va.home.com> <2m3d7emged.fsf@starship.python.net> <3B65AD24.84DD88A2@ActiveState.com> Message-ID: <2mitgafd0r.fsf_-_@starship.python.net> Paul Prescod writes: > Michael Hudson wrote: > > > >... > > I propose adding a fourth, optional, "flags" argument to the > > builtin "compile" function. If this argument is omitted, there > > will be no change in behaviour from that of Python 2.1. > > > > If it is present it is expected to be an integer, representing > > various possible compile time options as a bitfield. > > Nit: What is the virtue to using a C-style bitfield? The efficiency > isn't much of an issue. I'd prefer either keyword arguments or a list of > strings. Err, hadn't really occured to me to do anything else, to be honest! At one point I was going to use the same bits as are used in the code.co_flags field, which was probably where the bitfield idea originated. By "keyword arguments" do you mean e.g: compile(source, file, start_symbol, generators=1, division=0) ? I think that would be mildly painful for the one use I had in mind (the additions to codeop), and also mildly painful to implement. compile(source, file, start_symbol,{'generators':1, 'division':0}) would be better from my point of view. I think this is a bit of a propeller-heads-only feature, to be honest, so I'm not that inclined to worry aobut the API. Cheers, M. -- 3. Syntactic sugar causes cancer of the semicolon. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From guido@zope.com Mon Jul 30 20:11:17 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 15:11:17 -0400 Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 In-Reply-To: Your message of "Mon, 30 Jul 2001 21:00:12 +0200." <200107301900.VAA09388@core.inf.ethz.ch> References: <200107301900.VAA09388@core.inf.ethz.ch> Message-ID: <200107301911.f6UJBHQ03472@odiug.digicool.com> > [Michael Hudson] > > One - probably called Compile - will sport a __call__ method which > > will act much like the builtin "compile" of 2.1 with the > > difference that after it has compiled a __future__ statement, it > > "remembers" it and compiles all subsequent code with the > > __future__ options in effect. > > > > It will do this by examining the co_flags field of any code object > > it returns, which in turn means writing and maintaining a Python > > version of the function PyEval_MergeCompilerFlags found in > > Python/ceval.c. > FYI, in Jython (internally) we have a series of compile_flags functions > that take a "opaque" object CompilerFlags that is passed to the function > and compilation actually change the object in order to reflect future > statements encoutered during compilation... > Not elegant but avoids code duplication. > > Of course we can change that. Does codeop currently work in Jython? The solution should continue to work in Jython then. Does Jython support the same flag bit values as CPython? If not, Paul Prescod's suggestion to use keyword arguments becomes very relevant. --Guido van Rossum (home page: http://www.python.org/~guido/) From bckfnn@worldonline.dk Mon Jul 30 20:16:36 2001 From: bckfnn@worldonline.dk (Finn Bock) Date: Mon, 30 Jul 2001 19:16:36 GMT Subject: [Python-Dev] zipfiles on sys.path In-Reply-To: <20010725215830.2F49D14A25D@oratrix.oratrix.nl> References: <20010725215830.2F49D14A25D@oratrix.oratrix.nl> Message-ID: <3b65932d.2748051@mail.wanadoo.dk> Thanks for the feedback. >> - The __path__ vrbl in a package 'foo.bar' loaded from zipfile.zip >> will have the value ['zipfile.zip!foo/bar'] and this same syntax can >> also be used when adding entries to sys.path and __path__. > >__path__ is set to the package name. I'm not sure of the exact >rationale for this (Just did the package support) but it seems to work >fine. I think the result of the Mac implementation is that the package hierarchy and the folder structure in the archive must match. Normally this is the case but changes to __path__ can cause sub-modules to loaded from somewhere else. I'm guessing such changes to to __path__ isn't considered on Mac when importing from an archive. [Just] >I don't know the rationale either (or at least: not anymore ;-), I just copied >the behavior of frozen packages (as in freeze.py) from import.c. >PyImport_ImportFrozenModule() contains this snippet: Dynamic changes to __path__ is probably not needed for frozen packages. It may not even be needed for imports from zipfile. My first attempt of adding this feature did not support changes to __path__. regards, finn From guido@zope.com Mon Jul 30 20:18:55 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 15:18:55 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: Your message of "Mon, 30 Jul 2001 01:48:59 EDT." <20010730014859.A15971@thyrsus.com> References: <20010730014859.A15971@thyrsus.com> Message-ID: <200107301918.f6UJIt003517@odiug.digicool.com> Obviously, just as the new design is aiming at Perl 6, it would be aiming at Python 3. Nothing's impossible these days, so I am keeping an open mind. I expect that in addition to the bytecode, the entire runtime architecture would have to be shared though for this to make sense, and I'm not sure how easy that would be, even if Perl is willing to be flexible. Most of Python's run-time semantics are very carefully defined and shouldn't be changed in order to fit in the common runtime. I'm looking forward to Eric's comparison of the two run-time systems. (Eric, be sure to use a copy of 2.2a1 or the descr-branch -- *don't* use the CVS trunk.) --Guido van Rossum (home page: http://www.python.org/~guido/) From paulp@ActiveState.com Mon Jul 30 20:20:27 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Mon, 30 Jul 2001 12:20:27 -0700 Subject: [Python-Dev] Re: Simulating shells (was Re: Changing the Division Operator -- PEP 238, rev 1.12) References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> <200107281357.JAA30859@cj20424-a.reston1.va.home.com> <2m3d7emged.fsf@starship.python.net> <3B65AD24.84DD88A2@ActiveState.com> <2mitgafd0r.fsf_-_@starship.python.net> Message-ID: <3B65B37B.E3E05945@ActiveState.com> Michael Hudson wrote: > >... > > At one point I was going to use the same bits as are used in the > code.co_flags field, which was probably where the bitfield idea > originated. > > By "keyword arguments" do you mean e.g: > > compile(source, file, start_symbol, generators=1, division=0) > > ? I think that would be mildly painful for the one use I had in mind > (the additions to codeop), and also mildly painful to implement. Sorry, could you elaborate on why this is painful to use and implement? Considering the availability of **args, the code above looks to me like syntactic sugar for the code below: > compile(source, file, start_symbol,{'generators':1, 'division':0}) > > would be better from my point of view. I think this is a bit of a > propeller-heads-only feature, to be honest, so I'm not that inclined > to worry aobut the API. I would just like to see an end to the convention of using bitfields in Python everywhere. You're just my latest target. Python is not a really great bit-manipulation language! -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From jeremy@zope.com Mon Jul 30 20:23:27 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 30 Jul 2001 15:23:27 -0400 (EDT) Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <20010730014859.A15971@thyrsus.com> References: <20010730014859.A15971@thyrsus.com> Message-ID: <15205.46127.377897.520922@slothrop.digicool.com> >>>>> "ESR" == Eric S Raymond writes: ESR> Following my conversation with Guido, I've put doing an ESR> architectural comparison of the existing Python and Perl ESR> bytecodes at the top of my priority list. I'm flying to Taipei ESR> tomorrow and will have a lot of hours on airplanes with my ESR> laptop to do this. Eric, This is a good project. It's really difficult to evaluate the Parrot proposal otherwise. I know quite a bit about Python's VM and runtime, but next to nothing about Perl's. If you're feeling particularly energetic, you might look at some other VM's -- Ocaml, Java, and Ruby come to mind. It is probably a much harder fit for the first two, because they are statically typed. But I'd be quite interested to see a survey of language VM techniques. Jeremy From Samuele Pedroni Mon Jul 30 20:27:55 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Mon, 30 Jul 2001 21:27:55 +0200 (MET DST) Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 Message-ID: <200107301928.VAA10577@core.inf.ethz.ch> [GvR] > > > [Michael Hudson] > > > One - probably called Compile - will sport a __call__ method which > > > will act much like the builtin "compile" of 2.1 with the > > > difference that after it has compiled a __future__ statement, it > > > "remembers" it and compiles all subsequent code with the > > > __future__ options in effect. > > > > > > It will do this by examining the co_flags field of any code object > > > it returns, which in turn means writing and maintaining a Python > > > version of the function PyEval_MergeCompilerFlags found in > > > Python/ceval.c. > > > FYI, in Jython (internally) we have a series of compile_flags functions > > that take a "opaque" object CompilerFlags that is passed to the function > > and compilation actually change the object in order to reflect future > > statements encoutered during compilation... > > Not elegant but avoids code duplication. > > > > Of course we can change that. > > Does codeop currently work in Jython? The solution should continue to > work in Jython then. We have our interface compatible version of codeop that works. > Does Jython support the same flag bit values as > CPython? If not, Paul Prescod's suggestion to use keyword arguments > becomes very relevant. we support a subset of the co_flags, CO_NESTED e.g. is there with the same value. But the embedding API is very different, my implementation of nested scopes does not define any Py_CF... flags, we have an internal CompilerFlags object but is more similar to PyFutureFeatures ... Samuele. From esr@thyrsus.com Mon Jul 30 08:35:17 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Mon, 30 Jul 2001 03:35:17 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <200107301918.f6UJIt003517@odiug.digicool.com>; from guido@zope.com on Mon, Jul 30, 2001 at 03:18:55PM -0400 References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> Message-ID: <20010730033517.A17356@thyrsus.com> Guido van Rossum : > I'm looking forward to Eric's comparison of the two run-time systems. > (Eric, be sure to use a copy of 2.2a1 or the descr-branch -- *don't* > use the CVS trunk.) What would the CVS magic invocation for that be? And...um...why? Has the bytecode changed significantly recently? -- Eric S. Raymond The spirit of resistance to government is so valuable on certain occasions, that I wish it always to be kept alive. It will often be exercised when wrong, but better so than not to be exercised at all. I like a little rebellion now and then. -- Thomas Jefferson, letter to Abigail Adams, 1787 From mwh@python.net Mon Jul 30 20:35:11 2001 From: mwh@python.net (Michael Hudson) Date: 30 Jul 2001 15:35:11 -0400 Subject: [Python-Dev] Re: Simulating shells (was Re: Changing the Division Operator -- PEP 238, rev 1.12) In-Reply-To: Paul Prescod's message of "Mon, 30 Jul 2001 12:20:27 -0700" References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> <200107281357.JAA30859@cj20424-a.reston1.va.home.com> <2m3d7emged.fsf@starship.python.net> <3B65AD24.84DD88A2@ActiveState.com> <2mitgafd0r.fsf_-_@starship.python.net> <3B65B37B.E3E05945@ActiveState.com> Message-ID: <2mpuaijjm8.fsf@starship.python.net> Paul Prescod writes: > Michael Hudson wrote: > > > >... > > > > At one point I was going to use the same bits as are used in the > > code.co_flags field, which was probably where the bitfield idea > > originated. > > > > By "keyword arguments" do you mean e.g: > > > > compile(source, file, start_symbol, generators=1, division=0) > > > > ? I think that would be mildly painful for the one use I had in mind > > (the additions to codeop), and also mildly painful to implement. > > Sorry, could you elaborate on why this is painful to use and implement? Well, I don't know in detail how keyword arguments work from the C side. Your suggestion turns a roughly 4 line change I knew exactly how to do into a 20-30 line change I'd have to work on. I only said "mildly painful". The awkwardness of use would just mean using **, yes. > Considering the availability of **args, the code above looks to me like > syntactic sugar for the code below: > > > compile(source, file, start_symbol, {'generators':1, 'division':0}) Well yes, but I think the latter is closer to what one means, which is to say passing a (i.e. one) set of options. > > would be better from my point of view. I think this is a bit of a > > propeller-heads-only feature, to be honest, so I'm not that inclined > > to worry aobut the API. > > I would just like to see an end to the convention of using bitfields in > Python everywhere. You're just my latest target. Fair enough. I've probably been corrupted by C on this one. > Python is not a really great bit-manipulation language! At any rate, the fact that I'd temporarily forgotten about the existence of Jython is the more serious blunder... Cheers, M. -- . <- the point your article -> . |------------------------- a long way ------------------------| -- Cristophe Rhodes, ucam.chat From mwh@python.net Mon Jul 30 20:46:10 2001 From: mwh@python.net (Michael Hudson) Date: 30 Jul 2001 15:46:10 -0400 Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 In-Reply-To: Samuele Pedroni's message of "Mon, 30 Jul 2001 21:27:55 +0200 (MET DST)" References: <200107301928.VAA10577@core.inf.ethz.ch> Message-ID: <2mn15mjj3x.fsf@starship.python.net> Samuele Pedroni writes: > [GvR] > > > > > [Michael Hudson] > > > > One - probably called Compile - will sport a __call__ method which > > > > will act much like the builtin "compile" of 2.1 with the > > > > difference that after it has compiled a __future__ statement, it > > > > "remembers" it and compiles all subsequent code with the > > > > __future__ options in effect. > > > > > > > > It will do this by examining the co_flags field of any code object > > > > it returns, which in turn means writing and maintaining a Python > > > > version of the function PyEval_MergeCompilerFlags found in > > > > Python/ceval.c. > > > > > FYI, in Jython (internally) we have a series of compile_flags functions > > > that take a "opaque" object CompilerFlags that is passed to the function > > > and compilation actually change the object in order to reflect future > > > statements encoutered during compilation... > > > Not elegant but avoids code duplication. > > > > > > Of course we can change that. > > > > Does codeop currently work in Jython? The solution should continue to > > work in Jython then. > We have our interface compatible version of codeop that works. Would implementing the new interfaces I sketched out for codeop.py be possible in Jython? That's the bit I care about, not so much the interface to __builtin__.compile. > > Does Jython support the same flag bit values as > > CPython? If not, Paul Prescod's suggestion to use keyword arguments > > becomes very relevant. > we support a subset of the co_flags, CO_NESTED e.g. is there with the same > value. > > But the embedding API is very different, my implementation of nested > scopes does not define any Py_CF... flags, we have an internal CompilerFlags > object but is more similar to PyFutureFeatures ... Is this object exposed to Python code at all? One approach would be PyObject-izing PyFutureFlags and making *that* the fourth argument to compile... class Compiler: def __init__(self): self.ff = ff.new() # or whatever def __call__(self, source, filename, start_symbol): code = compile(source, filename, start_symbol, self.ff) self.ff.merge(code.co_flags) return code Cheers, M. -- Like most people, I don't always agree with the BDFL (especially when he wants to change things I've just written about in very large books), ... -- Mark Lutz, http://python.oreilly.com/news/python_0501.html From tim@digicool.com Mon Jul 30 20:47:57 2001 From: tim@digicool.com (Tim Peters) Date: Mon, 30 Jul 2001 15:47:57 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <20010730014859.A15971@thyrsus.com> Message-ID: [Eric S. Raymond] > ... > (and, incidentally, argued that the bytecode should emulate a stack > rather than a register machine because the cost/speed disparities that > justify register architectures in hardware don't exist in a software > VM). Don't get too married to that! My bet is that if anyone had time for it, we'd switch the Python VM today to a register model; Skip Montanaro's Rattlesnake project was aiming at that, but fizzled out due to lack of time. The per-opcode fetch-decode-dispatch overhead is very high in SW too, so a register VM can win simply by cutting the number of opcodes needed to accomplish a given bit of useful work. Indeed, eliding SET_LINENO opcodes is the primary reason Python -O runs faster, yet all it saves is one trip around the eval loop per source-code line (the *body* of SET_LINENO is just a test, branch, and store -- it's trivial compared to the overhead of getting to it). Variants of forth-like threading are alternatives to both. From Samuele Pedroni Mon Jul 30 20:59:35 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Mon, 30 Jul 2001 21:59:35 +0200 (MET DST) Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 Message-ID: <200107301959.VAA11733@core.inf.ethz.ch> ... > > > > > > Does codeop currently work in Jython? The solution should continue to > > > work in Jython then. > > We have our interface compatible version of codeop that works. > > Would implementing the new interfaces I sketched out for codeop.py be > possible in Jython? That's the bit I care about, not so much the > interface to __builtin__.compile. Yes, it's of possible. > > > Does Jython support the same flag bit values as > > > CPython? If not, Paul Prescod's suggestion to use keyword arguments > > > becomes very relevant. > > we support a subset of the co_flags, CO_NESTED e.g. is there with the same > > value. > > > > But the embedding API is very different, my implementation of nested > > scopes does not define any Py_CF... flags, we have an internal CompilerFlags > > object but is more similar to PyFutureFeatures ... > > Is this object exposed to Python code at all? Not publicily, but in Jython the separating line is a bit different, because public java classes are always accessible from jython, even most of the internals. That does not mean and every use of that is welcome and supported. > One approach would be > PyObject-izing PyFutureFlags and making *that* the fourth argument to > compile... > > class Compiler: > def __init__(self): > self.ff = ff.new() # or whatever > def __call__(self, source, filename, start_symbol): > code = compile(source, filename, start_symbol, self.ff) > self.ff.merge(code.co_flags) > return code I see, "internally" we already have a compiler_flags function that do the same of: > code = compile(source, filename, start_symbol, self.ff) > self.ff.merge(code.co_flags) where self.ff is a CompuilerFlags object. I can re-arrange things for any interface, I was only trying to explain our approach and situation and a possible way to avoid duplicating some internal code in Python. Samuele. From esr@thyrsus.com Mon Jul 30 09:08:17 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Mon, 30 Jul 2001 04:08:17 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: ; from tim@digicool.com on Mon, Jul 30, 2001 at 03:47:57PM -0400 References: <20010730014859.A15971@thyrsus.com> Message-ID: <20010730040817.A18034@thyrsus.com> Tim Peters : > The per-opcode fetch-decode-dispatch overhead is very high in SW too, so a > register VM can win simply by cutting the number of opcodes needed to > accomplish a given bit of useful work. That's an interesting idea. OK, so possibly I was wrong -- I hadn't considered that stack-push/stack-pop operations might introduce overhead comparable to the order-of-magnitude speed difference between registers and main memory in hardware. I'm still skeptical, but my mind is open. -- Eric S. Raymond You know why there's a Second Amendment? In case the government fails to follow the first one. -- Rush Limbaugh, in a moment of unaccustomed profundity 17 Aug 1993 From guido@zope.com Mon Jul 30 21:15:09 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 16:15:09 -0400 Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12 In-Reply-To: Your message of "Mon, 30 Jul 2001 21:27:55 +0200." <200107301928.VAA10577@core.inf.ethz.ch> References: <200107301928.VAA10577@core.inf.ethz.ch> Message-ID: <200107302015.f6UKF9j03661@odiug.digicool.com> > > Does codeop currently work in Jython? The solution should continue to > > work in Jython then. > We have our interface compatible version of codeop that works. Ah, good. > > Does Jython support the same flag bit values as > > CPython? If not, Paul Prescod's suggestion to use keyword arguments > > becomes very relevant. > we support a subset of the co_flags, CO_NESTED e.g. is there with the same > value. Cool. > But the embedding API is very different, my implementation of nested > scopes does not define any Py_CF... flags, we have an internal CompilerFlags > object but is more similar to PyFutureFeatures ... That's fine. We may end up rearchitecting that (rather baroque IMO) part of the CPython compiler anyway -- if we can get away with changing the C API. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@zope.com Mon Jul 30 21:16:49 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 16:16:49 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: Your message of "Mon, 30 Jul 2001 03:35:17 EDT." <20010730033517.A17356@thyrsus.com> References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> Message-ID: <200107302016.f6UKGoG03676@odiug.digicool.com> > What would the CVS magic invocation for that be? cvs update -r descr-branch or cvs checkout -r descr-branch python/dist/src Or just download 2.2a1. > And...um...why? Has the bytecode changed significantly recently? Not the bytecode, but the rest of the runtime has changed tremendously, and as I tried to explain over the phone, that has a big impact on reusability of the runtime. The bytecode engine cannot be considered independent from the rest of the runtime. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Mon Jul 30 21:29:01 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 30 Jul 2001 16:29:01 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <200107302016.f6UKGoG03676@odiug.digicool.com>; from guido@zope.com on Mon, Jul 30, 2001 at 04:16:49PM -0400 References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com> Message-ID: <20010730162901.F9578@ute.cnri.reston.va.us> On Mon, Jul 30, 2001 at 04:16:49PM -0400, Guido van Rossum wrote: >impact on reusability of the runtime. The bytecode engine cannot be >considered independent from the rest of the runtime. If you must have a portable bytecode format, why not use the JVM? Perhaps it's not optimal, but it works reasonably well, has a few reasonably complete free implementations that are mostly strangling due to lack of manpower, has some support in GCC 3.0, and is actually deployed in browsers and on people's systems *right now*. I fail to see why we should run after some mythical Perl/Python bytecode that would have to be 1) designed 2) implemented 3) debugged 4) actually made available to users 5) actually downloaded by users. (Much the same objections apply to .NET for Unix.) There's also the cultural difference between Python's "write it clearly and then optimize it" and Perl's "let's write clever optimized code right from the start". Perhaps this can be bridged, perhaps not. --amk From guido@zope.com Mon Jul 30 21:42:05 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 16:42:05 -0400 Subject: [Python-Dev] Revised decimal type PEP In-Reply-To: Your message of "Mon, 30 Jul 2001 11:06:52 EDT." <0107301106520A.02216@fermi.eeel.nist.gov> References: <0107301106520A.02216@fermi.eeel.nist.gov> Message-ID: <200107302042.f6UKg5H03826@odiug.digicool.com> Michael's PEP touches upon the one difficult area of decimal semantics: what to do when a decimal and a binary float meet? We discussed this briefly over lunch here and Tim pointed out that the default should probably be an error: code expecting to work with exact decimals should be allowed to continue after contamination with an inexact binary float. But in other contexts it would make more sense to turn mixed operands into inexact, like what currently happens when int/long meets float. In the IBM model that Aahz is implementing, decimal numbers are not necessarily exact, but (if I understand correctly) you can set a context flag that causes an exception to be raised when the result of an operation on two exact inputs is inexact. This can happen when e.g. a multiplication result exceeds the number of significant digits specified in the context -- then truncation is applied like for binary floats. Could the numeric tower look like this? int < long < decimal < rational < float < complex ******************************* *************** exact inexact A numeric context could contain a flag that decides what happens when exact and inexact are mixed. --Guido van Rossum (home page: http://www.python.org/~guido/) From jack@oratrix.nl Mon Jul 30 22:27:36 2001 From: jack@oratrix.nl (Jack Jansen) Date: Mon, 30 Jul 2001 23:27:36 +0200 Subject: [Python-Dev] Mac toolbox modules for MacOSX unix-Python Message-ID: <20010730212741.D2F37162A2A@oratrix.oratrix.nl> I now have a whole stack of modules that interface to MacOS toolboxes that compile for unix-Python on MacOSX, but I'm a bit unsure about how I should add these to the standard build. So far what I've checked in (in configure) is only a bit of glue that allows the toolbox modules to be loaded, but not yet the changes to setup.py that will actually compile and link the modules. I can do this in two ways: 1) Keep everything as-is and just check in the mods to setup.py. 2) Make the MacOS toolbox modules dependent on a configure switch. The toolbox glue would then also become dependent on this switch. The first option seems to be the standard nowadays: setup.py simply builds everything it can find and for which the prerequisite headers/libs are found. The second option seems a bit more friendly to Pythoneers who view MacOSX as simply unix-with-a-pretty-face and use Python only for command-line scripts and cgi and such. Also, the toolbox modules will be less stable than average modules for some time to be: as they're shared between unix-Python and MacPython and generated on the latter the repository version might not build for a few days while I get my act together. On the other hand: a failing compile of an extension module shouldn't bother them overmuch, and one can always comment out the setup.py lines. A problem with the second option is that I have absolutely no idea how to test for configure flags in setup.py. To complicate matters more I'm thinking of turning Python into a framework, which would give OSX-Python a lot of the niceties that MacPython users are used to (applets and building standalone applications without a C compiler, to name two). In that case many users will probably choose either to go the whole way (install Python as a framework and include the tooolbox modules) or forget about the macos stuff altogether. What do people think about this? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ From skip@pobox.com (Skip Montanaro) Mon Jul 30 22:38:53 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 30 Jul 2001 16:38:53 -0500 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <20010730040817.A18034@thyrsus.com> References: <20010730014859.A15971@thyrsus.com> <20010730040817.A18034@thyrsus.com> Message-ID: <15205.54253.976241.842131@beluga.mojam.com> >>>>> "Eric" == Eric S Raymond writes: Eric> Tim Peters : >> The per-opcode fetch-decode-dispatch overhead is very high in SW too, >> so a register VM can win simply by cutting the number of opcodes >> needed to accomplish a given bit of useful work. Eric> That's an interesting idea. OK, so possibly I was wrong -- I Eric> hadn't considered that stack-push/stack-pop operations might Eric> introduce overhead comparable to the order-of-magnitude speed Eric> difference between registers and main memory in hardware. I'm Eric> still skeptical, but my mind is open. Order of magnitude increases? Maybe, maybe not. Still, something like ADD a1,a2,a3 is going to be faster than PUSH a1 PUSH a2 ADD POP a3 My original aim in considering a register-based VM was that it is easier to track data flow and thus optimize out or rearrange operations to reduce the operation count. Translating Python's stack-oriented VM into a register-oriented one was fairly straightforward (at least it was back when I was fiddling with it - pre-1.5). The main stumbling block was that pesky "from module import *" statement. It could push an unknown quantity of stuff onto the stack, thus killing my attempts to track the location of objects on the stack at compile time. Skip From jeremy@zope.com Mon Jul 30 22:44:52 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 30 Jul 2001 17:44:52 -0400 (EDT) Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <20010730162901.F9578@ute.cnri.reston.va.us> References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com> <20010730162901.F9578@ute.cnri.reston.va.us> Message-ID: <15205.54612.424694.5559@slothrop.digicool.com> >>>>> "AMK" == Andrew Kuchling writes: AMK> On Mon, Jul 30, 2001 at 04:16:49PM -0400, Guido van Rossum AMK> wrote: >> impact on reusability of the runtime. The bytecode engine cannot >> be considered independent from the rest of the runtime. AMK> If you must have a portable bytecode format, why not use the AMK> JVM? Perhaps it's not optimal, but it works reasonably well, AMK> has a few reasonably complete free implementations that are AMK> mostly strangling due to lack of manpower, has some support in AMK> GCC 3.0, and is actually deployed in browsers and on people's AMK> systems *right now*. I'm not sure I understand the suggestion. The JVM defines an instruction set, but it also defines an entire runtime, right? You've got to live with the JVM's implementation of threads, garbage collection, etc. For the case of Python, that sounds a lot like abandoning CPython and using JPython instead. Or would you suggest using the instruction set but nothing else from the JVM? I'm not sure that there would be much advantage there. If we had a JVM implementation designed to support Python, there would be no need to implement most of the opcodes. We'd only need getstatic and invokevirtual <0.2 wink>. The typed opcodes (int, float, etc.) would never be used. The problem seems to be that the VM ties up a bunch of other issues with the bytecode. Python's VM is intimiately tied up with: - reference counting: each opcode knows when to INCREF and when to DECREF - threads: the global interpreter lock is managed outside the bytecode by the "Do periodic things" code. - object model: BINARY_ADD knows how to special case ints and what method to call to dispatch on all other objects Unless the bytecode is very low level, you buy a lot more than some instructions when you buy an instruction set. Jeremy From greg@cosc.canterbury.ac.nz Mon Jul 30 23:05:22 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 31 Jul 2001 10:05:22 +1200 (NZST) Subject: [Python-Dev] Iterator addition? In-Reply-To: <200107301242.IAA09350@cj20424-a.reston1.va.home.com> Message-ID: <200107302205.KAA00567@s454.cosc.canterbury.ac.nz> Guido: > The *only* thing that iterators and sequences have in common is that > they can be iterated over. So they are substitutable in all context > where that's all you do -- including sequence (not tuple!) unpacking. > And not in any other contexts. I agree. The more special cases we add to try to make iterators look like sequences, the harder it's going to be to remember what you can and can't do with an iterator. Let's keep it as simple as possible. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From esr@thyrsus.com Mon Jul 30 10:18:31 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Mon, 30 Jul 2001 05:18:31 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <200107302016.f6UKGoG03676@odiug.digicool.com>; from guido@zope.com on Mon, Jul 30, 2001 at 04:16:49PM -0400 References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com> Message-ID: <20010730051831.B1122@thyrsus.com> Guido van Rossum : > Or just download 2.2a1. It's cool. My local installation is from 2.2.a0. I'll update. > > And...um...why? Has the bytecode changed significantly recently? > > Not the bytecode, but the rest of the runtime has changed > tremendously, and as I tried to explain over the phone, that has a big > impact on reusability of the runtime. The bytecode engine cannot be > considered independent from the rest of the runtime. OK, let's try to factor this design problem. Let's suppose, for the sake of the design discussion, that we can make the type ontologies of the Perl and Python bytecode match up. (Note: making the type ontologies of the two bytecodes match is not the same problem as making the type ontologies of the *languages* match up. It should be rather simpler because a lot of the differences between, e.g., class semantics can probably be compiled away. Not a trivial problem, but humor me.) Let's further suppose that we have a callout mechanism from the Parrot interpreter core to the Perl or Python runtime's C level that can pass out Python/Perl types and return them. Given these two premises, what other problems are there? I can see one: garbage collection. What others are there? -- Eric S. Raymond An armed society is a polite society. Manners are good when one may have to back up his acts with his life. -- Robert A. Heinlein, "Beyond This Horizon", 1942 From martin@loewis.home.cs.tu-berlin.de Mon Jul 30 23:22:05 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 31 Jul 2001 00:22:05 +0200 Subject: [Python-Dev] Python API version & optional features Message-ID: <200107302222.f6UMM5105688@mira.informatik.hu-berlin.de> >> I guess one could argue that extension writers should check >> for narrow/wide builds in their extensions before using Unicode. >> >> Since the number of Unicode extension writers is much smaller >> than the number of users, I think that this apporach would be >> reasonable, provided that we document the problem clearly in the >> NEWS file. > OK. I approve. I'm not sure I can follow. What did you approve? That extension writers should check whether their Unicode build matches the one they get at run-time? How are they going to do that? Regards, Martin From gmcm@hypernet.com Mon Jul 30 23:40:21 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 30 Jul 2001 18:40:21 -0400 Subject: [Python-Dev] zipfiles on sys.path In-Reply-To: <3b65932d.2748051@mail.wanadoo.dk> References: <20010725215830.2F49D14A25D@oratrix.oratrix.nl> Message-ID: <3B65AA15.27947.E9B214D@localhost> Finn Bock wrote: [mac puts package name in __path__ when importing from elsewhere] > Dynamic changes to __path__ is probably not needed for frozen > packages. > > It may not even be needed for imports from zipfile. My first > attempt of adding this feature did not support changes to > __path__. I know of at least one package that requires an extensible __path__, even when frozen. It's a Mark Hammond Special, so you needn't worry about that one, but it's my observation that package authors are enamored of import hacks, so be wary. - Gordon From gmcm@hypernet.com Mon Jul 30 23:40:21 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 30 Jul 2001 18:40:21 -0400 Subject: [Python-Dev] zipfiles on sys.path In-Reply-To: <3b5f2b11.50733180@mail.wanadoo.dk> Message-ID: <3B65AA15.19868.E9B2049@localhost> Finn Bock wrote: > > We have recently added support for .zip files on sys.path to > Jython. Now, after the fact, I wondered what prior art exists for > such a feature and the semantic that is used. We came up with a > solution where: Prior art should include imputil.py (especially since it's at least partly blessed). With imputil, an importer object is on sys.path. The default implementation will give you a __path__ consisting of the package name (I think), but you're free to override that in an importer subclass. I believe Thomas Heller uses zipfiles with imputil in py2exe. I use archives (3 different formats), but not zipfiles. - Gordon From greg@cosc.canterbury.ac.nz Mon Jul 30 23:51:35 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 31 Jul 2001 10:51:35 +1200 (NZST) Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <15205.54253.976241.842131@beluga.mojam.com> Message-ID: <200107302251.KAA00585@s454.cosc.canterbury.ac.nz> Skip Montanaro : > The main stumbling block was that pesky "from module import *" > statement. It could push an unknown quantity of stuff onto the > stack Are you *sure* about that? I'm pretty certain it can't be true, since the compiler has to know at all times how much is on the stack, so it can decide how much stack space is needed. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From jeremy@zope.com Mon Jul 30 23:55:37 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 30 Jul 2001 18:55:37 -0400 (EDT) Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <20010730051831.B1122@thyrsus.com> References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com> <20010730051831.B1122@thyrsus.com> Message-ID: <15205.58857.375440.347263@slothrop.digicool.com> >>>>> "ESR" == Eric S Raymond writes: ESR> Let's suppose, for the sake of the design discussion, that we ESR> can make the type ontologies of the Perl and Python bytecode ESR> match up. What is a type ontology? The definition of ontology I'm familiar with is too broad to be useful in understanding what you're getting at. I've never heard the technical term "type ontology". ESR> (Note: making the type ontologies of the two bytecodes match is ESR> not the same problem as making the type ontologies of the ESR> *languages* match up. It should be rather simpler because a ESR> lot of the differences between, e.g., class semantics can ESR> probably be compiled away. Not a trivial problem, but humor ESR> me.) If I guess at what you mean-- a fuzzy notion that the underlying type system can support both languages-- then I submit that most of the hard problems are indeed here. ESR> Let's further suppose that we have a callout mechanism from the ESR> Parrot interpreter core to the Perl or Python runtime's C level ESR> that can pass out Python/Perl types and return them. Not quite sure what you mean ehre. ESR> Given these two premises, what other problems are there? ESR> I can see one: garbage collection. What others are there? I think you mean memory management in general, not just GC. Others: thread model, interpreter management (such as creating embedded interpreter objects). Jeremy From greg@cosc.canterbury.ac.nz Mon Jul 30 23:58:59 2001 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 31 Jul 2001 10:58:59 +1200 (NZST) Subject: [Python-Dev] Nostalgic Versions In-Reply-To: <15205.37144.824975.214559@beluga.mojam.com> Message-ID: <200107302258.KAA00589@s454.cosc.canterbury.ac.nz> Skip Montanaro : > I'll wager Moshe is planning on "fixing" division. ;-) Guido, why don't you just lend Moshe your time machine? Then he can go back and fix division in 0.1 and the whole problem will disappear from the timeline. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aahz@rahul.net Tue Jul 31 00:01:06 2001 From: aahz@rahul.net (Aahz Maruch) Date: Mon, 30 Jul 2001 16:01:06 -0700 (PDT) Subject: [Python-Dev] Revised decimal type PEP In-Reply-To: <200107302042.f6UKg5H03826@odiug.digicool.com> from "Guido van Rossum" at Jul 30, 2001 04:42:05 PM Message-ID: <20010730230106.0E7E799CA4@waltz.rahul.net> Guido van Rossum wrote: > > In the IBM model that Aahz is implementing, decimal numbers are not > necessarily exact, but (if I understand correctly) you can set a > context flag that causes an exception to be raised when the result of > an operation on two exact inputs is inexact. This can happen when > e.g. a multiplication result exceeds the number of significant digits > specified in the context -- then truncation is applied like for binary > floats. Rounding, actually (unless the specified context (which is not the default) request truncation), but yes. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From thomas@xs4all.net Tue Jul 31 00:02:56 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 31 Jul 2001 01:02:56 +0200 Subject: [Python-Dev] pep-discuss In-Reply-To: <200107301740.f6UHe6K03226@odiug.digicool.com> Message-ID: <20010731010256.F20676@xs4all.nl> On Mon, Jul 30, 2001 at 01:40:06PM -0400, Guido van Rossum wrote: [ Where to discuss PEPs ] [Aahz] > > While what you say makes sense, overall, there are a lot of people (me > > included) who prefer discussion on newsgroups, and I can't quite see > > creating a newsgroup for PEP discussions yet. Call me -0.25 for kicking > > discussion off c.l.py and +0.25 for getting it off python-dev. > For me personally, it would just be another list to follow, no matter > where it happens, so consider me -0. I won't object if a majority on > python-dev wants this though. I'd like to second that, with one minor addition: no crossposting, *please*. I got kind of fed up with the iterators discussions when I got in after a long weekend, and had to read through three copies of several threads (the iterators list, python-list and python-dev) and all were sufficiently long (and only quickly skimmed by me in all cases) that I couldn't remember whether I'd already seen that message... I ended up skipping whole discussions, which defeated the point of being on the list. If people keep up the crossposting, I'd rather have the discussions take place on python-dev or python-list to start with. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From sdm7g@Virginia.EDU Tue Jul 31 00:24:34 2001 From: sdm7g@Virginia.EDU (Steven D. Majewski) Date: Mon, 30 Jul 2001 19:24:34 -0400 (EDT) Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <20010730051831.B1122@thyrsus.com> Message-ID: Since python does so much by name (I think locals are the only place where name lookup is compiled away), dictionary lookup is pretty fundamental to the runtime, and is used by a number of opcodes. No reason that dictionary lookup couldn't be part of the common runtime, or else tucked behind an abstract interface common to Python and Perl lookups, but it's something to keep in mind if the VM is going to operate on a similar level to the current one. But then, I've always thought that one of the problems with trying to optimize Python was that the VM was too high level. If some sort of Forth-like extensible threaded code were used, you could build the current opcodes from lower level primitives. Re: stack vs. register machines: Some Forth implementations cache some of the top of stack in registers, but the more you try to cache, the hairier it gets. ( But you can figure out the bookkeeping once and automatically generate the code variations. ) You might take a look at Anton Ertl's VMGEN: | Vmgen generates much of the code for efficient virtual machine (VM) | interpreters from simple descriptions of the VM instructions. [ One of the nice/useful features of the the Forth VM is the PFA/CFA pairing: PFA is "parameter field address" and points to the code ( VM or native ) to be executed. CFA is "code field address" and points to the code to interpret what's in the parameter field. For threaded code, it points to the threaded code interpreter; for native code, it points to the PFA -- i.e. native code is 'self interpreting' . BUILDS/DOES in Forth creates a data type (BUILDS) and defines code to addess the data type (DOES) that is pointed to by the CFA -- an early but very primitive object-orientation, but with only one method (later Forth's added QUADS and other methods to have separate ACCESS/UPDATE (read/write) methods. ] Re: Other VM implementations: I'm not very familiar with the internals of Squeak, but I suspect that it's worth looking at. They are, in any case, interested in some of the same sort of things. ( There was a recent thread about MIT's StarLogo -- which was originally written for the Mac using (I think) Lisp, and then a portable version was done using Java, but they were disappointed in performance, and I think they are looking at using Squeak now. ) Scheme48 is probably considered the best portable byte-code Scheme implementation. ( Don't know anything about it's internals myself ) A lot of other people who have tried using the Java VM for other languages have had complaints about various things that are difficult or impossible. ( Scheme folks couldn't have full call/cc, and there were two different attempts to add generics to Java -- one involved adding special bytecode support, and the other (Pizza -- now GJ -- Generic Java) tried to stick with a standard Java VM. ) -- Steve majewski From thomas@xs4all.net Tue Jul 31 00:24:32 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 31 Jul 2001 01:24:32 +0200 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <20010730051831.B1122@thyrsus.com> Message-ID: <20010731012432.G20676@xs4all.nl> On Mon, Jul 30, 2001 at 05:18:31AM -0400, Eric S. Raymond wrote: > Let's suppose, for the sake of the design discussion, that we can make > the type ontologies of the Perl and Python bytecode match up. I'm afraid I'll have to side with Jeremy when I say, "What?" > Let's further suppose that we have a callout mechanism from the Parrot > interpreter core to the Perl or Python runtime's C level that can pass out > Python/Perl types and return them. > Given these two premises, what other problems are there? > I can see one: garbage collection. What others are there? As a midnight braindump: What about Perl's 'dynamic' (or 'really JIT') compilation ? The incessant weak typing -- would this be part of the Perl side of Parrot, or part of the Parrot types ? The differences in the regex engine; in Python, regular expressions are optional. Also, the Perl engine has some features SRE hasn't, yet, and vice versa (last I checked, Perl's regexps didn't do unicode or named groups.) And what about Perl's 'Taint' mode ? I don't see how you can emulate that ontop of the Parrot runtime, as it's a tag that gets carried into operations. And I won't even start with Perl's more archaic features, that change the whole working of the interpreter. You mentioned regular expressions as an upside for Python, from this 'merger'. Why is that ? We have a good regex engine, and it's tuned to Python's needs. Do we need 'regex literals' ? Why ? And why would we need a merger with Perl for that, anyway -- I've seen some arbitrary-type-literals suggestions come by in the last couple of days that would make it possible :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From Mark.Favas@per.dem.csiro.au Tue Jul 31 00:27:40 2001 From: Mark.Favas@per.dem.csiro.au (Favas, Mark (EM, Floreat)) Date: Tue, 31 Jul 2001 07:27:40 +0800 Subject: [Python-Dev] Picking on platform fmod Message-ID: <51716131991ED5118CDE00B0D02351865F94@MOORT> [Tim asks for platforms from Mars, unaccountably including Tru64 Unix in there ] No problems on Tru64 (v4.0F), no problems on FreeBSD 4.3-RELEASE, no problems on Solaris 8 More specifically, OK on: OSF1 erebus V4.0 1229 alpha FreeBSD teche 4.3-RELEASE FreeBSD 4.3-RELEASE (Intel) SunOS asafoetida 5.8 Generic_108528-09 sun4u sparc SUNW,Ultra-60 Mark C Favas CSIRO Exploration & Mining Private Bag No. 5 Wembley, Western Australia 6913 Phone - +61 8 93336268 From thomas@xs4all.net Tue Jul 31 00:34:48 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 31 Jul 2001 01:34:48 +0200 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <200107302251.KAA00585@s454.cosc.canterbury.ac.nz> References: <200107302251.KAA00585@s454.cosc.canterbury.ac.nz> Message-ID: <20010731013448.H20676@xs4all.nl> On Tue, Jul 31, 2001 at 10:51:35AM +1200, Greg Ewing wrote: > Skip Montanaro : > > The main stumbling block was that pesky "from module import *" > > statement. It could push an unknown quantity of stuff onto the > > stack > Are you *sure* about that? I'm pretty certain it can't > be true, since the compiler has to know at all times > how much is on the stack, so it can decide how much > stack space is needed. I think Skip meant it does an arbitrary number of load-onto-stack store-into-namespace operations. Skip, you'll be glad to know that's no longer true :) Since 2.0 (or when was it that we introduced 'import as' ?) import-* is not a special case of 'IMPORT_FROM', but rather a separate opcode that doesn't touch the stack. 'IMPORT_FROM' is now only used to push a given name from TOS onto the stack: >>> def eggs(): ... from stat import a, b >>> dis.dis(eggs) ... 9 IMPORT_NAME 0 (stat) 12 IMPORT_FROM 1 (a) 15 STORE_FAST 1 (a) 18 IMPORT_FROM 2 (b) 21 STORE_FAST 0 (b) 24 POP_TOP ... >>> def spam(): ... from stat import * >>> dis.dis(spam) ... 6 LOAD_CONST 1 (('*',)) 9 IMPORT_NAME 0 (stat) 12 IMPORT_STAR ... Bloody hell, what's that LOAD_CONST doing there ? I think I found a bug ;P Sigh... Sleep first, fix later. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From sdm7g@Virginia.EDU Tue Jul 31 00:39:47 2001 From: sdm7g@Virginia.EDU (Steven D. Majewski) Date: Mon, 30 Jul 2001 19:39:47 -0400 (EDT) Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <20010730162901.F9578@ute.cnri.reston.va.us> Message-ID: On Mon, 30 Jul 2001, Andrew Kuchling wrote: > If you must have a portable bytecode format, why not use the JVM? > Perhaps it's not optimal, but it works reasonably well, has a few > reasonably complete free implementations that are mostly strangling > due to lack of manpower, has some support in GCC 3.0, and is actually > deployed in browsers and on people's systems *right now*. I fail to > see why we should run after some mythical Perl/Python bytecode that > would have to be 1) designed 2) implemented 3) debugged 4) actually > made available to users 5) actually downloaded by users. (Much the > same objections apply to .NET for Unix.) Some of the folks who have done other languages on the JVM have complained about limitations of the Java VM when it comes to supporting features of other languages. Supposedly, Microsoft considered some of those critiques when designing the C# runtime & VM. If, in fact, they have done a better job of generic VM design, then .NET may be worth lookint at. ( Especially as there is now Miguel de Icaza's Mono project. ) ( Of course, politically, that may be inviting a lot of arguments -- see the slashdot threads about whether Mono is a good idea, or is just open source getting suckered by MS! ) -- Steve Majewski From skip@pobox.com (Skip Montanaro) Tue Jul 31 00:46:25 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 30 Jul 2001 18:46:25 -0500 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <15205.54612.424694.5559@slothrop.digicool.com> References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com> <20010730162901.F9578@ute.cnri.reston.va.us> <15205.54612.424694.5559@slothrop.digicool.com> Message-ID: <15205.61905.569827.464400@beluga.mojam.com> Jeremy> If we had a JVM implementation designed to support Python, there Jeremy> would be no need to implement most of the opcodes. We'd only Jeremy> need getstatic and invokevirtual <0.2 wink>. The typed opcodes Jeremy> (int, float, etc.) would never be used. Perhaps Armin Rego's Psyco stuff could make use of them if he chose the JVM as his "other VM". Skip From pedroni@inf.ethz.ch Tue Jul 31 00:59:22 2001 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Tue, 31 Jul 2001 01:59:22 +0200 Subject: [Python-Dev] Parrot -- should life imitate satire? References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com> <20010730162901.F9578@ute.cnri.reston.va.us> <15205.54612.424694.5559@slothrop.digicool.com> <15205.61905.569827.464400@beluga.mojam.com> Message-ID: <006b01c11953$b94d10a0$8a73fea9@newmexico> Hi [Skip Montanaro] > > Jeremy> If we had a JVM implementation designed to support Python, there > Jeremy> would be no need to implement most of the opcodes. We'd only > Jeremy> need getstatic and invokevirtual <0.2 wink>. The typed opcodes > Jeremy> (int, float, etc.) would never be used. > > Perhaps Armin Rego's Psyco stuff could make use of them if he chose the JVM > as his "other VM". > Yes, but feeding the JVM with bytecodes costs more than feeding a real CPU or a VM written to deal quickly with little chunks of code. JVM dynamic loading has a verification phase, accept only full class definitions and then you enter the interpretation/hotspot collecting phase and then dynamic compilation stuff... Samuele. From pedroni@inf.ethz.ch Tue Jul 31 01:04:49 2001 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Tue, 31 Jul 2001 02:04:49 +0200 Subject: [Python-Dev] Parrot -- should life imitate satire? References: Message-ID: <007701c11954$6b0017c0$8a73fea9@newmexico> > > Some of the folks who have done other languages on the JVM have > complained about limitations of the Java VM when it comes to supporting > features of other languages. > > Supposedly, Microsoft considered some of those critiques when designing > the C# runtime & VM. > > If, in fact, they have done a better job of generic VM design, then > .NET may be worth lookint at. ( Especially as there is now > Miguel de Icaza's Mono project. ) A question: are there already some data about what would be the actual performance of Python.NET vs. CPython ? > > ( Of course, politically, that may be inviting a lot of arguments -- > see the slashdot threads about whether Mono is a good idea, or is > just open source getting suckered by MS! ) > Samuele Pedroni. From akuchlin@mems-exchange.org Tue Jul 31 01:56:57 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 30 Jul 2001 20:56:57 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <20010731012432.G20676@xs4all.nl>; from thomas@xs4all.net on Tue, Jul 31, 2001 at 01:24:32AM +0200 References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl> Message-ID: <20010730205657.A2298@ute.cnri.reston.va.us> On Tue, Jul 31, 2001 at 01:24:32AM +0200, Thomas Wouters wrote: >Parrot types ? The differences in the regex engine; in Python, regular >expressions are optional. Also, the Perl engine has some features SRE If regex opcodes form part of the basic VM, would the main loop end up looking like the union of ceval.c and pypcre.c/_sre.c? The thought is too ghastly to contemplate, though a little part of me [*] would like to see it. --amk [*] "Taking it in its deepest sense, the shadow is the invisible saurian tail that man still drags behind him. Carefully amputated, it becomes the healing serpent of the mysteries. Only monkeys parade with it." C.G. Jung, in _The Integration of the Personality_. (1939) From esr@thyrsus.com Mon Jul 30 14:06:24 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Mon, 30 Jul 2001 09:06:24 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <15205.58857.375440.347263@slothrop.digicool.com>; from jeremy@zope.com on Mon, Jul 30, 2001 at 06:55:37PM -0400 References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com> <20010730051831.B1122@thyrsus.com> <15205.58857.375440.347263@slothrop.digicool.com> Message-ID: <20010730090624.A3944@thyrsus.com> Jeremy Hylton : > What is a type ontology? The definition of ontology I'm familiar with > is too broad to be useful in understanding what you're getting at. > I've never heard the technical term "type ontology". I first heard it in connection with cross-language RPC. The "type ontology" of a language or protocol is its implicit theory of what kinds of things there are in the universe. It's actually a pretty reasonable specialization of the term "ontology" in philosophy. -- Eric S. Raymond A man who has nothing which he is willing to fight for, nothing which he cares about more than he does about his personal safety, is a miserable creature who has no chance of being free, unless made and kept so by the exertions of better men than himself. -- John Stuart Mill, writing on the U.S. Civil War in 1862 From esr@thyrsus.com Mon Jul 30 14:12:08 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Mon, 30 Jul 2001 09:12:08 -0400 Subject: [Python-Dev] Parrot -- should life imitate satire? In-Reply-To: <20010731012432.G20676@xs4all.nl>; from thomas@xs4all.net on Tue, Jul 31, 2001 at 01:24:32AM +0200 References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl> Message-ID: <20010730091208.C3944@thyrsus.com> Thomas Wouters : > > Let's suppose, for the sake of the design discussion, that we can make > > the type ontologies of the Perl and Python bytecode match up. > > I'm afraid I'll have to side with Jeremy when I say, "What?" Explained in public reply to Jeremy. > You mentioned regular expressions as an upside for Python, from this > 'merger'. Why is that ? No. I was referring to the fact that we have *already* coopted Perl's regexp design. -- Eric S. Raymond The end move in politics is always to pick up a gun. -- R. Buckminster Fuller From guido@zope.com Tue Jul 31 03:02:06 2001 From: guido@zope.com (Guido van Rossum) Date: Mon, 30 Jul 2001 22:02:06 -0400 Subject: [Python-Dev] Python API version & optional features In-Reply-To: Your message of "Tue, 31 Jul 2001 00:22:05 +0200." <200107302222.f6UMM5105688@mira.informatik.hu-berlin.de> References: <200107302222.f6UMM5105688@mira.informatik.hu-berlin.de> Message-ID: <200107310202.WAA10380@cj20424-a.reston1.va.home.com> > >> I guess one could argue that extension writers should check > >> for narrow/wide builds in their extensions before using Unicode. > >> > >> Since the number of Unicode extension writers is much smaller > >> than the number of users, I think that this apporach would be > >> reasonable, provided that we document the problem clearly in the > >> NEWS file. > > > OK. I approve. > > I'm not sure I can follow. What did you approve? That extension > writers should check whether their Unicode build matches the one they > get at run-time? How are they going to do that? With an explicit call. They know their compile-time unicode width, they can pass that to a function defined in the main Python/C API which asserts that the argument is the same as *its* compile-time unicode width. --Guido van Rossum (home page: http://www.python.org/~guido/) From ping@lfw.org Tue Jul 31 03:43:45 2001 From: ping@lfw.org (Ka-Ping Yee) Date: Mon, 30 Jul 2001 19:43:45 -0700 (PDT) Subject: [Python-Dev] cgitb.py for Python 2.2 Message-ID: Hi guys. Sorry i've been fairly quiet recently -- at least life isn't dull. I wanted to put in a few words for cgitb.py for your consideration. I think you all saw it at IPC 9 -- if you missed the presentation, there are examples at http://www.lfw.org/python to check out. What i'm proposing is that we toss cgitb.py into the standard library (pretty small at about 100 lines, since all the heavy lifting is in pydoc and inspect). Then we can add this to site.py: if os.environ.has_key("GATEWAY_INTERFACE"): import sys, cgitb sys.excepthook = cgitb.excepthook I think this is pretty safe, since GATEWAY_INTERFACE is guaranteed to exist under the CGI specification and should never appear in any other context. cgitb.py is written in paranoid fashion -- if anything goes wrong during generation of the HTML traceback, sys.stderr still goes to the browser; and if for some reason the page gets dumped to a shell somewhere, the original traceback is still visible in a comment at the end of the page. The upside is that we *automagically* get pretty tracebacks for all the Python CGI scripts there, with zero effort from the CGI script writers. I think this is a really strong hook for people getting started with Python. No more "internal server error" messages followed by the annoying task of inserting "print 'Content-Type: text/html\n\n
'" into
all your scripts!  As for me, i've probably done this hundreds of
times now, and would love to stop doing it.

I anticipate a possible security concern (as this shows bits of your
source code to strangers when problems happen).  So i have tried to
address that by providing a SECRET flag in cgitb that causes the
tracebacks to get written to files instead of the Web browser.

Opinions and suggestions are welcomed!  (I'm looking at the good
stuff that the WebWare people have done with it, and i plan to
merge in their improvements.  For the HTML-heads out there in
particular, i'm looking for your thoughts on the reset() routine.)


-- ?!ng



From barry@zope.com  Tue Jul 31 04:04:03 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 30 Jul 2001 23:04:03 -0400
Subject: [Python-Dev] cgitb.py for Python 2.2
References: 
Message-ID: <15206.8227.652539.471067@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee  writes:

    KY> What i'm proposing is that we toss cgitb.py into the standard
    KY> library (pretty small at about 100 lines, since all the heavy
    KY> lifting is in pydoc and inspect).  Then we can add this to
    KY> site.py:

No time right now to look at it, but I remember it looked pretty cool
at IPC9.  I'd like to merge in some of the ideas I've developed in
Mailman's driver script, which prints out the environment and some
other sys information.  driver always prints to a log file and
optionally to stdout (it has a STEALTH_MODE variable that's probably
equivalent to your SECRET).

One thing I tried very hard to do was to make driver bulletproof, so
that it only imported a very minimal amount of stuff, and that /any/
exception along the way would get caught and not allowed to percolate
up out of the top frame (which would cause a non-zero exit status and
unhelpful message in the browser).  About the only thing that isn't
caught are exceptions importing sys, but if that happens you have
bigger problems! :)

I'll take a closer look at cgitb.py when I get a chance, but I'm
generally +1 on the idea.

-Barry


From gnat@oreilly.com  Tue Jul 31 04:08:47 2001
From: gnat@oreilly.com (Nathan Torkington)
Date: Mon, 30 Jul 2001 20:08:47 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730205657.A2298@ute.cnri.reston.va.us>
References: <20010730051831.B1122@thyrsus.com>
 <20010731012432.G20676@xs4all.nl>
 <20010730205657.A2298@ute.cnri.reston.va.us>
Message-ID: <15206.8511.147000.832644@gargle.gargle.HOWL>

Andrew Kuchling writes:
> If regex opcodes form part of the basic VM, would the main loop end up
> looking like the union of ceval.c and pypcre.c/_sre.c?  The thought is
> too ghastly to contemplate, though a little part of me [*] would like
> to see it.

(perl guy speaking alert) The plan for perl6 is to implement the
regular expression engine as opcodes.  We feel this would be cleaner
and faster than having the essentially separate module that we have
right now.  I think our current perl5 project manager was the one who
said that we have no idea how inefficient our current RE engine is,
because it's been "optimized" to the point where it's impossible to
read.

The core loop would just be the usual opcode dispatch loop ("call the
function for the current operation, which returns the next
operation").  The only difference is that some of the opcodes would be
specific to RE matches.  (I'm unclear on how much special logic RE
opcodes involve--it may be possible to implement REs with the
operations that regular language features like loops and tests
require).

Nat



From gnat@oreilly.com  Tue Jul 31 04:12:04 2001
From: gnat@oreilly.com (Nathan Torkington)
Date: Mon, 30 Jul 2001 20:12:04 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010731012432.G20676@xs4all.nl>
References: <20010730051831.B1122@thyrsus.com>
 <20010731012432.G20676@xs4all.nl>
Message-ID: <15206.8708.811000.468489@gargle.gargle.HOWL>

Thomas Wouters writes:
> Also, the Perl engine has some features SRE hasn't, yet, and vice
> versa (last I checked, Perl's regexps didn't do unicode or named
> groups.)

Perl's REs now do Unicode.  Perl 6's REs will do named groups.

> And I won't even start with Perl's more archaic features, that
> change the whole working of the interpreter.

Those are going away.  Perl people hate them as much as you do--the
only time they're used now is to make deliberately hideous code, and
hardly anyone will seriously lament the passing of that ability.  No
more "change the starting position for subscripts", no more "change
all RE matches globally", and so on.

Nat



From gnat@oreilly.com  Tue Jul 31 04:15:34 2001
From: gnat@oreilly.com (Nathan Torkington)
Date: Mon, 30 Jul 2001 20:15:34 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730162901.F9578@ute.cnri.reston.va.us>
References: <20010730014859.A15971@thyrsus.com>
 <200107301918.f6UJIt003517@odiug.digicool.com>
 <20010730033517.A17356@thyrsus.com>
 <200107302016.f6UKGoG03676@odiug.digicool.com>
 <20010730162901.F9578@ute.cnri.reston.va.us>
Message-ID: <15206.8918.603000.448728@gargle.gargle.HOWL>

Andrew Kuchling writes:
> There's also the cultural difference between Python's "write it
> clearly and then optimize it" and Perl's "let's write clever optimized
> code right from the start".  Perhaps this can be bridged, perhaps not.

The people designing and implementing perl6 have already agreed on a
"do it clean, then make it faster" approach.  We can all see the
problems with the current Perl internals, and have no desire to repeat
the mistakes of the past.

There may or may not be impedence mismatch between the two languages
(Perl's flexitypes might be one of the sticking points) but this won't
be one of them.

Nat



From esr@thyrsus.com  Mon Jul 30 16:51:03 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 30 Jul 2001 11:51:03 -0400
Subject: [Python-Dev] cgitb.py for Python 2.2
In-Reply-To: ; from ping@lfw.org on Mon, Jul 30, 2001 at 07:43:45PM -0700
References: 
Message-ID: <20010730115103.A2052@thyrsus.com>

Ka-Ping Yee :
> The upside is that we *automagically* get pretty tracebacks for all
> the Python CGI scripts there, with zero effort from the CGI script
> writers.  I think this is a really strong hook for people getting
> started with Python.

I've been to look at the cgitb page.  My jaw dropped open.

+1
-- 
		Eric S. Raymond

The abortion rights and gun control debates are twin aspects of a deeper
question --- does an individual ever have the right to make decisions
that are literally life-or-death?  And if not the individual, who does?


From paulp@ActiveState.com  Tue Jul 31 04:49:50 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 30 Jul 2001 20:49:50 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl> <20010730205657.A2298@ute.cnri.reston.va.us>
Message-ID: <3B662ADD.9E701795@ActiveState.com>

Andrew Kuchling wrote:
> 
>...
> 
> If regex opcodes form part of the basic VM, would the main loop end up
> looking like the union of ceval.c and pypcre.c/_sre.c?  The thought is
> too ghastly to contemplate, though a little part of me [*] would like
> to see it.

Welcome to Perl. :)

I don't really understand it but here are references that might help:

http://aspn.activestate.com/ASPN/Mail/Message/638953
http://aspn.activestate.com/ASPN/Mail/Message/639000
http://aspn.activestate.com/ASPN/Mail/Message/639048

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From paulp@ActiveState.com  Tue Jul 31 05:17:04 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 30 Jul 2001 21:17:04 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References:  <007701c11954$6b0017c0$8a73fea9@newmexico>
Message-ID: <3B663140.41CB9DD7@ActiveState.com>

Samuele Pedroni wrote:
> 
>...
> A question: are there already some data about
> what would be the actual performance of Python.NET vs. CPython ?

I think it is safe to say that the current version of Python.NET is
slower than Jython. Now it hasn't been optimized as much as Jython so we
might be able to get it as fast as Jython. But I don't think that there
is anything in the .NET runtime that makes it a great deal better than
the JVM for dynamic languages. The only difference is that Microsoft
seems more aware of the problem and may move to correct it whereas I
have a feeling that explicit support for our languages would dilute
Sun's 100% Java marketing campaign. Also, the .NET CLR is standardized
at ECMA so we could (at least in theory!) go to the meetings and try to
influence version 2.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From m@moshez.org  Tue Jul 31 05:15:46 2001
From: m@moshez.org (Moshe Zadka)
Date: Tue, 31 Jul 2001 07:15:46 +0300
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730051831.B1122@thyrsus.com>
References: <20010730051831.B1122@thyrsus.com>, <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com>
Message-ID: 

On Mon, 30 Jul 2001, "Eric S. Raymond"  wrote:

> Let's further suppose that we have a callout mechanism from the Parrot 
> interpreter core to the Perl or Python runtime's C level that can pass out 
> Python/Perl types and return them.
> 
> Given these two premises, what other problems are there?

This solution sounds like just taking two VM interpreters and forcing
them together by having the first byte of the instruction be "Python opcode"
or "Perl opcode". You get none of the wins you were aiming for.

> I can see one: garbage collection.

How is GC a problem? Python never promised a specific GC mechanism,
so as long as you have something which collects garbage, Python is
fine.
-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE
Learn Python! http://www.ibiblio.org/obp/thinkCSpy


From m@moshez.org  Tue Jul 31 05:18:10 2001
From: m@moshez.org (Moshe Zadka)
Date: Tue, 31 Jul 2001 07:18:10 +0300
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: 
References: 
Message-ID: 

On Mon, 30 Jul 2001, "Steven D. Majewski"  wrote:

> Scheme48 is probably considered the best portable byte-code Scheme
> implementation. ( Don't know anything about it's internals myself )

Last I heard (admittedly, >1 yr. ago), it didn't support 64 bit
architectures.
-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE
Learn Python! http://www.ibiblio.org/obp/thinkCSpy


From greg@cosc.canterbury.ac.nz  Tue Jul 31 06:00:45 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2001 17:00:45 +1200 (NZST)
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: 
Message-ID: <200107310500.RAA00648@s454.cosc.canterbury.ac.nz>

"Steven D. Majewski" :

> But then, I've always thought that one of the problems with
> trying to optimize Python was that the VM was too high level. 

No, the problem is that Python is just too darn dynamic!
This is a feature of the language, not just the VM.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@zope.com  Tue Jul 31 07:22:13 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 02:22:13 -0400
Subject: [Python-Dev] cgitb.py for Python 2.2
In-Reply-To: Your message of "Mon, 30 Jul 2001 19:43:45 PDT."
 
References: 
Message-ID: <200107310622.CAA11742@cj20424-a.reston1.va.home.com>

> Sorry i've been fairly quiet recently -- at least life isn't dull.

You still have a few SF bugs and patches assigned!  How about
addressing those?!

> I wanted to put in a few words for cgitb.py for your consideration.
> 
> I think you all saw it at IPC 9 -- if you missed the presentation,
> there are examples at http://www.lfw.org/python to check out.

Yeah, it's cool.

> What i'm proposing is that we toss cgitb.py into the standard library
> (pretty small at about 100 lines, since all the heavy lifting is in
> pydoc and inspect).  Then we can add this to site.py:
> 
>     if os.environ.has_key("GATEWAY_INTERFACE"):
>         import sys, cgitb
>         sys.excepthook = cgitb.excepthook

Why not add this to cgi.py instead?  Th site.py initialization is
accumulating a lot of cruft, and I don't like new additions that are
irrelevant for most apps (CGI is a tiny niche for Python IMO).  (I
also think all the stuff that's only for interactive mode should be
moved off to another module that is only run in interactive mode.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Tue Jul 31 07:29:36 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 02:29:36 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Your message of "Mon, 30 Jul 2001 21:17:04 PDT."
 <3B663140.41CB9DD7@ActiveState.com>
References:  <007701c11954$6b0017c0$8a73fea9@newmexico>
 <3B663140.41CB9DD7@ActiveState.com>
Message-ID: <200107310629.CAA11818@cj20424-a.reston1.va.home.com>

> Also, the .NET CLR is standardized at ECMA so we could (at least in
> theory!) go to the meetings and try to influence version 2.

Notice the addition "in theory".  In practice, this is BS.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Tue Jul 31 08:37:27 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 09:37:27 +0200
Subject: [Python-Dev] Python API version & optional features
References: <200107302222.f6UMM5105688@mira.informatik.hu-berlin.de>
Message-ID: <3B666037.6A813780@lemburg.com>

"Martin v. Loewis" wrote:
> 
> >> I guess one could argue that extension writers should check
> >> for narrow/wide builds in their extensions before using Unicode.
> >>
> >> Since the number of Unicode extension writers is much smaller
> >> than the number of users, I think that this apporach would be
> >> reasonable, provided that we document the problem clearly in the
> >> NEWS file.
> 
> > OK.  I approve.
> 
> I'm not sure I can follow. What did you approve? 

To use macros in unicodeobject.h which then map all interface names
to either PyUnicodeUC2_* or PyUnicodeUCS4_*. The linker will then
report the mismatch in interfaces.

> That extension
> writers should check whether their Unicode build matches the one they
> get at run-time? How are they going to do that?

They would have to use at least one of the PyUnicode_* APIs in
their code. 

I think it would also be a good idea to provide 
a non-mangled PyUnicode_UnicodeSize() API which would then return
the number of bytes occupied by Py_UNICODE of the Python build.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Tue Jul 31 09:14:53 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 10:14:53 +0200
Subject: [Python-Dev] Python API version & optional features
References: <3B655980.948BCDEF@lemburg.com> <15205.25545.353887.299167@cj42289-a.reston1.va.home.com> <3B6567A3.E386EAB9@lemburg.com> <200107301427.f6UERW802779@odiug.digicool.com>
 <3B65765A.9706A4A2@lemburg.com> <200107301547.f6UFlhB02991@odiug.digicool.com>
Message-ID: <3B6668FD.DA986A28@lemburg.com>

Guido van Rossum wrote:
> 
> > > Hm, the "u" argument parser is a nasty one to catch.  How likely is
> > > this to be the *only* reference to Unicode in a particular extension?
> >
> > It is not very likely but IMHO possible for e.g. extensions
> > which rely on the fact that wchar_t == Py_UNICODE and then do
> > direct interfacing to some other third party code.
> >
> > I guess one could argue that extension writers should check
> > for narrow/wide builds in their extensions before using Unicode.
> >
> > Since the number of Unicode extension writers is much smaller
> > than the number of users, I think that this apporach would be
> > reasonable, provided that we document the problem clearly in the
> > NEWS file.
> 
> OK.  I approve.

Great ! I'll go ahead and fix unicodeobject.h.
 
> > Hmm, that would probably not make UCS-4 builds very popular ;-)
> 
> Do you have any reason to assume that it would be popular otherwise?
> :-) :-) :-)

Oh, I do hope that people try out the UCS-4 builds. They may not
be all that interesting yet, but I believe that for Asian users
they do have some advantages.
 
> > > These warnings should use the warnings framework, by the way, to make
> > > it easier to ignore a specific warning.  Currently it's a hard write
> > > to stderr.
> >
> > Using the warnings framework would indeed be a good idea (many older
> > extensions work just fine even with later API levels; the warnings
> > are annoying, though) !
> 
> Exactly.
> 
> I'm not going to make the change, but it should be a two-liner in
> Python/modsupport.c:Py_InitModule4().

I'll look into this as well.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Tue Jul 31 09:30:20 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 10:30:20 +0200
Subject: [Python-Dev] Revised decimal type PEP
References: <0107301106520A.02216@fermi.eeel.nist.gov>
Message-ID: <3B666C9C.4400BD9C@lemburg.com>

Michael McLay wrote:
> 
> PEP: 2XX
> Title: Adding a Decimal type to Python
> Version: $Revision:$
> Author: mclay@nist.gov 
> Status: Draft
> Type: ??
> Created: 25-Jul-2001
> Python-Version: 2.2
> 
> Introduction
> 
>     This PEP describes the addition of a decimal number type to Python.
> 
>     ...
>
> Implementation
> 
>     The tokenizer will be modified to recognized number literals with
>     a 'd' suffix and a decimal() function will be added to __builtins__.

How will you be able to define the precision of decimals ? Implicit
by providing a decimal string with enough 0s to let the parser
deduce the precision ? Explicit like so: decimal(12, 5) ?

Also, what happens to the precision of the decimal object resulting
from numeric operations ?

>     A decimal number can be used to represent integers and floating point
>     numbers and decimal numbers can also be displayed using scientific
>     notation. Examples of decimal numbers include:
>     
>     ...
>
>     This proposal will also add an optional  'b' suffix to the
>     representation of binary float type literals and binary int type
>     literals.

Hmm, I don't quite grasp the need for the 'b'... numbers without
any modifier will work the same way as they do now, right ?
 
>     ...
>
>     Expressions that mix binary floats with decimals introduce the
>     possibility of unexpected results because the two number types use
>     different internal representations for the same numerical value. 

I'd rather have this explicit in the sense that you define which
assumptions will be made and what issues arise (rounding, truncation,
loss of precision, etc.).

>     The
>     severity of this problem is dependent on the application domain.  For
>     applications that normally use binary numbers the error may not be
>     important and the conversion should be done silently.  For newbie
>     programmers a warning should be issued so the newbie will be able to
>     locate the source of a discrepancy between the expected results and
>     the results that were achieved.  For financial applications the mixing
>     of floating point with binary numbers should raise an exception.
> 
>     To accommodate the three possible usage models the python interpreter
>     command line options will be used to set the level for warning and
>     error messages. The three levels are:
> 
>     promiscuous mode,   -f or  --promiscuous
>     safe mode           -s or --save
>     pedantic mode       -p or --pedantic

How about a generic option:

	--numerics:[loose|safe|pedantic] or -n:[l|s|p]

>     The default setting will be set to the safe setting. In safe mode
>     mixing decimal and binary floats in a calculation will trigger a warning
>     message.
> 
>     >>> type(12.3d + 12.2b)
>     Warning: the calculation mixes decimal numbers with binary floats
>     
> 
>     In promiscuous mode warnings will be turned off.
> 
>     >>> type(12.3d + 12.2b)
>     
> 
>     In pedantic mode warning from safe mode will be turned into exceptions.
> 
>     >>> type(12.3d + 12.2b)
>     Traceback (innermost last):
>       File "", line 1, in ?
>     TypeError: the calculation mixes decimal numbers with binary floats
> 
> Semantics of Decimal Numbers
> 
>     ??

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Tue Jul 31 09:05:14 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 10:05:14 +0200
Subject: [Python-Dev] pep-discuss
References: <20010730154936.AE36899C94@waltz.rahul.net>
Message-ID: <3B6666BA.7F774C46@lemburg.com>

Aahz Maruch wrote:
> 
> Paul Prescod wrote:
> >
> > We've talked about having a mailing list for general PEP-related
> > discussions. Two things make me think that revisiting this would be a
> > good idea right now.
> >
> > First, the recent loosening up of the python-dev rules threatens the
> > quality of discussion about bread and butter issues such as patch
> > discussions and process issues.
> >
> > Second, the flamewar on python-list basically drowned out the usual
> > newbie questions and would give a person coming new to Python a very
> > negative opinion about the language's future and the friendliness of the
> > community. I would rather redirect as much as possible of that to a list
> > that only interested participants would have to endure.
> 
> While what you say makes sense, overall, there are a lot of people (me
> included) who prefer discussion on newsgroups, and I can't quite see
> creating a newsgroup for PEP discussions yet.  Call me -0.25 for kicking
> discussion off c.l.py and +0.25 for getting it off python-dev.

I don't really mind having PEP discussions on both c.l.p (to get
user feedback) and python-dev (for the purpose of reaching 
consensus). After all, python-dev is about developing Python,
so PEP discussion is very much on topic.

Note that a filter on "python-dev" in the List-ID field and
"PEP" in the subject should pretty much filter out all
PEP discussions from python-dev if you don't want to participate
in them.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From paulp@ActiveState.com  Tue Jul 31 09:47:03 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 31 Jul 2001 01:47:03 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References:  <007701c11954$6b0017c0$8a73fea9@newmexico>
 <3B663140.41CB9DD7@ActiveState.com> <200107310629.CAA11818@cj20424-a.reston1.va.home.com>
Message-ID: <3B667087.EBBE8938@ActiveState.com>

Guido van Rossum wrote:
> 
> > Also, the .NET CLR is standardized at ECMA so we could (at least in
> > theory!) go to the meetings and try to influence version 2.
> 
> Notice the addition "in theory".  In practice, this is BS.

It depends on the rules and politics of each particular standards group.
It is fundamentally a social activity. It also depends how much effort
you are willing to put into promoting your cause. Sam Ruby is chair of
the ECMA CLI group. He is a big scripting language fan. 

http://www2.hursley.ibm.com/tc39/

Also note the presence of Mike Cowlishaw of REXX fame and Dave Raggett
of the W3C.

Working within a standards body is a gamble. It can pay off big or it
can completely fail. We might find Microsoft our strongest ally -- they
have always been interested in having the scripting languages work well
on their platforms. They would hate to give programmers to have an
excuse to stick to Unix or the JVM.

I don't personally know enough about this particular circumstance to
know whether there is any possibility of significantly influencing
version 2 or not. Maybe the gamble isn't worth the effort. But I
wouldn't dismiss it out of hand.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From mwh@python.net  Tue Jul 31 10:23:48 2001
From: mwh@python.net (Michael Hudson)
Date: 31 Jul 2001 05:23:48 -0400
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
In-Reply-To: Samuele Pedroni's message of "Mon, 30 Jul 2001 21:59:35 +0200 (MET DST)"
References: <200107301959.VAA11733@core.inf.ethz.ch>
Message-ID: <2m4rrt78pn.fsf@starship.python.net>

Samuele Pedroni  writes:

> ...
> > > > 
> > > > Does codeop currently work in Jython?  The solution should continue to
> > > > work in Jython then. 
> > > We have our interface compatible version of codeop that works.
> > 
> > Would implementing the new interfaces I sketched out for codeop.py be
> > possible in Jython?  That's the bit I care about, not so much the
> > interface to __builtin__.compile.
> Yes, it's of possible.

Good; hopefully we can get somewhere then.

> > > > Does Jython support the same flag bit values as
> > > > CPython?  If not, Paul Prescod's suggestion to use keyword arguments
> > > > becomes very relevant.
> > > we support a subset of the co_flags, CO_NESTED e.g. is there with the same
> > > value.
> > > 
> > > But the embedding API is very different, my implementation of nested
> > > scopes does not define any Py_CF... flags, we have an internal CompilerFlags
> > > object but is more similar to PyFutureFeatures ...
> > 
> > Is this object exposed to Python code at all?
> Not publicily, but in Jython the separating line is a bit different,
> because public java classes are always accessible from jython,
> even most of the internals. That does not mean and every use of that
> is welcome and supported.

Ah, of course.  I'd forgotten how cool Jython was in some ways.

> >  One approach would be
> > PyObject-izing PyFutureFlags and making *that* the fourth argument to
> > compile...
> > 
> > class Compiler:
> >     def __init__(self):
> >         self.ff = ff.new() # or whatever
> >     def __call__(self, source, filename, start_symbol):
> >         code = compile(source, filename, start_symbol, self.ff)
> >         self.ff.merge(code.co_flags)
> >         return code
> I see, "internally" we already have a compiler_flags function
> that do the same of:
> >         code = compile(source, filename, start_symbol, self.ff)
> >         self.ff.merge(code.co_flags)
> 
> where self.ff is a CompuilerFlags object.
> 
> I can re-arrange things for any interface, 

Well, I don't want to make more work for you - I imagine Guido's doing
enough of that for two!

> I was only trying to explain our approach and situation and a
> possible way to avoid duplicating some internal code in Python.

Can you point me to the code in CVS that implements this sort of
thing?  I don't really know Java but I can probably muddle through to
some extent.  We might as well have CPython copy Jython for once...

Cheers,
M.

-- 
  On the other hand, the following areas are subject to boycott
  in reaction to the rampant impurity of design or execution, as
  determined after a period of study, in no particular order:
    ...                              http://www.naggum.no/profile.html


From thomas@xs4all.net  Tue Jul 31 10:55:22 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 31 Jul 2001 11:55:22 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15206.8708.811000.468489@gargle.gargle.HOWL>
References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl> <15206.8708.811000.468489@gargle.gargle.HOWL>
Message-ID: <20010731115521.I20676@xs4all.nl>

On Mon, Jul 30, 2001 at 08:12:04PM -0700, Nathan Torkington wrote:

> > And I won't even start with Perl's more archaic features, that
> > change the whole working of the interpreter.
> 
> Those are going away.

Yeah, I thought as much, which is why I wasn't going to start on them :)

> Perl people hate them as much as you do--the only time they're used now is
> to make deliberately hideous code, and hardly anyone will seriously lament
> the passing of that ability.  No more "change the starting position for
> subscripts", no more "change all RE matches globally", and so on.

I don't really hate the features, I just don't use them, and wouldn't want
them in Python :-) I do actually program Perl, and will do a lot more of it
in the next couple of months at least (I switched projects at work, to one
that will entail Perl programming roughly 80% of the time) -- I just like
Python a lot more.

Your comments do lead me to ask this question, though (and forgive me if it
comes over as the arrogant ranting of a Python bigot; it's definately not
intended as such, even though I only have a Python-implementors point of
view.)

What's going to be the difference between Perl6 and Python ? The variable
typing-naming ($var, %var, etc) I presume, and the curly bracket vs.
indentation blocking issue. Regex-literals, 'unless', the '
if/unless/while ' shortcut, I guess ? Those are basically all
parser/compiler issues, so shouldn't be a real problem. The transmorphic
typing is trickier, as is taint mode and Perl's scoping rules.... Though the
latter could be done if we refactor the namespace-creation that is currently
done implicitly on function-creation, and allow it to be done explicitly.
The same goes for the variable-filling-assignment (which is quite different
from the name-binding assignment Python has.)

I don't really doubt that Perl and Python could use the same VM.... I'm not
entirely certain howmuch of the shared VM the two implementations would
actually be using. Is it worth it if the overlap is a mere, say, 25% ? (I
think it's more, but it depends entirely on howmuch different Perl6 is from
Perl5, and howmuch Python is willing to change.... Lurkers here know I'm
agressively against gratuitous breakage :)

-- 
Thomas Wouters 

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From paulp@ActiveState.com  Tue Jul 31 11:18:48 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 31 Jul 2001 03:18:48 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl> <15206.8708.811000.468489@gargle.gargle.HOWL> <20010731115521.I20676@xs4all.nl>
Message-ID: <3B668608.F68B5953@ActiveState.com>

One of the things I picked up from the Perl conference is that Perl
users *seem* (to me) to have a higher tolerance for code breakage than
Python users. (and Python users have a higher tolerance than (let's say)
Java users) Even if we put aside Perl 6, Perlers talk pretty glibly
about ripping little used features out in Perl 5.8.0 and Perl 5.10 and
so forth. 

e.g. Damian said that Autoload is going away (or pseudo hashes or
something like that). Whether or not he was right, nobody in the room
threw tomatoes as I'm sure they would if Guido tried to kill
__getattr__.

Admittedly, I never know when I hear stuff like "tr///CU is dead"  or
"package; is dead" whether each was a feature that has been in for three
years or was added to an experimental release and removed from the next
experimental release.

I'm not criticizing the Perl community. Acceptance of change is a good
thing! But I think they should know how conservative the Python world
is. Last week there were storm troopers heading for Guidos house when he
announced that the division operator is going to change its behaviour
two or three years. That means it would take a major PR effort to
convince the Python community that even minor language changes would be
worth the benefit of sharing a VM.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From sjoerd.mullender@oratrix.com  Tue Jul 31 11:23:27 2001
From: sjoerd.mullender@oratrix.com (Sjoerd Mullender)
Date: Tue, 31 Jul 2001 12:23:27 +0200
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: Your message of Sat, 28 Jul 2001 16:13:53 -0400.
 
References: 
Message-ID: <20010731102328.1260D301CF7@bireme.oratrix.nl>

Success on SGI O2 running IRIX6.5.12m with native compiler version
7.2.1.3m and compiled without -O.

On Sat, Jul 28 2001 "Tim Peters" wrote:

> Here's your chance to prove your favorite platform isn't a worthless pile of
> monkey crap .  Please run the attached.  If it prints anything other
> than
> 
> 0 failures in 10000 tries
> 
> it will probably print a lot.  In that case I'd like to know which flavor of
> C+libc+libm you're using, and the OS; a few of the failures it prints may be
> helpful too.

-- Sjoerd Mullender 


From barry@zope.com  Tue Jul 31 11:54:54 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 31 Jul 2001 06:54:54 -0400
Subject: [Python-Dev] cgitb.py for Python 2.2
References: 
 <200107310622.CAA11742@cj20424-a.reston1.va.home.com>
Message-ID: <15206.36478.421953.437702@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum  writes:

    >> What i'm proposing is that we toss cgitb.py into the standard
    >> library (pretty small at about 100 lines, since all the heavy
    >> lifting is in pydoc and inspect).  Then we can add this to
    >> site.py: if os.environ.has_key("GATEWAY_INTERFACE"): import
    >> sys, cgitb sys.excepthook = cgitb.excepthook

    GvR> Why not add this to cgi.py instead?  Th site.py
    GvR> initialization is accumulating a lot of cruft, and I don't
    GvR> like new additions that are irrelevant for most apps (CGI is
    GvR> a tiny niche for Python IMO).  (I also think all the stuff
    GvR> that's only for interactive mode should be moved off to
    GvR> another module that is only run in interactive mode.)

I'm at best +0 on adding it to site.py too.  E.g. for performance
reasons Mailman's cgi wrappers invoke Python with -S to avoid the
expensive overhead of importing site.py for each cgi hit.

-Barry


From barry@zope.com  Tue Jul 31 12:01:03 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 31 Jul 2001 07:01:03 -0400
Subject: [Python-Dev] pep-discuss
References: <3B62EB05.396DF4D7@ActiveState.com>
Message-ID: <15206.36847.621663.568615@anthem.wooz.org>

>>>>> "PP" == Paul Prescod  writes:

    PP> We've talked about having a mailing list for general
    PP> PEP-related discussions. Two things make me think that
    PP> revisiting this would be a good idea right now.

    PP> First, the recent loosening up of the python-dev rules
    PP> threatens the quality of discussion about bread and butter
    PP> issues such as patch discussions and process issues.

I'm not worrying about that until it becomes a problem. :)

    PP> Second, the flamewar on python-list basically drowned out the
    PP> usual newbie questions and would give a person coming new to
    PP> Python a very negative opinion about the language's future and
    PP> the friendliness of the community. I would rather redirect as
    PP> much as possible of that to a list that only interested
    PP> participants would have to endure.

For me too, it'd be just another list to subscribe to and follow, so
I'm generally against a separate pep list too.

One thing I'll note: in Mailman 2.1 we will be able to define "topics"
and you will be able to filter on specific topics.  E.g. if we defined
a pep topic, you could filter out all pep messages, receive only pep
messages, or do mail client filtering on the X-Topics: header.  (This
only works for regular delivery, not digest delivery.)

just-dont-ask-when-MM2.1-will-be-ready-ly y'rs,
-Barry


From guido@zope.com  Tue Jul 31 12:31:21 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 07:31:21 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Your message of "Tue, 31 Jul 2001 01:47:03 PDT."
 <3B667087.EBBE8938@ActiveState.com>
References:  <007701c11954$6b0017c0$8a73fea9@newmexico> <3B663140.41CB9DD7@ActiveState.com> <200107310629.CAA11818@cj20424-a.reston1.va.home.com>
 <3B667087.EBBE8938@ActiveState.com>
Message-ID: <200107311131.HAA15851@cj20424-a.reston1.va.home.com>

> Guido van Rossum wrote:
> > 
> > > Also, the .NET CLR is standardized at ECMA so we could (at least in
> > > theory!) go to the meetings and try to influence version 2.
> > 
> > Notice the addition "in theory".  In practice, this is BS.
> 
> It depends on the rules and politics of each particular standards group.
> It is fundamentally a social activity. It also depends how much effort
> you are willing to put into promoting your cause. Sam Ruby is chair of
> the ECMA CLI group. He is a big scripting language fan. 
> 
> http://www2.hursley.ibm.com/tc39/
> 
> Also note the presence of Mike Cowlishaw of REXX fame and Dave Raggett
> of the W3C.
> 
> Working within a standards body is a gamble. It can pay off big or it
> can completely fail. We might find Microsoft our strongest ally -- they
> have always been interested in having the scripting languages work well
> on their platforms. They would hate to give programmers to have an
> excuse to stick to Unix or the JVM.

So it boils down to us vs. MS.  Guess who wins whenever there's a
disagreement.  I still maintain that it's a waste of our time.

> I don't personally know enough about this particular circumstance to
> know whether there is any possibility of significantly influencing
> version 2 or not. Maybe the gamble isn't worth the effort. But I
> wouldn't dismiss it out of hand.

Well, your boss has a pact with MS, so AS might pull it off. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Tue Jul 31 12:52:58 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 31 Jul 2001 07:52:58 -0400
Subject: [Python-Dev] cgitb.py for Python 2.2
In-Reply-To: <15206.8227.652539.471067@anthem.wooz.org>; from barry@zope.com on Mon, Jul 30, 2001 at 11:04:03PM -0400
References:  <15206.8227.652539.471067@anthem.wooz.org>
Message-ID: <20010731075258.A2757@ute.cnri.reston.va.us>

On Mon, Jul 30, 2001 at 11:04:03PM -0400, Barry A. Warsaw wrote:
>I'll take a closer look at cgitb.py when I get a chance, but I'm
>generally +1 on the idea.

+0 from me, though I also think it would be better in cgi.py and not
in site.py.  It would also be useful if it could mail tracebacks and
return a non-committal but secure error message to the browser; I'll
contribute that as a patch if cgitb.py goes in.  (Or should that be
cgi/tb.py?  Hmm...)

--amk


From akuchlin@mems-exchange.org  Tue Jul 31 13:01:28 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 31 Jul 2001 08:01:28 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15206.8511.147000.832644@gargle.gargle.HOWL>; from gnat@oreilly.com on Mon, Jul 30, 2001 at 08:08:47PM -0700
References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl> <20010730205657.A2298@ute.cnri.reston.va.us> <15206.8511.147000.832644@gargle.gargle.HOWL>
Message-ID: <20010731080128.B2757@ute.cnri.reston.va.us>

On Mon, Jul 30, 2001 at 08:08:47PM -0700, Nathan Torkington wrote:
>Andrew Kuchling writes:
>The core loop would just be the usual opcode dispatch loop ("call the
>function for the current operation, which returns the next
>operation").  The only difference is that some of the opcodes would be
>specific to RE matches.  (I'm unclear on how much special logic RE

The big difference I see between regex opcodes and language opcodes is
that regexes need to backtrack and language ones don't.  Unless the
idea is to compile a regex to actual VM code similar to that generated
by Python/Perl code, but then wouldn't that sacrifice efficiency?

--amk


From paulp@ActiveState.com  Tue Jul 31 13:39:53 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 31 Jul 2001 05:39:53 -0700
Subject: [Python-Dev] Frank Willison
Message-ID: <3B66A719.4252CAAC@ActiveState.com>

The Python world has lost a great friend in Frank Willison. Frank died
yesterday of a massive heart attack.

I've searched in vain for a biography of Frank for those that didn't
know him but perhaps he was too modest to put his biography on the Web.
Suffice to say that before there were 30 or 10 or 5 Python books, before
acquisitions editors started cold-calling Python programmers, Frank had
a sense that this little language could become something.

In Frank's words:

"This is my third Python Conference. At the first one, a loyal 70 or so
Python loyalists debated potential new features of the language. At the
second, 120 or so Python programmers split their time between a review
of language features and the discussion of interesting Python
applications. 

At this conference, the third, we moved onto a completely different
level. Presentations and demonstrations at this conference of nearly 250
attendees have covered applications built on Python. Companies are
demonstrating their Python-based products. There is venture capital
here. There are people here because they want to learn about Python.
This year, mark my words: Python is here to stay."

	http://www.oreilly.com/frank/pythonconf_0100.html

The O'Reilly books that Frank edited helped to give Python the
legitimacy it needed to get over the hump. I carefully put in the word
"helped" because Frank requires honesty and modesty:

"O'Reilly doesn't legitimize. If we did, lots of technology creators who
enjoy their status as bastards would shun us. We try to find the
technologies that are interesting and powerful, that solve the problems
people really have. Then we take pleasure in publishing an interesting
book on that subject. 

I'd like to put another issue to rest: the Camel book did not legitimize
Perl. It may have accelerated Perl's adoption by making information
about Perl more readily available. But the truth is that Perl would have
succeeded without an O'Reilly book (as would Python and Zope), and that
we're very pleased to have been smart enough to recognize Perl's
potential before other publishers did."

	http://www.oreilly.com/frank/legitimacy_1199.html

Frank was also a Perl guy. He was big enough for both worlds. To me he
was a Perl guy but *the* Python guy. Frank was the guy who got Python
books into print. He and his protege Laura Llewin were constantly on the
lookout for opportunities to write about Python.

Much more important than anything he did with or for Python: Frank was a
really great guy with an excellent sense of humor and a way of
connecting with people. I know all of that after only meeting him two or
three times because it was just so obvious what kind of person he was
that it didn't take you any time to figure it out.

You can find more of Frank's writings here:

	http://www.oreilly.com/frank/

 Paul Prescod


From mal@lemburg.com  Tue Jul 31 14:28:39 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 15:28:39 +0200
Subject: [Python-Dev] PyOS_snprintf() / PyOS_vsnprintf()
Message-ID: <3B66B287.5D319774@lemburg.com>

Just to let you know and to initiate some cross-platform
testing:

While working on the warning patch for modsupport.c,
I've added two new APIs which hopefully make it easier for Python
to switch to buffer overflow safe [v]snprintf() APIs for error
reporting et al. 

The two new APIs are PyOS_snprintf() and 
PyOS_vsnprintf() and work just like the standard ones in many
C libs. On platforms which have snprintf(), the native APIs are used,
on all other an emulation with snprintf() tries to do its best.

Please try them out on your platform. If all goes well, I think
we should replace all sprintf() (without the n in the name)
with these new safer APIs.

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From skip@pobox.com (Skip Montanaro)  Tue Jul 31 15:07:26 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 31 Jul 2001 09:07:26 -0500
Subject: [Python-Dev] zipfiles on sys.path
In-Reply-To: <3B65AA15.27947.E9B214D@localhost>
References: <20010725215830.2F49D14A25D@oratrix.oratrix.nl>
 <3B65AA15.27947.E9B214D@localhost>
Message-ID: <15206.48030.99097.902155@beluga.mojam.com>

    Gordon> ... but it's my observation that package authors are enamored of
    Gordon> import hacks, so be wary.

One for amk's quotes file? ;-)

Skip


From skip@pobox.com (Skip Montanaro)  Tue Jul 31 15:51:22 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 31 Jul 2001 09:51:22 -0500
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010731013448.H20676@xs4all.nl>
References: <200107302251.KAA00585@s454.cosc.canterbury.ac.nz>
 <20010731013448.H20676@xs4all.nl>
Message-ID: <15206.50666.998086.720321@beluga.mojam.com>

    Skip> The main stumbling block was that pesky "from module import *"
    Skip> statement.  It could push an unknown quantity of stuff onto the
    Skip> stack

    Greg> Are you *sure* about that? I'm pretty certain it can't be true,
    Greg> since the compiler has to know at all times how much is on the
    Greg> stack, so it can decide how much stack space is needed.

    Thomas> I think Skip meant it does an arbitrary number of 

    Thomas> load-onto-stack
    Thomas> store-into-namespace

    Thomas> operations. Skip, you'll be glad to know that's no longer true
    Thomas> :) Since 2.0 (or when was it that we introduced 'import as' ?)
    Thomas> import-* is not a special case of 'IMPORT_FROM', but rather a
    Thomas> separate opcode that doesn't touch the stack.

I'm not sure what I meant any more.  (They say eye witness testimony in a
courtroom is quite unreliable.)  I'm pretty sure Greg's analysis is at least
partly correct (in that that couldn't have been why I failed to implement a
converter for IMPORT_FROM).  I went back and looked briefly at my old code
last night (which was broken when I put it aside - don't *ever* do that!)
and could find nothing that would indicate why I didn't like
"from-import-*".  The instruction set converter would refuse to try
converting any code that contained these opcdes: {LOAD,STORE,DELETE}_NAME,
SETUP_{FINALLY,EXCEPT}, or IMPORT_FROM.  At this point in time I'm not sure
which of those six opcodes were just ones I hadn't gotten around to writing
converters for and which were showstoppers.

wish-i-had-more-time-for-this-ly y'rs,

Skip


From skip@pobox.com (Skip Montanaro)  Tue Jul 31 16:02:25 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 31 Jul 2001 10:02:25 -0500
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: 
References: <20010730051831.B1122@thyrsus.com>
 
Message-ID: <15206.51329.561652.565480@beluga.mojam.com>

I was thinking a little about a Python/Perl VM merge.  One problem I imagine
would be difficult to reconcile is the subtle difference in semantics of
various basic types.  Consider the various bits of Python's (proposed)
number system that Perl might not have (or want): rationals, automatic
promotion from machine ints to longs, complex numbers.  These may not work
well with Perl's semantics.  What about exceptions?  Do Python and Perl have
similar notions of what exceptional conditions exist?

Skip


From pedroni@inf.ethz.ch  Tue Jul 31 16:06:57 2001
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Tue, 31 Jul 2001 17:06:57 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
References:  <007701c11954$6b0017c0$8a73fea9@newmexico> <3B663140.41CB9DD7@ActiveState.com>
Message-ID: <003801c119d2$724781c0$8a73fea9@newmexico>

Thanks for the answer.

> Samuele Pedroni wrote:
> >
> >...
> > A question: are there already some data about
> > what would be the actual performance of Python.NET vs. CPython ?
>
> I think it is safe to say that the current version of Python.NET is
> slower than Jython. Now it hasn't been optimized as much as Jython so we
> might be able to get it as fast as Jython.
But this  maybe will wonder you, but Jython is not that much optimized, it's
mostly
a straightforward OO design. But I think that's the only way to avoid
specializing
for some development state of the JVMs.
For exampe we have changed nothing, but it seems (it seems) that under Java 1.4
asymptotically (meaning you need a long running process to exploit the HotSpot
technology)
Jython is a bit faster than CPython, at least for non I/O intesive stuff. It
seems
they optimized reflection.

> But I don't think that there
> is anything in the .NET runtime that makes it a great deal better than
> the JVM for dynamic languages.
I have the same impression, unless one can do something really clever
with boxing/unboxing without loosing too much cycles or going in the
way of the compiler.

> The only difference is that Microsoft
> seems more aware of the problem and may move to correct it whereas I
> have a feeling that explicit support for our languages would dilute
> Sun's 100% Java marketing campaign.
But will Sun be such a passive actor, even if MS will have a market advatage
supporting especially
scripting languages.

There is much hype in both camps, but Unix/C seem to show that you need a good
system language
and the possibility to write some scripting languages over it to have a good
platform.

> Also, the .NET CLR is standardized
> at ECMA so we could (at least in theory!) go to the meetings and try to
> influence version 2.
I imagine you can go the same way entering the JCP. ASF is in for example.

Samuele Pedroni.



From mclay@nist.gov  Tue Jul 31 04:11:52 2001
From: mclay@nist.gov (Michael McLay)
Date: Mon, 30 Jul 2001 23:11:52 -0400
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <3B666C9C.4400BD9C@lemburg.com>
References: <0107301106520A.02216@fermi.eeel.nist.gov> <3B666C9C.4400BD9C@lemburg.com>
Message-ID: <01073023115207.02466@fermi.eeel.nist.gov>

On Tuesday 31 July 2001 04:30 am, M.-A. Lemburg wrote:
> How will you be able to define the precision of decimals ? Implicit
> by providing a decimal string with enough 0s to let the parser
> deduce the precision ? Explicit like so: decimal(12, 5) ?

Would the following work?  For literal type definitions the precision would 
be implicit.  For values set using the decimal() function the definition 
would be implicit unless an explicit precision definition is set.  The 
following would all define the same value and precision.

   3.40d
   decimal("3.40")
   decimal(3.4, 2)

Those were easy.  How would the following be interpreted?

   decimal 3.404, 2)
   decimal 3.405, 2)
   decimal(3.39999, 2)

> Also, what happens to the precision of the decimal object resulting
> from numeric operations ?

Good question.  I'm not the right person to answer this, but here's is a 
first stab at what I would expect.

For addition, subtraction, and multiplication the results would be exact with 
no rounding of the results.  Calculations that include division the number of 
digits in a non-terminating result will have to be explicitly set.  Would it 
make sense for this to be definedby the numbers used in the calculation?  
Could this be set in the module or could it be global for the application?

What do you suggestion?  

>
> >     A decimal number can be used to represent integers and floating point
> >     numbers and decimal numbers can also be displayed using scientific
> >     notation. Examples of decimal numbers include:
> >
> >     ...
> >
> >     This proposal will also add an optional  'b' suffix to the
> >     representation of binary float type literals and binary int type
> >     literals.
>
> Hmm, I don't quite grasp the need for the 'b'... numbers without
> any modifier will work the same way as they do now, right ?

I made a change to the parsenumber() function in compile.c so that the type 
of the number is determined by the suffix attached to the number.  To retain 
backward compatibility the tokenizer automatically attaches the 'b' suffix to 
float and int types if they do not have a suffix in the literal definition.

My original PEP included the definition of a .dp and a dpython mode for the 
interpreter in which the default number type is decimal instead of binary.  
When the mode is switch the language becomes easier to use for developing 
applications that use decimal numbers.

> >     Expressions that mix binary floats with decimals introduce the
> >     possibility of unexpected results because the two number types use
> >     different internal representations for the same numerical value.
>
> I'd rather have this explicit in the sense that you define which
> assumptions will be made and what issues arise (rounding, truncation,
> loss of precision, etc.).

Can you give an example of how this might be implemented.

> >     To accommodate the three possible usage models the python interpreter
> >     command line options will be used to set the level for warning and
> >     error messages. The three levels are:
> >
> >     promiscuous mode,   -f or  --promiscuous
> >     safe mode           -s or --save
> >     pedantic mode       -p or --pedantic
>
> How about a generic option:
>
> 	--numerics:[loose|safe|pedantic] or -n:[l|s|p]

Thanks for the suggestion. I"ll change it. 



From aahz@rahul.net  Tue Jul 31 17:37:02 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Tue, 31 Jul 2001 09:37:02 -0700 (PDT)
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <01073023115207.02466@fermi.eeel.nist.gov> from "Michael McLay" at Jul 30, 2001 11:11:52 PM
Message-ID: <20010731163703.2F86E99C85@waltz.rahul.net>

Michael McLay wrote:
> 
> Those were easy.  How would the following be interpreted?
> 
>    decimal 3.404, 2)
>    decimal 3.405, 2)
>    decimal(3.39999, 2)
> 
>  [...]
> 
> For addition, subtraction, and multiplication the results would be
> exact with no rounding of the results.  Calculations that include
> division the number of digits in a non-terminating result will have to
> be explicitly set.  Would it make sense for this to be definedby the
> numbers used in the calculation?  Could this be set in the module or
> could it be global for the application?

This is why Cowlishaw et al require a full context for all operations.
At one point I tried implementing things with the context being
contained in the number rather than "global" (which actually means
thread-global, but I'm probably punting on *that* bit for the moment),
but Tim Peters persuaded me that sticking with the spec was the Right
Thing until *after* the spec was fully implemented.

After seeing the mess generated by PEP-238, I'm fervently in favor of
sticking with external specs whenever possible.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From mal@lemburg.com  Tue Jul 31 17:36:28 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 18:36:28 +0200
Subject: [Python-Dev] Revised decimal type PEP
References: <0107301106520A.02216@fermi.eeel.nist.gov> <3B666C9C.4400BD9C@lemburg.com> <01073023115207.02466@fermi.eeel.nist.gov>
Message-ID: <3B66DE8C.C9C62012@lemburg.com>

Michael McLay wrote:
> 
> On Tuesday 31 July 2001 04:30 am, M.-A. Lemburg wrote:
> > How will you be able to define the precision of decimals ? Implicit
> > by providing a decimal string with enough 0s to let the parser
> > deduce the precision ? Explicit like so: decimal(12, 5) ?
> 
> Would the following work?  For literal type definitions the precision would
> be implicit.  For values set using the decimal() function the definition
> would be implicit unless an explicit precision definition is set.  The
> following would all define the same value and precision.
> 
>    3.40d
>    decimal("3.40")
>    decimal(3.4, 2)
> 
> Those were easy.  How would the following be interpreted?
> 
>    decimal 3.404, 2)
>    decimal 3.405, 2)
>    decimal(3.39999, 2)

I'd suggest to follow the rules for the SQL definitions
of DECIMAL(,).
 
> > Also, what happens to the precision of the decimal object resulting
> > from numeric operations ?
> 
> Good question.  I'm not the right person to answer this, but here's is a
> first stab at what I would expect.
> 
> For addition, subtraction, and multiplication the results would be exact with
> no rounding of the results.  Calculations that include division the number of
> digits in a non-terminating result will have to be explicitly set.  Would it
> make sense for this to be definedby the numbers used in the calculation?
> Could this be set in the module or could it be global for the application?
> 
> What do you suggestion?

Well, there are several options. I support that the IBM paper
on decimal types has good hints as to what the type should do.
Again, SQL is probably a good source for inspiration too, since
it deals with decimals a lot.
 
> >
> > >     A decimal number can be used to represent integers and floating point
> > >     numbers and decimal numbers can also be displayed using scientific
> > >     notation. Examples of decimal numbers include:
> > >
> > >     ...
> > >
> > >     This proposal will also add an optional  'b' suffix to the
> > >     representation of binary float type literals and binary int type
> > >     literals.
> >
> > Hmm, I don't quite grasp the need for the 'b'... numbers without
> > any modifier will work the same way as they do now, right ?
> 
> I made a change to the parsenumber() function in compile.c so that the type
> of the number is determined by the suffix attached to the number.  To retain
> backward compatibility the tokenizer automatically attaches the 'b' suffix to
> float and int types if they do not have a suffix in the literal definition.
> 
> My original PEP included the definition of a .dp and a dpython mode for the
> interpreter in which the default number type is decimal instead of binary.
> When the mode is switch the language becomes easier to use for developing
> applications that use decimal numbers.

I see, the small 'b' still looks funny to me though. Wouldn't
1.23f and 25i be more intuitive ?

> > >     Expressions that mix binary floats with decimals introduce the
> > >     possibility of unexpected results because the two number types use
> > >     different internal representations for the same numerical value.
> >
> > I'd rather have this explicit in the sense that you define which
> > assumptions will be made and what issues arise (rounding, truncation,
> > loss of precision, etc.).
> 
> Can you give an example of how this might be implemented.

You would typically first coerce the types to the "larger"
type, e.g. float + decimal -> float + float -> float, so
you'd only have to document how the conversion is done and
which accuracy to expect.
 
> > >     To accommodate the three possible usage models the python interpreter
> > >     command line options will be used to set the level for warning and
> > >     error messages. The three levels are:
> > >
> > >     promiscuous mode,   -f or  --promiscuous
> > >     safe mode           -s or --save
> > >     pedantic mode       -p or --pedantic
> >
> > How about a generic option:
> >
> >       --numerics:[loose|safe|pedantic] or -n:[l|s|p]
> 
> Thanks for the suggestion. I"ll change it.

Great.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@zope.com  Tue Jul 31 17:56:51 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 12:56:51 -0400
Subject: [Python-Dev] PyOS_snprintf() / PyOS_vsnprintf()
In-Reply-To: Your message of "Tue, 31 Jul 2001 15:28:39 +0200."
 <3B66B287.5D319774@lemburg.com>
References: <3B66B287.5D319774@lemburg.com>
Message-ID: <200107311656.MAA16366@cj20424-a.reston1.va.home.com>

> While working on the warning patch for modsupport.c,
> I've added two new APIs which hopefully make it easier for Python
> to switch to buffer overflow safe [v]snprintf() APIs for error
> reporting et al. 
> 
> The two new APIs are PyOS_snprintf() and 
> PyOS_vsnprintf() and work just like the standard ones in many
> C libs. On platforms which have snprintf(), the native APIs are used,
> on all other an emulation with snprintf() tries to do its best.
> 
> Please try them out on your platform. If all goes well, I think
> we should replace all sprintf() (without the n in the name)
> with these new safer APIs.

It would be easier to test out the fallback implementation if there
was a config option to enable it even on platforms that do have the
native version.

Or maybe (following the getopt example) we might consider always using
our own code -- so it gets the maximum testing.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Tue Jul 31 18:08:47 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 13:08:47 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Your message of "Tue, 31 Jul 2001 10:02:25 CDT."
 <15206.51329.561652.565480@beluga.mojam.com>
References: <20010730051831.B1122@thyrsus.com> 
 <15206.51329.561652.565480@beluga.mojam.com>
Message-ID: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>

> I was thinking a little about a Python/Perl VM merge.  One problem I imagine
> would be difficult to reconcile is the subtle difference in semantics of
> various basic types.  Consider the various bits of Python's (proposed)
> number system that Perl might not have (or want): rationals, automatic
> promotion from machine ints to longs, complex numbers.  These may not work
> well with Perl's semantics.  What about exceptions?  Do Python and Perl have
> similar notions of what exceptional conditions exist?

Actually, this may not be as big a deal as I thought before.  The PVM
doesn't have a lot of knowledge about types built into its instruction
set.  It knows a bit about classes, lists, dicts, but not e.g. about
ints and strings.  The opcodes are mostly very abstract: BINARY_ADD etc.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From DavidA@ActiveState.com  Tue Jul 31 18:21:59 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 31 Jul 2001 10:21:59 -0700
Subject: [Python-Dev] Frank Willison
Message-ID: <3B66E937.D2390F90@ActiveState.com>

As Paul mentioned on python-list, Frank Willison died of a heart attack
yesterday.  I'm sad.

--david


From mal@lemburg.com  Tue Jul 31 18:22:52 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 19:22:52 +0200
Subject: [Python-Dev] PyOS_snprintf() / PyOS_vsnprintf()
References: <3B66B287.5D319774@lemburg.com> <200107311656.MAA16366@cj20424-a.reston1.va.home.com>
Message-ID: <3B66E96C.FBAB8A62@lemburg.com>

Guido van Rossum wrote:
> 
> > While working on the warning patch for modsupport.c,
> > I've added two new APIs which hopefully make it easier for Python
> > to switch to buffer overflow safe [v]snprintf() APIs for error
> > reporting et al.
> >
> > The two new APIs are PyOS_snprintf() and
> > PyOS_vsnprintf() and work just like the standard ones in many
> > C libs. On platforms which have snprintf(), the native APIs are used,
> > on all other an emulation with snprintf() tries to do its best.
> >
> > Please try them out on your platform. If all goes well, I think
> > we should replace all sprintf() (without the n in the name)
> > with these new safer APIs.
> 
> It would be easier to test out the fallback implementation if there
> was a config option to enable it even on platforms that do have the
> native version.
>
> Or maybe (following the getopt example) we might consider always using
> our own code -- so it gets the maximum testing.

How about always enabling our version in the alpha cycle and then
reverting back to the native one in the betas ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From DavidA@ActiveState.com  Tue Jul 31 18:40:16 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 31 Jul 2001 10:40:16 -0700
Subject: [Python-Dev] pep-discuss
References: <3B62EB05.396DF4D7@ActiveState.com> <15206.36847.621663.568615@anthem.wooz.org>
Message-ID: <3B66ED80.61B7E4C6@ActiveState.com>

"Barry A. Warsaw" wrote:

>     PP> Second, the flamewar on python-list basically drowned out the
>     PP> usual newbie questions and would give a person coming new to
>     PP> Python a very negative opinion about the language's future and
>     PP> the friendliness of the community. I would rather redirect as
>     PP> much as possible of that to a list that only interested
>     PP> participants would have to endure.
> 
> For me too, it'd be just another list to subscribe to and follow, so
> I'm generally against a separate pep list too.
> 
> One thing I'll note: in Mailman 2.1 we will be able to define "topics"
> and you will be able to filter on specific topics.  E.g. if we defined
> a pep topic, you could filter out all pep messages, receive only pep
> messages, or do mail client filtering on the X-Topics: header.  (This
> only works for regular delivery, not digest delivery.)

But that doesn't really solve the problem for newbies who aren't going
to set up filters just for this Python list they just got onto.

IMO, having 100 or so people add a new list is cheaper than having 10's
of 1000's of people setting up filter.

But whatever. =)

--david


From skip@pobox.com (Skip Montanaro)  Tue Jul 31 18:48:32 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 31 Jul 2001 12:48:32 -0500
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
References: <20010730051831.B1122@thyrsus.com>
 
 <15206.51329.561652.565480@beluga.mojam.com>
 <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
Message-ID: <15206.61296.958360.72700@beluga.mojam.com>

    Guido> The PVM doesn't have a lot of knowledge about types built into
    Guido> its instruction set....  The opcodes are mostly very abstract:
    Guido> BINARY_ADD etc.

Yeah, but the runtime behind the virtual machine knows a hell of a lot about
the types.  A stream of opcodes doesn't mean anything without the semantics
of the functions the interpreter loop calls to do its work.  I thought the
aim of Eric's Parrot idea was that Perl and Python might be able to share a
virtual machine.  If both can generate something like today's BINARY_ADD
opcode, the underlying types of both Python and Perl better have the same
semantics.

Skip



From nascheme@mems-exchange.org  Tue Jul 31 18:54:57 2001
From: nascheme@mems-exchange.org (Neil Schemenauer)
Date: Tue, 31 Jul 2001 13:54:57 -0400
Subject: [Python-Dev] Good news about ExtensionClass and Python 2.2a1
Message-ID: <20010731135457.A15139@mems-exchange.org>

After a few tweaks to ExtensionClass and a few small fixes to some of
our introspection code I'm happy to say that Python 2.2a1 passes our
unit test suite.  This is significant since there are about 45000 lines
of code (counted by "wc -l") tested by 3569 test cases.  Since we use
ZODB ExtensionClasses are quite widely used.  Merging descr_branch into
HEAD sounds like a good idea to me.  Well done Guido.

I'm going to spend a bit of time trying to rewrite the ZODB Persistent
class as a type.  Attached is a diff of the changes I made to
ExtensionClass.

  Neil

--- ExtensionClass.h.dist	Tue Jul 31 11:50:39 2001
+++ ExtensionClass.h	Tue Jul 31 12:15:21 2001
@@ -143,12 +143,48 @@
 	reprfunc tp_str;
 	getattrofunc tp_getattro;
 	setattrofunc tp_setattro;
-	/* Space for future expansion */
-	long tp_xxx3;
-	long tp_xxx4;
+
+	/* Functions to access object as input/output buffer */
+	PyBufferProcs *tp_as_buffer;
+
+	/* Flags to define presence of optional/expanded features */
+	long tp_flags;
 
 	char *tp_doc; /* Documentation string */
 
+	/* call function for all accessible objects */
+	traverseproc tp_traverse;
+	
+	/* delete references to contained objects */
+	inquiry tp_clear;
+
+	/* rich comparisons */
+	richcmpfunc tp_richcompare;
+
+	/* weak reference enabler */
+	long tp_weaklistoffset;
+
+	/* Iterators */
+	getiterfunc tp_iter;
+	iternextfunc tp_iternext;
+
+	/* Attribute descriptor and subclassing stuff */
+	struct PyMethodDef *tp_methods;
+	struct memberlist *tp_members;
+	struct getsetlist *tp_getset;
+	struct _typeobject *tp_base;
+	PyObject *tp_dict;
+	descrgetfunc tp_descr_get;
+	descrsetfunc tp_descr_set;
+	long tp_dictoffset;
+	initproc tp_init;
+	allocfunc tp_alloc;
+	newfunc tp_new;
+	destructor tp_free; /* Low-level free-memory routine */
+	PyObject *tp_bases;
+	PyObject *tp_mro; /* method resolution order */
+	PyObject *tp_defined;
+
 #ifdef COUNT_ALLOCS
 	/* these must be last */
 	int tp_alloc;
@@ -302,7 +338,9 @@
    { PyExtensionClassCAPI->Export(D,N,&T); }
 
 /* Convert a method list to a method chain.  */
-#define METHOD_CHAIN(DEF) { DEF, NULL }
+#define METHOD_CHAIN(DEF) \
+	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, \
+	{ DEF, NULL }
 
 /* The following macro checks whether a type is an extension class: */
 #define PyExtensionClass_Check(TYPE) \
@@ -336,7 +374,9 @@
 #define PURE_MIXIN_CLASS(NAME,DOC,METHODS) \
 static PyExtensionClass NAME ## Type = { PyObject_HEAD_INIT(NULL) \
 	0, # NAME, sizeof(PyPureMixinObject), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
-	0, 0, 0, 0, 0, 0, 0, DOC, {METHODS, NULL}, \
+	0, 0, 0, 0, 0, 0, 0, DOC, \
+	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, \
+	{METHODS, NULL}, \
         EXTENSIONCLASS_BASICNEW_FLAG}
 
 /* The following macros provide limited access to extension-class
--- ExtensionClass.c.dist	Tue Jul 31 11:01:20 2001
+++ ExtensionClass.c	Tue Jul 31 12:15:24 2001
@@ -119,7 +119,7 @@
 static PyObject *subclass_watcher=0;  /* Object that subclass events */
 
 static void
-init_py_names()
+init_py_names(void)
 {
 #define INIT_PY_NAME(N) py ## N = PyString_FromString(#N)
   INIT_PY_NAME(__add__);
@@ -1800,8 +1800,8 @@
 
   if (PyFunction_Check(r) || NeedsToBeBound(r))
     ASSIGN(r,newPMethod(self,r));
-  else if (PyMethod_Check(r) && ! PyMethod_Self(r))
-    ASSIGN(r,newPMethod(self, PyMethod_Function(r)));
+  else if (PyMethod_Check(r) && ! PyMethod_GET_SELF(r))
+    ASSIGN(r,newPMethod(self, PyMethod_GET_FUNCTION(r)));
 
   return r;
 }
@@ -3527,7 +3527,7 @@
 };
 
 void
-initExtensionClass()
+initExtensionClass(void)
 {
   PyObject *m, *d;
   char *rev="$Revision: 1.1 $";


From DavidA@ActiveState.com  Tue Jul 31 18:57:35 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 31 Jul 2001 10:57:35 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <20010730051831.B1122@thyrsus.com>
 
 <15206.51329.561652.565480@beluga.mojam.com>
 <200107311708.NAA16497@cj20424-a.reston1.va.home.com> <15206.61296.958360.72700@beluga.mojam.com>
Message-ID: <3B66F18F.3EE81628@ActiveState.com>

Skip Montanaro wrote:
> 
>     Guido> The PVM doesn't have a lot of knowledge about types built into
>     Guido> its instruction set....  The opcodes are mostly very abstract:
>     Guido> BINARY_ADD etc.
> 
> Yeah, but the runtime behind the virtual machine knows a hell of a lot about
> the types.  A stream of opcodes doesn't mean anything without the semantics
> of the functions the interpreter loop calls to do its work.  I thought the
> aim of Eric's Parrot idea was that Perl and Python might be able to share a
> virtual machine.  If both can generate something like today's BINARY_ADD
> opcode, the underlying types of both Python and Perl better have the same
> semantics.

I don't think that needs to be true _in toto_.  In other words, some
opcodes can be used by both languages, some can be language-specific. 
The implementation of the VM for a given opcode can be shared per
language, or even just partially shared.  BINARY_ADD can do the same
thing in most languages for 'native' types, and defer to per-language
codepaths for objects, for example.

One problem with a hybrid approach might be that optimizations become
really hard to do if you can't assume much about the semantics, or if
you can only assume the union of the various semantics.  But the idea is
intriguing anyway =).

--david


From guido@zope.com  Tue Jul 31 20:00:01 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 15:00:01 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Your message of "Tue, 31 Jul 2001 12:48:32 CDT."
 <15206.61296.958360.72700@beluga.mojam.com>
References: <20010730051831.B1122@thyrsus.com>  <15206.51329.561652.565480@beluga.mojam.com> <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
 <15206.61296.958360.72700@beluga.mojam.com>
Message-ID: <200107311900.PAA17062@cj20424-a.reston1.va.home.com>

>     Guido> The PVM doesn't have a lot of knowledge about types built into
>     Guido> its instruction set....  The opcodes are mostly very abstract:
>     Guido> BINARY_ADD etc.
> 
> Yeah, but the runtime behind the virtual machine knows a hell of a lot about
> the types.  A stream of opcodes doesn't mean anything without the semantics
> of the functions the interpreter loop calls to do its work.  I thought the
> aim of Eric's Parrot idea was that Perl and Python might be able to share a
> virtual machine.  If both can generate something like today's BINARY_ADD
> opcode, the underlying types of both Python and Perl better have the same
> semantics.

Yeah, but the runtime could offer a choice of data types -- for Python
code the constants table would contain Python ints and strings etc., for
Perl code it would contain Perl string-number objects.  Maybe.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Tue Jul 31 20:11:13 2001
From: mwh@python.net (Michael Hudson)
Date: 31 Jul 2001 15:11:13 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Guido van Rossum's message of "Tue, 31 Jul 2001 15:00:01 -0400"
References: <20010730051831.B1122@thyrsus.com>  <15206.51329.561652.565480@beluga.mojam.com> <200107311708.NAA16497@cj20424-a.reston1.va.home.com> <15206.61296.958360.72700@beluga.mojam.com> <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
Message-ID: <2mr8uwsylq.fsf@starship.python.net>

Guido van Rossum  writes:

> >     Guido> The PVM doesn't have a lot of knowledge about types built into
> >     Guido> its instruction set....  The opcodes are mostly very abstract:
> >     Guido> BINARY_ADD etc.
> > 
> > Yeah, but the runtime behind the virtual machine knows a hell of a lot about
> > the types.  A stream of opcodes doesn't mean anything without the semantics
> > of the functions the interpreter loop calls to do its work.  I thought the
> > aim of Eric's Parrot idea was that Perl and Python might be able to share a
> > virtual machine.  If both can generate something like today's BINARY_ADD
> > opcode, the underlying types of both Python and Perl better have the same
> > semantics.
> 
> Yeah, but the runtime could offer a choice of data types -- for Python
> code the constants table would contain Python ints and strings etc., for
> Perl code it would contain Perl string-number objects.  Maybe.

And the point of this would be?  I don't see much more benefit than
just arranging for the numbers in Include/opcode.h to match perl's
equivalents (i.e. none), but I may be missing something...

Cheers,
M.

-- 
  I've even been known to get Marmite *near* my mouth -- but never
  actually in it yet.  Vegamite is right out.
 UnicodeError: ASCII unpalatable error: vegamite found, ham expected
                                       -- Tim Peters, comp.lang.python


From skip@pobox.com (Skip Montanaro)  Tue Jul 31 20:22:25 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 31 Jul 2001 14:22:25 -0500
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
References: <20010730051831.B1122@thyrsus.com>
 
 <15206.51329.561652.565480@beluga.mojam.com>
 <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
 <15206.61296.958360.72700@beluga.mojam.com>
 <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
Message-ID: <15207.1393.232974.785433@beluga.mojam.com>

    Guido> Yeah, but the runtime could offer a choice of data types -- for
    Guido> Python code the constants table would contain Python ints and
    Guido> strings etc., for Perl code it would contain Perl string-number
    Guido> objects.  Maybe.

So I could give a code object generated by the Python compiler to the Perl
runtime and get different results than if it was executed by the Python
environment?

Perhaps it's time for Eric to chime in again and tell us what he really has
in mind.  I can't see the utility in having the same set of opcodes for the
two languages if the semantics of running them under either environment
aren't going to be the same.  It seems like it would artificially constrain
people working on the internals of both languages.

Skip



From gnat@oreilly.com  Tue Jul 31 20:31:01 2001
From: gnat@oreilly.com (Nathan Torkington)
Date: Tue, 31 Jul 2001 13:31:01 -0600
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
References: <20010730051831.B1122@thyrsus.com>
 
 <15206.51329.561652.565480@beluga.mojam.com>
 <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
 <15206.61296.958360.72700@beluga.mojam.com>
 <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
Message-ID: <15207.1909.395000.123189@gargle.gargle.HOWL>

Guido van Rossum writes:
> Yeah, but the runtime could offer a choice of data types -- for Python
> code the constants table would contain Python ints and strings etc., for
> Perl code it would contain Perl string-number objects.  Maybe.

A perl6 value have a vtable, essentially an array of function pointers
which comprises the standard operations on that value.  I talked to
Dan (the perl6 internals guy, dan@sidhe.org) about an impedence
mismatch between Perl and Python data types, and he pointed out that
you can have Perl values and Python values, each with their own
semantics, simply by having separate vtables (and thus separate
functions to implement the behaviour of those types).  Code can work
with either type because the type carries around (in its vtable) the
knowledge of how it should behave.

Feel free to grill Dan about these things if you want.

Nat




From esr@thyrsus.com  Tue Jul 31 09:14:43 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 31 Jul 2001 04:14:43 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15207.1393.232974.785433@beluga.mojam.com>; from skip@pobox.com on Tue, Jul 31, 2001 at 02:22:25PM -0500
References: <20010730051831.B1122@thyrsus.com>  <15206.51329.561652.565480@beluga.mojam.com> <200107311708.NAA16497@cj20424-a.reston1.va.home.com> <15206.61296.958360.72700@beluga.mojam.com> <200107311900.PAA17062@cj20424-a.reston1.va.home.com> <15207.1393.232974.785433@beluga.mojam.com>
Message-ID: <20010731041443.A26075@thyrsus.com>

Skip Montanaro :
> 
>     Guido> Yeah, but the runtime could offer a choice of data types -- for
>     Guido> Python code the constants table would contain Python ints and
>     Guido> strings etc., for Perl code it would contain Perl string-number
>     Guido> objects.  Maybe.
> 
> So I could give a code object generated by the Python compiler to the Perl
> runtime and get different results than if it was executed by the Python
> environment?

No, I don't think that's what Guido is saying.  He and I are both imagining
a *single* runtime, but with some type-specific opcodes that are generated
only by Perl and some only generated by Python.

> Perhaps it's time for Eric to chime in again and tell us what he really has
> in mind.  I can't see the utility in having the same set of opcodes for the
> two languages if the semantics of running them under either environment
> aren't going to be the same.  It seems like it would artificially constrain
> people working on the internals of both languages.

You're right.

What I have in mind starts with a common opcode interpreter, perhaps
based on the Python VM but with extended opcodes where Perl type
semantics don't match, and a common callout mechanism to C-level
runtime libraries linked to the opcode interpreter.

In the conservative version of this vision, Perl and Python have
different runtimes dynamically linked to an instance of the same
opcode interpreter.  Memory allocation/GC and scheduling/threading are
handled inside the opcode interpreter but the OS and environment
binding is (mostly) in the libraries.

Things Python would bring to this party: our serious-cool GC, our 
C extension/embedding system (*much* nicer than XS).  Things Perl would
bring: blazingly fast regexps, taint, flexitypes, references. 

In the radical version, the Perl and Python runtimes merge and the 
differences in semantics are implemented by compiling different wrapper
sequences of opcodes around the library callouts.  At this point we're
doing something competitive with Microsoft's CLR.

My proposed work plan is:

1. Separate the Python VM from the Python compiler.  Initially it's
   OK if they still communicate by hard linkage but that will change
   later.

2. Build the Parrot VM out from the Python VM by adding the minimum
   number of Perliferous opcodes.

3. Start building the Perl runtime on top of that, re-using as much
   of the Python runtime as possible to save effort.
-- 
		Eric S. Raymond

Every election is a sort of advance auction sale of stolen goods. 
	-- H.L. Mencken 


From m@moshez.org  Tue Jul 31 21:10:50 2001
From: m@moshez.org (Moshe Zadka)
Date: Tue, 31 Jul 2001 23:10:50 +0300
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
References: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>, <20010730051831.B1122@thyrsus.com> 
 <15206.51329.561652.565480@beluga.mojam.com>
Message-ID: 

On Tue, 31 Jul 2001, Guido van Rossum  wrote:

> Actually, this may not be as big a deal as I thought before.  The PVM
> doesn't have a lot of knowledge about types built into its instruction
> set.  It knows a bit about classes, lists, dicts, but not e.g. about
> ints and strings.  The opcodes are mostly very abstract: BINARY_ADD etc.

PUSH "1"
PUSH "2"
BINARY_ADD

In Python that gives "12". In Perl that gives 3.
Unless you suggest a PERL_BINARY_ADD and a PYTHON_BINARY_ADD, I 
don't see how you can around these things.

-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE
Learn Python! http://www.ibiblio.org/obp/thinkCSpy


From mclay@nist.gov  Tue Jul 31 08:27:11 2001
From: mclay@nist.gov (Michael McLay)
Date: Tue, 31 Jul 2001 03:27:11 -0400
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <3B66DE8C.C9C62012@lemburg.com>
References: <0107301106520A.02216@fermi.eeel.nist.gov> <01073023115207.02466@fermi.eeel.nist.gov> <3B66DE8C.C9C62012@lemburg.com>
Message-ID: <01073103271101.02004@fermi.eeel.nist.gov>

On Tuesday 31 July 2001 12:36 pm, M.-A. Lemburg wrote:

> I'd suggest to follow the rules for the SQL definitions
> of DECIMAL(,).

> Well, there are several options. I support that the IBM paper
> on decimal types has good hints as to what the type should do.
> Again, SQL is probably a good source for inspiration too, since
> it deals with decimals a lot.

Ok, I know about the IBM paper.  is there online document on the SQL 
semantics that can be referenced in the PEP?

> I see, the small 'b' still looks funny to me though. Wouldn't
> 1.23f and 25i be more intuitive ?

I originally used 'f' for both the integer and float.    The use of 'b' was 
suggested by Guido. There were two reasons not to use 'i' for integers.  The 
first has to do with how the tokenizer works.  It doesn't distringuish 
between float and int when the token string is passed to parsenumber().  Both 
float and int are processed by the same function.  I could have got around 
this problem by having the switch statement in parsenumber recognize both 'i' 
and 'f', but there is another problem with using 'i'.  The 25i would be 
confusing for someone if they was trying to use an imaginary numbers If they 
accidentally typed 25i instead of 25j they would get an integer instead of an 
imaginary number.  The error might not be detected since 3.0 + 4i would 
evaluate properly.

> > > I'd rather have this explicit in the sense that you define which
> > > assumptions will be made and what issues arise (rounding, truncation,
> > > loss of precision, etc.).
> >
> > Can you give an example of how this might be implemented.
>
> You would typically first coerce the types to the "larger"
> type, e.g. float + decimal -> float + float -> float, so
> you'd only have to document how the conversion is done and
> which accuracy to expect.

I would be concerned about the float + decimal automatically generating a 
float.  Would it generate an error message if the pedantic flag was set?  
Would it generate a warning in safe mode?

Also, why do you consider a float to be a "larger" value type than decimal?  
Do you mean that a float is less precise?


From gmcm@hypernet.com  Tue Jul 31 21:27:29 2001
From: gmcm@hypernet.com (Gordon McMillan)
Date: Tue, 31 Jul 2001 16:27:29 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: 
References: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
Message-ID: <3B66DC71.25881.1347D90B@localhost>

Moshe Zadka wrote:

> PUSH "1"
> PUSH "2"
> BINARY_ADD

But you get a pair of LOAD_CONSTs and a BINARY_ADD. 
Presumably a Perl "1" is a different object that a Python "1".

- Gordon


From thomas@xs4all.net  Tue Jul 31 21:32:59 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 31 Jul 2001 22:32:59 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: 
Message-ID: <20010731223259.A626@xs4all.nl>

On Tue, Jul 31, 2001 at 11:10:50PM +0300, Moshe Zadka wrote:
> On Tue, 31 Jul 2001, Guido van Rossum  wrote:

> > Actually, this may not be as big a deal as I thought before.  The PVM
> > doesn't have a lot of knowledge about types built into its instruction
> > set.  It knows a bit about classes, lists, dicts, but not e.g. about
> > ints and strings.  The opcodes are mostly very abstract: BINARY_ADD etc.

> PUSH "1"
> PUSH "2"
> BINARY_ADD

> In Python that gives "12". In Perl that gives 3.
> Unless you suggest a PERL_BINARY_ADD and a PYTHON_BINARY_ADD, I 
> don't see how you can around these things.

The Perl version of the compiled code could of course be

PUSH "1"
COERCE_INT
PUSH "2"
COERCE_INT
BINARY_ADD

for Perl's 

"1" + "2"

and 

PUSH "1"
PUSH "2"
BINARY_ADD

for it's 

"1" . "2"

(or, in the case of variables instead of literals, an explicit
'COERCE_STRING' or whatever.)

-- 
Thomas Wouters 

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mclay@nist.gov  Tue Jul 31 08:40:21 2001
From: mclay@nist.gov (Michael McLay)
Date: Tue, 31 Jul 2001 03:40:21 -0400
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <20010731163703.2F86E99C85@waltz.rahul.net>
References: <20010731163703.2F86E99C85@waltz.rahul.net>
Message-ID: <01073103402102.02004@fermi.eeel.nist.gov>

On Tuesday 31 July 2001 12:37 pm, Aahz Maruch wrote:
> Michael McLay wrote:

> > For addition, subtraction, and multiplication the results would be
> > exact with no rounding of the results.  Calculations that include
> > division the number of digits in a non-terminating result will have to
> > be explicitly set.  Would it make sense for this to be definedby the
> > numbers used in the calculation?  Could this be set in the module or
> > could it be global for the application?
>
> This is why Cowlishaw et al require a full context for all operations.
> At one point I tried implementing things with the context being
> contained in the number rather than "global" (which actually means
> thread-global, but I'm probably punting on *that* bit for the moment),
> but Tim Peters persuaded me that sticking with the spec was the Right
> Thing until *after* the spec was fully implemented.
>
> After seeing the mess generated by PEP-238, I'm fervently in favor of
> sticking with external specs whenever possible.

I had originally expected the context for decimal calculations to be the 
module in which a statement is defined.  If a function defined in another 
module is called the rules of that other module would be applied to that part 
of the calculation.  My expectations of how Python would work with decimal 
numbers doesn't seem to match what Guido said about his conversation with 
Tim, and what you said in this message.  

How can the rules for using decimals be stated so that a newbie can 
understand what they should expect to happen?  We could set a default 
precision of 17 digits and all calculations that were not exact would be 
rounded to 17 digits.  This would match how their calculator works.  I would 
think this would be the model with the least suprises.  For someone needing 
to be more precise, or less precise, how would this rule be modified?



From paulp@ActiveState.com  Tue Jul 31 21:48:45 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 31 Jul 2001 13:48:45 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>, <20010730051831.B1122@thyrsus.com> 
 <15206.51329.561652.565480@beluga.mojam.com> 
Message-ID: <3B6719AD.EAC715FA@ActiveState.com>

Moshe Zadka wrote:
> 
>...
> 
> PUSH "1"
> PUSH "2"
> BINARY_ADD
> 
> In Python that gives "12". In Perl that gives 3.
> Unless you suggest a PERL_BINARY_ADD and a PYTHON_BINARY_ADD, I
> don't see how you can around these things.

I'm not endorsing the approach but I think the answer is:

PUSH PyString("1")
PUSH PyString("2")
BINARY_ADD

versus

PUSH PlString("1")
PUSH PlString("2")
BINARY_ADD

i.e. the operators are generic but the operand types vary across
languages. So you can completely unify the bytecodes or the types, but
trying to unify both seems impossible without changing the semantics of
one language or the other quite a bit.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From dan@sidhe.org  Tue Jul 31 21:51:30 2001
From: dan@sidhe.org (Dan Sugalski)
Date: Tue, 31 Jul 2001 16:51:30 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010731041443.A26075@thyrsus.com>
References: <15207.1393.232974.785433@beluga.mojam.com>
 <20010730051831.B1122@thyrsus.com>
 
 <15206.51329.561652.565480@beluga.mojam.com>
 <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
 <15206.61296.958360.72700@beluga.mojam.com>
 <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
 <15207.1393.232974.785433@beluga.mojam.com>
Message-ID: <5.1.0.14.0.20010731161946.02753210@24.8.96.48>

[Eric, could you forward this to python-dev if it doesn't show of its own 
accord? I'm not yet subscribed, so I don't know if it'll make it]

I should start with an apology for not being on python-dev when this 
started. Do please Cc me on anything, as I've not gotten on yet. (My 
subscription's caught in the mail, I guess... :)

At 04:14 AM 7/31/2001 -0400, Eric S. Raymond wrote:
>Skip Montanaro :
> >
> >     Guido> Yeah, but the runtime could offer a choice of data types -- for
> >     Guido> Python code the constants table would contain Python ints and
> >     Guido> strings etc., for Perl code it would contain Perl string-number
> >     Guido> objects.  Maybe.
> >
> > So I could give a code object generated by the Python compiler to the Perl
> > runtime and get different results than if it was executed by the Python
> > environment?
>
>No, I don't think that's what Guido is saying.  He and I are both imagining
>a *single* runtime, but with some type-specific opcodes that are generated
>only by Perl and some only generated by Python.

Odds are there won't even be a different set of opcodes. (Barring the 
possibility of the optimizer being able to *know* that an operation is 
guaranteed to be integer or float, and thus using special-purpose opcodes. 
And that's really an optimization, not a set of language-specific opcodes) 
The behaviour of data is governed by the data itself, so Python variables 
would have Python vtables attached to them guaranteeing Python behaviour, 
while perl ones would have perl vtables guaranteeing perl behaviour.

This was covered, more or less, by the chunks of the internals talk I 
didn't get to. Slides, for the interested, are at 
http://dev.perl.org/perl6/talks/. I'm not sure if there's enough info on 
the slides themselves to be clear--they were written to be talked around.

> > Perhaps it's time for Eric to chime in again and tell us what he really has
> > in mind.  I can't see the utility in having the same set of opcodes for the
> > two languages if the semantics of running them under either environment
> > aren't going to be the same.  It seems like it would artificially constrain
> > people working on the internals of both languages.
>
>You're right.
>
>What I have in mind starts with a common opcode interpreter, perhaps
>based on the Python VM but with extended opcodes where Perl type
>semantics don't match, and a common callout mechanism to C-level
>runtime libraries linked to the opcode interpreter.

I've snipped the rest here.

I don't think Parrot will be built off the Python interpreter. This isn't 
out of any NIH feelings or anything--I'm obligated to make it work for 
Perl, as that's the primary point. If we can make Python a primary point 
too that's keen, and something I *want*, but I do need to keep focused on perl.

Having said that, what I'm doing is stepping back from perl and trying, 
wherever possible, to make the runtime generic. If there's no reason to be 
perl specific I'm not, and so far that's not been a problem. (It actually 
makes life easier in a lot of ways, since we can then delegate the decision 
on how things are done to the variables involved, providing a default set 
of behaviours which the parser will end up determining anyway)

On some things I think I'm being a bit more vicious than, say, Python is by 
default. (For example, if extension code wants to hold on to a variable 
across a GC boundary it had darned well better register that fact with the 
interpreter, or it's going to find itself with trash) I'm not sure about 
the extension mechanism in general--I've not had a chance to look too 
closely at what Python does now, but I don't doubt that, at the C level, 
the differences between the languages will be pretty trivial and easily 
abstractable. Seeing what you folks have is on the list 'o things to do--I 
may well steal from it wholesale. :)

I expect there's a bunch of stuff I'm missing here, so if anyone wants to 
peg me with questions, go for it. (Cc me if they're going to the dev list 
please, at least until I'm sure I'm on) I really would like to see Parrot 
as a viable back end for Python--I think the joint development resources we 
could muster (possibly with the Ruby folks as well) could get us a VM for 
dynamically typed languages to rival the JVM/.NET for statically typed ones.

					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk



From thomas@xs4all.net  Tue Jul 31 21:54:45 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 31 Jul 2001 22:54:45 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15207.1909.395000.123189@gargle.gargle.HOWL>
References: <20010730051831.B1122@thyrsus.com>  <15206.51329.561652.565480@beluga.mojam.com> <200107311708.NAA16497@cj20424-a.reston1.va.home.com> <15206.61296.958360.72700@beluga.mojam.com> <200107311900.PAA17062@cj20424-a.reston1.va.home.com> <15207.1909.395000.123189@gargle.gargle.HOWL>
Message-ID: <20010731225445.B626@xs4all.nl>

On Tue, Jul 31, 2001 at 01:31:01PM -0600, Nathan Torkington wrote:
> Guido van Rossum writes:
> > Yeah, but the runtime could offer a choice of data types -- for Python
> > code the constants table would contain Python ints and strings etc., for
> > Perl code it would contain Perl string-number objects.  Maybe.

> A perl6 value have a vtable, essentially an array of function pointers
> which comprises the standard operations on that value.  I talked to
> Dan (the perl6 internals guy, dan@sidhe.org) about an impedence
> mismatch between Perl and Python data types, and he pointed out that
> you can have Perl values and Python values, each with their own
> semantics, simply by having separate vtables (and thus separate
> functions to implement the behaviour of those types).  Code can work
> with either type because the type carries around (in its vtable) the
> knowledge of how it should behave.

Python objects all have vtables too (though they're structs, not arrays...
I'm not sure why you'd use arrays; check the way Python uses them, you can
do just about anything you want with them, including growing them without
breaking binary compatibility, due to the fact Python never memmoves/copies)
but that wouldn't solve the problem. The problem isn't that the VM wouldn't
know what to do with the various types -- it's absolutely problem to make a
Python object that behaves like a Perl scalar, or a Perl hash, including the
auto-converting bits...

The problem is that we'd end up with two different sets of types...
Dicts/hashes could be merged, though Perl6 will have to decide if it still
wants to auto-stringify the keys (Python dicts can hold any hashable object
as key) and arrays could possibly be too, but scalars are a different type.
You basically lose the interchangability benifit if Perl6 code all works
with the 'Scalar' type, but Python code just uses the distinct
int/string/class-instance...

But now that I think about it, this might not be a big problem after all. I
assume Perl6 will always convert to fit the operation, like Perl5 does.
It'll just have to learn to handle a few more objects, and most notably
user-defined types and extension types. Python C code already does things
like 'PyObject_ToInt' to convert a Python value to a C value it can work
with, or just uses the PyObject_ (or PyMapping_, etc)
API to manipulate objects. Python code wouldn't notice the difference unless
it did type checks, and the Perl6 types could be made siblings of the Python
types to make it pass those, too. We already have the 8-bit and 16-bit
strings.

About the only *real* problem I see with that is getting the whole farm of
mexican jumping beans to figure-skate in unison... It'll be an interesting
experience, with a lot of slippery falls and just-in-time recovering... not
to mention quite a bit of ego-massaging :-) But I think it's just a manner
of typing code and taking the time, and forget about optimizing the code the
first couple of years.

-- 
Thomas Wouters 

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From aahz@rahul.net  Tue Jul 31 22:07:02 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Tue, 31 Jul 2001 14:07:02 -0700 (PDT)
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <01073103402102.02004@fermi.eeel.nist.gov> from "Michael McLay" at Jul 31, 2001 03:40:21 AM
Message-ID: <20010731210702.A778D99C82@waltz.rahul.net>

Michael McLay wrote:
> 
> I had originally expected the context for decimal calculations to be  
> the module in which a statement is defined.  If a function defined    
> in another module is called the rules of that other module would be   
> applied to that part of the calculation.  My expectations of how      
> Python would work with decimal numbers doesn't seem to match what     
> Guido said about his conversation with Tim, and what you said in this 
> message.                                                              
>
> How can the rules for using decimals be stated so that a newbie can
> understand what they should expect to happen?  We could set a default
> precision of 17 digits and all calculations that were not exact would
> be rounded to 17 digits.  This would match how their calculator works.
> I would think this would be the model with the least suprises.  For
> someone needing to be more precise, or less precise, how would this
> rule be modified?

I intend to have more discussions with Cowlishaw once I finish
implementing his spec, but I suspect his answer will be that whoever
calls the module should set the precision.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From niemeyer@conectiva.com  Tue Jul 31 22:09:54 2001
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Tue, 31 Jul 2001 18:09:54 -0300
Subject: [Python-Dev] Info documentation
Message-ID: <20010731180954.J19610@tux.distro.conectiva>

Hello!

I've taken the info files somebody has sent to the python-list and
included in Conectiva Linux' python package. People found it very
practical to use the documentation in this format. Would it be
possible to have this format built just like the others for
version 2.2?

Thanks!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


From esr@thyrsus.com  Tue Jul 31 10:18:08 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 31 Jul 2001 05:18:08 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010731225445.B626@xs4all.nl>; from thomas@xs4all.net on Tue, Jul 31, 2001 at 10:54:45PM +0200
References: <20010730051831.B1122@thyrsus.com>  <15206.51329.561652.565480@beluga.mojam.com> <200107311708.NAA16497@cj20424-a.reston1.va.home.com> <15206.61296.958360.72700@beluga.mojam.com> <200107311900.PAA17062@cj20424-a.reston1.va.home.com> <15207.1909.395000.123189@gargle.gargle.HOWL> <20010731225445.B626@xs4all.nl>
Message-ID: <20010731051808.A27187@thyrsus.com>

Thomas Wouters :
> About the only *real* problem I see with that is getting the whole farm of
> mexican jumping beans to figure-skate in unison... It'll be an interesting
> experience, with a lot of slippery falls and just-in-time recovering... not
> to mention quite a bit of ego-massaging :-) But I think it's just a manner
> of typing code and taking the time, and forget about optimizing the code the
> first couple of years.

This is just about exactly how I see it, too.  The big problem isn't
any of the technical challenges -- the discussion so far indicates
these are surmountable, and in fact may be less daunting than many
of us originally assumed.  The big problem will be summoning the
political will to make the right commitments and the right
compromises.

Making this work is going to take strong leadership from Larry and
Guido.  We're laying some of the technical groundwork now.  More will
have to be done.  But I think the key moment, if it happens, will be
the one at which Guido and Larry, each flanked by their three or four
chief lieutenants, shake hands for the cameras and issue a joint ukase
to their tribes.

Tim, hosting that meeting will be your job, of course :-).
-- 
		Eric S. Raymond

"Those who make peaceful revolution impossible 
will make violent revolution inevitable."
	-- John F. Kennedy


From tim.one@home.com  Tue Jul 31 23:19:39 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 31 Jul 2001 18:19:39 -0400
Subject: [Python-Dev] Plan to merge descr-branch into trunk
Message-ID: 

Unless somebody raises a killer objection over the next ~24 hours, I plan to
merge the descr-branch back into the trunk Wednesday PM (EDT), thus ending
descr-branch as a distinct line of Python development.

Since it would be intractably hard to roll back the code changes, this
represents a solid commitment to Guido's type/class work for 2.2 final.
There may be objections on those grounds.  If so, good luck selling them to
Guido .

I don't have any worries about the mechanics of the merge, so you shouldn't
either.  We've been very conscientious over the last month+ about merging
trunk changes into descr-branch frequently, and of course I'll do that one
last time before going the other direction.

all's-well-that-ends-ly y'rs  - tim



From esr@thyrsus.com  Tue Jul 31 14:59:41 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 31 Jul 2001 09:59:41 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: ; from ping@lfw.org on Tue, Jul 31, 2001 at 06:12:55PM -0700
References: <20010731041443.A26075@thyrsus.com> 
Message-ID: <20010731095941.E1708@thyrsus.com>

Ka-Ping Yee :
> On Tue, 31 Jul 2001, Eric S. Raymond wrote:
> > Things Python would bring to this party: our serious-cool GC, our
> > C extension/embedding system (*much* nicer than XS).  Things Perl would
> > bring: blazingly fast regexps, taint, flexitypes, references.
> 
> I don't really understand the motivation.  Do we want any of those things?

No, but we want to be able to interoperate with Perl and have if possible 
have just one back end on which efforts to do things like native code
compilation can be concentrated.
-- 
		Eric S. Raymond

The common argument that crime is caused by poverty is a kind of
slander on the poor.
	-- H. L. Mencken


From tim.one at home.com  Sun Jul  1 03:58:29 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 30 Jun 2001 21:58:29 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3E4487.40054EAE@ActiveState.com>
Message-ID: 

[Paul Prescod]
> "The Energy is the mass of the object times the speed of light times
> two."

[David Ascher]
> Actually, it's "squared", not times two.  At least in my universe =)

This is something for Guido to Pronounce on, then.  Who's going to write the
PEP?  The threat of nuclear war seems almost laughable in Paul's universe,
so it's certainly got attractions.  OTOH, it's got to be a lot colder too.

energy-will-do-what-guido-tells-it-to-do-ly y'rs  - tim




From paulp at ActiveState.com  Sun Jul  1 05:59:02 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sat, 30 Jun 2001 20:59:02 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com>
Message-ID: <3B3EA006.14882609@ActiveState.com>

David Ascher wrote:
> 
> > "The Energy is the mass of the object times the speed of light times
> > two."
> 
> Actually, it's "squared", not times two.  At least in my universe =)

Pedant. Next you're going to claim that these silly equations effect my
life somehow.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From paulp at ActiveState.com  Sun Jul  1 06:04:49 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sat, 30 Jun 2001 21:04:49 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com>
Message-ID: <3B3EA161.1375F74C@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> 
> The term "character" in Python should really only be used for
> the 8-bit strings. 

Are we going to change chr() and unichr() to one_element_string() and
unicode_one_element_string()

u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
character. No Python user will find that confusing no matter how Unicode
knuckle-dragging, mouth-breathing, wife-by-hair-dragging they are.

> In Unicode a "character" can mean any of:

Mark Davis said that "people" can use the word to mean any of those
things. He did not say that it was imprecisely defined in Unicode.
Nevertheless I'm not using the Unicode definition anymore than our
standard library uses an ancient Greek definition of integer. Python has
a concept of integer and a concept of character.

> >     It has been proposed that there should be a module for working
> >     with UTF-16 strings in narrow Python builds through some sort of
> >     abstraction that handles surrogates for you. If someone wants
> >     to implement that, it will be another PEP.
> 
> Uhm, narrow builds don't support UTF-16... it's UCS-2 which
> is supported (basically: store everything in range(0x10000));
> the codecs can map code points to surrogates, but it is solely
> their responsibility and the responsibility of the application
> using them to take care of dealing with surrogates.

The user can view the data as UCS-2, UTF-16, Base64, ROT-13, XML, ....
Just as we have a base64 module, we could have a UTF-16 module that
interprets the data in the string as UTF-16 and does surrogate
manipulation for you.

Anyhow, if any of those is the "real" encoding of the data, it is
UTF-16. After all, if the codec reads in four non-BMP characters in,
let's say, UTF-8, we represent them as 8 narrow-build Python characters.
That's the definition of UTF-16! But it's easy enough for me to take
that word out so I will.

>...
> Also, the module will be useful for both narrow and wide builds,
> since the notion of an encoded character can involve multiple code
> points. In that sense Unicode is always a variable length
> encoding for characters and that's the application field of
> this module.

I wouldn't advise that you do all different types of normalization in a
single module but I'll wait for your PEP.

> Here's the adjusted text:
> 
>      It has been proposed that there should be a module for working
>      with Unicode objects using character-, word- and line- based
>      indexing. The details of the implementation is left to
>      another PEP.
 
     It has been proposed that there should be a module that handles
     surrogates in narrow Python builds for programmers. If someone 
     wants to implement that, it will be another PEP. It might also be 
     combined with features that allow other kinds of character-, 
     word- and line- based indexing.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From DavidA at ActiveState.com  Sun Jul  1 08:09:40 2001
From: DavidA at ActiveState.com (David Ascher)
Date: Sat, 30 Jun 2001 23:09:40 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com>
Message-ID: <3B3EBEA4.3EC84EAF@ActiveState.com>

Paul Prescod wrote:
> 
> David Ascher wrote:
> >
> > > "The Energy is the mass of the object times the speed of light times
> > > two."
> >
> > Actually, it's "squared", not times two.  At least in my universe =)
> 
> Pedant. Next you're going to claim that these silly equations effect my
> life somehow.

Although one stretch the argument to say that the equations _effect_
your life, I'd limit the claim to stating that they _affect_ your life. 

pedantly y'rs,

--dr david



From paulp at ActiveState.com  Sun Jul  1 08:15:46 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sat, 30 Jun 2001 23:15:46 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com>
Message-ID: <3B3EC012.A3A05E64@ActiveState.com>

David Ascher wrote:
> 
> Paul Prescod wrote:
> >
> > David Ascher wrote:
> > >
> > > > "The Energy is the mass of the object times the speed of light times
> > > > two."
> > >
> > > Actually, it's "squared", not times two.  At least in my universe =)
> >
> > Pedant. Next you're going to claim that these silly equations effect my
> > life somehow.
> 
> Although one stretch the argument to say that the equations _effect_
              ^               
might    -----

> your life, I'd limit the claim to stating that they _affect_ your life.

And you just bought such a shiny, new glass, house. Pity.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From nhodgson at bigpond.net.au  Sun Jul  1 15:00:15 2001
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Sun, 1 Jul 2001 23:00:15 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>
Message-ID: <00dd01c1022d$c61e4160$0acc8490@neil>

Paul Prescod:


   The problem I have with this PEP is that it is a compile time option
which makes it hard to work with both 32 bit and 16 bit strings in one
program. Can not the 32 bit string type be introduced as an additional type?

> Are we going to change chr() and unichr() to one_element_string() and
> unicode_one_element_string()
>
> u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> character.

   This wasn't usefully true in the past for DBCS strings and is not the
right way to think of either narrow or wide strings now. The idea that
strings are arrays of characters gets in the way of dealing with many
encodings and is the primary difficulty in localising software for Japanese.
Iteration through the code units in a string is a problem waiting to bite
you and string APIs should encourage behaviour which is correct when faced
with variable width characters, both DBCS and UTF style. Iteration over
variable width characters should be performed in a way that preserves the
integrity of the characters. M.-A. Lemburg's proposed set of iterators could
be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to
provide for the various intended string uses such as "for c in
s.inVisualOrder()" reversing the receipt of right-to-left substrings.

   Neil





From guido at digicool.com  Sun Jul  1 15:44:29 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sun, 01 Jul 2001 09:44:29 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Sun, 01 Jul 2001 23:00:15 +1000."
             <00dd01c1022d$c61e4160$0acc8490@neil> 
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>  
            <00dd01c1022d$c61e4160$0acc8490@neil> 
Message-ID: <200107011344.f61DiTM03548@odiug.digicool.com>

> 
> 
>    The problem I have with this PEP is that it is a compile time option
> which makes it hard to work with both 32 bit and 16 bit strings in one
> program. Can not the 32 bit string type be introduced as an additional type?

Not without an outrageous amount of additional coding (every place in
the code that currently uses PyUnicode_Check() would have to be
bifurcated in a 16-bit and a 32-bit variant).

I doubt that the desire to work with both 16- and 32-bit characters in
one program is typical for folks using Unicode -- that's mostly
limited to folks writing conversion tools.  Python will offer the
necessary codecs so you shouldn't have this need very often.

You can use the array module to manipulate 16- and 32-bit arrays, and
you can use the various Unicode encodings to do the necessary
encodings.

> > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> > character.
> 
>    This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea that
> strings are arrays of characters gets in the way of dealing with many
> encodings and is the primary difficulty in localising software for Japanese.

Can you explain the kind of problems encountered in some more detail?

> Iteration through the code units in a string is a problem waiting to bite
> you and string APIs should encourage behaviour which is correct when faced
> with variable width characters, both DBCS and UTF style.

But this is not the Unicode philosophy.  All the variable-length
character manipulation is supposed to be taken care of by the codecs,
and then the application can deal in arrays of characteres.
Alternatively, the application can deal in opaque objects representing
variable-length encodings, but then it should be very careful with
concatenation and even more so with slicing.

> Iteration over
> variable width characters should be performed in a way that preserves the
> integrity of the characters. M.-A. Lemburg's proposed set of iterators could
> be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to
> provide for the various intended string uses such as "for c in
> s.inVisualOrder()" reversing the receipt of right-to-left substrings.

I think it's a good idea to provide a set of higher-level tools as
well.  However nobody seems to know what these higher-level tools
should do yet.  PEP 261 is specifically focused on getting the
lower-level foundations right (i.e. the objects that represent arrays
of code units), so that the authors of higher level tools will have a
solid base.  If you want to help author a PEP for such higher-level
tools, you're welcome!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From loewis at informatik.hu-berlin.de  Sun Jul  1 15:52:58 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 1 Jul 2001 15:52:58 +0200 (MEST)
Subject: [Python-Dev] Support for "wide" Unicode characters
Message-ID: <200107011352.PAA27645@pandora.informatik.hu-berlin.de>

> The problem I have with this PEP is that it is a compile time option
> which makes it hard to work with both 32 bit and 16 bit strings in
> one program.

Can you elaborate why you think this is a problem?

> Can not the 32 bit string type be introduced as an additional type?

Yes, but not just "like that". You'd have to define an API for
creating values of this type, you'd have to teach all functions which
ought to accept it to process it, you'd have to define conversion
operations and all that: In short, you'd have to go through all the
trouble that introduction of the Unicode type gave us once again.
Also, I cannot see any advantages in introducing yet another type.

Implementing this PEP is straight forward, and with almost no visible
effect to Python programs.

People have suggested to make it a run-time decision, having the
internal representation switch on demand, but that would give an API
nightmare for C code that has to access such values.

> u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> character.

>  This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea
> that strings are arrays of characters gets in the way of dealing
> with many encodings and is the primary difficulty in localising
> software for Japanese.

While I don't know much about localising software for Japanese (*), I
agree that 'u[i] is a character' isn't useful to say in many cases. If
this is the old Python string type, I'd much prefer calling u[i] a
'byte'.

Regards,
Martin

(*) Methinks that the primary difficulty still is translating all the
documentation, and messages. Actually, keeping the translations
up-to-date is even more challenging.



From aahz at rahul.net  Sun Jul  1 16:19:41 2001
From: aahz at rahul.net (Aahz Maruch)
Date: Sun, 1 Jul 2001 07:19:41 -0700 (PDT)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3EC012.A3A05E64@ActiveState.com> from "Paul Prescod" at Jun 30, 2001 11:15:46 PM
Message-ID: <20010701141941.A323099C80@waltz.rahul.net>

Paul Prescod wrote:
> David Ascher wrote:
>> Paul Prescod wrote:
>>> David Ascher wrote:
>>>>>
>>>>> "The Energy is the mass of the object times the speed of light times
>>>>> two."
>>>>
>>>> Actually, it's "squared", not times two.  At least in my universe =)
>>>
>>> Pedant. Next you're going to claim that these silly equations effect my
>>> life somehow.
>> 
>> Although one stretch the argument to say that the equations _effect_
>               ^               
> might    -----
> 
>> your life, I'd limit the claim to stating that they _affect_ your life.
> 
> And you just bought such a shiny, new glass, house. Pity.

All speeling falmes contain at least one erorr.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.



From just at letterror.com  Sun Jul  1 16:43:08 2001
From: just at letterror.com (Just van Rossum)
Date: Sun,  1 Jul 2001 16:43:08 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <200107011344.f61DiTM03548@odiug.digicool.com>
Message-ID: <20010701164315-r01010600-c2d5b07d@213.84.27.177>

Guido van Rossum wrote:

> > 
> > 
> >    The problem I have with this PEP is that it is a compile time option
> > which makes it hard to work with both 32 bit and 16 bit strings in one
> > program. Can not the 32 bit string type be introduced as an additional type?
> 
> Not without an outrageous amount of additional coding (every place in
> the code that currently uses PyUnicode_Check() would have to be
> bifurcated in a 16-bit and a 32-bit variant).

Alternatively, a Unicode object could *internally* be either 8, 16 or 32 bits
wide (to be clear: not per character, but per string). Also a lot of work, but
it'll be a lot less wasteful.

> I doubt that the desire to work with both 16- and 32-bit characters in
> one program is typical for folks using Unicode -- that's mostly
> limited to folks writing conversion tools.  Python will offer the
> necessary codecs so you shouldn't have this need very often.

Not a lot of people will want to work with 16 or 32 bit chars directly, but I
think a less wasteful solution to the surrogate pair problem *will* be desired
by people. Why use 32 bits for all strings in a program when only a tiny
percentage actually *needs* more than 16? (Or even 8...)

> > Iteration through the code units in a string is a problem waiting to bite
> > you and string APIs should encourage behaviour which is correct when faced
> > with variable width characters, both DBCS and UTF style.
> 
> But this is not the Unicode philosophy.  All the variable-length
> character manipulation is supposed to be taken care of by the codecs,
> and then the application can deal in arrays of characteres.

Right: this is the way it should be.

My difficulty with PEP 261 is that I'm afraid few people will actually enable
32-bit support (*what*?! all unicode strings become 32 bits wide? no way!),
therefore making programs non-portable in very subtle ways.

Just



From DavidA at ActiveState.com  Sun Jul  1 19:13:30 2001
From: DavidA at ActiveState.com (David Ascher)
Date: Sun, 01 Jul 2001 10:13:30 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com> <3B3EC012.A3A05E64@ActiveState.com>
Message-ID: <3B3F5A3A.A88B54B2@ActiveState.com>

Paul: 
> And you just bought such a shiny, new glass, house. Pity.

What kind of comma placement is that?

--david



From paulp at ActiveState.com  Sun Jul  1 20:08:10 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 11:08:10 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> <00dd01c1022d$c61e4160$0acc8490@neil>
Message-ID: <3B3F670A.B5396D61@ActiveState.com>

Neil Hodgson wrote:
> 
> Paul Prescod:
> 
> 
>    The problem I have with this PEP is that it is a compile time option
> which makes it hard to work with both 32 bit and 16 bit strings in one
> program. Can not the 32 bit string type be introduced as an additional type?

The two solutions are not mutually exclusive. If you (or someone)
supplies a 32-bit type and Guido accepts it, then the compile option
might fall into disuse. But this solution was chosen because it is much
less work. Really though, I think that having 16-bit and 32-bit types is
extra confusion for very little gain. I would much rather have a single
space-efficient type that hid the details of its implementation. But
nobody has volunteered to code it and Guido might not accept it even if
someone did.

>...
>    This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea that
> strings are arrays of characters gets in the way of dealing with many
> encodings and is the primary difficulty in localising software for Japanese.

The whole benfit of moving to 32-bit character strings is to allow
people to think of strings as arrays of characters. Forcing them to
consider variable-length encodings is precisely what we are trying to
avoid.

> Iteration through the code units in a string is a problem waiting to bite
> you and string APIs should encourage behaviour which is correct when faced
> with variable width characters, both DBCS and UTF style. Iteration over
> variable width characters should be performed in a way that preserves the
> integrity of the characters. 

On wide Python builds there is no such thing as variable width Unicode
characters. It doesn't make sense to combine two 32-bit characters to
get a 64-bit one. On narrow Python builds you might want to treat a
surrogate pair as a single character but I would strongly advise against
it. If you want wide characters, move to a wide build. Even if a narrow
build is more space efficient, you'll lose a ton of performance
emulating wide characters in Python code.

> ... M.-A. Lemburg's proposed set of iterators could
> be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to
> provide for the various intended string uses such as "for c in
> s.inVisualOrder()" reversing the receipt of right-to-left substrings.

A floor wax and a desert topping. <0.5 wink>

I don't think that the average Python programmer would want
s.asCharacters('utf-8') when they already have s.decode('utf-8'). We
decided a long time ago that the model for standard users would be
fixed-length (1!), abstract characters. That's the way Python's Unicode
subsystem has always worked.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From paulp at ActiveState.com  Sun Jul  1 20:19:17 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 11:19:17 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010701164315-r01010600-c2d5b07d@213.84.27.177>
Message-ID: <3B3F69A5.D7CE539D@ActiveState.com>

Just van Rossum wrote:
> 
> Guido van Rossum wrote:
> 
> > > 
> > >
> > >    The problem I have with this PEP is that it is a compile time option
> > > which makes it hard to work with both 32 bit and 16 bit strings in one
> > > program. Can not the 32 bit string type be introduced as an additional type?
> >
> > Not without an outrageous amount of additional coding (every place in
> > the code that currently uses PyUnicode_Check() would have to be
> > bifurcated in a 16-bit and a 32-bit variant).
> 
> Alternatively, a Unicode object could *internally* be either 8, 16 or 32 bits
> wide (to be clear: not per character, but per string). Also a lot of work, but
> it'll be a lot less wasteful.

I hope this is where we end up one day. But the compile-time option is
better than where we are today. Even though PEP 261 is not my favorite
solution, it buys us a couple of years of wait-and-see time.

Consider that computer memory is growing much faster than textual data.
People's text processing techniques get more and more "wasteful" because
it is now almost always possible to load the entire "text" into memory
at once. I remember how some text editors used to boast that they only
loaded your text "on demand". 

Maybe so much data will be passed to us from UCS-4 APIs that trying to
"compress it" will actually be inefficient.

Maybe two years from now Guido will make UCS-4 the default and only a
tiny minority will notice or care.

> ...
> My difficulty with PEP 261 is that I'm afraid few people will actually enable
> 32-bit support (*what*?! all unicode strings become 32 bits wide? no way!),
> therefore making programs non-portable in very subtle ways.

It really depends on what the default build option is.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From paulp at ActiveState.com  Sun Jul  1 20:22:01 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 11:22:01 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com> <3B3EC012.A3A05E64@ActiveState.com> <3B3F5A3A.A88B54B2@ActiveState.com>
Message-ID: <3B3F6A49.6E82B7DE@ActiveState.com>

David Ascher wrote:
> 
> Paul:
> > And you just bought such a shiny, new glass, house. Pity.
> 
> What kind of comma placement is that?

I had to leave you something to complain about;
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From guido at digicool.com  Sun Jul  1 20:37:48 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sun, 01 Jul 2001 14:37:48 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Sun, 01 Jul 2001 16:43:08 +0200."
             <20010701164315-r01010600-c2d5b07d@213.84.27.177> 
References: <20010701164315-r01010600-c2d5b07d@213.84.27.177> 
Message-ID: <200107011837.f61IbmZ03645@odiug.digicool.com>

> Alternatively, a Unicode object could *internally* be either 8, 16
> or 32 bits wide (to be clear: not per character, but per
> string). Also a lot of work, but it'll be a lot less wasteful.

Depending on what you prefer to waste: developers' time or computer
resources.  I bet that if you try the measure the wasted space you'll
find that it wastes very little compared to all the other overheads
in a typical Python program: CPU time compared to writing your code in
C, memory overhead for integers, etc.

It so happened that the Unicode support was written to make it very
easy to change the compile-time code unit size; but making this a
per-string (or even global) run-time variable is much harder without
touching almost every place that uses Unicode (not to mention slowing
down the common case).

Nobody was enthusiastic about fixing this, so our choice was really
between staying with 16 bits or making 32 bits an option for those who
need it.

> Not a lot of people will want to work with 16 or 32 bit chars
> directly,

How do you know?  There are more Chinese than Americans and Europeans
together, and they will soon all have computers. :-)

> but I think a less wasteful solution to the surrogate pair
> problem *will* be desired by people. Why use 32 bits for all strings
> in a program when only a tiny percentage actually *needs* more than
> 16? (Or even 8...)

So work in UTF-8 -- a lot of work can be done in UTF-8.

> > But this is not the Unicode philosophy.  All the variable-length
> > character manipulation is supposed to be taken care of by the codecs,
> > and then the application can deal in arrays of characteres.
> 
> Right: this is the way it should be.
> 
> My difficulty with PEP 261 is that I'm afraid few people will
> actually enable 32-bit support (*what*?! all unicode strings become
> 32 bits wide? no way!), therefore making programs non-portable in
> very subtle ways.

My hope and expectation is that those folks who need 32-bit support
will enable it.  If this solution is not sufficient, we may have to
provide something else in the future, but given that the
implementation effort for PEP 261 was very minimal (certainly less
than the time expended in discussing it) I am very happy with it.

It will take quite a while until lots of folks will need the 32-bit
support (there aren't that many characters defined outside the basic
plane yet).  In the mean time, those that need to 32-bit support
should be happy that we allow them to rebuild Python with 32-bit
support.  In the next 5-10 years, the 32-bit support requirement will
become more common -- as will be the memory upgrades to make it
painless.

It's not like Python is making this decision in a vacuum either: Linux
already has 32-bit wchar_t.  32-bit characters will eventually be
common (even in Windows, which probably has the largest investment in
16-bit Unicode at the moment of any system).  Like IPv6, we're trying
to enable uncommon uses of Python without breaking things for the
not-so-early adopters.

Again, don't see PEP 261 as the ultimate answer to all your 32-bit
Unicode questions.  Just consider that realistically we have two
choices: stick with 16-bit support only or make 32-bit support an
option.  Other approaches (more surrogate support, run-time choices,
transparent variable-length encodings) simply aren't realistic --
no-one has the time to code them.

It should be easy to write portable Python programs that work
correctly with 16-bit Unicode characters on a "narrow" interpreter and
also work correctly with 21-bit Unicode on a "wide" interpreter:
just avoid using surrogates.  If you *need* to work with surrogates,
try to limit yourself to very simple operations like concatenations of
valid strings, and splitting strings at known delimiters only.
There's a lot you can do with this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Sun Jul  1 20:52:36 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 1 Jul 2001 14:52:36 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3F69A5.D7CE539D@ActiveState.com>
Message-ID: 

[Paul Prescod]
> ...
> Consider that computer memory is growing much faster than textual data.
> People's text processing techniques get more and more "wasteful" because
> it is now almost always possible to load the entire "text" into memory
> at once.

Indeed, the entire text of the Bible fits in a corner of my year-old box's
RAM, even at 32 bits per character.

> I remember how some text editors used to boast that they only loaded
> your text "on demand".

Well, they still do -- fancy editors use fancy data structures, so that,
e.g., inserting characters at the start of the file doesn't cause a 50Mb
memmove each time.  Response time is still important, but I'd wager
relatively insensitive to basic character size (you need tricks that cut
factors of 1000s off potential worst cases to give the appearance of
instantaneous results; a factor of 2 or 4 is in the noise compared to what's
needed regardless).




From aahz at rahul.net  Sun Jul  1 21:21:26 2001
From: aahz at rahul.net (Aahz Maruch)
Date: Sun, 1 Jul 2001 12:21:26 -0700 (PDT)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3F670A.B5396D61@ActiveState.com> from "Paul Prescod" at Jul 01, 2001 11:08:10 AM
Message-ID: <20010701192126.9EB8299C80@waltz.rahul.net>

Paul Prescod wrote:
> 
> On wide Python builds there is no such thing as variable width Unicode
> characters. It doesn't make sense to combine two 32-bit characters to
> get a 64-bit one. On narrow Python builds you might want to treat a
> surrogate pair as a single character but I would strongly advise against
> it. If you want wide characters, move to a wide build. Even if a narrow
> build is more space efficient, you'll lose a ton of performance
> emulating wide characters in Python code.

This needn't go into the PEP, I think, but I'd like you to say something
about what you expect the end result of this PEP to look like under
Windows, where "rebuild" isn't really a valid option for most Python
users.  Are we simply committing to make two builds available?  If so,
what happens the next time we run into a situation like this?
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.



From paulp at ActiveState.com  Sun Jul  1 21:21:09 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 12:21:09 -0700
Subject: [Python-Dev] Text editors
References: 
Message-ID: <3B3F7825.CA3D1B5B@ActiveState.com>

Tim Peters wrote:
> 
>...
> 
> > I remember how some text editors used to boast that they only loaded
> > your text "on demand".
> 
> Well, they still do -- fancy editors use fancy data structures, so that,
> e.g., inserting characters at the start of the file doesn't cause a 50Mb
> memmove each time.  

Yes, but most modern text editors take O(n) time to open the file. There
was a time when the more advanced ones did not. Or maybe that was just
SGML editors...I can't remember.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From guido at digicool.com  Sun Jul  1 21:32:52 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sun, 01 Jul 2001 15:32:52 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Sun, 01 Jul 2001 12:21:26 PDT."
             <20010701192126.9EB8299C80@waltz.rahul.net> 
References: <20010701192126.9EB8299C80@waltz.rahul.net> 
Message-ID: <200107011932.f61JWq803843@odiug.digicool.com>

> This needn't go into the PEP, I think, but I'd like you to say something
> about what you expect the end result of this PEP to look like under
> Windows, where "rebuild" isn't really a valid option for most Python
> users.  Are we simply committing to make two builds available?  If so,
> what happens the next time we run into a situation like this?

I imagine that we will pick a choice (I expect it'll be UCS2) and
make only that build available, until there are loud enough cries from
folks who have a reasonable excuse not to have a copy of VCC around.

Given that the rest of Windows uses 16-bit Unicode, I think we'll be
able to get away with this for quite a while.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From paulp at ActiveState.com  Sun Jul  1 21:33:20 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 12:33:20 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010701192126.9EB8299C80@waltz.rahul.net>
Message-ID: <3B3F7B00.29D6832@ActiveState.com>

Aahz Maruch wrote:
> 
>...
> 
> This needn't go into the PEP, I think, but I'd like you to say something
> about what you expect the end result of this PEP to look like under
> Windows, where "rebuild" isn't really a valid option for most Python
> users.  Are we simply committing to make two builds available?  If so,
> what happens the next time we run into a situation like this?

Windows itself is strongly biased towards 16-bit characters. Therefore I
expect that to be the default for a while. Then I expect Guido to
announce that 32-bit characters are the new default with version 3000
(perhaps right after Windows 3000 ships) and we'll all change. So most
Windows users will not be able to work with 32-bit characters for a
while. But since Windows itself doesn't like those characters, they
probably won't run into them much.

I strongly doubt that we'll ever make two builds available because it
would cause a mess of extension module incompatibilities.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From paulp at ActiveState.com  Sun Jul  1 21:57:09 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 12:57:09 -0700
Subject: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" Unicode characters
Message-ID: <3B3F8095.8D58631D@ActiveState.com>

PEP: 261
Title: Support for "wide" Unicode characters
Version: $Revision: 1.3 $
Author: paulp at activestate.com (Paul Prescod)
Status: Draft
Type: Standards Track
Created: 27-Jun-2001
Python-Version: 2.2
Post-History: 27-Jun-2001


Abstract

    Python 2.1 unicode characters can have ordinals only up to 2**16
-1.  
    This range corresponds to a range in Unicode known as the Basic
    Multilingual Plane. There are now characters in Unicode that live
    on other "planes". The largest addressable character in Unicode
    has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we
    will call this TOPCHAR and call characters in this range "wide 
    characters".


Glossary

    Character 
        
        Used by itself, means the addressable units of a Python 
        Unicode string.

    Code point

        A code point is an integer between 0 and TOPCHAR.
        If you imagine Unicode as a mapping from integers to
        characters, each integer is a code point. But the 
        integers between 0 and TOPCHAR that do not map to
        characters are also code points. Some will someday 
        be used for characters. Some are guaranteed never 
        to be used for characters.

    Codec

        A set of functions for translating between physical
        encodings (e.g. on disk or coming in from a network)
        into logical Python objects.

    Encoding

        Mechanism for representing abstract characters in terms of
        physical bits and bytes. Encodings allow us to store
        Unicode characters on disk and transmit them over networks
        in a manner that is compatible with other Unicode software.

    Surrogate pair

        Two physical characters that represent a single logical
        character. Part of a convention for representing 32-bit
        code points in terms of two 16-bit code points.

    Unicode string

          A Python type representing a sequence of code points with
          "string semantics" (e.g. case conversions, regular
          expression compatibility, etc.) Constructed with the 
          unicode() function.


Proposed Solution

    One solution would be to merely increase the maximum ordinal 
    to a larger value. Unfortunately the only straightforward
    implementation of this idea is to use 4 bytes per character.
    This has the effect of doubling the size of most Unicode 
    strings. In order to avoid imposing this cost on every
    user, Python 2.2 will allow the 4-byte implementation as a
    build-time option. Users can choose whether they care about
    wide characters or prefer to preserve memory.

    The 4-byte option is called "wide Py_UNICODE". The 2-byte option
    is called "narrow Py_UNICODE".

    Most things will behave identically in the wide and narrow worlds.

    * unichr(i) for 0 <= i < 2**16 (0x10000) always returns a
      length-one string.

    * unichr(i) for 2**16 <= i <= TOPCHAR will return a
      length-one string on wide Python builds. On narrow builds it will 
      raise ValueError.

        ISSUE 

            Python currently allows \U literals that cannot be
            represented as a single Python character. It generates two
            Python characters known as a "surrogate pair". Should this
            be disallowed on future narrow Python builds?

        Pro:

            Python already the construction of a surrogate pair
            for a large unicode literal character escape sequence.
            This is basically designed as a simple way to construct
            "wide characters" even in a narrow Python build. It is also
            somewhat logical considering that the Unicode-literal syntax
            is basically a short-form way of invoking the unicode-escape
            codec.

        Con:

            Surrogates could be easily created this way but the user
            still needs to be careful about slicing, indexing, printing 
            etc. Therefore some have suggested that Unicode
            literals should not support surrogates.


        ISSUE 

            Should Python allow the construction of characters that do
            not correspond to Unicode code points?  Unassigned Unicode 
            code points should obviously be legal (because they could 
            be assigned at any time). But code points above TOPCHAR are 
            guaranteed never to be used by Unicode. Should we allow
access 
            to them anyhow?

        Pro:

            If a Python user thinks they know what they're doing why
            should we try to prevent them from violating the Unicode
            spec? After all, we don't stop 8-bit strings from
            containing non-ASCII characters.

        Con:

            Codecs and other Unicode-consuming code will have to be
            careful of these characters which are disallowed by the
            Unicode specification.

    * ord() is always the inverse of unichr()

    * There is an integer value in the sys module that describes the
      largest ordinal for a character in a Unicode string on the current
      interpreter. sys.maxunicode is 2**16-1 (0xffff) on narrow builds
      of Python and TOPCHAR on wide builds.

        ISSUE: Should there be distinct constants for accessing
               TOPCHAR and the real upper bound for the domain of 
               unichr (if they differ)? There has also been a
               suggestion of sys.unicodewidth which can take the 
               values 'wide' and 'narrow'.

    * every Python Unicode character represents exactly one Unicode code 
      point (i.e. Python Unicode Character = Abstract Unicode
character).

    * codecs will be upgraded to support "wide characters"
      (represented directly in UCS-4, and as variable-length sequences
      in UTF-8 and UTF-16). This is the main part of the implementation 
      left to be done.

    * There is a convention in the Unicode world for encoding a 32-bit
      code point in terms of two 16-bit code points. These are known
      as "surrogate pairs". Python's codecs will adopt this convention
      and encode 32-bit code points as surrogate pairs on narrow Python
      builds. 

        ISSUE 

            Should there be a way to tell codecs not to generate
            surrogates and instead treat wide characters as 
            errors?

        Pro:

            I might want to write code that works only with
            fixed-width characters and does not have to worry about
            surrogates.


        Con:

            No clear proposal of how to communicate this to codecs.

    * there are no restrictions on constructing strings that use 
      code points "reserved for surrogates" improperly. These are
      called "isolated surrogates". The codecs should disallow reading
      these from files, but you could construct them using string 
      literals or unichr().


Implementation

    There is a new (experimental) define:

        #define PY_UNICODE_SIZE 2

    There is a new configure option:

        --enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses
                              wchar_t if it fits
        --enable-unicode=ucs4 configures a wide Py_UNICODE, and uses
                              whchar_t if it fits
        --enable-unicode      same as "=ucs2"

    The intention is that --disable-unicode, or --enable-unicode=no
    removes the Unicode type altogether; this is not yet implemented.

    It is also proposed that one day --enable-unicode will just
    default to the width of your platforms wchar_t.

    Windows builds will be narrow for a while based on the fact that
    there have been few requests for wide characters, those requests
    are mostly from hard-core programmers with the ability to buy
    their own Python and Windows itself is strongly biased towards
    16-bit characters.


Notes

    This PEP does NOT imply that people using Unicode need to use a
    4-byte encoding for their files on disk or sent over the network. 
    It only allows them to do so. For example, ASCII is still a 
    legitimate (7-bit) Unicode-encoding.

    It has been proposed that there should be a module that handles
    surrogates in narrow Python builds for programmers. If someone 
    wants to implement that, it will be another PEP. It might also be 
    combined with features that allow other kinds of character-, 
    word- and line- based indexing.


Rejected Suggestions

    More or less the status-quo

        We could officially say that Python characters are 16-bit and
        require programmers to implement wide characters in their
        application logic by combining surrogate pairs. This is a heavy 
        burden because emulating 32-bit characters is likely to be
        very inefficient if it is coded entirely in Python. Plus these
        abstracted pseudo-strings would not be legal as input to the
        regular expression engine.

    "Space-efficient Unicode" type

        Another class of solution is to use some efficient storage
        internally but present an abstraction of wide characters to
        the programmer. Any of these would require a much more complex
        implementation than the accepted solution. For instance consider
        the impact on the regular expression engine. In theory, we could
        move to this implementation in the future without breaking
Python
        code. A future Python could "emulate" wide Python semantics on
        narrow Python. Guido is not willing to undertake the
        implementation right now.

    Two types

        We could introduce a 32-bit Unicode type alongside the 16-bit
        type. There is a lot of code that expects there to be only a 
        single Unicode type.

    This PEP represents the least-effort solution. Over the next
    several years, 32-bit Unicode characters will become more common
    and that may either convince us that we need a more sophisticated 
    solution or (on the other hand) convince us that simply 
    mandating wide Unicode characters is an appropriate solution.
    Right now the two options on the table are do nothing or do
    this.


References

    Unicode Glossary: http://www.unicode.org/glossary/


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From thomas at xs4all.net  Mon Jul  2 00:12:48 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 2 Jul 2001 00:12:48 +0200
Subject: [Python-Dev] Python 2.1.1 release 'schedule'
Message-ID: <20010702001248.H8098@xs4all.nl>

This is just a heads-up to everyone. I plan to release Python 2.1.1c1
(release candidate 1) somewhere on Friday the 13th (of July) and, barring
any serious problems, the full release the friday following that, July 20.

The python 2.1.1 CVS branch (tagged 'release21-maint') should be stable, and
should contain most bugfixes that will be in 2.1.1. If you care about
2.1.1's stability and portability, or you found bugs in 2.1 and aren't sure
they are fixed, and you can check things out of CVS, please give the CVS
branch a try: just 'checkout' python with

cvs co -rrelease21-maint python

(with the -d option from the SourceForge CVS page that applies to you) and
follow the normal compile procedure. Binaries for Windows as well as source
tarballs will be provided for the release candidate and the final release
(obviously) but the more bugs people point out before the final release, the
more bugs will be fixed in 2.1.1 :-)

Python 2.1.1 (as well as the CVS branch) will fall under the new
GPL-compatible PSF licence, just like Python 2.0.1. The only notable thing
missing from the CVS branch is an updated NEWS file -- I'm working on it.
I'm also not done searching the open bugs for ones that might need to be
adressed in 2.1.1, but feel free to point me to bugs you think are
important!

2.1.1-Patch-Czar-ly y'rs,
-- 
Thomas Wouters 

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From greg at cosc.canterbury.ac.nz  Mon Jul  2 04:06:50 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:06:50 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3EBEA4.3EC84EAF@ActiveState.com>
Message-ID: <200107020206.OAA00427@s454.cosc.canterbury.ac.nz>

David Ascher :

> I'd limit the claim to stating that they _affect_ your life.

If matter didn't have any rest energy, everything
would fly about at the speed of light, which would
make life very hectic.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Mon Jul  2 04:36:39 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:36:39 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <20010701164315-r01010600-c2d5b07d@213.84.27.177>
Message-ID: <200107020236.OAA00432@s454.cosc.canterbury.ac.nz>

Just van Rossum :

> My difficulty with PEP 261 is that I'm afraid few people will actually enable
> 32-bit support (*what*?! all unicode strings become 32 bits wide? no way!),
> therefore making programs non-portable in very subtle ways.

I agree. This can only be a stopgap measure. Ultimately the
Unicode type needs to be made smarter.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Mon Jul  2 04:42:12 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:42:12 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3F5A3A.A88B54B2@ActiveState.com>
Message-ID: <200107020242.OAA00436@s454.cosc.canterbury.ac.nz>

David Ascher :
> > And you just bought such a shiny, new glass, house. Pity.
>
> What kind of comma placement is that?

Obviously it's only the glass that is new, not the
whole house. :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From nhodgson at bigpond.net.au  Mon Jul  2 04:42:11 2001
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Mon, 2 Jul 2001 12:42:11 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107011352.PAA27645@pandora.informatik.hu-berlin.de>
Message-ID: <01d601c102a0$98671580$0acc8490@neil>

Martin von Loewis:


> > The problem I have with this PEP is that it is a compile time option
> > which makes it hard to work with both 32 bit and 16 bit strings in
> > one program.
>
> Can you elaborate why you think this is a problem?

   A common role for Python is to act as glue between various modules. If
Paul produces some interesting code that depends on 32 bit strings and I
want to use that in conjunction with some Win32 specific or COM dependent
code that wants 16 bit strings then it may not be possible or may require
difficult workaronds.

> (*) Methinks that the primary difficulty still is translating all the
> documentation, and messages. Actually, keeping the translations
> up-to-date is even more challenging.

   Translation of documentation and strings can be performed by almost
anyone who writes both languages ("even managers") and can be budgeted by
working out the amount of text and applying a conversion rate. Code requires
careful thought and can lead to the typical buggy software schedule
blowouts.

   Neil





From greg at cosc.canterbury.ac.nz  Mon Jul  2 04:49:56 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:49:56 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <200107011837.f61IbmZ03645@odiug.digicool.com>
Message-ID: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>

> It so happened that the Unicode support was written to make it very
> easy to change the compile-time code unit size

What about extension modules that deal with Unicode strings?
Will they have to be recompiled too? If so, is there anything
to detect an attempt to import an extension module with an
incompatible Unicode character width?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From nhodgson at bigpond.net.au  Mon Jul  2 04:52:45 2001
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Mon, 2 Jul 2001 12:52:45 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>              <00dd01c1022d$c61e4160$0acc8490@neil>  <200107011344.f61DiTM03548@odiug.digicool.com>
Message-ID: <01ea01c102a2$128491c0$0acc8490@neil>

Guido van Rossum:

> >    This wasn't usefully true in the past for DBCS strings and is
> > not the right way to think of either narrow or wide strings
> > now. The idea that strings are arrays of characters gets in
> > the way of dealing with many encodings and is the primary
> > difficulty in localising software for Japanese.
>
> Can you explain the kind of problems encountered in some more detail?

   Programmers used to working with character == indexable code unit will
often split double wide characters when performing an action. For example
searching for a particular double byte character "bc" may match "abcd"
incorrectly where "ab" and "cd" are the characters. DBCS is not normally
self synchronising although UTF-8 is. Another common problem is counting
characters, for example when filling a line, hitting the line width and
forcing half a character onto the next line.

> I think it's a good idea to provide a set of higher-level tools as
> well.  However nobody seems to know what these higher-level tools
> should do yet.  PEP 261 is specifically focused on getting the
> lower-level foundations right (i.e. the objects that represent arrays
> of code units), so that the authors of higher level tools will have a
> solid base.  If you want to help author a PEP for such higher-level
> tools, you're welcome!

   Its more likely I'll publish some of the low level pieces of
Scintilla/SinkWorld as a Python extension providing some of these facilities
in an editable-text class. Then we can see if anyone else finds the code
worthwhile.

   Neil





From nhodgson at bigpond.net.au  Mon Jul  2 05:00:41 2001
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Mon, 2 Jul 2001 13:00:41 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: 
Message-ID: <020b01c102a3$2dd23440$0acc8490@neil>

Tim Peters:

> Well, they still do -- fancy editors use fancy data structures, so that,
> e.g., inserting characters at the start of the file doesn't cause a 50Mb
> memmove each time.  Response time is still important, but I'd wager
> relatively insensitive to basic character size (you need tricks that cut
> factors of 1000s off potential worst cases to give the appearance of
> instantaneous results; a factor of 2 or 4 is in the noise compared to
what's
> needed regardless).

   I actually have some numbers here. Early versions of some new editor
buffer code used UCS-2 on .NET and the JVM. Moving to an 8 bit buffer saved
10-20% of execution time on the insert string, delete string and global
replace benchmarks using strings that fit into ASCII. These buffers did have
some other overhead for line management and other features but I expect
these did not affect the proportions much.

   Neil






From tim.one at home.com  Mon Jul  2 06:36:20 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 2 Jul 2001 00:36:20 -0400
Subject: [Python-Dev] RE: Python 2.1.1 release 'schedule'
In-Reply-To: <20010702001248.H8098@xs4all.nl>
Message-ID: 

Woo hoo!

[Thomas Wouters]
> ...
> Binaries for Windows as well as source tarballs will be provided ...

Building a Windows installer isn't straightforward, so you'd better let us
do that part (e.g., you need the Wise installer program, Fred needs to
supply appropriate HTML docs for the Windows installer to zip up, Tcl/Tk has
to get unpacked and rearranged, etc).  I just checked in 2.1.1c1 changes to
the Windows part of the release21-maint tree, but the rest of it isn't in
CVS.




From thomas at xs4all.net  Mon Jul  2 08:27:24 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 2 Jul 2001 08:27:24 +0200
Subject: [Python-Dev] Re: Python 2.1.1 release 'schedule'
In-Reply-To: 
References: 
Message-ID: <20010702082724.K32419@xs4all.nl>

On Mon, Jul 02, 2001 at 12:36:20AM -0400, Tim Peters wrote:

> [Thomas Wouters]
> > ...
> > Binaries for Windows as well as source tarballs will be provided ...

> Building a Windows installer isn't straightforward, so you'd better let us
> do that part (e.g., you need the Wise installer program, Fred needs to
> supply appropriate HTML docs for the Windows installer to zip up, Tcl/Tk has
> to get unpacked and rearranged, etc).  I just checked in 2.1.1c1 changes to
> the Windows part of the release21-maint tree, but the rest of it isn't in
> CVS.

Oh yeah, I was entirely going to let you guys do it, or at least find
another set of wintendows-weenies to do it :) That's part of why I posted
the tentative release dates.

-- 
Thomas Wouters 

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From loewis at informatik.hu-berlin.de  Mon Jul  2 09:25:18 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 2 Jul 2001 09:25:18 +0200 (MEST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <01d601c102a0$98671580$0acc8490@neil> (nhodgson@bigpond.net.au)
References: <200107011352.PAA27645@pandora.informatik.hu-berlin.de> <01d601c102a0$98671580$0acc8490@neil>
Message-ID: <200107020725.JAA25925@pandora.informatik.hu-berlin.de>

> > > The problem I have with this PEP is that it is a compile time option
> > > which makes it hard to work with both 32 bit and 16 bit strings in
> > > one program.
> >
> > Can you elaborate why you think this is a problem?
> 
>    A common role for Python is to act as glue between various modules. If
> Paul produces some interesting code that depends on 32 bit strings and I
> want to use that in conjunction with some Win32 specific or COM dependent
> code that wants 16 bit strings then it may not be possible or may require
> difficult workaronds.

Neither nor. All it will require is you to recompile your Python
installation for to use wide Unicode.

On Win32 APIs, this will mean that you cannot directly interpret
PyUnicode object representations as WCHAR_T pointers. This is no
problem, as you can transparently copy unicode objects into wchar_t
strings; it's a matter of coming up with a good C API for doing so
conveniently.

Regards,
Martin



From fredrik at pythonware.com  Mon Jul  2 10:20:09 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 10:20:09 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020236.OAA00432@s454.cosc.canterbury.ac.nz>
Message-ID: <03b301c102cf$e0e3dd00$0900a8c0@spiff>

greg wrote:

> I agree. This can only be a stopgap measure. Ultimately the
> Unicode type needs to be made smarter.

PIL uses 8 bits per pixel to store bilevel images, and 32 bits
per pixel to store 16- and 24-bit images.

back in 1995, some people claimed that the image type had
to be made smarter to be usable.  these days, nobody ever
notices...







From fredrik at pythonware.com  Mon Jul  2 10:08:10 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 10:08:10 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> <00dd01c1022d$c61e4160$0acc8490@neil>
Message-ID: <03b201c102cf$e0dab540$0900a8c0@spiff>

Neil Hodgson wrote:
> > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> > character.
>
>    This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea that
> strings are arrays of characters gets in the way

if you stop confusing binary buffers with text strings, all such
problems will go away.







From mal at egenix.com  Mon Jul  2 11:39:55 2001
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 11:39:55 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>
Message-ID: <3B40416B.6438D1F7@egenix.com>

Greg Ewing wrote:
> 
> > It so happened that the Unicode support was written to make it very
> > easy to change the compile-time code unit size
> 
> What about extension modules that deal with Unicode strings?
> Will they have to be recompiled too? If so, is there anything
> to detect an attempt to import an extension module with an
> incompatible Unicode character width?

That's a good question ! 

The answer is: yes, extensions which use Unicode will have to
be recompiled for narrow and wide builds of Python. The question
is however, how to detect cases where the user imports an
extension built for narrow Python into a wide build and
vice versa.

The standard way of looking at the API level won't help. We'd
need some form of introspection API at the C level... hmm,
perhaps looking at the sys module will do the trick for us ?!

In any case, this is certainly going to cause trouble one
of these days...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jul  2 12:13:59 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:13:59 +0200
Subject: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" Unicode 
 characters
References: <3B3F8095.8D58631D@ActiveState.com>
Message-ID: <3B404967.14FE180F@lemburg.com>

Paul Prescod wrote:
> 
> PEP: 261
> Title: Support for "wide" Unicode characters
> Version: $Revision: 1.3 $
> Author: paulp at activestate.com (Paul Prescod)
> Status: Draft
> Type: Standards Track
> Created: 27-Jun-2001
> Python-Version: 2.2
> Post-History: 27-Jun-2001
> 
> Abstract
> 
>     Python 2.1 unicode characters can have ordinals only up to 2**16
> -1.
>     This range corresponds to a range in Unicode known as the Basic
>     Multilingual Plane. There are now characters in Unicode that live
>     on other "planes". The largest addressable character in Unicode
>     has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we
>     will call this TOPCHAR and call characters in this range "wide
>     characters".
> 
> Glossary
> 
>     Character
> 
>         Used by itself, means the addressable units of a Python
>         Unicode string.

Please add: also known as "code unit".
 
>     Code point
> 
>         A code point is an integer between 0 and TOPCHAR.
>         If you imagine Unicode as a mapping from integers to
>         characters, each integer is a code point. But the
>         integers between 0 and TOPCHAR that do not map to
>         characters are also code points. Some will someday
>         be used for characters. Some are guaranteed never
>         to be used for characters.
> 
>     Codec
> 
>         A set of functions for translating between physical
>         encodings (e.g. on disk or coming in from a network)
>         into logical Python objects.
> 
>     Encoding
> 
>         Mechanism for representing abstract characters in terms of
>         physical bits and bytes. Encodings allow us to store
>         Unicode characters on disk and transmit them over networks
>         in a manner that is compatible with other Unicode software.
> 
>     Surrogate pair
> 
>         Two physical characters that represent a single logical

Eeek... two code units (or have you ever seen a physical character
walking around ;-)

>         character. Part of a convention for representing 32-bit
>         code points in terms of two 16-bit code points.
> 
>     Unicode string
> 
>           A Python type representing a sequence of code points with
>           "string semantics" (e.g. case conversions, regular
>           expression compatibility, etc.) Constructed with the
>           unicode() function.
> 
> Proposed Solution
> 
>     One solution would be to merely increase the maximum ordinal
>     to a larger value. Unfortunately the only straightforward
>     implementation of this idea is to use 4 bytes per character.
>     This has the effect of doubling the size of most Unicode
>     strings. In order to avoid imposing this cost on every
>     user, Python 2.2 will allow the 4-byte implementation as a
>     build-time option. Users can choose whether they care about
>     wide characters or prefer to preserve memory.
> 
>     The 4-byte option is called "wide Py_UNICODE". The 2-byte option
>     is called "narrow Py_UNICODE".
> 
>     Most things will behave identically in the wide and narrow worlds.
> 
>     * unichr(i) for 0 <= i < 2**16 (0x10000) always returns a
>       length-one string.
> 
>     * unichr(i) for 2**16 <= i <= TOPCHAR will return a
>       length-one string on wide Python builds. On narrow builds it will
>       raise ValueError.
> 
>         ISSUE
> 
>             Python currently allows \U literals that cannot be
>             represented as a single Python character. It generates two
>             Python characters known as a "surrogate pair". Should this
>             be disallowed on future narrow Python builds?
> 
>         Pro:
> 
>             Python already the construction of a surrogate pair
>             for a large unicode literal character escape sequence.
>             This is basically designed as a simple way to construct
>             "wide characters" even in a narrow Python build. It is also
>             somewhat logical considering that the Unicode-literal syntax
>             is basically a short-form way of invoking the unicode-escape
>             codec.
> 
>         Con:
> 
>             Surrogates could be easily created this way but the user
>             still needs to be careful about slicing, indexing, printing
>             etc. Therefore some have suggested that Unicode
>             literals should not support surrogates.
> 
>         ISSUE
> 
>             Should Python allow the construction of characters that do
>             not correspond to Unicode code points?  Unassigned Unicode
>             code points should obviously be legal (because they could
>             be assigned at any time). But code points above TOPCHAR are
>             guaranteed never to be used by Unicode. Should we allow
> access
>             to them anyhow?
> 
>         Pro:
> 
>             If a Python user thinks they know what they're doing why
>             should we try to prevent them from violating the Unicode
>             spec? After all, we don't stop 8-bit strings from
>             containing non-ASCII characters.
> 
>         Con:
> 
>             Codecs and other Unicode-consuming code will have to be
>             careful of these characters which are disallowed by the
>             Unicode specification.
> 
>     * ord() is always the inverse of unichr()
> 
>     * There is an integer value in the sys module that describes the
>       largest ordinal for a character in a Unicode string on the current
>       interpreter. sys.maxunicode is 2**16-1 (0xffff) on narrow builds
>       of Python and TOPCHAR on wide builds.
> 
>         ISSUE: Should there be distinct constants for accessing
>                TOPCHAR and the real upper bound for the domain of
>                unichr (if they differ)? There has also been a
>                suggestion of sys.unicodewidth which can take the
>                values 'wide' and 'narrow'.
> 
>     * every Python Unicode character represents exactly one Unicode code
>       point (i.e. Python Unicode Character = Abstract Unicode
> character).
> 
>     * codecs will be upgraded to support "wide characters"
>       (represented directly in UCS-4, and as variable-length sequences
>       in UTF-8 and UTF-16). This is the main part of the implementation
>       left to be done.
> 
>     * There is a convention in the Unicode world for encoding a 32-bit
>       code point in terms of two 16-bit code points. These are known
>       as "surrogate pairs". Python's codecs will adopt this convention
>       and encode 32-bit code points as surrogate pairs on narrow Python
>       builds.
> 
>         ISSUE
> 
>             Should there be a way to tell codecs not to generate
>             surrogates and instead treat wide characters as
>             errors?
> 
>         Pro:
> 
>             I might want to write code that works only with
>             fixed-width characters and does not have to worry about
>             surrogates.
> 
>         Con:
> 
>             No clear proposal of how to communicate this to codecs.

No need to pass this information to the codec: simply write
a new one and give it a clear name, e.g. "ucs-2" will generate
errors while "utf-16-le" converts them to surrogates.
 
>     * there are no restrictions on constructing strings that use
>       code points "reserved for surrogates" improperly. These are
>       called "isolated surrogates". The codecs should disallow reading
>       these from files, but you could construct them using string
>       literals or unichr().
> 
> Implementation
> 
>     There is a new (experimental) define:
> 
>         #define PY_UNICODE_SIZE 2
> 
>     There is a new configure option:
> 
>         --enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses
>                               wchar_t if it fits
>         --enable-unicode=ucs4 configures a wide Py_UNICODE, and uses
>                               whchar_t if it fits
>         --enable-unicode      same as "=ucs2"
> 
>     The intention is that --disable-unicode, or --enable-unicode=no
>     removes the Unicode type altogether; this is not yet implemented.
> 
>     It is also proposed that one day --enable-unicode will just
>     default to the width of your platforms wchar_t.
> 
>     Windows builds will be narrow for a while based on the fact that
>     there have been few requests for wide characters, those requests
>     are mostly from hard-core programmers with the ability to buy
>     their own Python and Windows itself is strongly biased towards
>     16-bit characters.
> 
> Notes
> 
>     This PEP does NOT imply that people using Unicode need to use a
>     4-byte encoding for their files on disk or sent over the network.
>     It only allows them to do so. For example, ASCII is still a
>     legitimate (7-bit) Unicode-encoding.
> 
>     It has been proposed that there should be a module that handles
>     surrogates in narrow Python builds for programmers. If someone
>     wants to implement that, it will be another PEP. It might also be
>     combined with features that allow other kinds of character-,
>     word- and line- based indexing.
> 
> Rejected Suggestions
> 
>     More or less the status-quo
> 
>         We could officially say that Python characters are 16-bit and
>         require programmers to implement wide characters in their
>         application logic by combining surrogate pairs. This is a heavy
>         burden because emulating 32-bit characters is likely to be
>         very inefficient if it is coded entirely in Python. Plus these
>         abstracted pseudo-strings would not be legal as input to the
>         regular expression engine.
> 
>     "Space-efficient Unicode" type
> 
>         Another class of solution is to use some efficient storage
>         internally but present an abstraction of wide characters to
>         the programmer. Any of these would require a much more complex
>         implementation than the accepted solution. For instance consider
>         the impact on the regular expression engine. In theory, we could
>         move to this implementation in the future without breaking
> Python
>         code. A future Python could "emulate" wide Python semantics on
>         narrow Python. Guido is not willing to undertake the
>         implementation right now.
> 
>     Two types
> 
>         We could introduce a 32-bit Unicode type alongside the 16-bit
>         type. There is a lot of code that expects there to be only a
>         single Unicode type.
> 
>     This PEP represents the least-effort solution. Over the next
>     several years, 32-bit Unicode characters will become more common
>     and that may either convince us that we need a more sophisticated
>     solution or (on the other hand) convince us that simply
>     mandating wide Unicode characters is an appropriate solution.
>     Right now the two options on the table are do nothing or do
>     this.
> 
> References
> 
>     Unicode Glossary: http://www.unicode.org/glossary/

Plus perhaps the Mark Davis paper at:

http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/
 
> Copyright
> 
>     This document has been placed in the public domain.

Good work, Paul !

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal at lemburg.com  Mon Jul  2 12:08:53 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:08:53 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>
Message-ID: <3B404835.4CE77C60@lemburg.com>

Paul Prescod wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> >...
> >
> > The term "character" in Python should really only be used for
> > the 8-bit strings.
> 
> Are we going to change chr() and unichr() to one_element_string() and
> unicode_one_element_string()

No. I am just suggesting to make use of the crispy clear
definitions which the Unicode Consortium has developed for us.
 
> u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> character. No Python user will find that confusing no matter how Unicode
> knuckle-dragging, mouth-breathing, wife-by-hair-dragging they are.

Except that u[i] maps to a code unit which may or may not be
a code point. Whether a code point matches a grapheme (this
is what users tend to regard as character) is yet another
story due to combining code points.

> > In Unicode a "character" can mean any of:
> 
> Mark Davis said that "people" can use the word to mean any of those
> things. He did not say that it was imprecisely defined in Unicode.
> Nevertheless I'm not using the Unicode definition anymore than our
> standard library uses an ancient Greek definition of integer. Python has
> a concept of integer and a concept of character.

Ok, I'll stop whining. Just as final remark, let me say that
our little discussion is a perfect example of how people can
misunderstand each other by using the terms in different ways
(Kant tried to solve this for Philosophy and did not succeed;
so I guess the Unicode Consortium doesn't stand a chance 
either ;-)
 
> > >     It has been proposed that there should be a module for working
> > >     with UTF-16 strings in narrow Python builds through some sort of
> > >     abstraction that handles surrogates for you. If someone wants
> > >     to implement that, it will be another PEP.
> >
> > Uhm, narrow builds don't support UTF-16... it's UCS-2 which
> > is supported (basically: store everything in range(0x10000));
> > the codecs can map code points to surrogates, but it is solely
> > their responsibility and the responsibility of the application
> > using them to take care of dealing with surrogates.
> 
> The user can view the data as UCS-2, UTF-16, Base64, ROT-13, XML, ....
> Just as we have a base64 module, we could have a UTF-16 module that
> interprets the data in the string as UTF-16 and does surrogate
> manipulation for you.
> 
> Anyhow, if any of those is the "real" encoding of the data, it is
> UTF-16. After all, if the codec reads in four non-BMP characters in,
> let's say, UTF-8, we represent them as 8 narrow-build Python characters.
> That's the definition of UTF-16! But it's easy enough for me to take
> that word out so I will.

u[i] gives you a code unit and whether this maps to a code point
or not is dependent on the implementation which in turn depends
on the narrow/wide choice.

In UCS-2, I believe, surrogates are regarded as two code points;
in UTF-16 they always have to come in pairs. There's a semantic
difference here which is for the codecs and these additional
tools to be aware of -- not the Unicode type implementation.

> >...
> > Also, the module will be useful for both narrow and wide builds,
> > since the notion of an encoded character can involve multiple code
> > points. In that sense Unicode is always a variable length
> > encoding for characters and that's the application field of
> > this module.
> 
> I wouldn't advise that you do all different types of normalization in a
> single module but I'll wait for your PEP.

I'll see if I find some time at the Bordeaux Python Meeting
next week.
 
> > Here's the adjusted text:
> >
> >      It has been proposed that there should be a module for working
> >      with Unicode objects using character-, word- and line- based
> >      indexing. The details of the implementation is left to
> >      another PEP.
> 
>      It has been proposed that there should be a module that handles
>      surrogates in narrow Python builds for programmers. If someone
>      wants to implement that, it will be another PEP. It might also be
>      combined with features that allow other kinds of character-,
>      word- and line- based indexing.

Hmm, I liked my version better, but what the heck ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/





From mal at lemburg.com  Mon Jul  2 12:43:38 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:43:38 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com>  
	            <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>
Message-ID: <3B40505A.2F03EEC4@lemburg.com>

Guido van Rossum wrote:
> 
> Hi Marc-Andre,
> 
> I'm dropping the i18n-sig from the distribution list.
> 
> I hear you:
> 
> > You didn't get my point. I feel responsable for the Unicode
> > implementation design and would like to see it become a continued
> > success.
> 
> I'm sure we all share this goal!
> 
> > In that sense and taking into account that I am the
> > maintainer of all this stuff, I think it is very reasonable to
> > ask me before making any significant changes to the implementation
> > and also respect any comments I put forward.
> 
> I understand you feel that we've rushed this in without waiting for
> your comments.
> 
> Given how close your implementation was, I still feel that the changes
> weren't that significant, but I understand that you get nervous.  If
> Christian were to check in his speed hack  changes to the guts of
> ceval.c I would be nervous too!  (Heck, I got nervous when Eric
> checked in his library-wide string method changes without asking.)
> 
> Next time I'll try to be more sensitive to situations that require
> your review before going forward.

Good.
 
> > Currently, I have to watch the checkins list very closely
> > to find out who changed what in the implementation and then to
> > take actions only after the fact. Since I'm not supporting Unicode
> > as my full-time job this is simply impossible. We have the SF manager
> > and there is really no need to rush anything around here.
> 
> Hm, apart from the fact that you ought to be left in charge, I think
> that in this case the live checkins were a big win over the usual SF
> process.  At least two people were making changes, sometimes to each
> other's code, and many others on at least three continents were
> checking out the changes on many different platforms and immediately
> reporting problems.  We would definitely not have a patch as solid as
> the code that's now checked in, after two days of using SF!  (We
> could've used a branch, but I've found that getting people to actually
> check out the branch is not easy.)

True, but I was thinking of the concept and design questions
which should be resolved *before* taking the direct checkin 
approach.
 
> So I think that the net result was favorable.  Sometimes you just have
> to let people work in the spur of the moment to get the results of
> their best thinking, otherwise they lose interest or their train of
> thought.

Understood, but then I'd like to at least receive a summary
of the changes in some way, so that I continue to understand
how the implementation works after the checkins and which
corners to keep in mind for future additions, changes, etc.
 
> > If I am offline or too busy with other things for a day or two,
> > then I want to see patches on SF and not find new versions of
> > the implementation already checked in.
> 
> That's still the general rule, but in our enthousiasm (and mine was
> definitely part of this!) we didn't want to wait.  Also, I have to
> admit that I mistook your silence for consent -- I didn't think the
> main proposed changes (making the size of Py_UNICODE a config choice)
> were controversial at all, so I didn't realize you would have a problem
> with it.

I don't have a problem with it; I was just seeing things
slip my fingers and getting worried about this.
 
> > This has worked just fine during the last year, so I can only explain
> > the latest actions in this direction with an urge to bypass my comments
> > and any discussion this might cause.
> 
> I think you're projecting your own stuff here. 

Not really. I have processed many patches on SF, gave comments
etc. and did the final checkin. This has worked great over
the last months and I intend to keep working this way since
it is by far the best way to both manage and document the
issues and questions which arise during the process.

E.g. I'm currently processing a patch by Walter D?rwald 
which adds support for callback error handlers. He has done
some great work there which was the result of many lively
discussions. Working like this is fun while staying
manageable at the same time... and again, there's really no
need to rush things !

> I honestly didn't
> think there was much disagreement on your part and thought we were
> doing you a favor by implementing the consensus.  IMO, Martin and and
> Fredrik are familiar enough with both the code and the issues to do a
> good job.

Well, the above was my interpretation of how things went. 
I may have been wrong (and honestly do hope that I am wrong),
but my gutt feeling simply said: hey, what are these guys doing
there... is this some kind of 
 
> > Needless to say that
> > quality control is not possible anymore.
> 
> Unclear.  Lots of other people looked over the changes in your
> absence.  And CVS makes code review after it's checked in easy enough.
> (Hey, in many other open source projects that's the normal procedure
> once the rough characteristics of a feature have been agreed upon:
> check in first and review later!)

That was not my point: quality control also includes checking
the design approach. This is something which should normally
be done in design/implementation/design/... phases -- just like 
I worked with you on the Unicode implementation late in 1999.
 
> > Conclusion:
> > I am not going to continue this work if this does not change.
> 
> That would be sad, and I hope you will stay with us.  We certainly
> don't plan to ignore your comments!
> 
> > Another other problem for me is the continued hostility I feel on i18n
> > against parts of the design and some of my decisions. I am
> > not talking about your feedback and the feedback from many other
> > people on the list which was excellent and to high standards.
> > But reading the postings of the last few months you will
> > find notices of what I am referring to here (no, I don't want
> > to be specific).
> 
> I don't know what to say about this, and obviously nobody has the time
> to go back and read the archives.  I'm sure it's not you as a person
> that was attacked.  If the design isn't perfect -- and hey, since
> Python is the 80 percent language, few things in it are quite perfect!
> -- then (positive) criticism is an attempt to help, to move it closer
> to perfection.
> 
> If people have at times said "the Unicode support sucks", well, that
> may hurt.  You can't always stay friends with everybody.  I get flames
> occasionally for features in Python that folks don't like.  I get used
> to them, and it doesn't affect my confidence any more.  Be the same!

I'll try.
 
> But sometimes, after saying "it sucks", people make specific
> suggestions for improvements, and it's important to be open for those
> even from sources that use offending language.  (Within reason, of
> course.  I don't ask you to listen to somebody who is persistently
> hostile to you as a person.)

Ok.
 
> > If people don't respect my comments or decision, then how can
> > I defend the design and how can I stop endless discussions which
> > simply don't lead anywhere ? So either I am missing something
> > or there is a need for a clear statement from you about
> > my status in all this.
> 
> Do you really *want* to be the Unicode BDFL?  Being something's BDFL a
> full-time job, and you've indicated you're too busy.  (Or is that
> temporary?)

I am currently doing a lot of consulting work, so things sometimes
tighten up and are less work intense at other times. Given
this setup, I think that I will be able to play the BD (without
the FL) for Unicode for some time. I will certainly pass on the
flag to someone else if I find myself not spending enough
time on it.

The only thing I'm asking for, is some more professional
work mentality at times. If people make it hard for me to follow
the development, then I cannot manage this task in a satisfying
way.

> I see you as the original coder, which means that you know that
> section of the code better than anyone, and whenever there's a
> question that others can't answer about its design, implementation, or
> restrictions, I refer to you.  But given that you've said you wouldn't
> be able to work much on it, I welcome contributions by others as long
> as they seem knowledgeable.

Same here.
 
> > If I don't have the right to comment on proposals and patches,
> > possibly even rejecting them, then I simply don't see any
> > ground for keeping the implementation in a state which I can
> > maintain.
> 
> Nobody said you couldn't comment, and you know that.

If I don't get a chance to comment on a summary of changes
(be it before or after a batch of checkings), how am I
supposed to follow up on them ? Keeping a close eye
on the checkin mailing list doesn't help: it simply doesn't
always give you the big picture.

We are all professional quality programmers and I respect
Fredrik and Martin for their coding quality and ideas. What
I am asking for is some more teamwork.

> When it comes to rejecting or accepting, I feel that I am still the
> final arbiter, even for Unicode, until I get hit by a bus.  Since I
> don't always understand the implementation or the issues, I'll of
> course defer to you in cases where I think I can't make the decision,
> but I do reserve the right to be convinced by others to override your
> judgement, occasionally, if there's a good reason.  And when you're
> not responsive, I may try to channel you.  (I'll try to be more
> explicit about that.)

That's perfectly OK (and indeed can be very useful at times).
 
> > And last but not least: The fun-factor has faded which was
> > the main motor driving my into working on Unicode in the first
> > place. Nothing much you can do about this, though :-/
> 
> Yes, that happens to all of us at times.  The fun factor goes up and
> down, and sometimes we must look for fun elsewhere for a while.  Then
> the fun may come back where it appeared lost.  Go on vacation, read a
> book, tackle a new project in a totally different area!  Then come
> back and see if you can find some fun in the old stuff again.

I'll visit the Bordeaux Python conference later week. That should
give me some time to breathe (and hopefully to write some more
PEPs :=).
 
> > > Paul Prescod offered to write a PEP on this issue.  My cynical half
> > > believes that we'll never hear from him again, but my optimistic half
> > > hopes that he'll actually write one, so that we'll be able to discuss
> > > the various issues for the users with the users.  I encourage you to
> > > co-author the PEP, since you have a lot of background knowledge about
> > > the issues.
> >
> > I guess your optimistic half won :-) I think Paul already did all the
> > work, so I'll simply comment on what he wrote.
> 
> Your suggestions were very valuable.  My opinion of Paul also went up
> a notch!
> 
> > > BTW, I think that Misc/unicode.txt should be converted to a PEP, for
> > > the historic record.  It was very much a PEP before the PEP process
> > > was invented.  Barry, how much work would this be?  No editing needed,
> > > just formatting, and assignment of a PEP number (the lower the better).
> >
> > Thanks for converting the text to PEP format, Barry.
> >
> > Thanks for reading this far,
> 
> You're welcome, and likewise.
> 
> Just one more thing, Marc-Andre.  Please know that I respect your work
> very much even if we don't always agree.  We would get by without you,
> but Python would be hurt if you turned your back on us.

Thanks. Be assured that I'll stay around for quite some time --
you won't get by that easily ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal at lemburg.com  Mon Jul  2 12:56:00 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:56:00 +0200
Subject: [Python-Dev] Bordeaux Python Meeting 04.07.-07.07.
Message-ID: <3B405340.31C5AA11@lemburg.com>

Hi everybody,

I think nobody has posted an announcement for the conference
yet, so I'll at least provide a pointer:

	http://www.lsm.abul.org/program/topic19/

Marc Poinot, who also organized the "First Python Day" in France,
is chair of this subtopic at the "Debian One" conference in
Bordeaux:

	http://www.lsm.abul.org/

Cheers,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From fredrik at pythonware.com  Mon Jul  2 13:41:51 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 13:41:51 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com>              <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com> <3B40505A.2F03EEC4@lemburg.com>
Message-ID: <001e01c102eb$fe4995d0$4ffa42d5@hagrid>

mal wrote:

> The only thing I'm asking for, is some more professional
> work mentality at times.

for the record, your recent posts under this subject doesn't strike
me as very professional.

think about it.






From paulp at ActiveState.com  Mon Jul  2 16:25:55 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 02 Jul 2001 07:25:55 -0700
Subject: [I18n-sig] Re: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" 
 Unicodecharacters
References: <3B3F8095.8D58631D@ActiveState.com> <3B404967.14FE180F@lemburg.com>
Message-ID: <3B408473.77AB6C8@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> >     Character
> >
> >         Used by itself, means the addressable units of a Python
> >         Unicode string.
> 
> Please add: also known as "code unit".

I'm not entirely comfortable with that. As you yourself pointed out, the
same Python Unicode object can be interpreted as either a series of
single-width code points *or* as a UTF-16 string where the characters
are code units. You could also interpet it as a BASE64'd region or an
XML document... It all depends on how you look at it.

> ....
> >     Surrogate pair
> >
> >         Two physical characters that represent a single logical
> 
> Eeek... two code units (or have you ever seen a physical character
> walking around ;-)

No, that's sort of my point. The user can decide to adopt the convention
of looking at the two characters as code units or they can ignore that
interpretation and look at them as two code points. It's all relative,
man. Dig it? That's why I use the word "convention" below:

> >         character. Part of a convention for representing 32-bit
> >         code points in terms of two 16-bit code points.

"Surrogates are all in your head. Python doesn't know or care about
them!"

I'll change this to:

    Surrogate pair

        Two Python Unicode characters that represent a single logical
        Unicode code point. Part of a convention for representing
        32-bit code points in terms of two 16-bit code points. Python
        has limited support for reading, writing and constructing
strings 
        that use this convention (described below). Otherwise Python
        ignores the convention.

> No need to pass this information to the codec: simply write
> a new one and give it a clear name, e.g. "ucs-2" will generate
> errors while "utf-16-le" converts them to surrogates.

That's a good point, but what if I want a UTF-8 codec that doesn't
generate surrogates? Or even a UCS4 one?

> Plus perhaps the Mark Davis paper at:
> 
> http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/

Okay.

> > Copyright
> >
> >     This document has been placed in the public domain.
> 
> Good work, Paul !

Thanks for your help. You did help me to clarify many things even though
I argued with you as I was doing it. 
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From guido at digicool.com  Mon Jul  2 17:23:56 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 02 Jul 2001 11:23:56 -0400
Subject: [Python-Dev] Unicode Maintenance
In-Reply-To: Your message of "Mon, 02 Jul 2001 12:43:38 +0200."
             <3B40505A.2F03EEC4@lemburg.com> 
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>  
            <3B40505A.2F03EEC4@lemburg.com> 
Message-ID: <200107021523.f62FNun01807@odiug.digicool.com>

Thanks for your response, Marc-Andre.  I'd like to close this topic
now.  I'm not sure how to get you a "summary of changes", but I think
you can ask Fredrik directly (Martin annonced he's away on vacation).

One thing you can do is pipe the output of "cvs log" through
tools/scripts/logmerge.py -- this gives you the checkin messages in
(reverse?) chronological order.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jul  2 17:29:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 02 Jul 2001 11:29:39 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Mon, 02 Jul 2001 11:39:55 +0200."
             <3B40416B.6438D1F7@egenix.com> 
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>  
            <3B40416B.6438D1F7@egenix.com> 
Message-ID: <200107021529.f62FTdx01823@odiug.digicool.com>

> Greg Ewing wrote:
> > 
> > > It so happened that the Unicode support was written to make it very
> > > easy to change the compile-time code unit size
> > 
> > What about extension modules that deal with Unicode strings?
> > Will they have to be recompiled too? If so, is there anything
> > to detect an attempt to import an extension module with an
> > incompatible Unicode character width?
> 
> That's a good question ! 
> 
> The answer is: yes, extensions which use Unicode will have to
> be recompiled for narrow and wide builds of Python. The question
> is however, how to detect cases where the user imports an
> extension built for narrow Python into a wide build and
> vice versa.
> 
> The standard way of looking at the API level won't help. We'd
> need some form of introspection API at the C level... hmm,
> perhaps looking at the sys module will do the trick for us ?!
> 
> In any case, this is certainly going to cause trouble one
> of these days...

Here are some alternative ways to deal with this:

(1) Use the preprocessor to rename all the Unicode APIs to get "Wide"
    appended to their name in wide mode.  This makes any use of a
    Unicode API in an extension compiled for the wrong Py_UNICODE_SIZE
    fail with a link-time error.  (Which should cause an ImportError
    for shared libraries.)

(2) Ditto but only rename the PyModule_Init function.  This is much
    less work but more coarse: a module that doesn't use any Unicode
    APIs (and I expect these will be a large majority) still would not
    be accepted.

(3) Change the interpretation of PYTHON_API_VERSION so that a low bit
    of '1' means wide Unicode.  Then you only get a warning (followed
    by a core dump when actually trying to use Unicode).

I mentioned (1) and (3) in an earlier post.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at beowolf.digicool.com  Mon Jul  2 17:37:45 2001
From: fdrake at beowolf.digicool.com (Fred Drake)
Date: Mon,  2 Jul 2001 11:37:45 -0400 (EDT)
Subject: [Python-Dev] [maintenance doc updates]
Message-ID: <20010702153745.B304B28929@beowolf.digicool.com>

The development version of the documentation has been updated:

	http://python.sourceforge.net/maint-docs/


Updated to reflect the current state of the Python 2.1.1 maintenance
release branch.




From mal at lemburg.com  Mon Jul  2 18:51:58 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 18:51:58 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>  
	            <3B40416B.6438D1F7@egenix.com> <200107021529.f62FTdx01823@odiug.digicool.com>
Message-ID: <3B40A6AE.EDE30857@lemburg.com>

Guido van Rossum wrote:
> 
> > Greg Ewing wrote:
> > >
> > > > It so happened that the Unicode support was written to make it very
> > > > easy to change the compile-time code unit size
> > >
> > > What about extension modules that deal with Unicode strings?
> > > Will they have to be recompiled too? If so, is there anything
> > > to detect an attempt to import an extension module with an
> > > incompatible Unicode character width?
> >
> > That's a good question !
> >
> > The answer is: yes, extensions which use Unicode will have to
> > be recompiled for narrow and wide builds of Python. The question
> > is however, how to detect cases where the user imports an
> > extension built for narrow Python into a wide build and
> > vice versa.
> >
> > The standard way of looking at the API level won't help. We'd
> > need some form of introspection API at the C level... hmm,
> > perhaps looking at the sys module will do the trick for us ?!
> >
> > In any case, this is certainly going to cause trouble one
> > of these days...
> 
> Here are some alternative ways to deal with this:
> 
> (1) Use the preprocessor to rename all the Unicode APIs to get "Wide"
>     appended to their name in wide mode.  This makes any use of a
>     Unicode API in an extension compiled for the wrong Py_UNICODE_SIZE
>     fail with a link-time error.  (Which should cause an ImportError
>     for shared libraries.)
>
> (2) Ditto but only rename the PyModule_Init function.  This is much
>     less work but more coarse: a module that doesn't use any Unicode
>     APIs (and I expect these will be a large majority) still would not
>     be accepted.
> 
> (3) Change the interpretation of PYTHON_API_VERSION so that a low bit
>     of '1' means wide Unicode.  Then you only get a warning (followed
>     by a core dump when actually trying to use Unicode).
>
> I mentioned (1) and (3) in an earlier post.

(4) Add a feature flag to PyModule_Init() which then looks up the
    features in the sys module and uses this as basis for
    processing the import requrest.

In this case, I think that (5) would be the best solution,
since old code will notice the change in width too.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/



From paulp at ActiveState.com  Mon Jul  2 20:15:41 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 02 Jul 2001 11:15:41 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>  
		            <3B40416B.6438D1F7@egenix.com> <200107021529.f62FTdx01823@odiug.digicool.com> <3B40A6AE.EDE30857@lemburg.com>
Message-ID: <3B40BA4D.9C85A202@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> 
> (4) Add a feature flag to PyModule_Init() which then looks up the
>     features in the sys module and uses this as basis for
>     processing the import requrest.

Could an extension be carefully written so that a single binary could be
compatible with both types of Python build? I'm thinking that it would
pass data buffers with the "right width" based on checking a runtime
flag...
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From just at letterror.com  Mon Jul  2 20:20:38 2001
From: just at letterror.com (Just van Rossum)
Date: Mon,  2 Jul 2001 20:20:38 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B40BA4D.9C85A202@ActiveState.com>
Message-ID: <20010702202041-r01010600-d5c62b95@213.84.27.177>

Paul Prescod wrote:

> Could an extension be carefully written so that a single binary could be
> compatible with both types of Python build? I'm thinking that it would
> pass data buffers with the "right width" based on checking a runtime
> flag...

But then it would also be compatible with a unicode object using different
internal storage units per string, so I'm sure this is a dead end ;-)

Just



From mal at lemburg.com  Mon Jul  2 20:59:06 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 20:59:06 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010702202041-r01010600-d5c62b95@213.84.27.177>
Message-ID: <3B40C47A.94317663@lemburg.com>

Just van Rossum wrote:
> 
> Paul Prescod wrote:
> 
> > Could an extension be carefully written so that a single binary could be
> > compatible with both types of Python build? I'm thinking that it would
> > pass data buffers with the "right width" based on checking a runtime
> > flag...
> 
> But then it would also be compatible with a unicode object using different
> internal storage units per string, so I'm sure this is a dead end ;-)

Agreed :-)

Extension writer will have to provide two versions of the binary.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jul  2 21:12:45 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 21:12:45 +0200
Subject: [I18n-sig] Re: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" 
 Unicodecharacters
References: <3B3F8095.8D58631D@ActiveState.com> <3B404967.14FE180F@lemburg.com> <3B408473.77AB6C8@ActiveState.com>
Message-ID: <3B40C7AD.F2646D56@lemburg.com>

Paul Prescod wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> >...
> > >     Character
> > >
> > >         Used by itself, means the addressable units of a Python
> > >         Unicode string.
> >
> > Please add: also known as "code unit".
> 
> I'm not entirely comfortable with that. As you yourself pointed out, the
> same Python Unicode object can be interpreted as either a series of
> single-width code points *or* as a UTF-16 string where the characters
> are code units. You could also interpet it as a BASE64'd region or an
> XML document... It all depends on how you look at it.

Well, that's what code unit tries to capture too: it's the basic storage
unit used by the implementation for storing characters. Never mind, it's
just a detail...
 
> > ....
> > >     Surrogate pair
> > >
> > >         Two physical characters that represent a single logical
> >
> > Eeek... two code units (or have you ever seen a physical character
> > walking around ;-)
> 
> No, that's sort of my point. The user can decide to adopt the convention
> of looking at the two characters as code units or they can ignore that
> interpretation and look at them as two code points. It's all relative,
> man. Dig it? That's why I use the word "convention" below:

Ok.
 
> > >         character. Part of a convention for representing 32-bit
> > >         code points in terms of two 16-bit code points.
> 
> "Surrogates are all in your head. Python doesn't know or care about
> them!"
> 
> I'll change this to:
> 
>     Surrogate pair
> 
>         Two Python Unicode characters that represent a single logical
>         Unicode code point. Part of a convention for representing
>         32-bit code points in terms of two 16-bit code points. Python
>         has limited support for reading, writing and constructing
> strings
>         that use this convention (described below). Otherwise Python
>         ignores the convention.

Good.
 
> > No need to pass this information to the codec: simply write
> > a new one and give it a clear name, e.g. "ucs-2" will generate
> > errors while "utf-16-le" converts them to surrogates.
> 
> That's a good point, but what if I want a UTF-8 codec that doesn't
> generate surrogates? Or even a UCS4 one?

With Walter's patch for callback error handlers, you should be able to
provide handlers which implement whatever you see fit. 
 
I think that codecs should work the same on all platforms and always
apply the needed conversion for the platform in question; could be wrong
though... it's really only a minor issue.

> > Plus perhaps the Mark Davis paper at:
> >
> > http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/
> 
> Okay.
> 
> > > Copyright
> > >
> > >     This document has been placed in the public domain.
> >
> > Good work, Paul !
> 
> Thanks for your help. You did help me to clarify many things even though
> I argued with you as I was doing it.

Thank you for taking the suggestions into account.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/



From fredrik at pythonware.com  Mon Jul  2 21:41:33 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 21:41:33 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com>
Message-ID: <013101c1032f$022770d0$4ffa42d5@hagrid>

guido wrote: 
> I'm not sure how to get you a "summary of changes", but I think you
> can ask Fredrik directly (Martin annonced he's away on vacation).

summary:

- portability: made unicode object behave properly also if
  sizeof(Py_UNICODE) > 2 and >= sizeof(long) (FL)
- same for unicode codecs and the unicode database (MvL)
- base unicode feature selection on unicode defines, not platform (FL)
- wrap surrogate handling in #ifdef Py_UNICODE_WIDE (MvL, FL)
- tweaked unit tests to work with wide unicode, by replacing explicit
  surrogates with \U escapes (MvL)
- configure options for narrow/wide unicode (MvL)
- removed bogus const and register from some scalars (GvR, FL)
- default unicode configuration for PC (Tim, FL)
- default unicode configuration for Mac (Jack)
- added sys.maxunicode (MvL)

most changes where really trivial (e.g. ~0xFC00 => 0x3FF). martin's
big patch was reviewed and tested by both me and him before checkin
(tim managed to check out and build before I'd gotten around to check
in my windows tweaks, but that's what makes distributed egoless deve-
lopment so fun ;-)






From greg at cosc.canterbury.ac.nz  Tue Jul  3 02:20:37 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 03 Jul 2001 12:20:37 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <03b301c102cf$e0e3dd00$0900a8c0@spiff>
Message-ID: <200107030020.MAA00584@s454.cosc.canterbury.ac.nz>

Fredrik Lundh :

> back in 1995, some people claimed that the image type had
> to be made smarter to be usable.

But at least you can use more than one depth of
image in the same program...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From mal at lemburg.com  Tue Jul  3 10:31:50 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 03 Jul 2001 10:31:50 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid>
Message-ID: <3B4182F6.DAC4C1@lemburg.com>

Fredrik Lundh wrote:
> 
> guido wrote:
> > I'm not sure how to get you a "summary of changes", but I think you
> > can ask Fredrik directly (Martin annonced he's away on vacation).
> 
> summary:
> 
> - portability: made unicode object behave properly also if
>   sizeof(Py_UNICODE) > 2 and >= sizeof(long) (FL)
> - same for unicode codecs and the unicode database (MvL)
> - base unicode feature selection on unicode defines, not platform (FL)
> - wrap surrogate handling in #ifdef Py_UNICODE_WIDE (MvL, FL)
> - tweaked unit tests to work with wide unicode, by replacing explicit
>   surrogates with \U escapes (MvL)
> - configure options for narrow/wide unicode (MvL)
> - removed bogus const and register from some scalars (GvR, FL)
> - default unicode configuration for PC (Tim, FL)
> - default unicode configuration for Mac (Jack)
> - added sys.maxunicode (MvL)

Thank you for the summary. 

Please let me suggest that for the next coding party you prepare a patch 
which spans all party checkins and upload that patch with a summary
like the above to SF. That way we can keep the documentation of the overall
changes in one place and make the process more transparent for everybody.

Now let's get on with business...

Thanks,
-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/





From fredrik at pythonware.com  Tue Jul  3 12:21:27 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 3 Jul 2001 12:21:27 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid> <3B4182F6.DAC4C1@lemburg.com>
Message-ID: <05aa01c103a9$ec29e710$0900a8c0@spiff>

mal wrote:

> Please let me suggest that for the next coding party you prepare a patch
> which spans all party checkins and upload that patch with a summary
> like the above to SF. That way we can keep the documentation of the overall
> changes in one place and make the process more transparent for everybody.

Sorry, but as long as Guido wants an open development approach
based on collective code ownership (aka "egoless programming"),
that's what he gets.

The current environment provides several tools to track changes
to the code base.  The python-checkins list provides instant info
on every single change to the code base; the investment to track
tha list is a few minutes per day.  The CVS history is also easy to
access; you can reach it via the viewcvs interface, or from the
command line.

Using both CVS and SF's patch manager to track development history
is a waste of time.  A development project manned by volunteers
doesn't need bureaucrats; the version control system provides
all the accountability we'll ever need.

(commercial development projects doesn't need bureaucrats
either, and usually don't have them, but that's another story).

I'd also argue that using many incremental checkins improves
quality -- the smaller a change is, the easier it is to understand,
and the more likely it is that also non-experts will notice simple
mistakes or portability issues.  (I regularily comment on checkin
messages that look suspicious codewise, even if I don't know
anything about the problem area.  I'm even right, sometimes).
Reviewing big patches on SF is really hard, even for experts.

And every hour a patch sits on sourceforge instead of in the code
repository is ten hours less burn-in in a heterogenous testing en-
vironment.  That's worth a lot.

Finally, my experience from this and other projects is that the
"visible heartbeat" you get from a continuous flow of checkin
messages improves team productivity and team morale.  No-
thing is more inspiring than seeing others working for a common
goal.  It's the final product that matters, not who's in charge of
what part of it.  The end user couldn't care less.

I'd prefer if you didn't feel the need to play miniboss on the Python
project (I'm sure you have plenty of 'mx' projects that you can use
that approach, if you have to).  And I'd rather see you at the next
party than out there whining over how you missed the last one.

Cheers /F





From mal at lemburg.com  Tue Jul  3 13:30:05 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 03 Jul 2001 13:30:05 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid> <3B4182F6.DAC4C1@lemburg.com> <05aa01c103a9$ec29e710$0900a8c0@spiff>
Message-ID: <3B41ACBD.9FA8FB25@lemburg.com>

Fredrik Lundh wrote:
> 
> > Please let me suggest that for the next coding party you prepare a patch
> > which spans all party checkins and upload that patch with a summary
> > like the above to SF. That way we can keep the documentation of the overall
> > changes in one place and make the process more transparent for everybody.
> 
> Sorry, but as long as Guido wants an open development approach
> based on collective code ownership (aka "egoless programming"),
> that's what he gets.
> 
> The current environment provides several tools to track changes
> to the code base.  The python-checkins list provides instant info
> on every single change to the code base; the investment to track
> tha list is a few minutes per day.  The CVS history is also easy to
> access; you can reach it via the viewcvs interface, or from the
> command line.

I think you misunderstood my suggestion: I didn't say you can't have
a coding party with lots of small checkins, I just suggested that *after*
the party someone does a diff before-and-after-the-party.diff and
uploads this diff to SF with a description of the overall changes.

You simply don't get the big picture from looking at various small 
checkin messages which are sometimes spread across mutliple files/checkins.
 
> Using both CVS and SF's patch manager to track development history
> is a waste of time.  A development project manned by volunteers
> doesn't need bureaucrats; the version control system provides
> all the accountability we'll ever need.
> 
> (commercial development projects doesn't need bureaucrats
> either, and usually don't have them, but that's another story).

Wasn't talking about bureaucrats... 
 
> I'd also argue that using many incremental checkins improves
> quality -- the smaller a change is, the easier it is to understand,
> and the more likely it is that also non-experts will notice simple
> mistakes or portability issues.  (I regularily comment on checkin
> messages that look suspicious codewise, even if I don't know
> anything about the problem area.  I'm even right, sometimes).
> Reviewing big patches on SF is really hard, even for experts.

It's just for keeping a combined record of changes. Following up on
dozens of checkins spanning another dozen files using CVS is 
harder, IMHO, than looking at one single before/after diff.
 
> And every hour a patch sits on sourceforge instead of in the code
> repository is ten hours less burn-in in a heterogenous testing en-
> vironment.  That's worth a lot.

Agreed.
 
> Finally, my experience from this and other projects is that the
> "visible heartbeat" you get from a continuous flow of checkin
> messages improves team productivity and team morale.  No-
> thing is more inspiring than seeing others working for a common
> goal.  It's the final product that matters, not who's in charge of
> what part of it.  The end user couldn't care less.
> 
> I'd prefer if you didn't feel the need to play miniboss on the Python
> project (I'm sure you have plenty of 'mx' projects that you can use
> that approach, if you have to). 

I have no intention of playing "miniboss" (I have enough of that being
the boss of a small company), I'm just trying to keep the task of a code
maintainer manageable; that's all. 'nuff said.

> And I'd rather see you at the next
> party than out there whining over how you missed the last one.

Perhaps you can send around invitations first, before starting the party 
next time ?!

BTW, do you have plans to update the Unicode database to the 3.1
version ? If not, I'll look into this next week.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/




From thomas at xs4all.net  Tue Jul  3 13:41:51 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 3 Jul 2001 13:41:51 +0200
Subject: [Python-Dev] CVS
Message-ID: <20010703134151.P8098@xs4all.nl>

Slightly off-topic, but I've depleted all my other sources :) I'm trying to
get CVS to give me all logentries for all checkins in a specific branch (the
2.1.1 branch) so I can pipe it through logmerge. It seems the one thing I'm
missing now is a branchpoint tag (which should translate to a revision with
an even number of dots, apparently) but 'release21' and 'release21-maint'
both don't qualify. Even the usage logmerge suggests (cvs log -rrelease21)
doesn't work, gives me a bunch of "no revision elease21' in "
warnings and just all logentries for those files.

Am I missing something simple, here, or should I hack logmerge to parse the
symbolic names, figure out the even-dotted revision for each file from the
uneven-dotted branch-tag, and filter out stuff outside that range ? :P

-- 
Thomas Wouters 

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From gregor at mediasupervision.de  Tue Jul  3 14:09:51 2001
From: gregor at mediasupervision.de (Gregor Hoffleit)
Date: Tue, 3 Jul 2001 14:09:51 +0200
Subject: [Python-Dev] PEP 250, site-python, site-packages
Message-ID: <20010703140951.A27647@mediasupervision.de>

PEP 250 talks about adopting site-packages for Windows systems. I'd like
to discuss the sitedirs as a whole.

Currently, site.py appends the following sitedirs to sys.path:

    * /lib/python/site-packages
    * /lib/site-python

If exec-prefix is different from prefix, then also

    * /lib/python/site-packages
    * /lib/site-python



From jepler at mail.inetnebr.com  Tue Jul  3 14:38:00 2001
From: jepler at mail.inetnebr.com (Jeff Epler)
Date: Tue, 3 Jul 2001 07:38:00 -0500
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703140951.A27647@mediasupervision.de>; from gregor@mediasupervision.de on Tue, Jul 03, 2001 at 02:09:51PM +0200
References: <20010703140951.A27647@mediasupervision.de>
Message-ID: <20010703073759.A4972@localhost.localdomain>

On Tue, Jul 03, 2001 at 02:09:51PM +0200, Gregor Hoffleit wrote:
> Due to Python's good tradition of compatibility, this is the vast
> majority of packages; only packages with binary modules necessarily need
> to be recompiled anyway for each major new .

Aren't there bytecode changes in 1.6, 2.0, and 2.1, compared to 1.5.2?  If
so, this either means that each version of Python does need a separate copy
(for the .pyc/.pyo file), or if all versions are compatible with 1.5.2
bytecodes (and I don't know that they are) then all packages would need to
be bytecompiled with 1.5.2.

For instance, it appears that between 1.5.2 and 2.1, the UNPACK_LIST
and UNPACK_TUPLE bytecode instructions were removed and replaced with
a single UNPACK_SEQUENCE opcode.

Information gathered by executing:
	python -c 'import dis
	for name in dis.opname:
	    if name[0] != "<": print name' | sort -u > opcodes-1.5.2
and similarly for python2.

Jeff



From tim.one at home.com  Sun Jul  1 03:58:29 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 30 Jun 2001 21:58:29 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3E4487.40054EAE@ActiveState.com>
Message-ID: 

[Paul Prescod]
> "The Energy is the mass of the object times the speed of light times
> two."

[David Ascher]
> Actually, it's "squared", not times two.  At least in my universe =)

This is something for Guido to Pronounce on, then.  Who's going to write the
PEP?  The threat of nuclear war seems almost laughable in Paul's universe,
so it's certainly got attractions.  OTOH, it's got to be a lot colder too.

energy-will-do-what-guido-tells-it-to-do-ly y'rs  - tim