From Jack.Jansen@cwi.nl Tue Jan 1 22:54:10 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Tue, 1 Jan 2002 23:54:10 +0100 Subject: [Python-Dev] Unicode support in getargs.c Message-ID: <200201012254.g01MsAp28293@spruit.ins.cwi.nl> I posted a question on Unicode support in getargs.c last month (working on a different project), but now that I'm trying to support unicode-based APIs more seriously I find that it leaves even more to be desired. I'd like to help to fix this, but I need some direction on how things should be fixed. Here are some of the issues I ran in today: - Unicode objects have a companion string object, meaning that you can pass a unicode object to an "s" format and have the right thing happen. String objects have no such accompanying unicode object, and I think they should have. Right now you cannot pass a string object when the C routine expects a unicode object. - There is no unicode equivalent of "c", the single character. - "u#" does something useful, but something completely different from what "s#" does. More to the point, it probably does something dangerous, if I understand correctly. If I write a C routine with an "u#" format and the Python code passes a string object the string object will be used as a buffer object and its binary contents will be interpreted as unicode. If the argument in question is a filename this will produce very surprising results:-) I'd like unicode objects to be get a little more first class citizenship, especially in the light of operating systems that are primarily (or exclusively) unicode based, such as Mac OS X or Windows CE, to sum things up. From mal@lemburg.com Tue Jan 1 23:42:21 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 02 Jan 2002 00:42:21 +0100 Subject: [Python-Dev] Unicode support in getargs.c References: <200201012254.g01MsAp28293@spruit.ins.cwi.nl> Message-ID: <3C32495D.7020008@lemburg.com> Jack Jansen wrote: > I posted a question on Unicode support in getargs.c last month (working > on a different project), but now that I'm trying to support > unicode-based APIs more seriously I find that it leaves even more to be > desired. I'd like to help to fix this, but I need some direction on > how things should be fixed. > > Here are some of the issues I ran in today: > - Unicode objects have a companion string object, meaning that you can > pass a unicode object to an "s" format and have the right thing happen. > String objects have no such accompanying unicode object, and I think they > should have. Right now you cannot pass a string object when the C > routine expects a unicode object. You can: parse the object and then pass it to PyUnicode_FromObject(). > - There is no unicode equivalent of "c", the single character. > - "u#" does something useful, but something completely different from > what "s#" does. More to the point, it probably does something > dangerous, if I understand correctly. If I write a C routine with an > "u#" format and the Python code passes a string object the string object > will be used as a buffer object and its binary contents will be interpreted > as unicode. If the argument in question is a filename this will produce > very surprising results:-) True; "u#" does exactly the same as "s#" -- it interprets the input as binary buffer. > I'd like unicode objects to be get a little more first class citizenship, > especially in the light of operating systems that are primarily (or > exclusively) unicode based, such as Mac OS X or Windows CE, to sum things up. You would be far better off using the Unicode API on the objects which are passed into the function rather than relying on the getargs parser to try to apply some magic to the input objects. It might be worthwhile extending the parser markers a bit more or allowing e.g. introduce "us#" to return Unicode objects much like "es#" returns strings... I think we'd need some examples of use though before deciding what's the right way to do this ("es#" was implemented after an request by Mark Hammond to be able to handle Unicode file names for Win CE). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Tue Jan 1 23:58:04 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 2 Jan 2002 00:58:04 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <200201012254.g01MsAp28293@spruit.ins.cwi.nl> (message from Jack Jansen on Tue, 1 Jan 2002 23:54:10 +0100) References: <200201012254.g01MsAp28293@spruit.ins.cwi.nl> Message-ID: <200201012358.g01Nw4r01273@mira.informatik.hu-berlin.de> > String objects have no such accompanying unicode object, and I > think they should have. No. That would either give you cyclic structures, or an ever growing chain of unicode->string->unicode->string objects that could easily result in unacceptable memory consumption. Furthermore, I consider the existance of the embedded string object in a Unicode object as a flaw in itself, as it relies on the default encoding. IMO, the default encoding shouldn't be used if possible, as it only serves the transition towards Unicode, and only in limited ways. > - There is no unicode equivalent of "c", the single character. Why do you need that? > - "u#" does something useful, but something completely different from > what "s#" does. More to the point, it probably does something > dangerous, if I understand correctly. If I write a C routine with an > "u#" format and the Python code passes a string object the string object > will be used as a buffer object and its binary contents will be interpreted > as unicode. That sounds like a bug to me. Passing a string to u# most certainly does not do the right thing; it is bad that does so silently. OTOH, why do you need u#? Normally, you use s# if a string can have embedded null bytes; you do that if the string is "binary". For Unicode, that is useless: A Unicode string typically won't have any embedded null bytes, and it definitely isn't "binary". Regards, Martin From martin@v.loewis.de Wed Jan 2 00:02:08 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 2 Jan 2002 01:02:08 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <3C32495D.7020008@lemburg.com> (mal@lemburg.com) References: <200201012254.g01MsAp28293@spruit.ins.cwi.nl> <3C32495D.7020008@lemburg.com> Message-ID: <200201020002.g02028E01298@mira.informatik.hu-berlin.de> > True; "u#" does exactly the same as "s#" -- it interprets the > input as binary buffer. It doesn't do exactly the same. If s# is applied to a Unicode object, it transparently invokes the default encoding, which is sensible. If u# is applied to a byte string, it does not apply the default encoding. Instead, it interprets the string "as-is". I cannot see an application where this is useful, but I can see many applications where it is clearly wrong. IMO, u# cannot and should not be symmetric to s#. Instead, it should accept just Unicode objects, and raise TypeErrors for everything else. Regards, Martin From barry@zope.com Wed Jan 2 01:02:43 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 1 Jan 2002 20:02:43 -0500 Subject: [Python-Dev] Unicode support in getargs.c References: <200201012254.g01MsAp28293@spruit.ins.cwi.nl> Message-ID: <15410.23603.77221.292793@anthem.wooz.org> >>>>> "JJ" == Jack Jansen writes: JJ> I'd like unicode objects to be get a little more first class JJ> citizenship, especially in the light of operating systems that JJ> are primarily (or exclusively) unicode based, such as Mac OS X JJ> or Windows CE, to sum things up. string/unicode unification? -Barry From mhammond@skippinet.com.au Wed Jan 2 06:57:21 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 2 Jan 2002 17:57:21 +1100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <3C32495D.7020008@lemburg.com> Message-ID: > It might be worthwhile extending the parser markers a bit > more or allowing e.g. introduce "us#" to return Unicode objects > much like "es#" returns strings... I think we'd need some examples > of use though before deciding what's the right way to do this > ("es#" was implemented after an request by Mark Hammond to > be able to handle Unicode file names for Win CE). Actually, it was for Windows itself, allowing the nt module to use Unicode objects correctly for the platform. Mark. From mal@lemburg.com Wed Jan 2 10:24:45 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 02 Jan 2002 11:24:45 +0100 Subject: [Python-Dev] Unicode support in getargs.c References: <200201012254.g01MsAp28293@spruit.ins.cwi.nl> <3C32495D.7020008@lemburg.com> <200201020002.g02028E01298@mira.informatik.hu-berlin.de> Message-ID: <3C32DFED.A81086CB@lemburg.com> "Martin v. Loewis" wrote: > > > True; "u#" does exactly the same as "s#" -- it interprets the > > input as binary buffer. > > It doesn't do exactly the same. If s# is applied to a Unicode object, > it transparently invokes the default encoding, which is sensible. If > u# is applied to a byte string, it does not apply the default encoding. That's because the buffer interface on Unicode objects doesn't return the raw binary buffer. If you pass in a memory mapped file or a buffer object wrapping some memory area, u# will take the input as raw binary stream. All this weird behaviour is needed to make Unicode objects behave well together with s#. The implementation of u# is completely symmetric to that of s# though. I agree, though, that it would make more sense to special case Unicode objects here and have u# return a pointer to the raw internal buffer of the Unicode object. Jack will probably also need a way to say "decode this encoded object into Unicode using the encoding xyz". Something like the Unicode version of "es#". How about "eu#" which then passes through Unicode as-is while decoding all other objects according to the given encoding ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Wed Jan 2 19:20:38 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 2 Jan 2002 20:20:38 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <3C32DFED.A81086CB@lemburg.com> (mal@lemburg.com) References: <200201012254.g01MsAp28293@spruit.ins.cwi.nl> <3C32495D.7020008@lemburg.com> <200201020002.g02028E01298@mira.informatik.hu-berlin.de> <3C32DFED.A81086CB@lemburg.com> Message-ID: <200201021920.g02JKcN01520@mira.informatik.hu-berlin.de> > That's because the buffer interface on Unicode objects doesn't > return the raw binary buffer. If you pass in a memory mapped > file or a buffer object wrapping some memory area, u# will > take the input as raw binary stream. > > All this weird behaviour is needed to make Unicode objects > behave well together with s#. I don't believe this. Why would the implementation of u# have any effect on making s# work? > Jack will probably also need a way to say "decode this encoded > object into Unicode using the encoding xyz". Something like the > Unicode version of "es#". How about "eu#" which then passes through > Unicode as-is while decoding all other objects according to the > given encoding ?! I'd like to see the requirements, in terms of real-world problems, before considering any extensions. Regards, Martin From mal@lemburg.com Wed Jan 2 19:40:56 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 02 Jan 2002 20:40:56 +0100 Subject: [Python-Dev] Unicode support in getargs.c References: <200201012254.g01MsAp28293@spruit.ins.cwi.nl> <3C32495D.7020008@lemburg.com> <200201020002.g02028E01298@mira.informatik.hu-berlin.de> <3C32DFED.A81086CB@lemburg.com> <200201021920.g02JKcN01520@mira.informatik.hu-berlin.de> Message-ID: <3C336248.AEF30155@lemburg.com> "Martin v. Loewis" wrote: > > > That's because the buffer interface on Unicode objects doesn't > > return the raw binary buffer. If you pass in a memory mapped > > file or a buffer object wrapping some memory area, u# will > > take the input as raw binary stream. > > > > All this weird behaviour is needed to make Unicode objects > > behave well together with s#. > > I don't believe this. Why would the implementation of u# have any > effect on making s# work? To make s# work, we had to map the read buffer interface to the encoded version of Unicode -- not the binary version which would have been the "right" choice in terms of the buffer interface (s# maps to the read buffer interface, while t# maps to the character buffer interface). u# is simply a copy&paste implementation of s# interpreting the results of the read buffer interface as Py_UNICODE array. As I menioned in another mail, we should probably let u# pass through Unicode objects as-is without going through the read buffer interface. This functionality is clearly missing and should be added to make u# useful. > > Jack will probably also need a way to say "decode this encoded > > object into Unicode using the encoding xyz". Something like the > > Unicode version of "es#". How about "eu#" which then passes through > > Unicode as-is while decoding all other objects according to the > > given encoding ?! > > I'd like to see the requirements, in terms of real-world problems, > before considering any extensions. Agreed. Jack should post some examples of what he needs for his application. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Wed Jan 2 20:29:02 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 2 Jan 2002 21:29:02 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <3C336248.AEF30155@lemburg.com> (mal@lemburg.com) References: <200201012254.g01MsAp28293@spruit.ins.cwi.nl> <3C32495D.7020008@lemburg.com> <200201020002.g02028E01298@mira.informatik.hu-berlin.de> <3C32DFED.A81086CB@lemburg.com> <200201021920.g02JKcN01520@mira.informatik.hu-berlin.de> <3C336248.AEF30155@lemburg.com> Message-ID: <200201022029.g02KT2T01686@mira.informatik.hu-berlin.de> > > > All this weird behaviour is needed to make Unicode objects > > > behave well together with s#. > > > > I don't believe this. Why would the implementation of u# have any > > effect on making s# work? [...] > u# is simply a copy&paste implementation of s# interpreting the > results of the read buffer interface as Py_UNICODE array. Ok. That explains its history, but it also clarifies that changing the u# implementation has *no* effect whatsoever proper operation of s#. Therefore, I still think that u# should reject string objects, instead of silently doing the wrong thing. > As I menioned in another mail, we should probably let u# pass > through Unicode objects as-is without going through the read buffer > interface. Yes, that would be nice. The only use of u# I can see is that it gives you the number of Py_UNICODE characters, so that the caller doesn't have to look for the terminating NUL. Regards, Martin From Jack.Jansen@cwi.nl Wed Jan 2 21:46:46 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Wed, 2 Jan 2002 22:46:46 +0100 (CET) Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <200201021920.g02JKcN01520@mira.informatik.hu-berlin.de> Message-ID: On Wed, 2 Jan 2002, Martin v. Loewis wrote: > > Jack will probably also need a way to say "decode this encoded > > object into Unicode using the encoding xyz". Something like the > > Unicode version of "es#". How about "eu#" which then passes through > > Unicode as-is while decoding all other objects according to the > > given encoding ?! > > I'd like to see the requirements, in terms of real-world problems, > before considering any extensions. I have a number of MacOSX API's that expect Unicode buffers, passed as "long count, UniChar *buffer". I have the machinery in bgen to generate code for this, iff "u#" (or something else) would work the same as "s#", i.e. it returns you a pointer and a size, and it would work equally well for unicode objects as for classic strings (after conversion). The trick with O and using PyUnicode_FromObject() may do the trick for me, as my code is generated, so a little more glue call doesn't really matter. But as a general solution it doesn't look right: "How do I call a C routine with a string parameter?" "Use the "s" format and you get the string pointer to pass". "How do I call a C routine with a unicode string parameter?" "Use O and PyUnicode_FromObject() and PyUnicode_AsUnicode and make sure you get all your decrefs right and.....". The "es#" is a very strange beast, and a similar "eu#" would help me a little, but it has some serious drawbacks. Aside from it being completely different from the other converters (being a prefix operator in stead of a postfix one, and having a value-return argument) I would also have to pre-allocate the buffer in advance, and that sort of defeats the purpose. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From martin@v.loewis.de Wed Jan 2 22:51:17 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 2 Jan 2002 23:51:17 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: (message from Jack Jansen on Wed, 2 Jan 2002 22:46:46 +0100 (CET)) References: Message-ID: <200201022251.g02MpHU02332@mira.informatik.hu-berlin.de> > I have a number of MacOSX API's that expect Unicode buffers, passed as > "long count, UniChar *buffer". Well, my first question would be: Are you sure that UniChar has the same underlying integral type as Py_UNICODE? If not, you lose. So you may need to do even more conversion. > I have the machinery in bgen to generate code for this, iff "u#" (or > something else) would work the same as "s#", i.e. it returns you a > pointer and a size, and it would work equally well for unicode > objects as for classic strings (after conversion). I see. u# could be made work for Unicode objects alone, but it would have to reject string objects. > But as a general solution it doesn't look right: "How do I call a C > routine with a string parameter?" "Use the "s" format and you get the > string pointer to pass". "How do I call a C routine with a unicode string > parameter?" For that, the answer is u. But you want the length also. So for that, the answer is u#. But your question is "How do I call a C routine with either a Unicode object or a string object, getting a reasonable Py_UNICODE* and the length?". For that, I'd recommend to use O&, with a conversion function PyObject *Py_UnicodeOrString(PyObject *o, void *ignored)){ if (PyUnicode_Check(o)){ Py_INCREF(o);return o; } if (PyString_Check(o)){ return PyUnicode_FromObject(o); } PyErr_SetString(PyExc_TypeError,"unicode object expecpected"); return NULL; } > "Use O and PyUnicode_FromObject() and PyUnicode_AsUnicode and > make sure you get all your decrefs right and.....". With the function above, this becomes Use O&, passing a PyObject**, the function, and a NULL pointer, using PyUnicode_AS_UNICODE and PyUnicode_SIZE, performing a single DECREF at the end [allowing to specify an encoding is optional] In this scenario, somebody *has* to deallocate memory, you cannot get around this. It is your choice whether this is Py_DECREF or PyMem_Free that you have to call (as with the "esomething" conversions); the DECREF is more efficient as it will not copy a Unicode object. > The "es#" is a very strange beast, and a similar "eu#" would help me a > little, but it has some serious drawbacks. Aside from it being completely > different from the other converters (being a prefix operator in stead of a > postfix one, and having a value-return argument) I would also have to > pre-allocate the buffer in advance, and that sort of defeats the purpose. You don't. If you set the buffer to NULL before invoking getargs, you have to PyMem_Free it afterwards. Regards, Martin From mal@lemburg.com Thu Jan 3 10:34:17 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 03 Jan 2002 11:34:17 +0100 Subject: [Python-Dev] Unicode support in getargs.c References: <200201022251.g02MpHU02332@mira.informatik.hu-berlin.de> Message-ID: <3C3433A9.4AA1CB06@lemburg.com> "Martin v. Loewis" wrote: > > > I have a number of MacOSX API's that expect Unicode buffers, passed as > > "long count, UniChar *buffer". > > Well, my first question would be: Are you sure that UniChar has the > same underlying integral type as Py_UNICODE? If not, you lose. > > So you may need to do even more conversion. This should be the first thing to check. Also note that Python has two different flavors of Unicode support: UCS-2 and UCS-4, so you'll have to be careful about this too. > > I have the machinery in bgen to generate code for this, iff "u#" (or > > something else) would work the same as "s#", i.e. it returns you a > > pointer and a size, and it would work equally well for unicode > > objects as for classic strings (after conversion). > > I see. u# could be made work for Unicode objects alone, but it would > have to reject string objects. Martin, I don't agree here: string objects could hold binary UCS-2/UCS-4 data. Jack, u# cannot auto-convert strings to Unicode since this would require allocation of a temporary object and there's no logic there to free that object after usage. es# has logic in place which allows either copying the raw data to a buffer you provide or have it allocate a buffer of the right size for you. That's why I proposed to extend it support Unicode raw data as well. > > But as a general solution it doesn't look right: "How do I call a C > > routine with a string parameter?" "Use the "s" format and you get the > > string pointer to pass". "How do I call a C routine with a unicode string > > parameter?" > > For that, the answer is u. But you want the length also. So for that, > the answer is u#. But your question is "How do I call a C routine with > either a Unicode object or a string object, getting a reasonable > Py_UNICODE* and the length?". > > For that, I'd recommend to use O&, with a conversion function > > PyObject *Py_UnicodeOrString(PyObject *o, void *ignored)){ > if (PyUnicode_Check(o)){ > Py_INCREF(o);return o; > } > if (PyString_Check(o)){ > return PyUnicode_FromObject(o); > } > PyErr_SetString(PyExc_TypeError,"unicode object expecpected"); > return NULL; > } Martin, note that PyUnicode_FromObject() already does the Unicode pass-through (even more: it makes sure that you get a true Unicode object, not a subclass). > > "Use O and PyUnicode_FromObject() and PyUnicode_AsUnicode and > > make sure you get all your decrefs right and.....". > > With the function above, this becomes > > Use O&, passing a PyObject**, the function, and a NULL pointer, using > PyUnicode_AS_UNICODE and PyUnicode_SIZE, performing a single DECREF at > the end [allowing to specify an encoding is optional] > > In this scenario, somebody *has* to deallocate memory, you cannot get > around this. It is your choice whether this is Py_DECREF or PyMem_Free > that you have to call (as with the "esomething" conversions); the > DECREF is more efficient as it will not copy a Unicode object. > > > The "es#" is a very strange beast, and a similar "eu#" would help me a > > little, but it has some serious drawbacks. Aside from it being completely > > different from the other converters (being a prefix operator in stead of a > > postfix one, and having a value-return argument) I would also have to > > pre-allocate the buffer in advance, and that sort of defeats the purpose. > > You don't. If you set the buffer to NULL before invoking getargs, you > have to PyMem_Free it afterwards. Right. Let me see if I can summarize this: Jack wants to get string and Unicode objects converted to Unicode automagically and then receive a pointer to a Py_UNICODE buffer and a size. The current solution for this is to use the "O" parser, fetch the object, pass it through PyUnicode_FromObject(), then use PyUnicode_GET_SIZE() and PyUnicode_AS_UNICODE() to access the Py_UNICODE buffer and finally to Py_DECREF() the object returned by PyUnicode_FromObject(). What I proposed was to extend the "es#" parser marker with a new modifier: "eu#" which does all of the above except that it either copies the Py_UNICODE data to a buffer you provide or a newly allocated buffer which you then have to PyMem_Free() after usage. How does this sound ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From skip@pobox.com (Skip Montanaro) Thu Jan 3 15:11:01 2002 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 3 Jan 2002 09:11:01 -0600 Subject: [Python-Dev] Unicode strings as filenames Message-ID: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> What's the correct way to deal with filenames in a Unicode environment?= Consider this: >>> import site >>> site.encoding 'latin-1' >>> a =3D "abc\xe4\xfc\xdf.txt" >>> u =3D unicode (a, "latin-1") >>> uu =3D u.encode ("utf-8") >>> open(a, "w") >>> open(u, "w") >>> open(uu, "w") If I change my site's default encoding back to ascii, the second open f= ails: >>> import site >>> site.encoding 'ascii' >>> a =3D "abc\xe4\xfc\xdf.txt" >>> u =3D unicode (a, "latin-1") >>> uu =3D u.encode ("utf-8") >>> open(a, "w") >>> open(u, "w") Traceback (most recent call last): File "", line 1, in ? UnicodeError: ASCII encoding error: ordinal not in range(128) >>> open(uu, "w") as I expect it should. The third open is a problem as well, even thoug= h it succeeds with either encoding. (Why doesn't it fail when the default encoding is ascii?) My thought is that before using a plain string or = a unicode string as a filename it should first be coerced to a unicode st= ring with the default encoding, something like: if type(fname) =3D=3D types.StringType: fname =3D unicode(fname, site.encoding) elif type(fname) =3D=3D types.UnicodeType: fname =3D fname.encode(site.encoding) else: raise TypeError, ("unrecognized type for filename: %s"%type(fna= me)) Is that the correct approach? Apparently Python's file object doesn't = do this under the covers. Should it? Thx, Skip From Samuele Pedroni" Hi, [Ok this is maybe more a comp.lang.python thing but ...] If I'm correct dictionaries are based on equality and so the "in" operator. AFAIK if I'm interested in a dictionary working on identity I should wrap my objects ... Now what is the fastest idiom equivalent to: obj in list when I'm interested in identity (is) and not equality? That was the comp.lang.python part, now my impression is that in any case when I'm interested in identity and not equality I have to workaround, that means I will never directly have the performace of the equality idioms. Although my experience say that the equality case is the most common, I wonder whether some directy support for the identity case isn't worth, because it is rare but typically then you would like some speed. [Yes, I have some concrete context but this is long so unless strictly requested ...] Am I missing something? Opinions. regards. From Samuele Pedroni" Message-ID: <022f01c19476$dbabb680$6d94fea9@newmexico> PS: I know that equality for user classes defaults to identity. But I'm obviously interested to the case when equality has been possibly redefined and I still need identity. Thanks. ----- Original Message ----- From: Samuele Pedroni To: Sent: Thursday, January 03, 2002 5:43 PM Subject: [Python-Dev] object equality vs identity, in and dicts idioms and speed > Hi, > > [Ok this is maybe more a comp.lang.python thing > but ...] > > If I'm correct > dictionaries are based on equality and so the "in" operator. > > AFAIK if I'm interested in a dictionary working on identity > I should wrap my objects ... > > Now what is the fastest idiom equivalent to: > > obj in list > > when I'm interested in identity (is) and not equality? > > That was the comp.lang.python part, now > my impression is that in any case when I'm interested > in identity and not equality I have to workaround, > that means I will never directly have the performace of the > equality idioms. Although my experience say that the > equality case is the most common, I wonder whether > some directy support for the identity case isn't worth, > because it is rare but typically then you would like some > speed. [Yes, I have some concrete context but this is long > so unless strictly requested ...] > > Am I missing something? Opinions. > > regards. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > From ping@lfw.org Thu Jan 3 17:16:44 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Thu, 3 Jan 2002 11:16:44 -0600 (CST) Subject: [Python-Dev] pydoc.org updated Message-ID: I would like to apologize for allowing pydoc.org to fall behind Python development for the past few months. At last count, it only gave documentation for Python 2.1b1 and 1.5.2. Today, pydoc.org has been updated to provide all the pydoc-generated documentation pages for Python 1.5.2, 1.6, 2.1, and 2.2 final. The search feature lets you search the names of all the modules, packages, functions, classes, and methods, and the text of all their docstrings. I hope it can be a useful resource for you. Any thoughts you have on making it better would be very welcome, of course. -- ?!ng From nhodgson@bigpond.net.au Thu Jan 3 21:20:24 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Fri, 4 Jan 2002 08:20:24 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> Message-ID: <006e01c1949c$7631d1b0$0acc8490@neil> Skip: > What's the correct way to deal with filenames in a > Unicode environment? > Consider this: [Attempts to use encoding] On Windows NT/2K/XP the right thing to do is to use the wide char open function such as _CRTIMP FILE * __cdecl _wfopen(const wchar_t *, const wchar_t *); _CRTIMP int __cdecl _wopen(const wchar_t *, int, ...); There may also be techniques for doing this on Windows 9x as the file system stores Unicode file names but I have never looked into this. Neil From skip@pobox.com (Skip Montanaro) Thu Jan 3 21:28:16 2002 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 3 Jan 2002 15:28:16 -0600 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <006e01c1949c$7631d1b0$0acc8490@neil> References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> Message-ID: <15412.52464.67681.653594@12-248-41-177.client.attbi.com> Skip> What's the correct way to deal with filenames in a Unicode Skip> environment? Consider this: Skip> [Attempts to use encoding] Neil> On Windows NT/2K/XP the right thing to do is to use the wide char Neil> open function such as Neil> _CRTIMP FILE * __cdecl _wfopen(const wchar_t *, const wchar_t *); Neil> _CRTIMP int __cdecl _wopen(const wchar_t *, int, ...); Neil> There may also be techniques for doing this on Windows 9x as the Neil> file system stores Unicode file names but I have never looked into Neil> this. How is this exposed (if at all) to Python programmers? I happen to be developing on Linux, but the eventual delivery platform will be Windows. Is there no way to handle this in a cross-platform way? Skip From martin@v.loewis.de Thu Jan 3 21:38:56 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 3 Jan 2002 22:38:56 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <3C3433A9.4AA1CB06@lemburg.com> (mal@lemburg.com) References: <200201022251.g02MpHU02332@mira.informatik.hu-berlin.de> <3C3433A9.4AA1CB06@lemburg.com> Message-ID: <200201032138.g03Lcua01440@mira.informatik.hu-berlin.de> > > I see. u# could be made work for Unicode objects alone, but it would > > have to reject string objects. > > Martin, I don't agree here: string objects could hold binary UCS-2/UCS-4 > data. They could. Most likely, they don't. Explicit is better then implicit: Anybody wishing to pass UCS-2 binary data to a function expecting character strings should do function(unicode(data, "UCS-2BE")) # or LE if appropriate > es# has logic in place which allows either copying the raw data > to a buffer you provide or have it allocate a buffer of the > right size for you. That's why I proposed to extend it support > Unicode raw data as well. Even though es# is cleanly defined, it is still undesirable to use, IMO: it requires more copies of data than necessary. If explicit memory management is required, it should be exposed through Py_DECREF. That is easy to understand, and it allows to share immutable objects, thus avoiding copies. > > PyObject *Py_UnicodeOrString(PyObject *o, void *ignored)){ > > if (PyUnicode_Check(o)){ > > Py_INCREF(o);return o; > > } > > if (PyString_Check(o)){ > > return PyUnicode_FromObject(o); > > } > > PyErr_SetString(PyExc_TypeError,"unicode object expecpected"); > > return NULL; > > } > > Martin, note that PyUnicode_FromObject() already does the Unicode > pass-through (even more: it makes sure that you get a true Unicode > object, not a subclass). I noticed. However, I'd like Py_UnicodeOrString to fail if you are not passing a character string (and I'd see no problem in accepting Unicode subtypes without copying them). This is a minor point, though - I might have written PyObject *Py_UnicodeOrString(PyObject *p, void* ignored){ return PyObject_FromObject(o); } as well. > Jack wants to get string and Unicode objects converted to Unicode > automagically and then receive a pointer to a Py_UNICODE buffer and > a size. > > The current solution for this is to use the "O" parser, > fetch the object, pass it through PyUnicode_FromObject(), then > use PyUnicode_GET_SIZE() and PyUnicode_AS_UNICODE() to access > the Py_UNICODE buffer and finally to Py_DECREF() the object returned > by PyUnicode_FromObject(). That is the solution, although I would claim that using the O& parser is simpler, and more flexible. > What I proposed was to extend the "es#" parser marker with a new > modifier: "eu#" which does all of the above except that it either > copies the Py_UNICODE data to a buffer you provide or a newly > allocated buffer which you then have to PyMem_Free() after usage. > > How does this sound ? Terrible. It copies a Unicode object without any need. It also adds to the inflation of format specifiers for getargs; this inflation is terrible in itself. Regards, Martin From nhodgson@bigpond.net.au Thu Jan 3 21:43:11 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Fri, 4 Jan 2002 08:43:11 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <15412.52464.67681.653594@12-248-41-177.client.attbi.com> Message-ID: <00c701c1949f$a3cb38c0$0acc8490@neil> Skip: > How is this exposed (if at all) to Python programmers? Currently not exposed AFAICT except through calldll. > I happen to be > developing on Linux, but the eventual delivery platform will be Windows. Is > there no way to handle this in a cross-platform way? Cross-platform is tricky as the file systems used on Linux have narrow string file names. Some higher level software (such as the forthcoming version of GTK+/GNOME) assume file names are encoded in UTF-8 but this is a somewhat dangerous assumption. The problem on Windows is that there are files you can not open by performing encoding operations on the Unicode names. They do have narrow generated names, but these are mangled and look like Z8F22~1.HTM so are hard to discover. Neil From martin@v.loewis.de Thu Jan 3 21:52:19 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 3 Jan 2002 22:52:19 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> (message from Skip Montanaro on Thu, 3 Jan 2002 09:11:01 -0600) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> Message-ID: <200201032152.g03LqJY01468@mira.informatik.hu-berlin.de> > What's the correct way to deal with filenames in a Unicode environment? > Consider this: >=20 > >>> import site > >>> site.encoding > 'latin-1' Setting site.encoding is certainly the wrong thing to do. How can you know all users of your system use latin-1? > If I change my site's default encoding back to ascii, the second open fai= ls: >=20 > >>> import site > >>> site.encoding > 'ascii' > >>> a =3D "abc\xe4\xfc\xdf.txt" > >>> u =3D unicode (a, "latin-1") On my system, the following works fine >>> import locale >>> locale.setlocale(locale.LC_ALL,"") 'LC_CTYPE=3Dde_DE;LC_NUMERIC=3Dde_DE;LC_TIME=3Dde_DE;LC_COLLATE=3DC;LC_MONE= TARY=3Dde_DE;LC_MESSAGES=3Dde_DE;LC_PAPER=3Dde_DE;LC_NAME=3Dde_DE;LC_ADDRES= S=3Dde_DE;LC_TELEPHONE=3Dde_DE;LC_MEASUREMENT=3Dde_DE;LC_IDENTIFICATION=3Dd= e_DE' >>> a =3D "abc\xe4\xfc\xdf.txt" >>> u =3D unicode (a, "latin-1") >>> open(u, "w") On Unix, your best bet for file names is to trust the user's locale settings. If you do that, open will accept Unicode objects. What is your locale? > Is that the correct approach? Apparently Python's file object doesn't do > this under the covers. Should it? No. There is no established convention, on Unix, how to do non-ASCII file names. If anything, following the user's locale setting is the most reasonable thing to do; this should be in synch of how the user's terminal displays characters. The Python installations' default encoding is almost useless, and shouldn't be changed. On Windows, things are much better, since there a notion of Unicode file names in the system. Regards, Martin From martin@v.loewis.de Thu Jan 3 22:09:57 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 3 Jan 2002 23:09:57 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <006e01c1949c$7631d1b0$0acc8490@neil> (nhodgson@bigpond.net.au) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> Message-ID: <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> > On Windows NT/2K/XP the right thing to do is to use the wide char open > function such as > _CRTIMP FILE * __cdecl _wfopen(const wchar_t *, const wchar_t *); > _CRTIMP int __cdecl _wopen(const wchar_t *, int, ...); I agree. However: - Mark decided to take a different route, using fopen all the time, but encoding Unicode strings with the "mbcs" encoding, which calls MultiByteToWideCharCP with CP_ACP. AFAICT, this is correct as well (although it invokes an unneeded conversion of the string, since fopen, eventually, will convert the string back to Unicode - probably inside CreateFileExA - atleast on WinNT). In any case, passing Unicode objects to open() works just fine, atleast as long as they can be encoded in the ANSI code page. If you want to open a Chinese file name on a Russian Windows installation, you lose. - Skip was likely asking about a Unix installation, in which case all of this is irrelevant. > There may also be techniques for doing this on Windows 9x as the file > system stores Unicode file names but I have never looked into this. To my knowledge, VFAT32 doesn't - only NTFS does (which is not available on W9x). Regards, Martin From nhodgson@bigpond.net.au Thu Jan 3 22:51:03 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Fri, 4 Jan 2002 09:51:03 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> Message-ID: <01ab01c194a9$237b6dc0$0acc8490@neil> Martin: > In any case, passing Unicode objects to open() works just fine, atleast > as long as they can be encoded in the ANSI code page. If you want to > open a Chinese file name on a Russian Windows installation, you lose. I want to be able to open all files on my English W2K install and can with many applications even if some have Chinese names and some have Russian. The big advance W2K made over NT was to only have one real version of the OS instead of multiple language versions. There is a system default language as well as local defaults but with just a few clicks my machine can be used as a Japanese machine although as the keyboard keys don't grow Japanese characters, it is a bit harder to use. You do buy localised versions of W2K and XP but they differ in packagng and defaults - the underlying code is identical which was not the case for NT or 9x. Locales are a really poor choice for people who need to operate in multiple languages and much software is moving to allowing concurrent use of multiple languages through the use of Unicode. The term 'multinationalization' (m18n) is sometimes used in Japan to talk about systems that try to avoid restrictions on character set and language. > > There may also be techniques for doing this on Windows 9x as the file > > system stores Unicode file names but I have never looked into this. > > To my knowledge, VFAT32 doesn't - only NTFS does (which is not > available on W9x). I have a file called u"C:\\z\u0439\u0446.html" on my W2K FAT partition which displays correctly in the explorer and can be opened in, for example, notepad. This leads to the interesting situation of being able to see a file using glob but not then use it: >>> import glob >>> glob.glob("C:\\*.html") ['C:\\l2.html', 'C:\\list.html', 'C:\\m4.html', 'C:\\x.html', 'C:\\z??.html'] >>> for i in glob.glob("C:\\*.html"): ... f = open(i) ... Traceback (most recent call last): File "", line 2, in ? IOError: [Errno 22] Invalid argument: 'C:\\z??.html' Neil From martin@v.loewis.de Thu Jan 3 22:56:38 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 3 Jan 2002 23:56:38 +0100 Subject: [Python-Dev] object equality vs identity, in and dicts idioms and speed In-Reply-To: <021101c19475$d7251260$6d94fea9@newmexico> (pedronis@bluewin.ch) References: <021101c19475$d7251260$6d94fea9@newmexico> Message-ID: <200201032256.g03Muc001778@mira.informatik.hu-berlin.de> > Now what is the fastest idiom equivalent to: > > obj in list > > when I'm interested in identity (is) and not equality? It appears that doing a plain for loop is fastest, see the attached script below. On my system,it gives m1 0 0.00 5000 0.29 9999 0.60 1.0 0.61 m2 0 0.60 5000 0.61 9999 0.62 1.0 0.62 m3 0 1.81 5000 1.81 9999 1.81 1.0 1.83 m4 0 0.00 5000 1.54 9999 3.11 1.0 3.17 > Although my experience say that the equality case is the most > common, I wonder whether some directy support for the identity case > isn't worth, because it is rare but typically then you would like > some speed. In Smalltalk, such things would be done in specialized containers. E.g. the IdentityDictionary is a dictionary where keys are considered equal only if identical. Likewise, you could have a specialized list type. OTOH, if you need speed, just write an extension module - doing a identical_in function is straight-forward. I'd hesitate to add identical_in to the API, since it would mean that it needs to be supported for any container, the same sq_contains works now. Regards, Martin import time x = range(10000) rep = [None] * 100 values = x[0], x[5000], x[-1], 1.0 def m1(val, rep=rep, x=x): for r in rep: found = 0 for s in x: if s is val: found = 1 break def m2(val, rep=rep, x=x): for r in rep: found = [s for s in x if s is val] def m3(val, rep=rep, x=x): for r in rep: def identical(elem): return elem is val found = filter(identical, x) class Contains: def __init__(self, val): self.val = val def __eq__(self, other): return self.val is other def m4(val, rep=rep, x=x): for r in rep: found = Contains(val) in x for options in [m1, m2, m3, m4]: print options.__name__, for val in values: start = time.time() options(val) end = time.time() print "%9s %6.2f" % (val,end-start), print From martin@v.loewis.de Thu Jan 3 22:58:23 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 3 Jan 2002 23:58:23 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <15412.52464.67681.653594@12-248-41-177.client.attbi.com> (message from Skip Montanaro on Thu, 3 Jan 2002 15:28:16 -0600) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <15412.52464.67681.653594@12-248-41-177.client.attbi.com> Message-ID: <200201032258.g03MwNR01783@mira.informatik.hu-berlin.de> > How is this exposed (if at all) to Python programmers? I happen to be > developing on Linux, but the eventual delivery platform will be Windows. Is > there no way to handle this in a cross-platform way? Sure. Just pass Unicode strings to open(). Notice that this requires Python 2.2, and expect exceptions. If it fails, fallback are up to your application: it would be best to let the user know that the choice of file name was not sensible. Regards, Martin From Jack.Jansen@cwi.nl Thu Jan 3 23:09:45 2002 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 4 Jan 2002 00:09:45 +0100 (CET) Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <200201032138.g03Lcua01440@mira.informatik.hu-berlin.de> Message-ID: I'm going to jump out of this discussion for a while. Martin and Mark have a completely different view on Unicode than I do, apparently, and I think I should first try and see if I can use the current implementation. For the record: my view of Unicode is really "ascii done right", i.e. a datatype that allows you to get richer characters than what 1960s ascii gives you. For this it should be as backward-compatible as possible, i.e. if some API expects a unicode filename and I pass "a.out" it should interpret it as u"a.out". All the converting to different charsets is icing on the cake, the number one priority should be that unicode is as compatible as possible with the 8-bit convention used on the platform (whatever it may be). No, make that the number 2 priority: the number one pritority is compatibility with 7-bit ascii. Using Python StringObjects as binary buffers is also far less common than using StringObjects to store plain old strings, so if either of these uses bites the other it's the binary buffer that needs to suffer. UnicodeObjects and StringObjects should behave pretty orthogonal to how FloatObjects and IntObjects behave. -- -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From skip@pobox.com (Skip Montanaro) Thu Jan 3 23:11:10 2002 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 3 Jan 2002 17:11:10 -0600 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <200201032152.g03LqJY01468@mira.informatik.hu-berlin.de> References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <200201032152.g03LqJY01468@mira.informatik.hu-berlin.de> Message-ID: <15412.58638.300600.849962@12-248-41-177.client.attbi.com> >>>>> "Martin" =3D=3D Martin v Loewis writes: >> What's the correct way to deal with filenames in a Unicode >> environment? Consider this: >> >> >>> import site site.encoding >> 'latin-1' Martin> Setting site.encoding is certainly the wrong thing to do. H= ow Martin> can you know all users of your system use latin-1? Why is setting site.encoding appropriate to your environment at the tim= e you install Python wrong? I can't know that all users of my system (whatev= er the definition of "my system" is) will use latin-1. Somewhere along th= e way I have to make some assumptions, however. On any given computer I assume the people who install Python will s= et site.encoding appropriate to their environment. The example I used was latin-1 simply because the folks I'm working= with are in Austria and they came up with the example. I assume the bes= t default encoding for them is latin-1. The application writers themselves will have no problem restricting= internal filenames to be ascii. I assume it users want to save fil= es of their own, they will choose characters from the Unicode character s= et they use most frequently. So, my example used latin-1. I could just as easily have chosen someth= ing else. Martin> On my system, the following works fine Martin> >>> import locale ; locale.setlocale(locale.LC_ALL,"") Martin> 'LC_CTYPE=3Dde_DE;LC_NUMERIC=3Dde_DE;LC_TIME=3Dde_DE;LC_COL= LATE=3DC;LC_MONETARY=3Dde_DE;LC_MESSAGES=3Dde_DE;LC_PAPER=3Dde_DE;LC_NA= ME=3Dde_DE;LC_ADDRESS=3Dde_DE;LC_TELEPHONE=3Dde_DE;LC_MEASUREMENT=3Dde_= DE;LC_IDENTIFICATION=3Dde_DE' Martin> >>> a =3D "abc\xe4\xfc\xdf.txt" u =3D unicode (a, "latin-1"= ) open(u, "w") Martin> Martin> On Unix, your best bet for file names is to trust the user'= s Martin> locale settings. If you do that, open will accept Unicode Martin> objects. Martin> What is your locale? The above setlocale call prints 'LC_CTYPE=3Den_US;LC_NUMERIC=3Den_US;LC_TIME=3Den_US;LC_COLLATE=3De= n_US;LC_MONETARY=3Den_US;LC_MESSAGES=3Den_US;LC_PAPER=3Den;LC_NAME=3Den= ;LC_ADDRESS=3Den;LC_TELEPHONE=3Den;LC_MEASUREMENT=3Den;LC_IDENTIFICATIO= N=3Den' I can't get to the machines in Austria right now to see how their local= es are set, though I suspect they haven't fiddled their LC_* environment, because they are having the problems I described. >> Is that the correct approach? Apparently Python's file object >> doesn't do this under the covers. Should it? Martin> No. There is no established convention, on Unix, how to do Martin> non-ASCII file names. If anything, following the user's loc= ale Martin> setting is the most reasonable thing to do; this should be = in Martin> synch of how the user's terminal displays characters. The P= ython Martin> installations' default encoding is almost useless, and shou= ldn't Martin> be changed. Martin> On Windows, things are much better, since there a notion of= Martin> Unicode file names in the system. This suggests to me that the Python docs need some introductory materia= l on this topic. It appears to me that there are two people in the Python community who live and breathe this stuff are you, Martin, and Marc-And= r=E9. For most of the rest of us, especially if we've never conciously writte= n code for consumption outside an ascii environment, the whole thing just= looks like a quagmire. Skip From martin@v.loewis.de Thu Jan 3 23:16:29 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 00:16:29 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <01ab01c194a9$237b6dc0$0acc8490@neil> (nhodgson@bigpond.net.au) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> Message-ID: <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> > I want to be able to open all files on my English W2K install and can > with many applications even if some have Chinese names and some have > Russian. The big advance W2K made over NT was to only have one real version > of the OS instead of multiple language versions. I understand all that, but I can't agree with all your conclusions. > Locales are a really poor choice for people who need to operate in > multiple languages and much software is moving to allowing concurrent use of > multiple languages through the use of Unicode. On Windows, locales and Unicode don't contradict each other. You can create files through the locale's code page, and they still end up on disk correctly. This is a much better situation than you have on Unix. In any case, there is no alternative. Locales may be good or bad - you must follow system conventions, if you want to write usable software. > > To my knowledge, VFAT32 doesn't - only NTFS does (which is not > > available on W9x). > > I have a file called u"C:\\z\u0439\u0446.html" on my W2K FAT partition > which displays correctly in the explorer and can be opened in, for example, > notepad. Oops, you are right - the long file name is in Unicode. It is only when you do not have a long file name that the short one is interpreted in OEM encoding. > >>> import glob > >>> glob.glob("C:\\*.html") > ['C:\\l2.html', 'C:\\list.html', 'C:\\m4.html', 'C:\\x.html', > 'C:\\z??.html'] > >>> for i in glob.glob("C:\\*.html"): > ... f = open(i) > ... > Traceback (most recent call last): > File "", line 2, in ? > IOError: [Errno 22] Invalid argument: 'C:\\z??.html' I agree this is unfortunate; patches are welcome. Please notice that the strategy of using wchar_t API on Windows has explicitly been considered and rejected, for the complexity of the code changes involved. So anybody proposing a patch would need to make it both useful, and easy to maintain. With these constraints, the current implementation is the best thing Mark could come up with. Software always has limitations, which are removed only if somebody is bothered so much as to change the software. Regards, Martin From martin@v.loewis.de Thu Jan 3 23:23:35 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 00:23:35 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: (message from Jack Jansen on Fri, 4 Jan 2002 00:09:45 +0100 (CET)) References: Message-ID: <200201032323.g03NNZA02271@mira.informatik.hu-berlin.de> > For the record: my view of Unicode is really "ascii done right", i.e. a > datatype that allows you to get richer characters than what 1960s ascii > gives you. Exactly, with the stress on *ASCII*. Almost everybody could agree on ASCII; it is the 8-bit character sets where the troubles start. > For this it should be as backward-compatible as possible, i.e. if > some API expects a unicode filename and I pass "a.out" it should > interpret it as u"a.out". That works fine with the current API. > All the converting to different charsets is icing on the cake, the > number one priority should be that unicode is as compatible as > possible with the 8-bit convention used on the platform (whatever it > may be). The problem is that there are multiple conventions on many systems, and only the application can know which of these to apply. > Using Python StringObjects as binary buffers is also far less common > than using StringObjects to store plain old strings, so if either of > these uses bites the other it's the binary buffer that needs to > suffer. This is a conclusion I cannot agree with. Most strings are really binary, if you look at them closely enough :-) > UnicodeObjects and StringObjects should behave pretty orthogonal to > how FloatObjects and IntObjects behave. For the Python programmer: yes; For the C programmer: memory management makes that inherently difficult, which you don't have for int vs float. Regards, Martin From martin@v.loewis.de Thu Jan 3 23:34:25 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 00:34:25 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <15412.58638.300600.849962@12-248-41-177.client.attbi.com> (message from Skip Montanaro on Thu, 3 Jan 2002 17:11:10 -0600) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <200201032152.g03LqJY01468@mira.informatik.hu-berlin.de> <15412.58638.300600.849962@12-248-41-177.client.attbi.com> Message-ID: <200201032334.g03NYPW02294@mira.informatik.hu-berlin.de> > >> What's the correct way to deal with filenames in a Unicode > >> environment? Consider this: > >> > >> >>> import site site.encoding > >> 'latin-1' > > Martin> Setting site.encoding is certainly the wrong thing to do. How > Martin> can you know all users of your system use latin-1? > > Why is setting site.encoding appropriate to your environment at the time you > install Python wrong? I can't know that all users of my system (whatever > the definition of "my system" is) will use latin-1. Somewhere along the way > I have to make some assumptions, however. Well, then accept the assumption that almost everybody will use an ASCII superset. That may be still wrong, for the case of EBCDIC users, but those are rare on Unix. However, on our typical Unix system, three different encodings are in use: ISO-8859-1 (for tradition), ISO-8859-15 (because it has the Euro), and UTF-8 (because it removes all the limitations). Notice that all of our users speak German, and we still could not set a meaningful site.encoding except for 'ascii'. > On any given computer I assume the people who install Python will set > site.encoding appropriate to their environment. That is probably wrong. Most users will install precompiled packages, and thus site.py will have the value that the package held, which will be 'ascii' for most packages. > The example I used was latin-1 simply because the folks I'm working with > are in Austria and they came up with the example. I assume the best > default encoding for them is latin-1. Well, latin-1 does not have a Euro sign, which may be more and more of a problem. > The application writers themselves will have no problem restricting > internal filenames to be ascii. I assume it users want to save files of > their own, they will choose characters from the Unicode character set > they use most frequently. That is a meaningful assumption. However, it is one that you have to make in your application, not one that you should users expect to make in their Python installations. > The above setlocale call prints > > 'LC_CTYPE=en_US;LC_NUMERIC=en_US;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en;LC_NAME=en;LC_ADDRESS=en;LC_TELEPHONE=en;LC_MEASUREMENT=en;LC_IDENTIFICATION=en' You may want to extend your system to support the same configuration that your users have, i.e. you might want to install an Austrian locale on your system, and set LANG to de_AT. If your system also sets all the LC_ variables for you, I recommend to unset them - setting LANG is enough (to override all other LC_ variables, setting LC_ALL to de_AT should also work). > I can't get to the machines in Austria right now to see how their locales > are set, though I suspect they haven't fiddled their LC_* environment, > because they are having the problems I described. If if they set the environment variables, they'd still have the problem because your application doesn't call setlocale. I do expect that they have set LANG to de_AT, or de_AT.ISO-8859-1. Perhaps they also have this problem because they use Python 2.1 or earlier. > This suggests to me that the Python docs need some introductory material on > this topic. It appears to me that there are two people in the Python > community who live and breathe this stuff are you, Martin, and Marc-André. > For most of the rest of us, especially if we've never conciously written > code for consumption outside an ascii environment, the whole thing just > looks like a quagmire. Well, I'd happily review any introductory material somebody else writes :-) Regards, Martin From nhodgson@bigpond.net.au Fri Jan 4 00:07:19 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Fri, 4 Jan 2002 11:07:19 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> Message-ID: <020601c194b3$c85a4320$0acc8490@neil> Martin: > I agree this is unfortunate; patches are welcome. Please notice that > the strategy of using wchar_t API on Windows has explicitly been > considered and rejected, for the complexity of the code changes > involved. So anybody proposing a patch would need to make it both > useful, and easy to maintain. With these constraints, the current > implementation is the best thing Mark could come up with. > > Software always has limitations, which are removed only if somebody is > bothered so much as to change the software. Sure, I'm just putting my point of view which appears to be different from most in that many developers just use a single locale. If I had a larger supply of time then I'd eventually work on this but there are other tasks that currently look like having more impact. The system provided scripting languages support wide character file names. in VBScript: Set fso = CreateObject("Scripting.FileSystemObject") crlf = chr(13) & chr(10) For Each f1 in fso.GetFolder("C:\").Files if instr(1, f1.name, ".htm") > 0 then s = s & f1.Path & crlf if left(f1.name, 1) = "z" then fo = fso.OpenTextFile(f1.Path).ReadAll() s = s & fo & crlf end if end if Next MsgBox s And Python with the win32 extensions can do the same using the FileSystemObject: # encode used here just to make things print as a quick demo import win32com fso = win32com.client.Dispatch("Scripting.FileSystemObject") s = "" fol = fso.GetFolder("C:\\") for f1 in fol.Files: if f1.name.find(".htm") > 0: s += f1.Path.encode("UTF-8") + "\r\n" if f1.name[0] == u"z": fo = fso.OpenTextFile(f1.Path).ReadAll() s += fo.encode("UTF-8") + "\r\n" print s Neil From Samuele Pedroni" <200201032256.g03Muc001778@mira.informatik.hu-berlin.de> Message-ID: <003d01c194b9$c9f016a0$47fdbac3@newmexico> [Martin v. Loewis] > > Now what is the fastest idiom equivalent to: > > > > obj in list > > > > when I'm interested in identity (is) and not equality? > > It appears that doing a plain for loop is fastest, see the attached > script below. On my system,it gives > > m1 0 0.00 5000 0.29 9999 0.60 1.0 0.61 > m2 0 0.60 5000 0.61 9999 0.62 1.0 0.62 > m3 0 1.81 5000 1.81 9999 1.81 1.0 1.83 > m4 0 0.00 5000 1.54 9999 3.11 1.0 3.17 Thanks, and, sorry I could have done the measuraments myself but I supposed that maybe someone should already know. The result makes also sense, is the version that does less consing and calling user python functions. But only profiling knows the truth . > > > Although my experience say that the equality case is the most > > common, I wonder whether some directy support for the identity case > > isn't worth, because it is rare but typically then you would like > > some speed. > > In Smalltalk, such things would be done in specialized > containers. E.g. the IdentityDictionary is a dictionary where keys are > considered equal only if identical. Likewise, you could have a > specialized list type. OTOH, if you need speed, just write an > extension module - doing a identical_in function is straight-forward. Is not really my code, but yes writing an extension (especially in jython) would be not too difficult but see below. > I'd hesitate to add identical_in to the API, since it would mean that > it needs to be supported for any container, the same sq_contains works > now. I see the problem. Implicitly I was asking whether adding builtin-in identity_list, identity_dict and corresponding weak versions for the dicts could make sense or is just code bloat. The context is anygui (www.anygui.org), I'm following it closely and I try to help with jython/swing issues. Anygui has code like this in the event handling logic: source_stack.append(id(source)) try: ... if not loop and not r.loop \ and id(obj) in source_stack: continue ... finally: source_stack.pop() Now this is a nice idiom and workarounds the identity-list problem, but mmh ... id is broken under jython (that means different objects can get the same id :( ) , also in 2.1 final. It is a long-standing bug and yes we are about to solve it but there is a trade-off and jython id will become precise but many times slower wrt to CPython version (we need to implement a weak-key-table :( ). An identity_list would make for a portable idiom with comparable overhead and will give to the identity case somehow the same speed of the equality case... And further anygui shows also a possible need for a WeakKeyIdentityDict... regards. From martin@v.loewis.de Fri Jan 4 01:07:04 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 02:07:04 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <020601c194b3$c85a4320$0acc8490@neil> (nhodgson@bigpond.net.au) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> Message-ID: <200201040107.g04174J01274@mira.informatik.hu-berlin.de> > The system provided scripting languages support wide character file > names. Please understand that Python also supports wide character file names. It just doesn't allow all the possible values that the system would allow. > For Each f1 in fso.GetFolder("C:\").Files That, of course, is another important difference: Here you get the directory contents as wide strings. Changing os.listdir to return Unicode objects would be possible, but would likely introduce a number of incompatibilities. Your script (e.g. the Python variant) is prepared that .Files returns Unicode objects. Making the same change in Python on all functions that return file names (i.e. listdir, glob, etc) is difficult - most likely, you'll have to make the return type a choice of the application. Regards, Martin From martin@v.loewis.de Fri Jan 4 01:39:23 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 02:39:23 +0100 Subject: [Python-Dev] object equality vs identity, in and dicts idioms and speed In-Reply-To: <003d01c194b9$c9f016a0$47fdbac3@newmexico> (pedronis@bluewin.ch) References: <021101c19475$d7251260$6d94fea9@newmexico> <200201032256.g03Muc001778@mira.informatik.hu-berlin.de> <003d01c194b9$c9f016a0$47fdbac3@newmexico> Message-ID: <200201040139.g041dNj01414@mira.informatik.hu-berlin.de> > An identity_list would make for a portable idiom with comparable > overhead and will give to the identity case somehow the same speed > of the equality case... > > And further anygui shows also a possible need for a WeakKeyIdentityDict... Well, I'd say this is a clear indication that this has to go the path that all library extensions go (or should go): They are implemented in one project, then are used in other projects as well, until finally somebody submits the implementation to the Python core. In the case of anygui, I'd suggest to include different implementations of the identity_list, and any other specialised container you may have: - one implementation for C python that works across all Python versions (in C) - if useful, one implementation for Python 2.2 using type inheritance, in C, alternatively, one implementation in pure Python: class identity_list(list): def __contains__(self, elem): for i in self: if i is elem: return 1 return 0 # need to implement count, index, remove It turns out that this class is as fast in my benchmark than the Python loop over a builtin list, which is not surprising, as it is the same loop. - one implementation in Java for use with Jython. - one implementation in pure Python which works across all Python versions. The configuration mechanics of anygui should then select an appropriate version. Experience will tell which of those implementations are used in practice, and which are of use to other packages. That will eventually give a foundation for including one of them into the core libraries. People tend to invent new containers all the time (and new methods for existing containers), and I believe we have to resist the tempation of including them into the language at first sight. Just make sure that you do *not* put those containers into the location where the Python library will eventually put them, as well; instead if the core provides them, have the configuration mechanics figure to use the builtin type, instead of the anygui-included fallback implementation. Regards, Martin From Samuele Pedroni" <200201032256.g03Muc001778@mira.informatik.hu-berlin.de> <003d01c194b9$c9f016a0$47fdbac3@newmexico> <200201040139.g041dNj01414@mira.informatik.hu-berlin.de> Message-ID: <00eb01c194c5$0880a000$47fdbac3@newmexico> > > An identity_list would make for a portable idiom with comparable > > overhead and will give to the identity case somehow the same speed > > of the equality case... > > > > And further anygui shows also a possible need for a WeakKeyIdentityDict... > > Well, I'd say this is a clear indication that this has to go the path > that all library extensions go (or should go): They are implemented in > one project, then are used in other projects as well, until finally > somebody submits the implementation to the Python core. > > In the case of anygui, I'd suggest to include different > implementations of the identity_list, and any other specialised > container you may have: > > - one implementation for C python that works across all Python > versions (in C) > - if useful, one implementation for Python 2.2 using type inheritance, > in C, alternatively, one implementation in pure Python: > > class identity_list(list): > def __contains__(self, elem): > for i in self: > if i is elem: > return 1 > return 0 > > # need to implement count, index, remove > > It turns out that this class is as fast in my benchmark than the > Python loop over a builtin list, which is not surprising, as it is > the same loop. > > - one implementation in Java for use with Jython. > > - one implementation in pure Python which works across all Python > versions. > > The configuration mechanics of anygui should then select an > appropriate version. > > Experience will tell which of those implementations are used in > practice, and which are of use to other packages. That will eventually > give a foundation for including one of them into the core > libraries. People tend to invent new containers all the time (and new > methods for existing containers), and I believe we have to resist the > tempation of including them into the language at first sight. I won't argue about that. > Just make sure that you do *not* put those containers into the > location where the Python library will eventually put them, as well; > instead if the core provides them, have the configuration mechanics > figure to use the builtin type, instead of the anygui-included > fallback implementation. In this case the above "you" is fully undefined. I will archive this discussion for better times when I have spare-cycles. Anygui people is commited to ship just pure python code, and I'm not really a developer for the project, just a jython "consultant". So I will just workaround otherwise, I already knew that, this was mostly a survey, a valuable one and thanks for the answers. My band-width in the near future is for helping with Jython 2.2 and other personal stuff ... Thanks, Samuele. From tim.one@home.com Fri Jan 4 03:00:08 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 3 Jan 2002 22:00:08 -0500 Subject: [Python-Dev] object equality vs identity, in and dicts idioms and speed In-Reply-To: <003d01c194b9$c9f016a0$47fdbac3@newmexico> Message-ID: [Samuele Pedroni] > ... > but mmh ... id is broken under jython (that means > different objects can get the same id :( ) , also in 2.1 final. > It is a long-standing bug and yes we are about to solve it but > there is a trade-off and jython id will become precise but many times > slower wrt to CPython version (we need to implement a weak-key-table > :( ). Mapping what to what? A fine implementation of id() would be to hand each new object a unique Java int from a global counter, incremented once per Python object creation -- or a Java long if any JVM stays up long enough that 32 bits is an issue . From Samuele Pedroni" [Tim Peters] > Mapping what to what? A fine implementation of id() would be to hand each > new object a unique Java int from a global counter, incremented once per > Python object creation -- or a Java long if any JVM stays up long enough > that 32 bits is an issue . The problem are java class instances, sir, we use non-unique wrappers for them and identity is simulated. We could use a table to make the wrappers unique but we have potentially lots of them as you can imagine, jython people actually use java classes . So the workaround is to keep a table just for the java instances for which someone has asked the id. We cannot win-win so we try not to lose-lose. For simplicity we will extend the table approach to everything. If you have a win-win solution also in this case please ... Our goal is compatibility but we will suggest to avoid id as far as possible for production jython code ... regards From Samuele Pedroni" Message-ID: <00fe01c194d0$82fb4be0$47fdbac3@newmexico> I have been sloppy in the explanation (dangerous!) [me] > We could use a table to make the wrappers unique but we have > potentially lots of them as you can imagine, jython people > actually use java classes . > The point is that we have potentially many java class instances but not that much wrapper duplication for the same instance. So it is not worth to pay the overhead and the complication of making the wrappers unique. And it still not worth to pay it in order to implement a non-broken id. regards. From anthony@interlink.com.au Fri Jan 4 07:15:08 2002 From: anthony@interlink.com.au (Anthony Baxter) Date: Fri, 04 Jan 2002 18:15:08 +1100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... Message-ID: <200201040715.g047F8D08868@mbuna.arbhome.com.au> [resend: sorry if you see it twice, but I can't see that the original ever got through...] Ok, I'd like to make the 2.1.2 release some time in the first half of the week starting 7th Jan, assuming that's ok for the folks who'll need to do the work on the PC/Mac packaging. I notice that pep 101 is pretty strongly focussed on the major releases, not the minor ones. Is it worth making a modified version of this PEP with the minor release steps? the things to do: README file. NEWS file - should this have anything other than the socket.sendall() change? I don't have access to creosote.python.org, so someone else's going to need to do this. As far as 2.2.1 goes, I'm happy to keep on the patch czar role. Is trying for a release before the conference too aggressive a timeframe? There seem to be a number of niggles that'd be nice to have fixed... Anthony From barry@zope.com Fri Jan 4 07:53:11 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 4 Jan 2002 02:53:11 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> Message-ID: <15413.24423.772132.175722@anthem.wooz.org> >>>>> "AB" == Anthony Baxter writes: AB> [resend: sorry if you see it twice, but I can't see that the AB> original ever got through...] AB> Ok, I'd like to make the 2.1.2 release some time in the first AB> half of the week starting 7th Jan, assuming that's ok for the AB> folks who'll need to do the work on the PC/Mac packaging. One of the things I'd really like to be sure works in 2.1.2 is largefile support. I've had some trouble along these lines on filesystems that I know have largefile (because a Python 2.2 built on the same platform works fine). Do we expect that largefile support should work in Python 2.1.2? I'm okay that autoconf detection fails as long as the instructions in the posix module work: http://www.python.org/doc/current/lib/posix-large-files.html I've had some failures on 2.4.7 kernels w/ ext3 filesystems. AB> I notice that pep 101 is pretty strongly focussed on the major AB> releases, not the minor ones. Is it worth making a modified AB> version of this PEP with the minor release steps? I'd be more inclined to clone PEP 101 into a PEP 102 with micro release instructions. The nice thing about 101 is that you can just go down the list, checking things off in a linear fashion as you complete each item. I'd be loathe to break up the linearity of that. AB> I don't have access to creosote.python.org, so someone else's AB> going to need to do this. I can certainly help with any fiddling necessary on creosote. Then again... AB> As far as 2.2.1 goes, I'm happy to keep on the patch czar AB> role. ...if this is going to be a recurring role, we might just want to give you access to the web cvs tree and creosote. AB> Is trying for a release before the conference too AB> aggressive a timeframe? There seem to be a number of niggles AB> that'd be nice to have fixed... Hey, if you're up for it! dunno-about-you-but-i'm-planning-a-vacation-ly y'rs, -Barry From mal@lemburg.com Fri Jan 4 09:21:09 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 04 Jan 2002 10:21:09 +0100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> Message-ID: <3C357405.CC439CBE@lemburg.com> [Skip wants open() to handle Unicode on all platforms] As Martin and Neil have already explained, the handling of national characters in file names is not standardized at all across platforms (not even file systems on one platform, e.g. on Linux). The only option I see to make this situation less painful is to write a filename subsystem which implements two generic APIs: 1. file open using strings and Unicode 2. file listing using either Unicode or strings with a predefined encoding in the output list Since this subsystem would be fairly complicated, I'd suggest that someone writes a PEP on the topic and then the various experts try to come up with implementations which work on at least some systems and a fallback implementation which gets used if no other implementation fits. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Fri Jan 4 10:08:51 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 11:08:51 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <200201040715.g047F8D08868@mbuna.arbhome.com.au> (message from Anthony Baxter on Fri, 04 Jan 2002 18:15:08 +1100) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> Message-ID: <200201041008.g04A8p701317@mira.informatik.hu-berlin.de> > I notice that pep 101 is pretty strongly focussed on the major releases, > not the minor ones. Is it worth making a modified version of this PEP with > the minor release steps? If you don't think you'd get it "right", adding a delta section might be reasonable. Specifically: Don't create a release branch. Instead, just call a code freeze on the maintainance branch, and release from the maintainance branch (just putting on the release tag, i.e. r212) As for the things still to be done, don't forget Include/patchlevel.h :-) > NEWS file - should this have anything other than the socket.sendall() > change? If you can, producing a list of bugs fixed would be nice. It does not need to be exhaustive. > As far as 2.2.1 goes, I'm happy to keep on the patch czar role. Is > trying for a release before the conference too aggressive a timeframe? I'd very much encourage a release in that time frame. Regards, Martin From martin@v.loewis.de Fri Jan 4 10:38:28 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 11:38:28 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <15413.24423.772132.175722@anthem.wooz.org> (barry@zope.com) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> Message-ID: <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> > Do we expect that largefile support should work in Python 2.1.2? I'm > okay that autoconf detection fails as long as the instructions in the > posix module work: > > http://www.python.org/doc/current/lib/posix-large-files.html I don't think we can get autoconf detection to work on 2.1. The instructions are right. Unfortunately, the code is wrong: It prefers fgetpos in 2.1, but that returns not an integral type on some systems. I think the best approach is to copy the body of _portable_fseek and _portable_ftell from 2.2. With that, I get a setup that atleast looks right (patch attached) > I've had some failures on 2.4.7 kernels w/ ext3 filesystems. Were these compilation failures, or runtime failures? For the compilation failures, ext3 should be irrelevant, and 2.4.7 should be irrelevant as well - the glibc version would matter (which defines fpos_t to be a struct with an mbstate_t inside). Regards, Martin Index: fileobject.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/fileobject.c,v retrieving revision 2.110 diff -u -r2.110 fileobject.c --- fileobject.c 2001/04/14 17:49:40 2.110 +++ fileobject.c 2002/01/04 10:31:39 @@ -225,20 +225,28 @@ static int _portable_fseek(FILE *fp, Py_off_t offset, int whence) { -#if defined(HAVE_FSEEKO) +#if !defined(HAVE_LARGEFILE_SUPPORT) + return fseek(fp, offset, whence); +#elif defined(HAVE_FSEEKO) && SIZEOF_OFF_T >= 8 return fseeko(fp, offset, whence); #elif defined(HAVE_FSEEK64) return fseek64(fp, offset, whence); #elif defined(__BEOS__) return _fseek(fp, offset, whence); -#elif defined(HAVE_LARGEFILE_SUPPORT) && SIZEOF_FPOS_T >= 8 +#elif SIZEOF_FPOS_T >= 8 /* lacking a 64-bit capable fseek(), use a 64-bit capable fsetpos() and fgetpos() to implement fseek()*/ fpos_t pos; switch (whence) { case SEEK_END: +#ifdef MS_WINDOWS + fflush(fp); + if (_lseeki64(fileno(fp), 0, 2) == -1) + return -1; +#else if (fseek(fp, 0, SEEK_END) != 0) return -1; +#endif /* fall through */ case SEEK_CUR: if (fgetpos(fp, &pos) != 0) @@ -249,7 +257,7 @@ } return fsetpos(fp, &offset); #else - return fseek(fp, offset, whence); +#error "Large file support, but no way to fseek." #endif } @@ -260,17 +268,19 @@ static Py_off_t _portable_ftell(FILE* fp) { -#if SIZEOF_FPOS_T >= 8 && defined(HAVE_LARGEFILE_SUPPORT) +#if !defined(HAVE_LARGEFILE_SUPPORT) + return ftell(fp); +#elif defined(HAVE_FTELLO) && SIZEOF_OFF_T >= 8 + return ftello(fp); +#elif defined(HAVE_FTELL64) + return ftell64(fp); +#elif SIZEOF_FPOS_T >= 8 fpos_t pos; if (fgetpos(fp, &pos) != 0) return -1; return pos; -#elif defined(HAVE_FTELLO) && defined(HAVE_LARGEFILE_SUPPORT) - return ftello(fp); -#elif defined(HAVE_FTELL64) && defined(HAVE_LARGEFILE_SUPPORT) - return ftell64(fp); #else - return ftell(fp); +#error "Large file support, but no way to ftell." #endif } From martin@v.loewis.de Fri Jan 4 10:46:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 11:46:20 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <3C357405.CC439CBE@lemburg.com> (mal@lemburg.com) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> Message-ID: <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> > The only option I see to make this situation less painful is > to write a filename subsystem which implements two generic > APIs: > > 1. file open using strings and Unicode I think this "pretty much" works in Python 2.2 already. It uses the "mbcs" encoding on Windows, and the locale's encoding on Unix if locale.setlocale has been called (and the C library is good enough). That might be still wrong if the file system expects UTF-8, or a fixed encoding (e.g. on an NTFS or VFAT partition mounted on Linux), but I don't think there is anything that can be done about this: It would be a misconfigured system if then the user doesn't also use an UTF-8 locale. > 2. file listing using either Unicode or strings with a predefined > encoding in the output list That is something that certainly needs to be done. Having a PEP on that would be useful. Regards, Martin From nhodgson@bigpond.net.au Fri Jan 4 10:54:43 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Fri, 4 Jan 2002 21:54:43 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> Message-ID: <008c01c1950e$3775c3b0$0acc8490@neil> Marc-Andre Lemburg: > The only option I see to make this situation less painful is > to write a filename subsystem which implements two generic > APIs: > > 1. file open using strings and Unicode > > 2. file listing using either Unicode or strings with a predefined > encoding in the output list I started work on this in C++ for my SciTE editor a couple of months ago but the design started to include stuff like 'are these two paths pointing at one file', converting between OpenVMS and Unix paths, and handling URLs (at least using ftp and http). My brain threatened to explode if it got any more complex so it got moved to the 'future niceness' pile. Neil From mal@lemburg.com Fri Jan 4 11:11:12 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 04 Jan 2002 12:11:12 +0100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> Message-ID: <3C358DD0.815BC93D@lemburg.com> "Martin v. Loewis" wrote: > > > The only option I see to make this situation less painful is > > to write a filename subsystem which implements two generic > > APIs: > > > > 1. file open using strings and Unicode > > I think this "pretty much" works in Python 2.2 already. It uses the > "mbcs" encoding on Windows, and the locale's encoding on Unix if > locale.setlocale has been called (and the C library is good enough). > > That might be still wrong if the file system expects UTF-8, or a fixed > encoding (e.g. on an NTFS or VFAT partition mounted on Linux), but I > don't think there is anything that can be done about this: It would be > a misconfigured system if then the user doesn't also use an UTF-8 > locale. We'd still need to support other OSes as well, though, and I don't think that putting all this code into fileobject.c is a good idea -- after all opening files is needed by some other parts of Python as well and may also be useful for extensions. I'd suggest to implement something similiar to the DLL loading code which is also implemented as subsystem in Python. > > 2. file listing using either Unicode or strings with a predefined > > encoding in the output list > > That is something that certainly needs to be done. Having a PEP on > that would be useful. Yep. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mwh@python.net Fri Jan 4 11:14:21 2002 From: mwh@python.net (Michael Hudson) Date: 04 Jan 2002 11:14:21 +0000 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: Anthony Baxter's message of "Fri, 04 Jan 2002 18:15:08 +1100" References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> Message-ID: <2mheq2bc6q.fsf@starship.python.net> Anthony Baxter writes: > As far as 2.2.1 goes, I'm happy to keep on the patch czar role. Fine, so long as you get on with it :) I was going to merge this weeks bugfixes this morning... > Is trying for a release before the conference too aggressive a > timeframe? There seem to be a number of niggles that'd be nice to > have fixed... That's probably a reasonable timeframe, so long as the niggles actually do get fixed by then. Picklability of the struct_seq thingies is one that might be a bit of a pain. Cheers, M. -- 31. Simplicity does not precede complexity, but follows it. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From mal@lemburg.com Fri Jan 4 11:20:12 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 04 Jan 2002 12:20:12 +0100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <008c01c1950e$3775c3b0$0acc8490@neil> Message-ID: <3C358FEC.D4FEF710@lemburg.com> Neil Hodgson wrote: > > Marc-Andre Lemburg: > > > The only option I see to make this situation less painful is > > to write a filename subsystem which implements two generic > > APIs: > > > > 1. file open using strings and Unicode > > > > 2. file listing using either Unicode or strings with a predefined > > encoding in the output list > > I started work on this in C++ for my SciTE editor a couple of months ago > but the design started to include stuff like 'are these two paths pointing > at one file', converting between OpenVMS and Unix paths, and handling URLs > (at least using ftp and http). My brain threatened to explode if it got any > more complex so it got moved to the 'future niceness' pile. I believe that we could do well with the following assumptions: a) strings passed to open() use whatever encoding is needed by the file system b) Unicode passed to open() are converted to whatever the file system needs by then open() API. This doesn't cover all the possibilities, but goes a long way. Joining paths between file systems should really be left to the os.path APIs. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jack@oratrix.nl Fri Jan 4 12:22:47 2002 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 04 Jan 2002 13:22:47 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: Message by "Martin v. Loewis" , Fri, 4 Jan 2002 00:23:35 +0100 , <200201032323.g03NNZA02271@mira.informatik.hu-berlin.de> Message-ID: <20020104122252.3A770E8452@oratrix.oratrix.nl> Sigh, I let myself be drawn in again, despite my previous assertion.... Recently, "Martin v. Loewis" said: > > For this it should be as backward-compatible as possible, i.e. if > > some API expects a unicode filename and I pass "a.out" it should > > interpret it as u"a.out". > > That works fine with the current API. No, it doesn't, that is the whole point of why I started this thread!!!! If the Python wrapper around the API uses PyArg_Parse("u") then it will barf on "a.out", if the wrapper uses "u#" it will not barf but in stead completely misinterpret the StringObject containing "a.out", interpreting it as the binary representation of 3 unicode characters or something far worse! Yes, there is a workaround with the "O" format and three more function calls, but I wouldn't call that "works fine"... > > Using Python StringObjects as binary buffers is also far less common > > than using StringObjects to store plain old strings, so if either of > > these uses bites the other it's the binary buffer that needs to > > suffer. > > This is a conclusion I cannot agree with. Most strings are really > binary, if you look at them closely enough :-) I'm not sure I understand this remark. If you made it just for the smiley: never mind. If you really don't agree: please explain why. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mwh@python.net Fri Jan 4 12:30:48 2002 From: mwh@python.net (Michael Hudson) Date: 04 Jan 2002 12:30:48 +0000 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects fileobject.c,2.141,2.142 In-Reply-To: Neal Norwitz's message of "Tue, 01 Jan 2002 11:07:15 -0800" References: Message-ID: <2mitai8fif.fsf@starship.python.net> Neal Norwitz writes: > Update of /cvsroot/python/python/dist/src/Objects > In directory usw-pr-cvs1:/tmp/cvs-serv2511/Objects > > Modified Files: > fileobject.c > Log Message: > SF Patch #494863, file.xreadlines() should raise ValueError if file is closed > > This makes xreadlines behave like all other file methods > (other than close() which just returns). Does this qualify as a bugfix? Cheers, M. From fdrake@acm.org Fri Jan 4 12:39:00 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 4 Jan 2002 07:39:00 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects fileobject.c,2.141,2.142 In-Reply-To: <2mitai8fif.fsf@starship.python.net> References: <2mitai8fif.fsf@starship.python.net> Message-ID: <15413.41572.173450.104663@cj42289-a.reston1.va.home.com> Michael Hudson writes: > > SF Patch #494863, file.xreadlines() should raise ValueError if file is closed > > > > This makes xreadlines behave like all other file methods > > (other than close() which just returns). > > Does this qualify as a bugfix? I think so. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jack@oratrix.nl Fri Jan 4 12:47:45 2002 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 04 Jan 2002 13:47:45 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: Message by "M.-A. Lemburg" , Fri, 04 Jan 2002 10:21:09 +0100 , <3C357405.CC439CBE@lemburg.com> Message-ID: <20020104124750.78D4AE8451@oratrix.oratrix.nl> Off on a slight tangent: On Mac OS X the default 8-bit encoding is UTF8. os.listdir() handles this fine and so does open(). The OS does all the hard work for you: it knows that some mounted disks may be in other 8-bit encodings (such as MacRoman or MacJapanese for old mac disks, or probably latin-1 for NFS filesystems, or god-knows-what for SMB mounted disks) and handles the conversion. But in Python (unix-Python we're talking here, not MacPython), unicode(filename) fails, because site.encoding is "ascii". Would it be safe to set site.encoding to utf8 on Mac OS X by default? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido@python.org Fri Jan 4 13:35:28 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 04 Jan 2002 08:35:28 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: Your message of "Fri, 04 Jan 2002 18:15:08 +1100." <200201040715.g047F8D08868@mbuna.arbhome.com.au> References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> Message-ID: <200201041335.IAA00823@cj20424-a.reston1.va.home.com> > Ok, I'd like to make the 2.1.2 release some time in the first > half of the week starting 7th Jan, assuming that's ok for the folks > who'll need to do the work on the PC/Mac packaging. Cool! I can't speak for the Mac folks -- they may still be exhausted from the 2.2 release effort -- but I can't imagine this would be much of a problem. > I notice that pep 101 is pretty strongly focussed on the major > releases, not the minor ones. Is it worth making a modified version > of this PEP with the minor release steps? Great idea! > the things to do: > > README file. > NEWS file - should this have anything other than the socket.sendall() > change? The 2.1.1 NEWS file had a SF reference of each and every bug that was fixed. Is this worth doing? (If it were me, the answer would be an emphatic "no".) > I don't have access to creosote.python.org, so someone else's going to > need to do this. Barry & I are at your service. I'm guessing you'll also need Fred's help to roll out the docs (are there going to be 2.1.2 docs?) and Tim's for the windows installer (which may be a bit of a pain since we've switched installer builders for 2.2). > As far as 2.2.1 goes, I'm happy to keep on the patch czar role. Is > trying for a release before the conference too aggressive a timeframe? > There seem to be a number of niggles that'd be nice to have fixed... That would be very cool! Should be plenty of time if we aim low. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Fri Jan 4 16:23:27 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 4 Jan 2002 11:23:27 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> Message-ID: <15413.55039.802450.257239@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> I don't think we can get autoconf detection to work on MvL> 2.1. I don't mind. MvL> The instructions are right. Unfortunately, the code is MvL> wrong: It prefers fgetpos in 2.1, but that returns not an MvL> integral type on some systems. Right. MvL> I think the best approach is to copy the body of MvL> _portable_fseek and _portable_ftell from 2.2. With that, I MvL> get a setup that atleast looks right (patch attached) Unfortunately that's not enough, I suspect. >> I've had some failures on 2.4.7 kernels w/ ext3 filesystems. MvL> Were these compilation failures, or runtime failures? For the MvL> compilation failures, ext3 should be irrelevant, and 2.4.7 MvL> should be irrelevant as well - the glibc version would matter MvL> (which defines fpos_t to be a struct with an mbstate_t MvL> inside). Vanilla release21-maint will give compilation failures, which go away with the patch (essentially what I tried on other systems). But even with these patches, test_largefile fails on the seek(2**31L). So something else too is going on. FTR: this is a stock Mandrake 8.1 system w/ glibc 2.2.4. I don't have much time to spend looking into this right now, but it would be good to fix for 2.1.2. -Barry From tim.one@home.com Fri Jan 4 16:41:26 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 4 Jan 2002 11:41:26 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <200201041335.IAA00823@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > I'm guessing you'll also need Fred's help to roll out the docs (are > there going to be 2.1.2 docs?) and Tim's for the windows installer > (which may be a bit of a pain since we've switched installer builders > for 2.2). Ouch. More than a bit -- I'd have to find the old Wise floppy first (it's not on my disk anymore). But the 16-bit installer is itself "a bug" (often doesn't work) on the recent MS high-end OSes (2000 and XP), so creating another of those is a dubious exercise. We were probably shipping different versions of expat and/or zlib on Windows for 2.1 too (but at least I can suck those binaries out of an installed 2.1 -- or was Windows 2.1 compiled with a binary-incompatible MSVC 5?). Etc. If I do this, it's going to consume at least a day to straighten out all the issues. From guido@python.org Fri Jan 4 16:58:47 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 04 Jan 2002 11:58:47 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: Your message of "Fri, 04 Jan 2002 11:41:26 EST." References: Message-ID: <200201041658.LAA27923@cj20424-a.reston1.va.home.com> > [Guido] > > I'm guessing you'll also need Fred's help to roll out the docs (are > > there going to be 2.1.2 docs?) and Tim's for the windows installer > > (which may be a bit of a pain since we've switched installer builders > > for 2.2). > [Tim] > Ouch. More than a bit -- I'd have to find the old Wise floppy first (it's > not on my disk anymore). But the 16-bit installer is itself "a bug" (often > doesn't work) on the recent MS high-end OSes (2000 and XP), so creating > another of those is a dubious exercise. We were probably shipping different > versions of expat and/or zlib on Windows for 2.1 too (but at least I can > suck those binaries out of an installed 2.1 -- or was Windows 2.1 compiled > with a binary-incompatible MSVC 5?). Etc. If I do this, it's going to > consume at least a day to straighten out all the issues. 2.1 was solidly MSVC 6, so I don't think there were any MSVC 5 issues. Would it be a problem if we used the new installer for 2.1.2? That would be much easier on Tim. There are still some issues (e.g. expat) but I'm not qualified to rule on those. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Fri Jan 4 17:06:53 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 04 Jan 2002 18:06:53 +0100 Subject: [Python-Dev] Unicode strings as filenames References: <20020104124750.78D4AE8451@oratrix.oratrix.nl> Message-ID: <3C35E12D.4A72AC02@lemburg.com> Jack Jansen wrote: > > Off on a slight tangent: > On Mac OS X the default 8-bit encoding is UTF8. os.listdir() handles > this fine and so does open(). The OS does all the hard work for you: > it knows that some mounted disks may be in other 8-bit encodings (such > as MacRoman or MacJapanese for old mac disks, or probably latin-1 for NFS > filesystems, or god-knows-what for SMB mounted disks) and handles the > conversion. That's good news. > But in Python (unix-Python we're talking here, not MacPython), > unicode(filename) fails, because site.encoding is "ascii". > > Would it be safe to set site.encoding to utf8 on Mac OS X by default? I'd rather suggest to use UTF-8 as default encoding in the subsystem layer I was talking about. Making UTF-8 the default Python system encoding would have many other consequences -- and you'd probably lose a great deal of portability since UTF-8 conversion (nearly) always will succeed while ASCII can easily fail on other systems which use e.g. Latin-1 as native encoding. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@home.com Fri Jan 4 17:23:47 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 4 Jan 2002 12:23:47 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <200201041658.LAA27923@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > 2.1 was solidly MSVC 6, so I don't think there were any MSVC 5 issues. That matches my recollection, but while I'd bet your life on it I wouldn't bet mine . > Would it be a problem if we used the new installer for 2.1.2? That > would be much easier on Tim. The real reason to use the new installer is that the old one is, and increasingly as 2000 and XP get more popular, itself "a bug". Getting a 32-bit installer is increasingly necessary, and the old installer can't deal with the Win2K privilege maze at all (usually spelling "insufficient privilege" as "corrupt installation detected" before its first dialog box even appears -- the old Wise 16-bit-installer builder was released when Win95 was brand new). > There are still some issues (e.g. expat) but I'm not qualified to rule > on those. I'll ask Fred about that one offline. It's all doable, it's just going to consume some time. From martin@v.loewis.de Fri Jan 4 18:17:03 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 19:17:03 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <3C358DD0.815BC93D@lemburg.com> (mal@lemburg.com) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <3C358DD0.815BC93D@lemburg.com> Message-ID: <200201041817.g04IH3p01371@mira.informatik.hu-berlin.de> > We'd still need to support other OSes as well, though, and I > don't think that putting all this code into fileobject.c is > a good idea -- after all opening files is needed by some other > parts of Python as well and may also be useful for extensions. The stuff isn't in fileobject.c. Py_FileSystemDefaultEncoding is defined in bltinmodule.c. Also, on other OSes: You can pass Unicode object to open on all systems. If Py_FileSystemDefaultEncoding is NULL, it will fall back to site.encoding. Of course, if the system has an open function that expects wchar_t*, we might want to use that instead of going through a codec. Off hand, Win32 seems to be the only system where this might work, and even there, it won't work on Win95. > I'd suggest to implement something similiar to the DLL loading > code which is also implemented as subsystem in Python. I'd say this is over-designed. It is not that there are ten alternative approaches to doing encodings in file names, and we only support two of them, but it is rather that there are only two, and we support all three of them :-) Also, it is more difficult than threads: for threads, there is a fixed set of API features that need to be represented. Doing Py_UNICODE* opening alone is easy, but look at the number of posixmodule functions that all expect file names of some sort. Regards, Martin From chrism@zope.com Fri Jan 4 18:53:36 2002 From: chrism@zope.com (Chris McDonough) Date: Fri, 4 Jan 2002 13:53:36 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... References: <200201040715.g047F8D08868@mbuna.arbhome.com.au><15413.24423.772132.175722@anthem.wooz.org><200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> Message-ID: <01e801c19551$222145a0$c617a8c0@kurtz> Hi folks, I'm subscribed to the list, but I'm still not quite sure if I'm supposed to be posting here... I suppose I should go read the charter. Please flame me if this list is for the "in crowd" only. ;-) I tried to get the 21-maintbranch LFS working using the directions that are provided in the current docs (http://www.python.org/doc/current/lib/posix-large-files.html), but it fails to compile for me as a result. Someone has suggested that it's not the instructions that are broken, but the code. Can this be confirmed? Because ZC is forced to stick with Python 2.1.X (as opposed to 2.2.X) for the current crop of Zope releases, and because we often need large file support under Zope, it's pretty important for us to get a 2.1.X release under which LFS works. A workaround is fine as well. I don't think I have the knowhow to fix it, but if I can help in any way by testing under various Linuxii, please let me know. -C ----- Original Message ----- From: "Barry A. Warsaw" To: "Martin v. Loewis" Cc: ; Sent: Friday, January 04, 2002 11:23 AM Subject: Re: [Python-Dev] release for 2.1.2, plus 2.2.1... > > >>>>> "MvL" == Martin v Loewis writes: > > MvL> I don't think we can get autoconf detection to work on > MvL> 2.1. > > I don't mind. > > MvL> The instructions are right. Unfortunately, the code is > MvL> wrong: It prefers fgetpos in 2.1, but that returns not an > MvL> integral type on some systems. > > Right. > > MvL> I think the best approach is to copy the body of > MvL> _portable_fseek and _portable_ftell from 2.2. With that, I > MvL> get a setup that atleast looks right (patch attached) > > Unfortunately that's not enough, I suspect. > > >> I've had some failures on 2.4.7 kernels w/ ext3 filesystems. > > MvL> Were these compilation failures, or runtime failures? For the > MvL> compilation failures, ext3 should be irrelevant, and 2.4.7 > MvL> should be irrelevant as well - the glibc version would matter > MvL> (which defines fpos_t to be a struct with an mbstate_t > MvL> inside). > > Vanilla release21-maint will give compilation failures, which go away > with the patch (essentially what I tried on other systems). But even > with these patches, test_largefile fails on the seek(2**31L). > > So something else too is going on. > > FTR: this is a stock Mandrake 8.1 system w/ glibc 2.2.4. > > I don't have much time to spend looking into this right now, but it > would be good to fix for 2.1.2. > > -Barry > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > From martin@v.loewis.de Fri Jan 4 18:40:34 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 19:40:34 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <20020104122252.3A770E8452@oratrix.oratrix.nl> (message from Jack Jansen on Fri, 04 Jan 2002 13:22:47 +0100) References: <20020104122252.3A770E8452@oratrix.oratrix.nl> Message-ID: <200201041840.g04IeY201401@mira.informatik.hu-berlin.de> > No, it doesn't, that is the whole point of why I started this > thread!!!! Oops, right. I was thinking the other way around: passing u"a.out" where "a.out" is expected works fine; for this case, the memory management issues come into play. > > > Using Python StringObjects as binary buffers is also far less common > > > than using StringObjects to store plain old strings, so if either of > > > these uses bites the other it's the binary buffer that needs to > > > suffer. > > > > This is a conclusion I cannot agree with. Most strings are really > > binary, if you look at them closely enough :-) > > I'm not sure I understand this remark. If you made it just for the > smiley: never mind. If you really don't agree: please explain why. When the discussion of tagging binary strings in source code came up, I started to look into the standard library which string literals would have to be tagged as byte strings, and which are really character strings. I found that the overwhelming majority of string literals in the standard Python library really denotes byte strings, if you ignore doc strings. Sometimes, it isn't obvious that they are binary strings, hence the smiley. Look at httplib.py: __all__ = ["HTTP", ... Not sure: Are Python function names byte strings or character strings? Probably doesn't matter either way. Python source code is definitely byte-oriented, explicitly wihthout any assumed encoding, so I'd lean towards byte strings here. _UNKNOWN = 'UNKNOWN' Looks like a character string. However, it is used in self.version = _UNKNOWN # HTTP-Version self.version is later sent on the byte-oriented HTTP protocol, so _UNKNOWN *is* a byte string. _CS_IDLE = 'Idle' These are enumerators, let's say they are character strings. self.fp = sock.makefile('rb', 0) Not sure. Could be a character string. print "reply:", repr(line) Definitely a character string. version = "HTTP/0.9" status = "200" reason = "" Protocol elements, thus byte string. So, I'm arguing that byte strings are far more common than you may think at first sight. In particular, everything passed to .read(), either of a file, or of a socket, is a byte string, since files and network connections are byte-oriented. In the particular case of network connections, applying system conventions for narrow strings would be foolish. Regards, Martin From barry@zope.com Fri Jan 4 19:02:03 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 4 Jan 2002 14:02:03 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> Message-ID: <15413.64555.942551.865775@anthem.wooz.org> >>>>> "CM" == Chris McDonough writes: CM> I'm subscribed to the list, but I'm still not quite sure if CM> I'm supposed to be posting here... I suppose I should go read CM> the charter. Please flame me if this list is for the "in CM> crowd" only. ;-) You did fine, Chris! :) Welcome. CM> I tried to get the 21-maintbranch LFS working using the CM> directions that are provided in the current docs CM> (http://www.python.org/doc/current/lib/posix-large-files.html), CM> but it fails to compile for me as a result. Someone has CM> suggested that it's not the instructions that are broken, but CM> the code. Can this be confirmed? Confirmed. The compilation errors can be fixed with the patch that Martin sent around earlier in this thread. So that probably ought to be added to Python 2.1.2. But the patch + the posix-large-file instructions still don't enable large file support for me on glibc 2.2.4. So something more is needed. CM> Because ZC is forced to stick with Python 2.1.X (as opposed to CM> 2.2.X) for the current crop of Zope releases, and because we CM> often need large file support under Zope, it's pretty CM> important for us to get a 2.1.X release under which LFS works. CM> A workaround is fine as well. CM> I don't think I have the knowhow to fix it, but if I can help CM> in any way by testing under various Linuxii, please let me CM> know. I do plan to get back to this if nobody else fixes it in the meantime, but I've got a couple of higher priority things to deal with right now. I'd say LFS in Python 2.1.2 should be a high priority. -Barry From guido@python.org Fri Jan 4 19:04:45 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 04 Jan 2002 14:04:45 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: Your message of "Fri, 04 Jan 2002 14:02:03 EST." <15413.64555.942551.865775@anthem.wooz.org> References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <15413.64555.942551.865775@anthem.wooz.org> Message-ID: <200201041904.OAA29642@cj20424-a.reston1.va.home.com> > Confirmed. The compilation errors can be fixed with the patch that > Martin sent around earlier in this thread. So that probably ought to > be added to Python 2.1.2. But the patch + the posix-large-file > instructions still don't enable large file support for me on glibc > 2.2.4. So something more is needed. Hm, is it possible that glibc 2.2.4 is too old to support large files? > I'd say LFS in Python 2.1.2 should be a high priority. Yes. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Fri Jan 4 19:14:29 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 4 Jan 2002 14:14:29 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <15413.64555.942551.865775@anthem.wooz.org> Message-ID: [Barry] > I'd say LFS in Python 2.1.2 should be a high priority. I'd say it's a show-stopper. Zope isn't the only client for large files; besides, we could just tell Zope customers to upgrade to Windows, where LFS has been part of the Win32 API since before Linus learned how to spell Perl . From Andreas Jung" <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <15413.64555.942551.865775@anthem.wooz.org> <200201041904.OAA29642@cj20424-a.reston1.va.home.com> Message-ID: <070701c19554$6a31bac0$9e17a8c0@suxlap> ----- Original Message ----- From: "Guido van Rossum" To: "Barry A. Warsaw" Cc: "Chris McDonough" ; "Martin v. Loewis" ; ; Sent: Friday, January 04, 2002 14:04 Subject: Re: [Python-Dev] release for 2.1.2, plus 2.2.1... > > Confirmed. The compilation errors can be fixed with the patch that > > Martin sent around earlier in this thread. So that probably ought to > > be added to Python 2.1.2. But the patch + the posix-large-file > > instructions still don't enable large file support for me on glibc > > 2.2.4. So something more is needed. > > Hm, is it possible that glibc 2.2.4 is too old to support large files? s I would be surprised if glibc 2.2.4 does not support LFS. Some months ago I installed Python 2.1 on a "older" RH 7.1 system with LFS support. The glibc version of RH7.1 is most likely older than 2.2.4. Andreas From barry@zope.com Fri Jan 4 19:29:43 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 4 Jan 2002 14:29:43 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <15413.64555.942551.865775@anthem.wooz.org> <200201041904.OAA29642@cj20424-a.reston1.va.home.com> Message-ID: <15414.679.382753.10285@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> Hm, is it possible that glibc 2.2.4 is too old to support GvR> large files? Doubtful. This is the stock glibc that comes with Mandrake 8.1, which is their latest offering. And besides, Python 2.2 on the same box supports LFS just fine! -Barry From martin@v.loewis.de Fri Jan 4 19:36:51 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 20:36:51 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <15413.55039.802450.257239@anthem.wooz.org> (barry@zope.com) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> Message-ID: <200201041936.g04Japl01600@mira.informatik.hu-berlin.de> > MvL> I think the best approach is to copy the body of > MvL> _portable_fseek and _portable_ftell from 2.2. With that, I > MvL> get a setup that atleast looks right (patch attached) > > Unfortunately that's not enough, I suspect. I can't see a problem. > Vanilla release21-maint will give compilation failures, which go away > with the patch (essentially what I tried on other systems). But even > with these patches, test_largefile fails on the seek(2**31L). Not for me (i.e. it passes just fine). How exactly does it fail? What version of the test? Can you produce an strace? > FTR: this is a stock Mandrake 8.1 system w/ glibc 2.2.4. That should be good enough. > I don't have much time to spend looking into this right now, but it > would be good to fix for 2.1.2. Somebody else should probably try this as well. I would not stop the release for that: if it compiles fine when following the instructions, and does the right thing for small files, I think the release should go. Regards, Martin From martin@v.loewis.de Fri Jan 4 19:27:59 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 20:27:59 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <20020104124750.78D4AE8451@oratrix.oratrix.nl> (message from Jack Jansen on Fri, 04 Jan 2002 13:47:45 +0100) References: <20020104124750.78D4AE8451@oratrix.oratrix.nl> Message-ID: <200201041927.g04JRxO01561@mira.informatik.hu-berlin.de> > Would it be safe to set site.encoding to utf8 on Mac OS X by default? As MAL explains, no. Instead, you should extend the fragment #if defined(MS_WIN32) && defined(HAVE_USABLE_WCHAR_T) const char *Py_FileSystemDefaultEncoding = "mbcs"; #else const char *Py_FileSystemDefaultEncoding = NULL; /* use default */ #endif to cover OSX as well, setting the string to "utf-8". Then, Unicode objects will be auto-converted to UTF-8 in open() and all posixmodule calls; not sure whether OSX uses posixmodule, though... Once you've done this, you should use es# specifiers with Py_FileSystemDefaultEncoding wherever you retrieve a file or path name from the application. Returning file names to the user is a different story, though: it may or may not be sensible to apply the file system encoding (if set) whenever file names are returned to the application (mostly in listdir). HTH, Martin From martin@v.loewis.de Fri Jan 4 19:21:07 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 20:21:07 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <200201041658.LAA27923@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Fri, 04 Jan 2002 11:58:47 -0500) References: <200201041658.LAA27923@cj20424-a.reston1.va.home.com> Message-ID: <200201041921.g04JL7b01556@mira.informatik.hu-berlin.de> > Would it be a problem if we used the new installer for 2.1.2? That > would be much easier on Tim. There are still some issues (e.g. expat) > but I'm not qualified to rule on those. My guess is that 2.1.2 will compile fine with whatever expat installation Tim currently has, if it does, pyexpat will certainly work correctly (or: as good as it did in 2.1.1). Regards, Martin From martin@v.loewis.de Fri Jan 4 18:42:42 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 19:42:42 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects fileobject.c,2.141,2.142 In-Reply-To: <2mitai8fif.fsf@starship.python.net> (message from Michael Hudson on 04 Jan 2002 12:30:48 +0000) References: <2mitai8fif.fsf@starship.python.net> Message-ID: <200201041842.g04IggA01405@mira.informatik.hu-berlin.de> > > This makes xreadlines behave like all other file methods > > (other than close() which just returns). > > Does this qualify as a bugfix? Yes. But it also tightens the behaviour, so it should not be applied to maintainance branches: no correct program would work better with this patch, but currently broken programs may stop working. Regards, Martin From martin@v.loewis.de Fri Jan 4 19:49:24 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 20:49:24 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <01e801c19551$222145a0$c617a8c0@kurtz> (chrism@zope.com) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au><15413.24423.772132.175722@anthem.wooz.org><200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> Message-ID: <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> > I'm subscribed to the list, but I'm still not quite sure if I'm supposed to > be posting here... I suppose I should go read the charter. Please flame me > if this list is for the "in crowd" only. ;-) This list is for development *of* Python. Anybody is free to post questions and comments on that topic, like you just did; I don't like it when people post questions of the "how do I ... in Python" kind that you typically see on python-list - this is not a list to get better help :-) > I tried to get the 21-maintbranch LFS working using the directions that are > provided in the current docs > (http://www.python.org/doc/current/lib/posix-large-files.html), but it fails > to compile for me as a result. Someone has suggested that it's not the > instructions that are broken, but the code. Can this be confirmed? Well, you did not describe exactly how it fails to compile for you. Assuming you got an error that something is not an integral type, then that is clearly an error in the code. You might want to investigate the error message you get more closely; please confirm that it refers to the return value of fgetpos. If you need further confimation, I recommend that you invoke the gcc line that fails adding --save-temps, and inspect the resulting fileobject.i. You will likely find that fpos_t is a structure, and that Python attempts to return it in a place where an integer is needed (or vice versa). > Because ZC is forced to stick with Python 2.1.X (as opposed to 2.2.X) for > the current crop of Zope releases, and because we often need large file > support under Zope, it's pretty important for us to get a 2.1.X release > under which LFS works. A workaround is fine as well. Please try the patch I posted, and report whether test_largefile passes or fails (or, if it fails to compile, what the exact error messages are). Regards, Martin From martin@v.loewis.de Fri Jan 4 19:52:50 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 20:52:50 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <15413.64555.942551.865775@anthem.wooz.org> (barry@zope.com) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <15413.64555.942551.865775@anthem.wooz.org> Message-ID: <200201041952.g04Jqoj01753@mira.informatik.hu-berlin.de> > Confirmed. The compilation errors can be fixed with the patch that > Martin sent around earlier in this thread. So that probably ought to > be added to Python 2.1.2. But the patch + the posix-large-file > instructions still don't enable large file support for me on glibc > 2.2.4. So something more is needed. One possible difference between your and my installation is that you probably followed the Linux instructions, whereas I followed the Solaris instructions (even though my system is Linux). I did so because of martin@mira:~/work/python/dist/src> getconf LFS_CFLAGS -D_FILE_OFFSET_BITS=64 So getconf works fine on Linux, as well, and DTRT. Could please recompile your installation using the getconf approach alone? > I do plan to get back to this if nobody else fixes it in the > meantime, but I've got a couple of higher priority things to deal with > right now. > > I'd say LFS in Python 2.1.2 should be a high priority. I'd like to see an independent confirmation first that there still is a problem to solve. Regards, Martin From martin@v.loewis.de Fri Jan 4 19:53:34 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 20:53:34 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <200201041904.OAA29642@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Fri, 04 Jan 2002 14:04:45 -0500) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <15413.64555.942551.865775@anthem.wooz.org> <200201041904.OAA29642@cj20424-a.reston1.va.home.com> Message-ID: <200201041953.g04JrYg01762@mira.informatik.hu-berlin.de> > Hm, is it possible that glibc 2.2.4 is too old to support large files? No, it is the current release. Regards, Martin From tim.one@home.com Fri Jan 4 19:58:15 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 4 Jan 2002 14:58:15 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <200201041921.g04JL7b01556@mira.informatik.hu-berlin.de> Message-ID: [Martin v. Loewis] > My guess is that 2.1.2 will compile fine with whatever expat > installation Tim currently has, if it does, pyexpat will certainly > work correctly (or: as good as it did in 2.1.1). It changes the structure of the distribution, though: 2.1 Windows Python shipped with xmlparse.dll and xmltok.dll, 2.2 with neither of those but with a single expat.dll instead. Regardless of whether "it works" for Python, I don't think a bugfix release is the time to change the *set* of DLLs we ship. The MSVC project files on the 2.1 branch also have no idea what to do with the current expat setup, and last-second changes just multiply if I fight that too (the 2.1 PCbuild README would also need to be changed; etc). 2.2 is better here, but the old expat setup wasn't "a bug"; people who want the new setup should upgrade to 2.2, where it was first introduced. From martin@v.loewis.de Fri Jan 4 20:04:29 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 21:04:29 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: References: Message-ID: <200201042004.g04K4TA01835@mira.informatik.hu-berlin.de> > It changes the structure of the distribution, though: 2.1 Windows Python > shipped with xmlparse.dll and xmltok.dll, 2.2 with neither of those but with > a single expat.dll instead. Regardless of whether "it works" for Python, I > don't think a bugfix release is the time to change the *set* of DLLs we > ship. Right, I agree on all accounts. Do whatever is most convenient for you. On that topic, is there anybody else in this list who has the necessary software to build a Python Windows release? I feel quite uncomfortable thinking that, if your PC crashes, Windows people would be without Python (actually, *that* uncomfortable is that thought not :-) This would probably the time to step forward offering to build the official 2.1.2 binary distribution. Regards, Martin From skip@pobox.com Fri Jan 4 20:07:26 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 4 Jan 2002 14:07:26 -0600 Subject: [Python-Dev] To post or not to post, that is the question... In-Reply-To: <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> Message-ID: <15414.2942.960382.24692@beluga.mojam.com> Martin> I don't like it when people post questions of the "how do I= Martin> ... in Python" kind that you typically see on python-list -= this Martin> is not a list to get better help :-) Somewhat au contraire from this neck of the woods... In my Unicode fil= ename thread I decided it would be best to post here for a couple reasons: * I figured most good answers would come from Martin and Marc-Andr=E8= . * It's not clear that the "right way" to do this stuff appears to b= e settled, which I think has been proven out somewhat by the extend= ed thread and the long thread Jack started about Unicode and getargs= .c. Skip From fdrake@acm.org Fri Jan 4 20:08:57 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 4 Jan 2002 15:08:57 -0500 (EST) Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <200201041921.g04JL7b01556@mira.informatik.hu-berlin.de> References: <200201041658.LAA27923@cj20424-a.reston1.va.home.com> <200201041921.g04JL7b01556@mira.informatik.hu-berlin.de> Message-ID: <15414.3033.553111.989293@cj42289-a.reston1.va.home.com> Martin v. Loewis writes: > My guess is that 2.1.2 will compile fine with whatever expat > installation Tim currently has, if it does, pyexpat will certainly > work correctly (or: as good as it did in 2.1.1). I like that answer. ;-) The catch is that I think Python 2.1.1 includes Expat 1.2, and the Python API changes slightly based on the Expat version. So I think it best to use the Expat shipped with Python 2.1.1. The pyexpat extension should need no changes. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido@python.org Fri Jan 4 20:11:01 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 04 Jan 2002 15:11:01 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: Your message of "Fri, 04 Jan 2002 21:04:29 +0100." <200201042004.g04K4TA01835@mira.informatik.hu-berlin.de> References: <200201042004.g04K4TA01835@mira.informatik.hu-berlin.de> Message-ID: <200201042011.PAA29938@cj20424-a.reston1.va.home.com> > On that topic, is there anybody else in this list who has the > necessary software to build a Python Windows release? I feel quite > uncomfortable thinking that, if your PC crashes, Windows people would > be without Python (actually, *that* uncomfortable is that thought not > :-) I have the whole suite working on my laptop too. > This would probably the time to step forward offering to build the > official 2.1.2 binary distribution. But I'm not volunteering. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Fri Jan 4 20:25:43 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 4 Jan 2002 15:25:43 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <200201042004.g04K4TA01835@mira.informatik.hu-berlin.de> Message-ID: [Martin v. Loewis] > Right, I agree on all accounts. Do whatever is most convenient for you. I don't care what's convenient, I want to do what's *right* for 2.1.2. I think this got confused because Guido sold "use the new installer-builder" on the grounds that it would be easier for me, but, while that's true, it's also true that the old installer-builder produces broken installers (useless for many Win2K/XP users). The latter is the real reason I want to use the new installer-builder; the old one is quite arguably "a bug". > On that topic, is there anybody else in this list who has the > necessary software to build a Python Windows release? I feel quite > uncomfortable thinking that, if your PC crashes, Windows people would > be without Python (actually, *that* uncomfortable is that thought not > :-) The only pieces you can't get for free over the Web are MSVC 6 and Wise 8.14; the MSVC and Wise project files are in CVS, so it's only the MS and Wise executables someone would have to obtain. PythonLabs has the physical CDs for those, so it doesn't matter much if my box crashes; I also have at least 3 copies of them on backup tapes, and two other copies on two other machines. Hmm. I probably violated the license agreements at least 4 times there . > This would probably the time to step forward offering to build the > official 2.1.2 binary distribution. Don't I wish. I would like to see us move to a free installer. I built (and checked in) an Inno Setup project file that does "almost all" the good stuff, and advertised on c.l.py for volunteers to take it over. Alas, nobody bit, and I can't justify spending more of my time on it (I could at the time because I wasn't making any progress then getting Wise to let us use a new version of their stuff, and Inno Setup was the only feasible alternative). From tim.one@home.com Fri Jan 4 20:26:51 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 4 Jan 2002 15:26:51 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <200201042011.PAA29938@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > I have the whole suite working on my laptop too. I wasn't going to admit that: that one's a serious license violation . From mal@lemburg.com Fri Jan 4 20:32:26 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 04 Jan 2002 21:32:26 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... References: Message-ID: <3C36115A.3002B4DD@lemburg.com> Tim Peters wrote: > > [Guido] > > I have the whole suite working on my laptop too. > > I wasn't going to admit that: that one's a serious license violation > . Is it really ? Most desktop apps nowadays allow one additional laptop installation. Totally off-topic, of course, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@home.com Fri Jan 4 20:34:34 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 4 Jan 2002 15:34:34 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <3C36115A.3002B4DD@lemburg.com> Message-ID: [Guido] > I have the whole suite working on my laptop too. [Tim] > I wasn't going to admit that: that one's a serious license violation > . [MAL] > Is it really ? Most desktop apps nowadays allow one additional laptop > installation. Yes, and my laptop != Guido's laptop. Look at all the trouble you're getting us into here ... From aahz@rahul.net Fri Jan 4 21:00:24 2002 From: aahz@rahul.net (Aahz Maruch) Date: Fri, 4 Jan 2002 13:00:24 -0800 (PST) Subject: Re [Python-Dev] object equality vs identity, in and dicts idioms and speed In-Reply-To: <004101c194cd$701deb20$47fdbac3@newmexico> from "Samuele Pedroni" at Jan 04, 2002 04:11:00 AM Message-ID: <20020104210025.3163AE8C5@waltz.rahul.net> Samuele Pedroni wrote: > [Tim Peters] >> >> Mapping what to what? A fine implementation of id() would be to hand each >> new object a unique Java int from a global counter, incremented once per >> Python object creation -- or a Java long if any JVM stays up long enough >> that 32 bits is an issue . > > The problem are java class instances, sir, we use non-unique wrappers > for them and identity is simulated. We could use a table to make > the wrappers unique but we have potentially lots of them as you can > imagine, jython people actually use java classes . So the > workaround is to keep a table just for the java instances for which > someone has asked the id. I'm slightly confuzzled here (no surprise given how little Java I know). How does Jython know which Java class instance to refer to if there's no mapping? If there is a mapping, how does it slow things down to create an id every time a map gets created? (Yes, it'll chew up memory, but Java uses so much memory already... ;-) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From martin@v.loewis.de Fri Jan 4 21:21:35 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 22:21:35 +0100 Subject: [Python-Dev] To post or not to post, that is the question... In-Reply-To: <15414.2942.960382.24692@beluga.mojam.com> (message from Skip Montanaro on Fri, 4 Jan 2002 14:07:26 -0600) References: <15414.2942.960382.24692@beluga.mojam.com> Message-ID: <200201042121.g04LLZq02166@mira.informatik.hu-berlin.de> > * I figured most good answers would come from Martin and Marc-Andrč. That alone is not a good reason. python-dev is not a place to get free consulting (which, of course, is more bothersome to whoever gives the consulting, than to who receives it). > * It's not clear that the "right way" to do this stuff appears to be > settled, which I think has been proven out somewhat by the extended > thread and the long thread Jack started about Unicode and getargs.c. Well, I was mostly referring to things that are either documented, or can be found easily through source inspection. IOW, I expect that python-dev posters do their homework before posting. Things like "is it really that you cannot do X, but that it should be possible to do so", or "what is the exact rationale for Y happening" are definitely python-dev issues. Regards, Martin From martin@v.loewis.de Fri Jan 4 21:39:49 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 4 Jan 2002 22:39:49 +0100 Subject: Re [Python-Dev] object equality vs identity, in and dicts idioms and speed In-Reply-To: <20020104210025.3163AE8C5@waltz.rahul.net> (aahz@rahul.net) References: <20020104210025.3163AE8C5@waltz.rahul.net> Message-ID: <200201042139.g04Ldn202275@mira.informatik.hu-berlin.de> > I'm slightly confuzzled here (no surprise given how little Java I know). > How does Jython know which Java class instance to refer to if there's > no mapping? I understand Samuele was talking about mapping in the Python sense (existance of dictionary-style containers); he also mentioned that Jython creates a Python wrapper object for each "foreign" Java object. Creating wrapper objects immediately raises the issue of identity: If you get the very same Java objects two times, do you want to use the same wrapper object? If yes, how do you find out that you already have a wrapper object. This is where the mapping comes into play. If no, how do you implement "is"? Well, that's easy: def is(o1, o2): if o1 instanceof wrapper: if not o2 instanceof wrapper: return false return o1.wrapped identical_to o2.wrapped return o1 identical_to o2 Now, how do you implement id()? More tricky: Tim suggests you bump a counter every time you create a Python object. Works fine for "true" python objects: def id(o): return o.countervalue Doesn't work as well for wrapper objects: When should you bump the counter? When you create the wrapper? But then there may be two wrappers with different ids refering to the very same object, so you'd have 'o1 is o2 and id(o1) <> id(o2)' which clearly is a no-no. > If there is a mapping, how does it slow things down to create an id > every time a map gets created? You can do it like this: map = {} def wrap(java.lang.Object o): try: return map[o] except KeyError: map[o] = res = wrapper(o, new_id()) return res That requires a map lookup every time a wrapper is created; clearly undesirable. I think Samuele had something in mind like: map = {} def wrap(java.lang.Object o): return wrapper(o, None) def id(o): if not o instanceof wrapper: return o.countervalue if o.countervalue: return o.countervalue try: o.countervalue = map[o.wrapped] except KeyError: o.countervalue = map[o.wrapped] = new_id() return o.countervalue So you'd take the cost of a map lookup only if somebody accesses the id of an object. That would still mean that all Java objects whose Python id() was ever computed would live in the dictionary forever; there you need a WeakkeyDictionary. Regards, Martin From Samuele Pedroni" <200201042139.g04Ldn202275@mira.informatik.hu-berlin.de> Message-ID: <003701c1956a$7e296a80$6d94fea9@newmexico> Martin explanation is correct. [Martin v. Loewis] > You can do it like this: > > map = {} > > def wrap(java.lang.Object o): > try: > return map[o] > except KeyError: > map[o] = res = wrapper(o, new_id()) > return res > > That requires a map lookup every time a wrapper is created; clearly > undesirable. I think Samuele had something in mind like: > With this approach you could use less memory if there is much wrapper duplication, but typically a Java object does not get many long-lived different wrappers. This "wrap" is quite a core operation, and the map need to be weak otherwise you leak badly. That means that you can implement it only with java >1.2 and anyway weak-dictionaries in java require dealing with polling queues of reset weak-refs. This means complication and slowdown where we would prefer to avoid it. regards. From bckfnn@worldonline.dk Fri Jan 4 22:11:29 2002 From: bckfnn@worldonline.dk (Finn Bock) Date: Fri, 04 Jan 2002 22:11:29 GMT Subject: Re [Python-Dev] object equality vs identity, in and dicts idioms and speed In-Reply-To: <200201042139.g04Ldn202275@mira.informatik.hu-berlin.de> References: <20020104210025.3163AE8C5@waltz.rahul.net> <200201042139.g04Ldn202275@mira.informatik.hu-berlin.de> Message-ID: <3c362501.49820277@mail.wanadoo.dk> [Martin v. Loewis] >I understand Samuele was talking about mapping in the Python sense >(existance of dictionary-style containers); he also mentioned that >Jython creates a Python wrapper object for each "foreign" Java object. Your summery is quite accurate. When this was discussed on jython-dev, I said I preferred a solution where all objects was inserted in your "map" dictionary when id() was called on them. Not just the wrapped java instances. I picked that preference because I think using id() is a relative uncommon operation. In the Lib modules, id() is used to detect cycles in copy.py, pickle.py, pprint.py and xmlrpclib. I would rather have a slow id() operation on python objects too, than burden all python objects with an additional int or long. Is that a wrong call? In the repr() of a lot of internal objects, the id() is used in the return string. Would anyone rightly expect that hex number to match the id() value of the object? In our discussions we agreed that the repr() string does not have to match the value return by id(). regards, finn From chrism@zope.com Fri Jan 4 22:27:12 2002 From: chrism@zope.com (Chris McDonough) Date: Fri, 04 Jan 2002 17:27:12 -0500 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... References: <200201040715.g047F8D08868@mbuna.arbhome.com.au><15413.24423.772132.175722@anthem.wooz.org><200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> Message-ID: <3C362C40.8010002@zope.com> > Well, you did not describe exactly how it fails to compile for > you. Assuming you got an error that something is not an integral type, > then that is clearly an error in the code. You might want to > investigate the error message you get more closely; please confirm > that it refers to the return value of fgetpos. Yes, apologies. I should have provided more details. I'm using a stock Red Hat Linux 7.2, which has glibc 2.2.4 (Linux kernel version 2.4.7). With a Python built successfully from the 21-maintbranch without any additional compiler flags indicating that I want large file support, I get this when attempting to run the test_largefile test: [chrism@kurtz tmp]$ python /usr/local/lib/python2.1/test/test_largefile.py Traceback (most recent call last): File "/usr/local/lib/python2.1/test/test_largefile.py", line 22, in ? raise test_support.TestSkipped, \ test_support.TestSkipped: platform does not have largefile support What's going on "under the hood" here is that a bit of code like this: open('foo', 'w').seek(2147483649L) .. raises an IOError 22, (invalid argument) out of the seek. When I attempt to compile the code from the same branch using the instructions for Solaris from http://www.python.org/doc/current/lib/posix-large-files.html, it craps out during a successive make: gcc -c -g -O2 -I. -I./Include -DHAVE_CONFIG_H -o Objects/fileobject.o Objects/fileobject.c Objects/fileobject.c: In function `_portable_ftell': Objects/fileobject.c:267: incompatible types in return make: *** [Objects/fileobject.o] Error 1 [chrism@kurtz src]$ As a result, I am not able to compile successfully. (Note: FYI, the same thing happens when following the slightly different current doc instructions for Linux.) So be it. With the patch you supplied earlier (and providing *either* the "Solaris" or "Linux" largefile support flags to configure) I am able to compile successfully and when invoking the resulting executable against test_largefile.py, I get what looks like success, e.g.: create large file via seek (may be sparse file) ... 2500000001L =?= 2500000001L ... yes check file size with os.fstat check file size with os.stat 2500000001L =?= 2500000001L ... yes play around with seek() and read() with the built largefile 0L =?= 0 ... yes .... So the question is this: is there reason to disbelieve test_largefile? There seems to be some disbelief from Barry that your patch is "enough", but it appears to work at least enough to fool test_largefile. ;-) - C From eric@enthought.com Fri Jan 4 21:48:49 2002 From: eric@enthought.com (eric) Date: Fri, 4 Jan 2002 16:48:49 -0500 Subject: [Python-Dev] weave -- inline C/C++ in Python, an implementation Message-ID: <062601c19569$984f67d0$777ba8c0@ericlaptop> Hello group, I'm pretty close to releasing weave 0.2, a tool that helps in combining C/C++ with Python code. There are basically three ways to use it. inline() offers inline C/C++ in Python. blitz() converts Python Numeric expression to C++ for fast execution. And, ext_tools offer a couple of classes that build extension classes. If you have a few cycles to spare, I'd appreciate a few eye balls on the documentation page and source. Also, if people could download the zip/exe/tar.gz files and let me know of any failures that would be helpful. The website provides info on how to test (it's very simple). Also, success reports on platforms would be good . W2K, RH 7.1, Debian are about all that has been tested. Here is the link: http://www.scipy.org/site_content/weave thanks, eric From martin@v.loewis.de Fri Jan 4 23:03:07 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 5 Jan 2002 00:03:07 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: <3C362C40.8010002@zope.com> (message from Chris McDonough on Fri, 04 Jan 2002 17:27:12 -0500) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au><15413.24423.772132.175722@anthem.wooz.org><200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> Message-ID: <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> > create large file via seek (may be sparse file) ... > 2500000001L =?= 2500000001L ... yes > check file size with os.fstat > check file size with os.stat > 2500000001L =?= 2500000001L ... yes > play around with seek() and read() with the built largefile > 0L =?= 0 ... yes > .... I have taken this is as success also. I don't know how Barry found that the tests fail, but must likely, one of the expect calls failed, resulting in a TestFailed exception - which would have been clearly visible. Regards, Martin From barry@zope.com Fri Jan 4 23:15:38 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 4 Jan 2002 18:15:38 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> Message-ID: <15414.14234.952188.980370@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> I don't know how Barry found that the tests fail, but must MvL> likely, one of the expect calls failed, resulting in a MvL> TestFailed exception - which would have been clearly visible. Okay, it's a build problem. For whatever reason, the -D flags set in configure weren't getting passed to gcc during the make. If I add that explicitly, everything works. So Py2.1.2 is fine with Martin's patch, which should be committed to the maint branch. If I come up with a better recipe for posix-large-files I'll submit it as a doc-fix. -Barry From barry@zope.com Sat Jan 5 00:28:09 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 4 Jan 2002 19:28:09 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> Message-ID: <15414.18585.311512.80269@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> I don't know how Barry found that the tests fail, but must MvL> likely, one of the expect calls failed, resulting in a MvL> TestFailed exception - which would have been clearly visible. >>>>> "BAW" == Barry A Warsaw writes: BAW> Okay, it's a build problem. For whatever reason, the -D BAW> flags set in configure weren't getting passed to gcc during BAW> the make. If I add that explicitly, everything works. So BAW> Py2.1.2 is fine with Martin's patch, which should be BAW> committed to the maint branch. BAW> If I come up with a better recipe for posix-large-files I'll BAW> submit it as a doc-fix. I think the following is a better suggestion: % CC='gcc -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' ./configure CC is propagated to the Makefile so that just "make" is necessary, but OPT and CFLAGS is not. (Although, I seem to vaguely remember that OPT /used/ to propagate -- I must be mis-remebering.) -Barry From fdrake@acm.org Sat Jan 5 00:32:44 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 4 Jan 2002 19:32:44 -0500 (EST) Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15414.18585.311512.80269@anthem.wooz.org> References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> Message-ID: <15414.18860.123306.760743@grendel.zope.com> Barry A. Warsaw writes: > CC is propagated to the Makefile so that just "make" is necessary, but > OPT and CFLAGS is not. (Although, I seem to vaguely remember that OPT > /used/ to propagate -- I must be mis-remebering.) Or it used to work -- that's how I remember it as well. Perhaps we should fix this. Feel free to file a bug report and assign it to me. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@zope.com Sat Jan 5 00:39:49 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 4 Jan 2002 19:39:49 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <15414.18860.123306.760743@grendel.zope.com> Message-ID: <15414.19285.970977.725030@anthem.wooz.org> >>>>> "Fred" == Fred L Drake, Jr writes: Fred> Or it used to work -- that's how I remember it as well. Fred> Perhaps we should fix this. It looks like OPT propagates in Python2.2-cvs, e.g. try: % OPT=-g ./configure So maybe it's just a bug in release21-maint. Fred> Feel free to file a bug report and assign it to me. Done. -Barry From Anthony Baxter Sat Jan 5 02:50:52 2002 From: Anthony Baxter (Anthony Baxter) Date: Sat, 05 Jan 2002 13:50:52 +1100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objec ts fileobject.c,2.141,2.142 In-Reply-To: Message from "Martin v. Loewis" of "Fri, 04 Jan 2002 19:42:42 BST." <200201041842.g04IggA01405@mira.informatik.hu-berlin.de> Message-ID: <200201050250.g052oqR15435@mbuna.arbhome.com.au> >>> "Martin v. Loewis" wrote > Yes. But it also tightens the behaviour, so it should not be applied > to maintainance branches: no correct program would work better with > this patch, but currently broken programs may stop working. Yep, what he said. It's a bug, but it doesn't cause programs that are broken to work. It does potentially change how they break, though - this has been one of the criterion I've been using to say "nope". -- Anthony Baxter It's never too late to have a happy childhood. From martin@v.loewis.de Sat Jan 5 05:43:46 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 5 Jan 2002 06:43:46 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15414.18585.311512.80269@anthem.wooz.org> (barry@zope.com) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> Message-ID: <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> > I think the following is a better suggestion: > > % CC='gcc -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' ./configure > > CC is propagated to the Makefile so that just "make" is necessary, but > OPT and CFLAGS is not. (Although, I seem to vaguely remember that OPT > /used/ to propagate -- I must be mis-remebering.) What version of configure are you using? On my system, with configure 1.207.2.7, doing CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' OPT="-g -O2 $CFLAGS" ./configure will result in a line OPT= -g -O2 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 in the Makefile. This, in turn, will result in a compilation line gcc -c -g -O2 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -I. -I./Include -DHAVE_CONFIG_H -o Objects/fileobject.o Objects/fileobject.c Something else is going on on your system. Did you remove config.cache before running configure? Regards, Martin From martin@v.loewis.de Sat Jan 5 05:53:50 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 5 Jan 2002 06:53:50 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15414.19285.970977.725030@anthem.wooz.org> (barry@zope.com) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <15414.18860.123306.760743@grendel.zope.com> <15414.19285.970977.725030@anthem.wooz.org> Message-ID: <200201050553.g055roo05275@mira.informatik.hu-berlin.de> > >>>>> "Fred" == Fred L Drake, Jr writes: > > Fred> Or it used to work -- that's how I remember it as well. > Fred> Perhaps we should fix this. > > It looks like OPT propagates in Python2.2-cvs, e.g. try: > > % OPT=-g ./configure > > So maybe it's just a bug in release21-maint. > > Fred> Feel free to file a bug report and assign it to me. > > Done. Before changing the documentation, I'd like to understand the problem Barry is seeing first, or I'd like to hear independent confirmation that the docs have a bug. Chris' report, in http://mail.python.org/pipermail/python-dev/2002-January/019177.html is contradicting: On one hand, he says that following the instructions, he got an interpreter that does LFS correctly; but he also says that the compilation line is just gcc -c -g -O2 -I. -I./Include -DHAVE_CONFIG_H -o Objects/fileobject.o which cannot possibly have the desired effect, AFAICT. Regards, Martin From barry@zope.com Sat Jan 5 16:42:35 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 5 Jan 2002 11:42:35 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> Message-ID: <15415.11515.142516.734683@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> What version of configure are you using? On my system, with MvL> configure 1.207.2.7, doing 1.207.2.7 is at the head of the release21-maint branch, so that's definitely the version I'm using. MvL> CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' OPT="-g MvL> -O2 $CFLAGS" ./configure MvL> will result in a line MvL> OPT= -g -O2 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 MvL> in the Makefile. Not for me. In fact the -D symbols never make it into the Makefile at all. MvL> This, in turn, will result in a compilation MvL> line MvL> gcc -c -g -O2 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 MvL> -I. -I./Include -DHAVE_CONFIG_H -o Objects/fileobject.o MvL> Objects/fileobject.c MvL> Something else is going on on your system. Did you remove MvL> config.cache before running configure? Of course! I always "make distclean" before running configure again. -Barry From marpet@linuxpl.org Sat Jan 5 17:02:34 2002 From: marpet@linuxpl.org (Marek =?iso-8859-13?Q?P=E6tlicki?=) Date: 05 Jan 2002 18:02:34 +0100 Subject: [Python-Dev] RPM *.spec file Message-ID: <1010250155.2251.5.camel@marek.almaran.home> Hi! Could someone point me to the RPM *.spec file with which was built the Python 2.2 (ftp://ftp.python.org/pub/python/2.2/rpms/)? I've already installed from source, but I would like make some order in my system, and currently I have 3 versions of Python in various dirs :-) Q: why is the specfile not distributed with the sources? I've found some outdated BeOpen specfile, but it doesn't work.=20 Yes, I can fix it, but what for when somewhere there seems to exist a fixed one, since *.rpm binaries are supported on ftp.python.org? bye --=20 Marek P=EAtlicki Linux User ID=3D162988 From guido@python.org Sat Jan 5 17:29:01 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 05 Jan 2002 12:29:01 -0500 Subject: [Python-Dev] RPM *.spec file In-Reply-To: Your message of "05 Jan 2002 18:02:34 +0100." <1010250155.2251.5.camel@marek.almaran.home> References: <1010250155.2251.5.camel@marek.almaran.home> Message-ID: <200201051729.MAA12651@cj20424-a.reston1.va.home.com> > Hi! > > Could someone point me to the RPM *.spec file with which was built the > Python 2.2 (ftp://ftp.python.org/pub/python/2.2/rpms/)? I'm cc'ing Sean Reifschneider, who created them. I'm sure he has them somewhere. > I've already installed from source, but I would like make some order in > my system, and currently I have 3 versions of Python in various dirs :-) > > Q: why is the specfile not distributed with the sources? I've found some > outdated BeOpen specfile, but it doesn't work. > Yes, I can fix it, but what for when somewhere there seems to exist a > fixed one, since *.rpm binaries are supported on ftp.python.org? I've asked Sean to contribute his specfile, but he hasn't given them to me yet. You did read http://www.python.org/2.2/rpms.html I hope? --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Sat Jan 5 17:48:47 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 5 Jan 2002 18:48:47 +0100 Subject: [Python-Dev] RPM *.spec file In-Reply-To: <1010250155.2251.5.camel@marek.almaran.home> (message from Marek =?iso-8859-13?Q?P=E6tlicki?= on 05 Jan 2002 18:02:34 +0100) References: <1010250155.2251.5.camel@marek.almaran.home> Message-ID: <200201051748.g05Hml211527@mira.informatik.hu-berlin.de> > Could someone point me to the RPM *.spec file with which was built the > Python 2.2 (ftp://ftp.python.org/pub/python/2.2/rpms/)? > > I've already installed from source, but I would like make some order in > my system, and currently I have 3 versions of Python in various dirs :-) Did you look at the src.rpm? That should definitely include a spec file (didn't check, though). Just do rpm -i of the src.rpm, then look into your packages/SPECS directory. You may also look at Misc/RPM, but Guido suggests that this is likely *not* the spec file that was used. > Q: why is the specfile not distributed with the sources? Because they have been contributed, and because the contributor did not contribute a stand-alone spec file (although he implicitly did so through the src.rpm). > I've found some outdated BeOpen specfile, but it doesn't work. Yes, > I can fix it, but what for when somewhere there seems to exist a > fixed one, since *.rpm binaries are supported on ftp.python.org? They are available on ftp.python.org. They are supported only if their creator supports them. Regards, Martin From martin@v.loewis.de Sat Jan 5 18:02:30 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 5 Jan 2002 19:02:30 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15415.11515.142516.734683@anthem.wooz.org> (barry@zope.com) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> Message-ID: <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> > MvL> OPT= -g -O2 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 > > MvL> in the Makefile. > > Not for me. In fact the -D symbols never make it into the Makefile at > all. That is very puzzling. I just did a fresh checkout on cf.sf.net (the debian installation), using cvs -z9 -d:pserver:anonymous@cvs.python.sourceforge.net:/cvsroot/python co -d py21 -rrelease21-maint python/dist/src cd py21 CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' OPT="-g -O2 $CFLAGS" ./configure make The earliest indication that it was accepted correctly is in checking whether the C compiler (gcc -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 ) works... yes In the end, Makefile will have OPT= -g -O2 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 Can you please try the same sequence of actions on the SF compile farm, and report what it does for you? Alternatively, can you spot the error in the commands I used? Regards, Martin From marpet@linuxpl.org Sat Jan 5 21:00:58 2002 From: marpet@linuxpl.org (Marek =?iso-8859-13?Q?P=E6tlicki?=) Date: 05 Jan 2002 22:00:58 +0100 Subject: [Python-Dev] RPM *.spec file In-Reply-To: <200201051748.g05Hml211527@mira.informatik.hu-berlin.de> References: <1010250155.2251.5.camel@marek.almaran.home> <200201051748.g05Hml211527@mira.informatik.hu-berlin.de> Message-ID: <1010264459.3051.9.camel@marek.almaran.home> W li=B6cie z sob, 05-01-2002, godz. 18:48, Martin v. Loewis pisze:=20 > > Could someone point me to the RPM *.spec file with which was built the > > Python 2.2 (ftp://ftp.python.org/pub/python/2.2/rpms/)? > >=20 > > I've already installed from source, but I would like make some order in > > my system, and currently I have 3 versions of Python in various dirs :-= ) >=20 > Did you look at the src.rpm? That should definitely include a spec > file (didn't check, though). Just do rpm -i of the src.rpm, then look > into your packages/SPECS directory. I'm sure src.rpm will have the correct version :-) I only hoped to get only a few kilo heavy specfile instead of a few mega src.rpm :-) Sorry if I disturb you hackers with such a silly=20 request. =20 > You may also look at Misc/RPM, but Guido suggests that this is likely > *not* the spec file that was used. deffinitely > > I've found some outdated BeOpen specfile, but it doesn't work. Yes, > > I can fix it, but what for when somewhere there seems to exist a > > fixed one, since *.rpm binaries are supported on ftp.python.org? >=20 > They are available on ftp.python.org. They are supported only if their > creator supports them. I understand that, it is the same as with Windows installers (you don't _have to_ supply the installer creator off course), but since they _are_=20 present in SRPMs they _could_ be present in *.tgz, couldn't they? (well,=20 at least you could erase those outdated ones from Misc/RPM, they are=20 useless anyway :-) regards --=20 Marek P=EAtlicki Linux User ID=3D162988 From marpet@linuxpl.org Sat Jan 5 21:01:01 2002 From: marpet@linuxpl.org (Marek =?iso-8859-13?Q?P=E6tlicki?=) Date: 05 Jan 2002 22:01:01 +0100 Subject: [Python-Dev] RPM *.spec file In-Reply-To: <200201051748.g05Hml211527@mira.informatik.hu-berlin.de> References: <1010250155.2251.5.camel@marek.almaran.home> <200201051748.g05Hml211527@mira.informatik.hu-berlin.de> Message-ID: <1010264088.3050.8.camel@marek.almaran.home> W li=B6cie z sob, 05-01-2002, godz. 18:48, Martin v. Loewis pisze:=20 > > Could someone point me to the RPM *.spec file with which was built the > > Python 2.2 (ftp://ftp.python.org/pub/python/2.2/rpms/)? > >=20 > > I've already installed from source, but I would like make some order in > > my system, and currently I have 3 versions of Python in various dirs :-= ) >=20 > Did you look at the src.rpm? That should definitely include a spec > file (didn't check, though). Just do rpm -i of the src.rpm, then look > into your packages/SPECS directory. well, I hoped to get only a few kilo heavy specfile instead of a few mega src.rpm :-) =20 > You may also look at Misc/RPM, but Guido suggests that this is likely > *not* the spec file that was used. deffinitely > > I've found some outdated BeOpen specfile, but it doesn't work. Yes, > > I can fix it, but what for when somewhere there seems to exist a > > fixed one, since *.rpm binaries are supported on ftp.python.org? >=20 > They are available on ftp.python.org. They are supported only if their > creator supports them. I understand that, it is the same as with Windows installers (you don't _have to_ supply the installer creator off course), but since they _are_ in SRPMs they _could_ be present in *.tgz, couldn't they? :-) And: where can I find those SRPMs anyway? regards --=20 Marek P=EAtlicki Linux User ID=3D162988 From marpet@linuxpl.org Sat Jan 5 21:20:19 2002 From: marpet@linuxpl.org (Marek =?iso-8859-13?Q?P=E6tlicki?=) Date: 05 Jan 2002 22:20:19 +0100 Subject: [Python-Dev] RPM *.spec file In-Reply-To: <1010264088.3050.8.camel@marek.almaran.home> References: <1010250155.2251.5.camel@marek.almaran.home> <200201051748.g05Hml211527@mira.informatik.hu-berlin.de> <1010264088.3050.8.camel@marek.almaran.home> Message-ID: <1010265603.3049.14.camel@marek.almaran.home> sorry for this one, must've sent it out of the trash :-( --=20 Marek P=EAtlicki Linux User ID=3D162988 From barry@zope.com Sat Jan 5 21:53:40 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 5 Jan 2002 16:53:40 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> Message-ID: <15415.30180.542499.182362@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> That is very puzzling. I just did a fresh checkout on MvL> cf.sf.net (the debian installation), using MvL> Can you please try the same sequence of actions on the SF MvL> compile farm, and report what it does for you? Alternatively, MvL> can you spot the error in the commands I used? Actually, let's do something different. My laptop is a pretty stock Mandrake 8.1 installation, while my desktop is a RH 6.1-ish system (with a few kernel and other package updates). On a fresh checkout of the release21-maint branch on the RH system, everything works fine; the posix-large-file recipe does indeed propagate the extended OPT macro into the Makefile. A fresh checkout on the Mandrake system it does not. Weird. What's different? On both systems we're using autoconf 2.13, and m4 version 1.4. On the RH system I've got GNU make version 3.77, but on Mandrake it's 3.79.1 so that's one obvious difference. But I'd be surprised if this is a make bug because I didn't think make was invoked during the configure phase. I've gotta run, but I'll try to look into this some more later on. -Barry From nhodgson@bigpond.net.au Sat Jan 5 22:37:39 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Sun, 6 Jan 2002 09:37:39 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> Message-ID: <016e01c19639$94c909b0$0acc8490@neil> Explored the possibility of detecting Unicode arguments to open and using _wfopen on Windows NT. This led to trying to store Unicode strings in the f_name and f_mode fields of the file object which started to escalate into complexity making Mark's mbcs choice more understandable. Another approach is to use utf-8 as the Py_FileSystemDefaultEncoding and then convert to and from in each file system access function. The core file open function from fileobject.c changed to work with utf-8 is at the end of this message with the important lines in the #ifdef MS_WIN32 section. Along with that change goes a change in Py_FileSystemDefaultEncoding to be "utf-8" rather than "mbcs". This change works for me on Windows 2000 and allows access to all files no matter what the current code page is set to. On Windows 9x (not yet tested), the _wfopen call should fail causing a fallback to fopen. Possibly the OS should be detected instead and _wfopen not attempted on 9x. On 9x, mbcs may be a better choice of encoding although it may also be possible to ask the file system to find the wide character file name and return the mangled short name that can then be used by fopen. The best approach to me seems to be to make Py_FileSystemDefaultEncoding settable by the user, at least allowing the choice between 'utf-8' and 'mbcs' with a default of 'utf-8' on NT and 'mbcs' on 9x. This approach can be extended to other file system calls with, for example, os.listdir and glob.glob upon detecting a utf-8 default encoding, using wide character system calls and converting to utf-8. Please criticise any stylistic or correctness issues in the code as it is my first modification to the Python sources. Neil static PyObject * open_the_file(PyFileObject *f, char *name, char *mode) { assert(f != NULL); assert(PyFile_Check(f)); assert(name != NULL); assert(mode != NULL); assert(f->f_fp == NULL); /* rexec.py can't stop a user from getting the file() constructor -- all they have to do is get *any* file object f, and then do type(f). Here we prevent them from doing damage with it. */ if (PyEval_GetRestricted()) { PyErr_SetString(PyExc_IOError, "file() constructor not accessible in restricted mode"); return NULL; } errno = 0; #ifdef HAVE_FOPENRF if (*mode == '*') { FILE *fopenRF(); f->f_fp = fopenRF(name, mode+1); } else #endif { Py_BEGIN_ALLOW_THREADS #ifdef MS_WIN32 if (strcmp(Py_FileSystemDefaultEncoding, "utf-8") == 0) { PyObject *wname; PyObject *wmode; wname = PyUnicode_DecodeUTF8(name, strlen(name), "strict"); wmode = PyUnicode_DecodeUTF8(mode, strlen(mode), "strict"); if (wname && wmode) { f->f_fp = _wfopen(PyUnicode_AS_UNICODE(wname), PyUnicode_AS_UNICODE(wmode)); } Py_XDECREF(wname); Py_XDECREF(wmode); } if (NULL == f->f_fp) { f->f_fp = fopen(name, mode); } #else f->f_fp = fopen(name, mode); #endif Py_END_ALLOW_THREADS } if (f->f_fp == NULL) { #ifdef NO_FOPEN_ERRNO /* Metroworks only, wich does not always sets errno */ if (errno == 0) { PyObject *v; v = Py_BuildValue("(is)", 0, "Cannot open file"); if (v != NULL) { PyErr_SetObject(PyExc_IOError, v); Py_DECREF(v); } return NULL; } #endif if (errno == EINVAL) PyErr_Format(PyExc_IOError, "invalid argument: %s", mode); else PyErr_SetFromErrnoWithFilename(PyExc_IOError, name); f = NULL; } return (PyObject *)f; } From jack@oratrix.nl Sat Jan 5 23:05:57 2002 From: jack@oratrix.nl (Jack Jansen) Date: Sun, 06 Jan 2002 00:05:57 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: Message by "M.-A. Lemburg" , Fri, 04 Jan 2002 18:06:53 +0100 , <3C35E12D.4A72AC02@lemburg.com> Message-ID: <20020105230603.1758BE8451@oratrix.oratrix.nl> Recently, "M.-A. Lemburg" said: > Jack Jansen wrote: > > > > Off on a slight tangent: > > On Mac OS X the default 8-bit encoding is UTF8. os.listdir() handles > > this fine and so does open(). The OS does all the hard work for > > you [...] > > But in Python (unix-Python we're talking here, not MacPython), > > unicode(filename) fails, because site.encoding is "ascii". > > > > Would it be safe to set site.encoding to utf8 on Mac OS X by default? > > I'd rather suggest to use UTF-8 as default encoding in the > subsystem layer I was talking about. Uhm... Do you mean Py_FileSystemDefaultEncoding? Otherwise: what do you mean? And, if you do mean Py_FSDE, would that also work for listdir()? No, I guess it can't because listdir() returns simple strings, so by the time I pass them to unicode() all knowledge that they came from listdir is gone... Hmm, shouldn't StringObjects themselves carry an encoding field (defaulting to sys.encoding)? That would solve quite a few issues. read() from a binary file would return the special encoding "binary", for instance, and then the "u" and "u#" formats could make a distinction between character strings (which would be converted to unicode using the encoding they carry) and binary strings (which would be interpreted as 16-bit chars). But interning may be a showstopper, now that I think of it... > Making UTF-8 the default Python system encoding would have many other > consequences -- and you'd probably lose a great deal of portability > since UTF-8 conversion (nearly) always will succeed while ASCII can > easily fail on other systems which use e.g. Latin-1 as native > encoding. What are your reasons for asserting this? If I read this correctly this would make Python compatible to the least common denominator of all platforms, while I think I would prefer it to allow access to all the niceties a platform gives. On Unix you really don't have a good guess for the encoding, but on MacOS and Windows you do... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jack@oratrix.nl Sat Jan 5 23:18:36 2002 From: jack@oratrix.nl (Jack Jansen) Date: Sun, 06 Jan 2002 00:18:36 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: Message by "Martin v. Loewis" , Fri, 4 Jan 2002 19:40:34 +0100 , <200201041840.g04IeY201401@mira.informatik.hu-berlin.de> Message-ID: <20020105231841.54F1BE8451@oratrix.oratrix.nl> Recently, "Martin v. Loewis" said: > When the discussion of tagging binary strings in source code came up, > I started to look into the standard library which string literals > would have to be tagged as byte strings, and which are really > character strings. > > I found that the overwhelming majority of string literals in the > standard Python library really denotes byte strings, if you ignore doc > strings. Sometimes, it isn't obvious that they are binary strings, > hence the smiley. [leaving only one example in:] > version = "HTTP/0.9" > status = "200" > reason = "" > > Protocol elements, thus byte string. I think you're taking it too far now. I think we should assume that ASCII survives. If Python runs on an EBCDIC machine (does it?) I assume that at some point the conversion of EBCDIC<->ASCII is handled semi-transparently. Also, as these things are readable they should be treated as such. It should be possible to do >>> print u"Funny reply to my "+unicode(version)+u" message" especially when the "funny reply" bit is in Japanese. What I would agree with, I think, is if we tag these strings as "ascii". And that is also what the BDFL pronounced at some point: Python sourcecode is ASCII, and if you put 8 bit characters in there you're living dangerously. Only when octal or hex escapes appear in a sourcecode string can it be anything other than ascii. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From martin@v.loewis.de Sat Jan 5 23:58:13 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 00:58:13 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15415.30180.542499.182362@anthem.wooz.org> (barry@zope.com) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> <15415.30180.542499.182362@anthem.wooz.org> Message-ID: <200201052358.g05NwDN14410@mira.informatik.hu-berlin.de> > What's different? On both systems we're using autoconf 2.13, and m4 > version 1.4. Should be irrelevant, since you are using the generated configure (I hope). > On the RH system I've got GNU make version 3.77, but on > Mandrake it's 3.79.1 so that's one obvious difference. But I'd be > surprised if this is a make bug because I didn't think make was > invoked during the configure phase. Right. That leaves it to /bin/sh. Regards, Martin From martin@v.loewis.de Sun Jan 6 00:10:42 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 01:10:42 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <20020105231841.54F1BE8451@oratrix.oratrix.nl> (message from Jack Jansen on Sun, 06 Jan 2002 00:18:36 +0100) References: <20020105231841.54F1BE8451@oratrix.oratrix.nl> Message-ID: <200201060010.g060AgF14443@mira.informatik.hu-berlin.de> > [leaving only one example in:] > > version = "HTTP/0.9" > > status = "200" > > reason = "" > > > > Protocol elements, thus byte string. > > I think you're taking it too far now. I think we should assume that > ASCII survives. That is not the issue. That string *is* a byte string. The HTTP protocol is not defined in terms of character sequences, but in terms of byte sequences, or else interoperability would be lost. If those strings would converted to character strings (i.e. Unicode strings), it would still work, but it won't be correct anymore. That's just like giving a file size as a double: it would probably work, but it won't be correct. > Also, as these things are readable they should be treated as such. It > should be possible to do > >>> print u"Funny reply to my "+unicode(version)+u" message" > especially when the "funny reply" bit is in Japanese. That is a nice property of so-called "text" protocols. That still doesn't make it a character-oriented protocol; HTTP *is* a byte oriented protocol. If you have a binary protocol, there is likely also a version field in it, but you'd have to write print u"Funny reply to my "+XDRversion2string(version)+u" message" > What I would agree with, I think, is if we tag these strings as > "ascii". That is pointless. Having strings tagged with their encoding is also a possible architecture for a programming language, but none that Python has chosen to take. Instead, Python has selected to have only a single data type for character data, namely Unicode. > Python sourcecode is ASCII, and if you put 8 bit characters in there > you're living dangerously. [...] > Only when octal or hex escapes appear in a sourcecode string can it be > anything other than ascii. The octal escapes, in themselves, are also ASCII, or else you could not put them into source code. The traditional string type in Python really is a byte string type first of all. It can be used as a character string type only if you imply a character set and an encoding. The source being ASCII just gives you a guarantee about the bytes you get at runtime. Regards, Martin From martin@v.loewis.de Sun Jan 6 00:20:27 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 01:20:27 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <20020105230603.1758BE8451@oratrix.oratrix.nl> (message from Jack Jansen on Sun, 06 Jan 2002 00:05:57 +0100) References: <20020105230603.1758BE8451@oratrix.oratrix.nl> Message-ID: <200201060020.g060KRR14467@mira.informatik.hu-berlin.de> > Hmm, shouldn't StringObjects themselves carry an encoding field > (defaulting to sys.encoding)? That approach has been discussed during the design phase of the Unicode API; Bill Janssen was the first to propose this in response to my talk http://www.python.org/workshops/1997-10/proceedings/loewis.html During the Unicode design, this idea came up sometimes, but it always turned out that proposers could not give a coherent semantics to such tags. Just explain what happens if you add two strings that have different encodings. > That would solve quite a fewb issues. And introduce many new ones. > > Making UTF-8 the default Python system encoding would have many other > > consequences -- and you'd probably lose a great deal of portability > > since UTF-8 conversion (nearly) always will succeed while ASCII can > > easily fail on other systems which use e.g. Latin-1 as native > > encoding. > > What are your reasons for asserting this? If I understand this claim correctly, he means: "Currently, if auto-conversion (to ASCII) succeeds, the result is likely correc. If the default encoding was UTF-8, conversion would succeed for all Unicode objects, but give incorrect results for many users, e.g. if they use Latin-1 on their terminal" This is actually a frequent problem since the introduction of UTF-8: Some applications display the bytes that make up an UTF-8 string as if it was a Latin-1 string, rendering it completely unreadable (although I can already recognize my name if I run into such an application). This problem may go unnoticed during testing, whereas an exception is likely noticed. > If I read this correctly this would make Python compatible to the > least common denominator of all platforms, while I think I would > prefer it to allow access to all the niceties a platform gives. It does no such thing. The application has full control over all conversions, if it initiates them explicitly. Explicit is better then implicit. Regards, Martin From martin@v.loewis.de Sun Jan 6 00:33:08 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 01:33:08 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <016e01c19639$94c909b0$0acc8490@neil> (nhodgson@bigpond.net.au) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> Message-ID: <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> > This change works for me on Windows 2000 and allows access to all files > no matter what the current code page is set to. On Windows 9x (not yet > tested), the _wfopen call should fail causing a fallback to fopen. Possibly > the OS should be detected instead and _wfopen not attempted on 9x. Now that you have that change, please try to extend it to posixmodule.c. This is where I gave up. Notice that, with changing Py_FileSystemDefaultEncoding and open() alone, you have worsened the situation: os.stat will now fail on files with non-ASCII names on which it works under the mbcs encoding, because windows won't find the file (correct me if I'm wrong). > On 9x, mbcs may be a better choice of encoding although it may also > be possible to ask the file system to find the wide character file > name and return the mangled short name that can then be used by > fopen. It is not just 9x: if you have ten (*) different APIs to open a file, 10 different APIs to stat a file, and so on, and have to select some of them at compile time, and some of them at run-time, it gets messy very quickly. (*) I'd expect that other systems may also have proprietary system calls to do these things, using either wchar_t* or a proprietary Unicode type. > The best approach to me seems to be to make > Py_FileSystemDefaultEncoding settable by the user, at least allowing > the choice between 'utf-8' and 'mbcs' with a default of 'utf-8' on > NT and 'mbcs' on 9x. By the user, or by the application? How can the application make a more educated guess than Python proper? Alternatively, how can the user (or her Administrator) know what value to put in there? On Windows, probably neither is a good idea; if the file system default encoding is used in the future, fixing it at mbcs is the best I can think of. > Please criticise any stylistic or correctness issues in the code > as it is my first modification to the Python sources. The code looks fine. I'd encourage you to continue on that topic; just expect that it will need many more rounds for completion. Regards, Martin From jafo@tummy.com Sun Jan 6 01:16:58 2002 From: jafo@tummy.com (Sean Reifschneider) Date: Sat, 5 Jan 2002 18:16:58 -0700 Subject: [Python-Dev] RPM *.spec file In-Reply-To: <20020106003817.7055.qmail@scrye.com>; from kevin@scrye.com on Sat, Jan 05, 2002 at 05:38:17PM -0700 References: <20020106003817.7055.qmail@scrye.com> Message-ID: <20020105181658.A26453@tummy.com> >> Could someone point me to the RPM *.spec file with which was built the >> Python 2.2 (ftp://ftp.python.org/pub/python/2.2/rpms/)? The nice thing about an source RPM file (.src.rpm) is that includes *ALL* things necessary to reproduce the build of the software. This includes the pristine source, any modifications required to build, shell commands to build/install it, and the meta-data. So, pick up the .src.rpm file from the URL you mention above. It's got everything you need... You can either install it and find the .spec file in /usr/src/redhat/SPECS, or you can extract the .src.rpm using "rpm2cpio" and using CPIO to get just the files you want. >I've asked Sean to contribute his specfile, but he hasn't given them >to me yet. Well, at some point we were talking about wether to eliminate the patches and how to do it for expat and ... I was actually thinking last night of putting them into CVS (I made some modifications that I thought would let Zope 2.4.3 build, but didn't and I backed them out). How would you like to deal with the .spec file and patches? I can easily enough turn the patches into sed commands in the setup, which would mean you could build the RPMs from the Python tar file directly, if included there. Do you want me to just mail the new .spec file to you when I ask you to upload the new RPMs, or do you want a script that would check out the latest .spec file from my CVS into your tree, update the version number in it, and go from there, or can I get access to your CVS for checking in new .specs? Or, I guess we could get it updated into the current CVS and I could submit patches through the Sourceforge. Except I always sourceFORGET to click the "click here to attach file" button. Comments? Sean -- "McGuyver stole all his tricks from Dr. Who." Sean Reifschneider, Inimitably Superfluous tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python From barry@zope.com Sun Jan 6 01:26:48 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 5 Jan 2002 20:26:48 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> <15415.30180.542499.182362@anthem.wooz.org> <200201052358.g05NwDN14410@mira.informatik.hu-berlin.de> Message-ID: <15415.42968.453024.267482@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> Right. That leaves it to /bin/sh. Yup. A bash bug? /bin/sh (aka bash) version 2.03.8 on RH6.1 vs. 2.05.1 on MD8.1. It isn't sed, which is at version 3.02 on both. Hmm, a bash bug? -Barry From nhodgson@bigpond.net.au Sun Jan 6 01:47:43 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Sun, 6 Jan 2002 12:47:43 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> Message-ID: <021901c19654$21f2e3f0$0acc8490@neil> Martin v. Loewis: > Now that you have that change, please try to extend it to > posixmodule.c. This is where I gave up. OK. os.open, os.stat, and os.listdir now work. Placed temporarily at http://pythoncard.sourceforge.net/posixmodule.c os.stat is ugly because the posix_do_stat function is parameterised over a stat function pointer but it is always _stati64 on Windows so the patch just assumes _wstati64 is right. os.listdir returns Unicode objects rather than strings. This makes glob.glob work as well so my earlier script that finds the *.html files and opens them works. Unfortunately, I expect most callers of glob() will be expecting narrow strings. > Notice that, with changing > Py_FileSystemDefaultEncoding and open() alone, you have worsened the > situation: os.stat will now fail on files with non-ASCII names on > which it works under the mbcs encoding, because windows won't find the > file (correct me if I'm wrong). If you give it a file name encoded in the current code page then it may fail where it did not before. Neil From guido@python.org Sun Jan 6 03:26:23 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 05 Jan 2002 22:26:23 -0500 Subject: [Python-Dev] RPM *.spec file In-Reply-To: Your message of "Sat, 05 Jan 2002 18:16:58 MST." <20020105181658.A26453@tummy.com> References: <20020106003817.7055.qmail@scrye.com> <20020105181658.A26453@tummy.com> Message-ID: <200201060326.WAA07572@cj20424-a.reston1.va.home.com> FYI, I've checked in Sean's RPM spec file and the patches under Misc/RPM/, replacing the previous (outdated) contents there. --Guido van Rossum (home page: http://www.python.org/~guido/) From marpet@linuxpl.org Sun Jan 6 11:50:21 2002 From: marpet@linuxpl.org (Marek =?iso-8859-13?Q?P=E6tlicki?=) Date: 06 Jan 2002 12:50:21 +0100 Subject: [Python-Dev] RPM *.spec file In-Reply-To: <200201060326.WAA07572@cj20424-a.reston1.va.home.com> References: <20020106003817.7055.qmail@scrye.com> <20020105181658.A26453@tummy.com> <200201060326.WAA07572@cj20424-a.reston1.va.home.com> Message-ID: <1010317823.1347.0.camel@marek.almaran.home> W li=B6cie z nie, 06-01-2002, godz. 04:26, Guido van Rossum pisze:=20 > FYI, I've checked in Sean's RPM spec file and the patches under > Misc/RPM/, replacing the previous (outdated) contents there. thank you very much: this is what suits me best - I prefer the specfile in the main tarfile (AND cvs). In this way if building RPM-s doesn't work I can always build it in the 'classic' way but still I don't have to wait for the src.rpms to appear (don't want to say that this is the case with Python releases, but generarly I prefer original tarballs _with_ rpm specs to src.rpms with nobody-knows-what-changes-applied). thanks and best regards --=20 Marek P=EAtlicki Linux User ID=3D162988 From marpet@linuxpl.org Sun Jan 6 11:50:20 2002 From: marpet@linuxpl.org (Marek =?iso-8859-13?Q?P=E6tlicki?=) Date: 06 Jan 2002 12:50:20 +0100 Subject: [Python-Dev] RPM *.spec file In-Reply-To: <200201060326.WAA07572@cj20424-a.reston1.va.home.com> References: <20020106003817.7055.qmail@scrye.com> <20020105181658.A26453@tummy.com> <200201060326.WAA07572@cj20424-a.reston1.va.home.com> Message-ID: <1010317824.1636.1.camel@marek.almaran.home> W li=B6cie z nie, 06-01-2002, godz. 04:26, Guido van Rossum pisze:=20 > FYI, I've checked in Sean's RPM spec file and the patches under > Misc/RPM/, replacing the previous (outdated) contents there. thank you very much: this is what suits me best - I prefer the specfile in the main tarfile (AND cvs). In this way if building RPM-s doesn't work I can always build it in the 'classic' way but still I don't have to wait for the src.rpms to appear (don't want to say that this is the case with Python releases, but generarly I prefer original tarballs _with_ rpm specs to src.rpms with nobody-knows-what-changes-applied). thanks and best regards --=20 Marek P=EAtlicki Linux User ID=3D162988 From martin@v.loewis.de Sun Jan 6 12:14:55 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 13:14:55 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <021901c19654$21f2e3f0$0acc8490@neil> (nhodgson@bigpond.net.au) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> Message-ID: <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> > > Now that you have that change, please try to extend it to > > posixmodule.c. This is where I gave up. > > OK. os.open, os.stat, and os.listdir now work. Placed temporarily at > http://pythoncard.sourceforge.net/posixmodule.c Looks good. The posix_do_stat changes contain an error; you have put Python API calls inside the BEGIN_ALLOW_THREADS block. That is wrong: you must always hold the interpreter lock when calling Python API. Also, when calling _wstati64, you might want to assert that the function pointer is _stati64. Likewise, the code inside posix_open should hold the interpreter lock. > os.listdir returns Unicode objects rather than strings. This makes > glob.glob work as well so my earlier script that finds the *.html > files and opens them works. Unfortunately, I expect most callers of > glob() will be expecting narrow strings. That is not that much of a problem; we could try to define API where it is the caller's choice. However, the size of your changes is really disturbing here. There used to be already four versions of listing a directory; now you've added a fifth one. And it isn't even clear whether this code works on W9x, is it? There must be a way to fold the different Windows versions into a single one; perhaps it is acceptable to drop Win16 support. I think three different versions should be offered to the end user: - path is plain string, result is list of plain strings - path is Unicode string, result is list of Unicode strings - path is Unicode string, result is list of plain strings Perhaps one could argue that the third version isn't really needed: anybody passing Unicode strings to listdir should be expected to get them back also. That would leave us with two functional features on windows. I envision a fragment that looks like this #ifdef windows if (argument is unicode string) { #define strings wide #include "listdir_win.h" #undef strings } else { convert argument to string #define strings narrow #include "listdir_win.h" #undef strings #endif If you provide a similar listdir_posix and listdir_os2, it should be possible to get a uniform implementation. > > Notice that, with changing > > Py_FileSystemDefaultEncoding and open() alone, you have worsened the > > situation: os.stat will now fail on files with non-ASCII names on > > which it works under the mbcs encoding, because windows won't find the > > file (correct me if I'm wrong). > > If you give it a file name encoded in the current code page then it may > fail where it did not before. I was actually talking about stat as a function that you haven't touched, yet. Now, os.rename will fail if you pass two Unicode strings referring to non-ASCII file names. posix_1str and posix_2str are like the stat implementation, except that you cannot know statically what the function pointer is. Regards, Martin From martin@v.loewis.de Sun Jan 6 11:37:00 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 12:37:00 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15415.42968.453024.267482@anthem.wooz.org> (barry@zope.com) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> <15415.30180.542499.182362@anthem.wooz.org> <200201052358.g05NwDN14410@mira.informatik.hu-berlin.de> <15415.42968.453024.267482@anthem.wooz.org> Message-ID: <200201061137.g06Bb0o01516@mira.informatik.hu-berlin.de> > Yup. A bash bug? > > /bin/sh (aka bash) version 2.03.8 on RH6.1 vs. 2.05.1 on MD8.1. It > isn't sed, which is at version 3.02 on both. > > Hmm, a bash bug? Could be a test problem as well. Line 1451 in configure currently reads if test -z "$OPT" My guess that this is where the environment setting is overwritten. Just put echo "Current value of OPT is x${OPT}x" before this test, and echo "New value of OPT is x${OPT}x" after the if statement. Actually, after re-reading the autoconf documentation, I think I see what's happending. $OPT starts with a - (HYPHEN MINUS), so test treats it as an option. Please try replacing the test with if test ${OPT+set} != set HTH, Martin From mal@lemburg.com Sun Jan 6 16:39:19 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 06 Jan 2002 17:39:19 +0100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <3C358DD0.815BC93D@lemburg.com> <200201041817.g04IH3p01371@mira.informatik.hu-berlin.de> Message-ID: <3C387DB7.1000201@lemburg.com> Martin v. Loewis wrote: >>We'd still need to support other OSes as well, though, and I >>don't think that putting all this code into fileobject.c is >>a good idea -- after all opening files is needed by some other >>parts of Python as well and may also be useful for extensions. >> > > The stuff isn't in fileobject.c. Py_FileSystemDefaultEncoding > is defined in bltinmodule.c. That's the global, sure but the code using it is scattered across fileobject.c and the posix module. I think it would be a good idea to put all this file naming code into some Python/fileapi.c file which then also provides C APIs for extensions to use. These APIs should then take the file name as PyObject* rather than char* to enable them to handle Unicode directly. > Also, on other OSes: You can pass Unicode object to open on all > systems. If Py_FileSystemDefaultEncoding is NULL, it will fall back to > site.encoding. > > Of course, if the system has an open function that expects wchar_t*, > we might want to use that instead of going through a codec. Off hand, > Win32 seems to be the only system where this might work, and even > there, it won't work on Win95. I expect this to become a standard in the next few years. >>I'd suggest to implement something similiar to the DLL loading >>code which is also implemented as subsystem in Python. >> > > I'd say this is over-designed. It is not that there are ten > alternative approaches to doing encodings in file names, and we only > support two of them, but it is rather that there are only two, and we > support all three of them :-) > > Also, it is more difficult than threads: for threads, there is a fixed > set of API features that need to be represented. Doing Py_UNICODE* > opening alone is easy, but look at the number of posixmodule functions > that all expect file names of some sort. Doesn't that support the idea of having a small subsystem in Python which exposes the Unicode aware APIs to Python and its extensions ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Sun Jan 6 16:58:41 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 06 Jan 2002 17:58:41 +0100 Subject: [Python-Dev] Unicode support in getargs.c References: Message-ID: <3C388241.909@lemburg.com> Jack Jansen wrote: > I'm going to jump out of this discussion for a while. Martin and Mark have > a completely different view on Unicode than I do, apparently, and I think > I should first try and see if I can use the current implementation. > > For the record: my view of Unicode is really "ascii done right", i.e. a > datatype that allows you to get richer characters than what 1960s ascii > gives you. For this it should be as backward-compatible as possible, i.e. > if some API expects a unicode filename and I pass "a.out" it should > interpret it as u"a.out". All the converting to different charsets is > icing on the cake, the number one priority should be that unicode is as > compatible as possible with the 8-bit convention used on the platform > (whatever it may be). No, make that the number 2 priority: the number one > pritority is compatibility with 7-bit ascii. Using Python StringObjects as > binary buffers is also far less common than using StringObjects to store > plain old strings, so if either of these uses bites the other it's the > binary buffer that needs to suffer. UnicodeObjects and StringObjects > should behave pretty orthogonal to how FloatObjects and IntObjects behave. It would be nice if Unicode could be made to behave that way, but unfortunately, the 8-bit world is so differentiated with lots of different encodings that not even Harry Potter would have much luck finding the right magic to apply. Another problem is that of the getargs.c API itself: since it returns pointers to data buffers, auto-conversions (if at all possible) which involve temporary objects must be handled differently than normal Python string objects. Now, the question is whether you are willing to pay for the comfort of getting direct access to a Py_UNICODE buffer (or char buffer) with extra copy-action and additional PyMem_Free() cleanup overhead or not. The "O" parser marker doesn't provide any magic on its own, but also reduces the need for copying data and handling memory management in you APIs. In my last message on this thread, I proposed to add "eu#" which returns a Py_UNICODE buffer, possibly decoding a string object using the given encoding first. As Martin noted, this option requires extra copying but simplifies the C coding somewhat. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Sun Jan 6 17:16:31 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 06 Jan 2002 18:16:31 +0100 Subject: [Python-Dev] Re: [XML-SIG] printing Unicode xml to StringIO References: <200112271506.fBRF6uF26242@spinaker.ins.cwi.nl> <200112271736.fBRHaom02848@mira.informatik.hu-berlin.de> <3C2C3406.CB6D49DB@lemburg.com> <200112281013.fBSADHu01819@mira.informatik.hu-berlin.de> <3C2C4E71.2BA58BF3@lemburg.com> <200112281125.fBSBPkC02403@mira.informatik.hu-berlin.de> <200112281420.JAA22961@cj20424-a.reston1.va.home.com> <3C2C901A.D61001D9@lemburg.com> <200112281527.KAA23737@cj20424-a.reston1.va.home.com> Message-ID: <3C38866F.90309@lemburg.com> Guido van Rossum wrote: >>>- Since we added a note to the docs that StringIO supports Unicode, we >>> clearly should continue to support that, and it's a bug if it >>> doesn't. >>> >>I still believe that the docs are wrong, but nevermind. I'll fix >>StringIO.py to continue to support Unicode in addition to strings >>and buffer objects. It's basically only about special casing >>Unicode in the .write() method. >> > > Thanks. > > >>BTW, I was never aware of the doc changes in this area and the >>test suite didn't bring up the issues either. >> > > Can you please add something to the test suite that makes sure this > feature works? > > >>>- OTOH, Unicode for cStringIO should be considered at best a feature >>> request. I don't mind if cStringIO doesn't support Unicode -- it >>> never has, AFAIK, so it won't break much code. I don't believe it's >>> much faster than StringIO, unless you use the C API (like cPickle >>> does). >>> >>Unicode support in cStringIO would require a new implementation >>since the machinery uses raw byte buffers. >> > > That's why I don't care much about it. :-) > > >>>- Of course, when Unicode is supported, mixing ASCII and Unicode >>> should be supported too. (But not necessarily mixing 8-bit strings >>> containing characters in the range \200-\377, since there's no >>> default encoding for this range.) >>> >>In StringIO.py this is not much of a problem since it uses >>a list of snippets. Note that this is also why StringIO.py "supported" >>Unicode in the first place (and that's why I think it was more an >>artifact of the implementation than true intent). >> > > But it was useful! :-) > > >>>- Since this changed from 2.1 to 2.2, we should restore this >>> capability in 2.2.1; I would say that 2.2.1 can't go out until this >>> is fixed. >>> > > Try to mark the checkin messages as "2.2.1 bugfix", for the 2.2.1 > patch czar. Checked in. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Sun Jan 6 17:30:17 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 06 Jan 2002 18:30:17 +0100 Subject: [Python-Dev] Add platform.py to the standard lib ?! Message-ID: <3C3889A9.6050607@lemburg.com> This is a multi-part message in MIME format. --------------030706070909040208000601 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Should I go ahead and checkin platform.py into the Python 2.2 tree together with some docs ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ --------------030706070909040208000601 Content-Type: message/rfc822; name="[Python-bugs-list] [ python-Feature Requests-494854 ] add platform.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="[Python-bugs-list] [ python-Feature Requests-494854 ] add platform.py" Received: from mail.python.org (mail.python.org [63.102.49.29]) by www.egenix.com (8.11.2/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id fBJGsRv07455 for ; Wed, 19 Dec 2001 17:54:27 +0100 Received: from localhost.localdomain ([127.0.0.1] helo=mail.python.org) by mail.python.org with esmtp (Exim 3.21 #1) id 16Gjyu-0002Bp-00; Wed, 19 Dec 2001 11:54:12 -0500 Received: from [216.136.171.253] (helo=usw-sf-netmisc.sourceforge.net) by mail.python.org with esmtp (Exim 3.21 #1) id 16Gjy6-00025X-00 for python-bugs-list@python.org; Wed, 19 Dec 2001 11:53:22 -0500 Received: from usw-sf-web3-b.sourceforge.net ([10.3.1.7] helo=usw-sf-web3.sourceforge.net) by usw-sf-netmisc.sourceforge.net with esmtp (Exim 3.22 #1 (Debian)) id 16Gjy4-0002BT-00; Wed, 19 Dec 2001 08:53:20 -0800 Received: from nobody by usw-sf-web3.sourceforge.net with local (Exim 3.22 #1 (Debian)) id 16Gjy4-0005mw-00; Wed, 19 Dec 2001 08:53:20 -0800 To: noreply@sourceforge.net From: noreply@sourceforge.net Message-Id: Subject: [Python-bugs-list] [ python-Feature Requests-494854 ] add platform.py Sender: python-bugs-list-admin@python.org Errors-To: python-bugs-list-admin@python.org X-BeenThere: python-bugs-list@python.org X-Mailman-Version: 2.0.8 (101270) Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: List which receives bug reports on Python List-Unsubscribe: , List-Archive: Date: Wed, 19 Dec 2001 08:53:20 -0800 MIME-Version: 1.0 Feature Requests item #494854, was opened at 2001-12-18 17:16 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=355470&aid=494854&group_id=5470 Category: Python Library Group: None Status: Open Priority: 5 Submitted By: Jason R. Mastaler (jasonrm) >Assigned to: M.-A. Lemburg (lemburg) Summary: add platform.py Initial Comment: Here's a request to add Marc-Andre Lemburg's platform.py to the Python standard library. It provides more complete platform information than either sys.platform or distutils.util.get_platform() For more info, see: http://www.lemburg.com/files/python/platform.py ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-12-19 01:27 Message: Logged In: YES user_id=38388 No problem from here :-) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=355470&aid=494854&group_id=5470 _______________________________________________ Python-bugs-list maillist - Python-bugs-list@python.org http://mail.python.org/mailman/listinfo/python-bugs-list --------------030706070909040208000601-- From mal@lemburg.com Sun Jan 6 17:41:32 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 06 Jan 2002 18:41:32 +0100 Subject: [Python-Dev] Add platform.py to the standard lib ?! References: <3C3889A9.6050607@lemburg.com> Message-ID: <3C388C4C.6030305@lemburg.com> M.-A. Lemburg wrote: > Should I go ahead and checkin platform.py into the Python 2.2 > tree together with some docs ? I meant CVS tree... of course. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jason@jorendorff.com Sun Jan 6 18:10:06 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Sun, 6 Jan 2002 12:10:06 -0600 Subject: [Python-Dev] Add platform.py to the standard lib ?! In-Reply-To: <3C3889A9.6050607@lemburg.com> Message-ID: > Should I go ahead and checkin platform.py into the Python 2.2 > tree together with some docs ? I noticed that the regular expressions in this module, throughout, don't use raw strings. Don't know if that's intentional. ## Jason Orendorff http://www.jorendorff.com/ From guido@python.org Sun Jan 6 18:54:03 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 06 Jan 2002 13:54:03 -0500 Subject: [Python-Dev] Add platform.py to the standard lib ?! In-Reply-To: Your message of "Sun, 06 Jan 2002 18:30:17 +0100." <3C3889A9.6050607@lemburg.com> References: <3C3889A9.6050607@lemburg.com> Message-ID: <200201061854.NAA05416@cj20424-a.reston1.va.home.com> > Should I go ahead and checkin platform.py into the Python 2.2 > tree together with some docs ? There is no Python 2.2 tree. Maybe you mean the 2.3 tree? No problem for me. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Sun Jan 6 19:44:45 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 20:44:45 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <3C387DB7.1000201@lemburg.com> (mal@lemburg.com) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <3C358DD0.815BC93D@lemburg.com> <200201041817.g04IH3p01371@mira.informatik.hu-berlin.de> <3C387DB7.1000201@lemburg.com> Message-ID: <200201061944.g06Jij801344@mira.informatik.hu-berlin.de> > That's the global, sure but the code using it is scattered > across fileobject.c and the posix module. I think it would be > a good idea to put all this file naming code into some > Python/fileapi.c file which then also provides C APIs for > extensions to use. These APIs should then take the file name > as PyObject* rather than char* to enable them to handle > Unicode directly. What do you gain by that? Most of the posixmodule functions that take filenames are direct wrappers around the system call. Using another level of indirection is only useful if the fileapi.c functions are used in different places. Notice that each function (open, access, stat, etc) is used exactly *once* currently, so putting this all into a single place just makes the code more complex. The extensions module argument is a red herring: I don't think there are many extension modules out there which want to call access(2) but would like to do so using a PyObject* as the first argument, but numbers as the other arguments. > > Of course, if the system has an open function that expects wchar_t*, > > we might want to use that instead of going through a codec. Off hand, > > Win32 seems to be the only system where this might work, and even > > there, it won't work on Win95. > > I expect this to become a standard in the next few years. I doubt that. Posix people (including developers of various posixish systems) have frequently rejected that idea in recent years. Even for the most recent system in this respect (OS X), we hear that they still open files with a char*, where char is byte - the only advancement is that there is a guarantee that those bytes are UTF-8. It turns out that this is all you need: with that guarantee, there is no need for an additional set of APIs. UTF-8 was originally invented precisely to represent file names (and was called UTF-1 at that time); it is more likely that more systems will follow this convention. If so, a global per-system file system encoding is all that's needed. The only problem is that on Windows, MS has already decided that the APIs are in CP_ANSI, so they cannot change it to UTF-8 now; that's why Windows will need special casing if people are unhappy with the "mbcs" approach (which some apparantly are). > > Also, it is more difficult than threads: for threads, there is a fixed > > set of API features that need to be represented. Doing Py_UNICODE* > > opening alone is easy, but look at the number of posixmodule functions > > that all expect file names of some sort. > > > Doesn't that support the idea of having a small subsystem > in Python which exposes the Unicode aware APIs to Python > and its extensions ? No. It is a lot of work, and an additional layer of indirection, with no apparent advantage. Feel free to write a PEP, though. Regards, Martin From martin@v.loewis.de Sun Jan 6 19:48:41 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 20:48:41 +0100 Subject: [Python-Dev] Unicode support in getargs.c In-Reply-To: <3C388241.909@lemburg.com> (mal@lemburg.com) References: <3C388241.909@lemburg.com> Message-ID: <200201061948.g06Jmfa01367@mira.informatik.hu-berlin.de> > In my last message on this thread, I proposed to add "eu#" which > returns a Py_UNICODE buffer, possibly decoding a string object > using the given encoding first. As Martin noted, this option > requires extra copying but simplifies the C coding somewhat. Also, while it simplifies processing compared to "O", I cannot see any simplification compared to "O&". So I'd be more in favor of offering standard conversion functions for O& instead of inventing new getargs modifiers all the time. This would also simplify creation of cross-version extension modules: people could just incorporate the code of the conversion function into their code base, trusting that O& had been available for ages. Regards, Martin From jack@oratrix.nl Sun Jan 6 21:36:45 2002 From: jack@oratrix.nl (Jack Jansen) Date: Sun, 06 Jan 2002 22:36:45 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: Message by "Martin v. Loewis" , Sun, 6 Jan 2002 01:33:08 +0100 , <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> Message-ID: <20020106213650.8E224E8451@oratrix.oratrix.nl> Recently, "Martin v. Loewis" said: > > This change works for me on Windows 2000 and allows access to all files > > no matter what the current code page is set to. On Windows 9x (not yet > > tested), the _wfopen call should fail causing a fallback to fopen. Possibly > > the OS should be detected instead and _wfopen not attempted on 9x. > > Now that you have that change, please try to extend it to > posixmodule.c. This is where I gave up. Notice that, with changing > Py_FileSystemDefaultEncoding and open() alone, you have worsened the > situation: os.stat will now fail on files with non-ASCII names on > which it works under the mbcs encoding, because windows won't find the > file (correct me if I'm wrong). Could someone who really understands this issue (Martin?) perhaps write a test case for this? I think something like creating a file with some nonascii chars in the name, and verifying that open(), readdir(), os.stat() and various others work as expected is what would be needed (but I'm not sure I fully understand it:-). -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jack@oratrix.nl Sun Jan 6 21:50:55 2002 From: jack@oratrix.nl (Jack Jansen) Date: Sun, 06 Jan 2002 22:50:55 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: Message by "Martin v. Loewis" , Sun, 6 Jan 2002 20:44:45 +0100 , <200201061944.g06Jij801344@mira.informatik.hu-berlin.de> Message-ID: <20020106215100.1D2DDE8451@oratrix.oratrix.nl> Recently, "Martin v. Loewis" said: > > That's the global, sure but the code using it is scattered > > across fileobject.c and the posix module. I think it would be > > a good idea to put all this file naming code into some > > Python/fileapi.c file which then also provides C APIs for > > extensions to use. These APIs should then take the file name > > as PyObject* rather than char* to enable them to handle > > Unicode directly. > > What do you gain by that? Most of the posixmodule functions that take > filenames are direct wrappers around the system call. Using another > level of indirection is only useful if the fileapi.c functions are > used in different places. Well, I only know about the Mac and (to a lesser extent) about Windows, but there's lots of methods that are not in {posix,mac,nt}module.c there that want filenames. And I think mmap also uses filenames, no? All in all I'm in favor of a single place where file name encoding magic is handled. Whether a fileapi.c is needed or something simpler can do the trick (a PyArg_Parse fmt that returns two items: the filename to use plus a routine you're expected to call on it before you return?) I'm not sure. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mal@lemburg.com Sun Jan 6 22:15:51 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 06 Jan 2002 23:15:51 +0100 Subject: [Python-Dev] Add platform.py to the standard lib ?! References: Message-ID: <3C38CC97.2D6AE2F0@lemburg.com> Jason Orendorff wrote: > > > Should I go ahead and checkin platform.py into the Python 2.2 > > tree together with some docs ? > > I noticed that the regular expressions in this module, throughout, > don't use raw strings. Don't know if that's intentional. It's not necessary since the escapes used in the module are not unescaped by the Python parser, but you're probably right: better safe than sorry... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jack@oratrix.nl Sun Jan 6 22:19:04 2002 From: jack@oratrix.nl (Jack Jansen) Date: Sun, 06 Jan 2002 23:19:04 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects Message-ID: <20020106221917.15318E8451@oratrix.oratrix.nl> Something I've wanted for a long time, and maybe I should drop the idea now. There's a lot of Python objects that are really little more than wrappers around an opaque C pointer (plus all the methods to operate on it, etc). These objects usually have accompanying Parse() and Build() functions, that you pass to the O& format for PyArg_Parse and Py_BuildValue. This all works fine, but I think we can do one better. If we have slots in the type structure to store the Parse and Build functions we could add a new format specifier O@ (or whatever other character is free:-) that has a typeobject parameter and a C pointer parameter. One advantage is that this would fit a lot better with the new class inheritance scheme. Moreover, and more importantly, this would give us a handle to use from Python code, so structmodule could (un)pack structures that contain pointers to objects that are python-wrappable, calldll could neatly wrap functions that have python-wrappable objects, etc. Is this a good idea? -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From martin@v.loewis.de Sun Jan 6 22:42:34 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 23:42:34 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <20020106213650.8E224E8451@oratrix.oratrix.nl> (message from Jack Jansen on Sun, 06 Jan 2002 22:36:45 +0100) References: <20020106213650.8E224E8451@oratrix.oratrix.nl> Message-ID: <200201062242.g06MgYa02308@mira.informatik.hu-berlin.de> --Multipart_Sun_Jan__6_23:42:34_2002-1 Content-Type: text/plain; charset=US-ASCII > Could someone who really understands this issue (Martin?) perhaps > write a test case for this? I think something like creating a file > with some nonascii chars in the name, and verifying that open(), > readdir(), os.stat() and various others work as expected is what would > be needed (but I'm not sure I fully understand it:-). I'll attach a script below. It contains UTF-8 encoded data, so to prevent transmission errors, it comes base-64 attached. Running it creates a three additional files in the current directory; I recommend to run it in an empty directory. In case you cannot view the source code properly, I attach a screenshot of my editor. Regards, Martin --Multipart_Sun_Jan__6_23:42:34_2002-1 Content-Type: application/octet-stream Content-Disposition: attachment; filename="uni.py" Content-Transfer-Encoding: base64 IyAtKi0gY29kaW5nOiB1dGYtOCAtKi0KaW1wb3J0IGxvY2FsZSwgb3MKCmxvY2FsZS5zZXRsb2Nh bGUobG9jYWxlLkxDX0FMTCwgIiIpCgpmaWxlbmFtZXMgPSBbCiAgICB1bmljb2RlKCJHcsO8w58t R290dCIsInV0Zi04IiksCiAgICB1bmljb2RlKCLOk861zrnOrC3Pg86xz4IiLCJ1dGYtOCIpLAog ICAgdW5pY29kZSgi0JfQtNGA0LDQstGB0YLQstGD0LnRgtC1IiwidXRmLTgiKSwKICAgIF0KCmZv ciBuYW1lIGluIGZpbGVuYW1lczoKICAgIHByaW50IHJlcHIobmFtZSkKICAgIGYgPSBvcGVuKG5h bWUsInciKQogICAgZi53cml0ZSgobmFtZSsnXG4nKS5lbmNvZGUoInV0Zi04IikpCiAgICBmLmNs b3NlKCkKICAgIG9zLnN0YXQobmFtZSkKCnByaW50IG9zLmxpc3RkaXIoIi4iKQoKZm9yIG5hbWUg aW4gZmlsZW5hbWVzOgogICAgb3MucmVuYW1lKG5hbWUsInRtcCIpCiAgICBvcy5yZW5hbWUoInRt cCIsbmFtZSkKICAgIAo= --Multipart_Sun_Jan__6_23:42:34_2002-1 Content-Type: image/png Content-Disposition: inline; filename="uni.png" Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAAApoAAAK3CAIAAACqXHhAAAAABGdBTUEAALGPC/xhBQAAADh0RVh0 U29mdHdhcmUAWFYgVmVyc2lvbiAzLjEwYSAgUmV2OiAxMi8yOS85NCAoUE5HIHBhdGNoIDEuMind FS5JAAAgAElEQVR4nO3dLZTcWH43YG1OgwIBBgEDFhgYDAgYsMDQIMBgwYAXBCwwCDBY0CBgwIIB AQYLCiwwCDBYMOAFBgsaDGhoENAgwGCAQcCAAIOAAg0CNKPI+rilq+9/1fMcnz5qlXR1pWrXr+7V xy1e/tsPBQAQ1unxdHP3p38+/PnD1jUBgP06vfvu9HC3dS263d7eHm4ON7/89nC/ZV0AYN8+ffq0 dRVSbtqzTn/97ptvvnl4eCiK4vCHN+1Xi6L44a8/vLr7tHz1AGAvPn78OHzhly9f3t1lNOhHLF// hvF3jZdPf/3u63/8+uM//vPhD2/aWV4URTn/1b+8Gr5JuHbfvDBhwsQlTBTF3WDrLF/5TVEUhz9/ KDvby5b327+8vf3wuXy5nHP4w5tqojG/AAY4fXo4PP3GhAkTUSce7j6+f/vx48e7u7vj8djzH/0X Zdwej8dqYqHly9b58Xh8+5e3HXHeyOyG8tW+OO+cf/rrd3sO/s/vbuu/Pnl15jgmChm3bm4JVYWn bG6KelWHVHvNgzOwkJ1UCYiiEeef//tzYuEP//GhEc8LLV+P8+5z5+XE3Y939z/ev/inF/c/3n/4 8OHp06df/cNXb/78y6t9CX16PB1uDunvBDvRzsXP7253/jG9SfV2fkwAVnZ6POUu//S3Tztf+vRf n/rKf/rbp/VXG782fHHuvJ7QP/zww4sXL17/y+vDzeH1v7x+8+bN8989f/3H1+Wrr1+/Ttf79Nfv fvmXuc+r6Wzj1nM9q7Qnr44T0256CasJVNU+a+/CTk77mTBhYvK589IpqWjpnJl46XT6Jf6f/vZp WWb9185ympfClYn+6tWrb//ft0VRfPX0q++///6rp19987tv3v//91/99qtymXf/87T4tS1b/1cU RXl+/v3f3p8eT+//9r5vBxpr1We2Jxrh2rlu46WBy1ddr9Xm2p/yQ2pV/7W9fLoOZ0uoz29MpHd5 yMTZY5Xer75dSOxj33EbXkJfsem/pb5y0gdk4D6eeX8f7k+fHkyYMBF3omh4/OXf18++rv5VMzv8 OvNjTeOlxvLVAr8U/uvq3cu347z03b9+VxTFz//1c1EUL1++/PnTzz//188vXr74/k/f11vwZy+0 61N13tZPZNZfbZ/gbKRL57rDl28v+eTVsdFez6pV3z4O2d9ECY3egurXzvmdodVX5vBjVd9ookXb WHjI7uRqVD5R58arWYUn1mr/qQypQ6m8msaECRNBJxpOj6fy39mZ1UuNZepz+sp/+M+H+syH/3zo K7/ovO+8KIqPP3188uRJ+RXg9l9vHz48fPiPD0VRvHnzxd1r3/7+287V7+7uvv39t+//9r782blM Qmfvd2JOKR0n5Yd1Z8s7sdbwWk1fvq+EdnhU6pUvF/v8Ze9C+ugNP7ZnDTmMZ3fnrMSeplfJLbxz gfbRHlgycHlG96i3lxleVKL8oq91/ubf3nz+n8+Hvz+Uvz75hydP/v7Jx58+Ni5/K+9Br/8r55cx X//Z6WzHZnrFxLr15G5f5tZXZjEhz84a0pG7BzupZ2419lPn7td2ctrPhAkTM5077+5XH9DZ/n/L PLZeahX1/HfP6/Oe/+55b/mdcX74w5sP//Hh6bOn1YXyh8PhxT+9+P5P3w+5TL3M70bAdypDN/ei pEbH5hD1hYd86M8e7SPqvInR9cw6iTCwwBFv8bZHOFWHrU/7mTBhYt5z59VVb/c1Qy6Fe17TeKmx fLVAWXi1+tBL4Sr3P94//frp/Y/3RVGUd9oNPzVeT/FN7jjvbCd1ftC3T58v3UznOu3k/J8JEybm PXfe+a9z+c5y+l46PZ7ufrwriuLux7uyzPqvneV0x/nhD29evnz58PDw/Z++f3h4ePFPL07/c3p4 eDibzZ0t8r610pdJn9VYK3E5WHsTjRSvWlRLt/Dal1BNadH2Xdw3Xfr88dnKFJm7Vt+FgYs19nTi 31Jii50XSPZd3L5EHYA9ekz+61q+91mt/Z3tv7z62Pq1S2/r/PCHN89/9/zdD++e/+75p0+fnj9/ /vbt2zH73KXvgu3h63ZemN13PXbntjqvkV7ogzjrkviJJc9bWmN69i869XfnbOHVBWidF/St0K3S dxzO12Enp/1MmDAxx7nzu7u7rNb50suXvnjIa1v5fNbTX7+7+9vdy9+/3POzWsfpbLXD7E47ee60 CRMmJj+zved/eVP10Nbllu99ZnvCzp+7DgDLKeP86dOnW1fkC13PbP/886unyaex/vF1UXxaumYA sEcvvj470Nkmbm9rfcyHP384/P62fB572Snvp59++umnn37u/OfrP77++NPHajiV/4vzzWvmp59+ +umnn34O+fn27dt6nP967vz+h1dPT2//8rZcDgDYrXf//q4oiof/fLi9vS3PnX9xo9rp8XS4OWxT NQBgrC/HO785aJ0DQDha5wAQntY5AIT3xXjnZeu8nejtJvvZ1K9WKZds/LqEvn6Fqsthxk0ntpVV wugqnX1HqvLrG8rdaGMrS7x3445D51rpomY54InDmCg/97/PlLcs1/7rVpW/6PsLF+CLOE+3zuv/ r9JLZv0/nNfKW9n2Q2TIdmcJsGLAm07ano/knusGDDSodd6n86t69X2/LKrxa/Flg6Bz3cYp/LkC qT6zUfIsmxtd/uxbrxfS975M30opsY9Zb+uIL0b1HamXPLAl3VmN6VsfvnrR9b2w723qXL7Ud7Rz qxSibn2baFdGe51rM/+586qE+g3vRes/djva6x8WfctM1FnyjJsbUf4SW+8spPFGnFXVp7OGw/dx yNva+VVvIcv9dc2i821qHJ+tjuRO6jbkHVzzLwp2IqN1PmMjsk/jc+HUPxJcn6qSI1Ycsbm5TNzZ 4sv97Uv0XO1ETy95tpy+5Sc2kbf6yM7aemebtV7C6MM7bsmIdRtOo5wrtMi587OGNB+HpEiikOFb n7K56Zbe2emqz/R093Lfx+5ylaxHaW6oT/8G0Lf1xHesiVtcToi6dXZrAZVJ585HG7KJzhRZbuvL bW5glTbceqf614t2WPadsCxW/5zdtvXW3npWa3WrL5Fn7blupX3WCjY0z7nzcR+p1edF3/xF/8dW W9n2Y2uJrZ99O+ZK3L53cLkVGxrXZ7S3smg10lufbs8N0J3ULfEZsn5lYFsZT4Ur/+c0Gm1F5peA IX3Ly/V+d5Y84+ZGlD9l69U70mgc9xWS9X7VK3Z2dxonXIv+ZusKX57Su7ntuZXOynTOHPI+LroL u63bwM+QPby/sKZB586HnFBPnCY8e93TkDlnJQo5ewpz9P/5gTVPLzbLzjbmd+577rbObqVv5vAV xx2HzrXSM2c84Ge3nlh34AJnq73Ekdx53YZvIndDcBk8sx02s+f/bnuuG9Dmme2wjT33Bu+5bkCn ba5sB/b8H23PdUuLW3OYSOscAML7Jc5PfzsWzp0DQEw3RVGc/vX5x58+Ho/HsnX+7t/fbV0rACDD 3xVf3nkiywEgnOZ951vVAwDI8vqPr6vp5pXtq1cGABjj9va2mr5JLAcArGlIN3ln21ucA8COpHvK +/JenAPAvvRldiLp/67vhYvx/tnXZ+fs0/tnX0epKgDb6o3zKkgkyqL6Du/7Z19/+9PHb3/6OLCQ 6t+stQMghmvsbB+YkQ1lvs5emekaFdttPQFYzhpxPj1g6o3ORnSl57e3W77Uzr/OVbLKSVS+M24b /R/7yWBfCADC6YjzubrZ26kwLicScXh2fnsX6snduYm+cmZPuM69KLaL+bneLwDW1xHnZeD1hc1w 9XImFtUoNrcO85Y5epWskocfrsY+7u39AmAFM3S2d7aAq4ny1bMt5unVuHh9xy333Ln3C+Dy9Ha2 17t80x/fiVer09L1ll+UMOg7ob6Vuepwqe8XwDXrvlGtukVq+L1SbY3vAQP7vfem7whc3l1hl/F+ AVynBR8j007BWdp5WRkzPZDmvZ97zwG50PsFwArWuFFtYir0Xeo1ZH5jmayLxtPnpIfvVKI+9TkT L2EbUbFEgRNLAGBl3Ve2NyY211eTIfPPZnBj5hJ7na7DkFoNLx+AK3SNT4UbaPZWLwAsRJyniHAA Qrj8EdUA4OKJcwAILxXn7559Xq0eAMBou2ud536HONwcDjeHhSqzQvkAMF0qzl/99GS1eox2ejyF Lh8Apuu9sr1sJdcTvWo3v/rpSf3Vd88+V3P6VqnP71u+mtPedK56e7qex7nzc8tfup4A0Kk3zuuJ W59Z/JrHnRN9M/uWqU93rjXC4ebQiMby19z5ueUvXU8A6DPPufPcAN5DN36VkTsPy51XD4A92P4x MnuI9tyL3Ra9OO70eCrLF+QADLR9nG+u3bl9dpV1rr8T6gAMtLsb1aicHk9VSx0AEuaJ84gPnOmM yRHZ2Vhl+n3q8huAXN2d7bn3jHXeeNa4Nn7gOfLGXXBpVfLV+6UbLdr6JW9n5zeWySq/s4SE3HoC QJ/uOO+M0npOtxdLr3J2ZtYClb6omzK/b3pIObmWLh+AK+Hc+czcJg7A+maI8z3cabYfshyA9Wmd A0B44hwAwhPnABCeOAeA8MQ5AIQnzgEgPHEOAOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkAhCfO ASA8cQ4A4YlzAAhPnANAeOIcvnB/PN4fj8OnAfZAnO/a4eawdRWuzovb26zpwtsE7EBvnB9uDot+ SC1dfmK7ffP3tr+Hm8Pp8TS9nNyNTiw/sXpZeNYm2gsn3sHhlZzX6fEk0YFt9cZ5O0jmtXT5uaLs b5R6tpXfTqp/Q/KvWmXNsLw/HqvG95BpgD24yVq6/qla/9zPnZ9b/iz1qWaWE7OXP2P9i56m+Q7r eXlGdLYXvzbQL/7gALuVEeeNT6vq19z5ueXPVZ/OV2csf+n9jV7PhrOF1DckLAHS5r8Urv4RPHvh Q7a7sq32N1eUegIwQl5ne64RF38NX7g6pdp5vVjWduey6P7O6FLrCXC1Fozzdift2VVyG45V73Fj 3U0aoCvs7yyi1LPoP98PQMMl3He+/sXPTOf9ApjR/HHe+TE94rO7fbfxwPuPp286y0L7O/sXlNXe l31a4X59FyUAG+rubO+8p6sRMI2rjtPzG8tkld8ntz6NVzs7cve8vxHrWV9lRDlZ5Sfmt7cOcGG6 47zvo3bK/CHnthMf8Z0vjSin89Ud7m/RdXfWDus5/GD2vdSZsuUJlM51xx3MRWmaA5tb9sp2JrqG kFhnHxdN3Gt4m4Cdu4RL4eAsiQtcNnEOAOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkAhCfOASA8 cQ4A4YlzAAhPnANAeOIcAMIT5wAQnjgHgPDEOQCEd7N1BYK5+3zXOf/lk5cr1wQAKlrnABCeOAeA 8FKd7YebQ1EUp8dTe3575hX69h++LSfe//f7bWsCwJXrjfPczO7L/ssmyAHYgzGd7Z2ZfW1BDgD7 kX1le1YrvFy4VK1Stvurl+pF9S1fzWlsvXP5xHwAuEgdcV5lYWdy15M4rdFdX/+1czqxfGOB6gtB 5/KJcvrq2ZjjGwAAsXTEeWcozmvN7vqzxQpvAKJb9jEyA9vx8y7f7pMHgMu2bJznXhvfd/47t/yq gyG3DgAQ0SU/5HXgWQPnzgGIbr04zzoZn9vrXvRfCneW8AYgurw477vovXN+4xr4IZekte9JG7h8 vfzc7QJAdL1xnnXxedb89p1v6ekhEwPrM7vyOa+eDQfAti753PnSpDgAOyHO8xjXHIAdMkAqAIQn zrPdH4/3x2N7elxRjYkdyt3fGY/PLPZWH4AlRIrzEXevzVVafeEXt7ed05cqd39nPz4TMzhRn3n/ ogA25Nz5ecNvZK+nTlbyNRauyhkYh7nbLVfpXLJv/tJmqU/ucWgP5AMQVKQ438nHbj1g+qbbv04v f8jyU7a7RH3W/HLQdxy2qg/AmjaI8/TApp3joBc9D2DPHe88XXjfS405uZ3JZYS0G9xLx3DR01pt nLA/O3+uzvZE3HZudy7p+mugA5dhX63zvjHRi67nxOWOd94o/Gw5E/Ul2dINxLNfF9pt1vqK67df +7a7dMwDXJJ9xfmUHM19Tvual0HlRpGrr4v8rxf1LpBC/ANXZl9xvpV5oz19IVtuf3W7U7qvqNwL 6ILqOw4rnLwA2C1xXhRzX2TXl76zFLv+dvdGSAO0RbrvHADotHGcz9jLPfqxMOmXxp1lz20iN5Yf 3cJev2l+uDmscHyW03fl47j9AtjKBp3tueOaF5njqQ+Zn6jD6I73xt1oZxMrsXznjW255aQvDate HTi/z/CvO+njk7vddPlTyimtfLEkwES/KYri9Hj69OnT8Xg87qbNtDduTU6Ie3DSNY+7X0BcjUew NDQelXF7e3t7e3s8Ht/+5a1z54P4WO8TOvNkOXAxxDmTXGrmXep+AZdKnO/Ufi4WA2D/xHm2Sx0/ 27jmAHFFinPjnS/KuOYAcXkq3HkDr4o6+xDWLJf0RaHOuOYAS4gU5zv5OE4MjxY6g41rDhCX8c5T y7drW4zqZE60JjuDp/5AlYHJ2phTf3573wgljYWNaw4Q175a55c03nkld6SvzubmiOUT5WzSeDWu OcBy9hXnlzreed3wuCqfWjp9+cb8vXVEG9ccYLp9xflWFo32ueJn3kvt9s+45gDDifOiWPgiu7ny 5tpC69r2F2CKSPedM4sho70BEIvxzs+8NPtZ9uFRWjbrp0dvu5zy16zHthjXHGDPjHc+23jnfXLH Hc8d+btvcPR0OblXxhnXHGDPjHc+yAXfsjz81H7cg2BccyCK0eOduxRukMv7uO9s0CeEzjxZDlw8 cX6lcvuoLzXzLnW/gGvjynbOaJz83s/VbQBUxHk242qPYzx1gOVEinPjnW+iflfblEfiGE8dYDnO nZ83/GqpS32WeKB9MZ46cJ0ixflOPi77hjPxLPE046kDLMd456nl27Ut+jtdzw572liy8bC29GDk ResatNzlq1f7yhkYY/X6d5bfVx/jqQMsZ1+t84sc77zSjrrOBl+ilT9kOPPq10RTcpZehIHdEqs1 ZI2nDlyzfcV59PHO+8659kXa8JnDTXx4amL08bmeIb8y46kD12Bfcb6VuaI93dE9XN/qc8XM0nEV IvKNpw5cEnFeFIsNu5LbLiy146RebN9iudXrLCcxPkrW5kLkX4hKAgwU6b5zAKCT8c7PvDT8LPuI HuYhT0abvcyzJjbNx9XHeOoAUxjvfLbxzhOXUPWlTueNaonxwvtuAMs9F77oufNxl5IZTx1gCuOd DzLv3WvVJeJ7O327bZXi3rdtPHVgLqPHO3fufJB5P473luKlbTuuQ2eeLAc2t5cr2xPfRFauyWr2 Furb1udS3+hL3S9gb7TOASA8cQ4A4W0zBEs5oR8SAGaxQev89HgS5AAwI53tABCeOAeA8MQ5AIQn zgEgPHEOAOGJcwAIT5wDQHjiHADCE+cAEN6WD3kFAGaxQZx7wisAzGsv453LeAAYzblzAAhPnANA eOIcAMIT5wAQnjgHgPDEOQCEJ84BIDxxDgDhiXMACE+cA0B44hwAwhPnABCeOAeA8MQ5AIQnzgEg PHEOAOGJcwAIT5wDQHipOH/37PO7Z59Xq8pq5tqpw83hcHPonD9L+QAw0E3fC++efX7105M1qxLL 4eZwejxlLV8URdYqADDQNXa2L/01pTOzBTkAy+ltnXeqd1NXoVi246uXzoZl2a6teqTrOVfvpq7m 15dstHE7lx+yC/VK5tY/LasV3re/Sx+fxHHTiwAQUUecV8HWSL5G93v9177pPvWe6mq60X1d/7Wx QBV4fcsn1JP77L70Vb4+0dhoPYnTEvVf9PiMO24A7FlHnJdhlnXuPLdFu7fu6Kz6d4bivNY8Pu2v I0tsBYBF5XW2DzGlszr3mvDo15Avvb+dy7f75AGIbv44H63dCXx2leiBlHtt/FzHp+pgyK0DAPt0 jVe2Uzo9ntpn+vtupgdgz3Ya5yMSJXoIZdV/yvGJfqAAaMvobG9cEz773duNu7POpk5jmSGdxn0X 7c+l76L3zvm59Z/r+KS3q+8dIKLeOO+MuuEzExL50Tk9ZGKgIfUfuDtZF59nzV/h+MhsgAuz0852 AGA4cQ4A4YlzAAhPnANAeJHi/AIGX1/5JjH3pAFciUhxnmWH2b/+YCfDx4MBILRIcb70OOUAENQG z2zvG188Pe748HHKc58VM2J88c7lE+UUXU3zdcZ9b4+UCsDl2WYIliFjpTdGaB0+TvmIAV5zxxfP nZ5ruzOO+w7AJdmms70vaHO70+fqfs/Nv7metra3cd8BCGpHA6Q2bHWmPHcQ0iHl5Lq2cd8BmGi/ cb4HmzSUr3DcdwAminRlOwDQaZs43+FN4WfVW8kjOrfH3QI+fRVXxgFcg2062xM3pHUaMU55tYnR 5+AT44InbkibboVx3wG4MJudO2+nbCJ3x41TPmXk8hEvnc3Rxk1lK4xrrmkOcCWcO1/V+g95XXNz AGxlgziP/qxWGQnA3midA0B44jy8++Px/nhsT8+1PAD7t2WcH24O0R9ntof6v7i97ZwesfwedgeA ETaL8/Ki69DnoS/vunHjowMEpbM9vPvjsWpk16fnWh6A/bvMOC+78Rud+eV0e/6I5Yue8ctzy6/m 9L00pK3c13nePilezkl3zmugA0S0wWNkqrQoJ4YMCt6Yk14ld5zyEeOaD9n0fsYv1wQHuHgbxHln yJ1dfsZNr7z8hZ1fB2CHDJBaFMtf0W38cgAWJc6LYuEGtPHLAVhagDjPPXcOANcmQJzPGN65l7Ml tK9ZG1iBrOXb9Rl4/eBol3czPcA1CBDnuXLHKV96vPB5xy8ffiPZi9vb6s609i1qAFySzeJ80SZg 7hDmQ2Z2DnC+8vjluar8HhjkmuYAQV3mY2RWs374LZq4shwgqCuK88vIqsvYCwDmdUVxDgCXSpxf iLlGLr+eEdD3Nu773uoDxHLJ453PWHhfUft5fNtcV613lrOf3WzLqlt94dxx4s+amMEjxqHf8/sC rGyzK9u3uoh63M3inavs/DrwGUdeGXeT/d7Msgt9RzXraNdTf8hafcf/Mt4XYBYXeN955ao+5uYK 7+sZfm2rcd8b26p+NQ49MEWAc+edY43Xf/aNR9456HjipYHjmvfNX228886Z7U//6ukxDeXMsme4 vUBfimSNg547fnzf/p6dP2S71UuN92vgOPHt89kLndgeNw698emBUoDxzusrthO0bzzy9sdc38Cs M44vvrfxzstEb2fDog3BRP2zjs+Q+Y2vC/OOE9/QbkOX6nlfePQesJEA452fLWo/oox3vlXkDHwY flaBm7dN+2I+sXzuuXOAsy7n3PmU1Fw6EnLLP9vH3n7yfJ++BvqitorYzaO9ru9ERt+5c4ApLifO pwgx3nnfaYUd2qp6uzosQhpYU4BL4Zio75o4AC7G7uJ86WfLDKlA/dd5ryheZ5W26YmeuMI/XcNZ TjSMWHjg+5i2n69BQR9+AKwmQGd79UFcn0ivkrh4vl1IrPHOl6jhvHKPZ9/yQ+Ynju3E2wHa0405 o/vSXQoHLOE3RVGcHk+fPn06Ho/H3bRF9ubK20YXs5tBd+TK//zgqlR3LHe+2riZ+fb29vb29ng8 vv3L2911tu9T34fmlXyYXsxuBt2RK//zA4YQ5wAQnjgHgPDEOQCEd8njnc+oPQTIVjUBgLbN4ry8 Nm/6tTwrfCdoXz+8h0eFA0AlfGe7i3sBIMBjZIovO7eH5Hff8rnzi/5be9sjmQLAVgKMd547jvWU 8bOHlA8AexNvvPO5slZmA3AxYnS2zyLxyHfXtQEQ2hXFeVHrGCi+DHUtdQBCC39l+wjlDXJa5ABc jN3F+ezjZ+euOHCcbFfMAbAfATrbE+NYd14kP2X87ELHOwABbRbnWWM+5g4QOdf86lU3swGwZ7vr bN+n9kNet6oJALSJcwAIT5wDQHjiHADCE+eDrHyTunviAchysXGem4iJ5de/jt1TbgDIcrFxDgDX Y5v7zkeMOz68nMQArLnLF11N83JOtdbw8ttjwCT213jqAAy3zXjns4w73rd83wCsucsP3PSQ+jcW qL4QeDQNALPYb2d7bradXX7GsMx6dB0ALG1Hz2xPjEc+y/LF8leMz3j9HQAMt6M4L/rHI59l+Xbn 9pSqzlK+Bj0As9hjZ3vueOTGLwfgyu0ozs8Oc95YYEp+Z31XGLGh6au4Mg6A4TbobJ9r3PH08u0z 6427yxrxOeJMfKI+Q74EGGcdgLlsc+58xLjjIy4mb79an5N+tTG/fdfZ2TLbyycmGjTNAciyo872 PVv/Ia9rbg6A6MQ5AIQnzgEgPHEOAOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkAhCfOASA8cQ4A 4YlzAAhPnANAeOIcAMIT5wAQnjgHgPDEOQCEJ84BIDxxHsb98Xh/PLan51oegLjmjPN3zz7PWNr6 DjeHRUsbUX59lRe3t53TfRLLz7unAGxus9b5Vtl/uDnMEmYrJOLh5nB6PC1R8unxJNEBLsmccf7q pyczlraQREAulJ1zlX9/PFaN7Pr0XMsDENdNe9a7Z59f/fSkaj1XId03v3q1MTNRTt8qfepNyXoo 9s0foSyqUUhn+dXM9irVS+3KdC5cbyW3N92YM2Nne/FrA33pbzAArKMjzotfk3j4dFEU9eROl1P+ bKye0Eid6te++eO0+5/7yu/cemNOuyu7s3+7sYpwBWCc7s72vqDN7U5fs/t9V13lAxeW3wDMYvy5 883PlJft3eUu6Vq6/Pbm1tkQAJenu7M9iqrfu1gmDpcuHwBmcQmPkTk9nha982rp8gFgou44D/FA mHS+Tu8nn7L69OzP/QKR+9w3V94BXJLuzvbEDWmdRtx7Vm3i7MKNYKtCqG9+Qt89Zp3z07rQi58A AAxnSURBVOVXr3bWp7Fu4t42AJiu99x5O2UTudv5UmNmVoENffmX+0yY3HLSudt+tT6nb3pg4UXm reFZT4nRNAe4MJdw7vyCLfeQ1yWKBWArHXG++R1oAEAWrXMACE+cA0B44hwAwhPnABCeOAeA8MQ5 AIQnzgEgPHEOAOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkAhCfOASA8cQ4A4YlzAAhPnANAeKk4 f/fs87tnn1erymrm2qnDzeFwc+icP0v5ADDQTd8L7559fvXTkzWrEsvh5nB6PGUtXxRF1ioAMNA1 drYv/TWlM7MFOQDL6W2dd6p3U1ehWLbjq5fOhmXZrq16pOs5V++mrubXl2y0cTuXH7IL9Urm1j8t qxXet79LH5/EcdOLABBRR5xXwdZIvkb3e/3Xvuk+9Z7qarrRfV3/tbFAFXh9yyfUk/vsvvRVvj7R 2Gg9idMS9V/0+Iw7bgDsWUecl2GWde48t0W7t+7orPp3huK81jw+7a8jS2wFgEXldbYPMaWzOvea 8OjXkC+9v53Lt/vkAYhu/jgfrd0JfHaV6IGUe238XMen6mDIrQMA+3SNV7ZTOj2e2mf6+26mB2DP dhrnIxIleghl1X/K8Yl+oABoy+hsb1wTPvvd2427s86mTmOZIZ3GfRftz6XvovfO+bn1n+v4pLer 7x0got4474y64TMTEvnROT1kYqAh9R+4O1kXn2fNX+H4yGyAC7PTznYAYDhxDgDhiXMACE+cX6z7 4/H+eGxPRykfgOHE+SL2cDPYi9vbzukVyt/D7gNcFXE+vxCDmizamB4+Dg0AsxDnF+v+eKwazfXp KOUDMNw2z2wfMQ53ZyG544KvMM56u2m+1fjufZ3h9RPe9Zeq+S9ub+svlVFdfzVdfr3y+++lALgM G8T5jONw544LvtU46xuO795WD+n0S+2JvhUB2NZ+O9uHZFVunm01zvrexnfPJb8Bdm5HA6ROHIc7 d/DQuq3GWb+28d0BWMiO4ryYbxzurcYRz3KF47sDsJA9drZ3jsMNAPTZUZyn8/twcxgX8KuNI14a 90Vk+iqjj88QI25Sd1k7wJo26GwfNw739PKHLL/QOOsrbDf3a0TjbrSBCw9cHoCVbXPufMQ43O2X shZuz1xunPXGTWUbju+e1gjm9g3lox8Tq2kOsLIddbZfkvXDbFcJup+aAFwJcX4hlktQvesA+yfO ASA8cb44444DsDRxvoj6ReYbjju+Fc8MAFiZOJ9f1lVpF9mY9hQggJWJ88UZdxyApRnvPLV8NWfK eOfbjjvets5xMN45wJqMd36N447v6jgAMN1+O9svabzzuczVkR79OADQsKMBUq9wvPO9cRwAgtpR nBdXNt753jgOAHHtsbPdeOcAkGVHcX7N450PNG7c8a3GX9d2B1iN8c43G++8tPS448O/Xmx7HACY wnjnvdNzjXd+1nLjjvdVr+/XuY6DpjnAynbU2X5J9hNmmyTrfnYf4EqI8zDGtcslK8A1EOcAEJ44 B4DwxDkAhCfOASA8cQ4A4YlzAAhPnANAeOIcAMIT5wAQnjgHgPDEOQCEJ84BIDxxDgDhiXMACE+c A0B44hwAwhPnABCeOAeA8MQ5AIQnzgEgPHEOAOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkAhCfO ASA8cQ4A4YlzAAhPnANAeOIcAMIT5wAQnjgHgPDEOQCEJ84BIDxxDgDhiXMACE+cA0B44hwAwhPn ABCeOAeA8MQ5AIQnzgEgPHEOAOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkAhCfOASA8cQ4A4Ylz AAhPnANAeOIcAMIT5wAQnjgHgPDEOQCEJ84BIDxxDgDhiXMACE+cA0B44hwAwhPnABCeOAeA8MQ5 AIQnzgEgPHEOAOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkAhCfOASA8cQ4A4YlzAAhPnANAeOIc AMIT5wAQnjgHgPDEOQCEJ84BIDxxDgDhiXMACE+cA0B44hwAwhPnABCeOAeA8MQ5AIQnzgEgPHEO AOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkAhCfOASA8cQ4A4YlzAAhPnANAeOIcAMIT5wAQnjgH gPDEOQCEJ84BIDxxDgDhiXMACE+cA0B44hwAwhPnABCeOAeA8MQ5AIQnzgEgPHEOAOGJcwAIT5wD QHjiHADCE+cAEJ44B4DwxDkAhCfOASA8cQ4A4YlzAAhPnANAeOIcAMIT5wAQnjgHgPDEOQCEJ84B IDxxDgDhiXMACE+cA0B44hwAwhPnABCeOAeA8MQ5AIQnzgEgPHEOAOGJcwAIT5wDQHjiHADCE+cA EJ44B4DwxDkAhCfOASA8cQ4A4YlzAAhPnANAeOIcAMIT5wAQnjgHgPDEOQCEJ84BIDxxDgDhiXMA CE+cA0B44hwAwhPnABCeOAeA8MQ5AIQnzgEgPHEOAOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkA hCfOASA8cQ4A4YlzAAhPnANAeOIcAMIT5wAQnjgHgPDEOQCEJ84BIDxxDgDhiXMACE+cA0B44hwA whPnABCeOAeA8MQ5AIQnzgEgPHEOAOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkAhCfOASA8cQ4A 4YlzAAhPnANAeOIcAMIT5wAQnjgHgPDEOQCEJ84BIDxxDgDhiXMACE+cA0B44hwAwhPnABCeOAeA 8MQ5AIQnzgEgPHEOAOGJcwAIT5wDQHjiHADCE+cAEJ44B4DwxDkAhCfOASA8cQ4A4YlzAAhPnANA eOIcAMIT5wAQnjgHgPDEOQCEJ84BIDxxDgDhiXMACE+cA0B44hwAwhPnABCeOAeA8MQ5AIQnzgEg PHEOAOHdbF0BAOALp8dT7iriHAB25HBzGLFWM87HlQIATFG2yEe0y0vOnQNAeOIcAMK7xjgf3ZXB XJZ4C2Ys018IEM41xjnbOj2e6pdonB5P9X/DC1mmdkVRFIebw/Tyz5bQuUDWQUgXnlX+XNsFtnKN ce5yv7053Byqf1cSKn2ZOstBSJTT98fvPwVElxHnjY+G9q8rfxD31ae6OLCzSsPr2VdO+jiQ1mia p5fs/LV6OxLvb+cfZ+f72Pd3sugXi86DUJ95PV9rgLlc5n3n9U/Gxkdn1gdlohy2Ur4LZxOxmm73 7Xe+p2u+v4tuqP21wN8tXIPZ4nxXHxlzVWZXO3XB6l+wphzz3HW9v8DF6Ijz4efb0vrKmat8LkaI P4Ct/p5HbHeui+mytmu++eZPnz9FR5wv3bQN8dkNDVv9PY/YbuPkwjrbNd9886fPn2K2K9uv50aX +tn3K9nlTTjOAMONjPO9fbzOcpdw52XSnQuXSeMioxGyLkXc8Divv9HG1xd/WkCWjEvh6h837Q/l 9T990vXp1Gjtna1zYxONl3IrzAiJrqoRb+KQ5WfU93+k7+9w+E5NrE9jK7n/L4Adyruyvf7/fA// 5zvrk8jdgacbh8xntMOXd0+NPsLpL1hn3/fE8rM0jkf8saVfGrfdIX///sjhAlzmfedL26q1dzEG HrStjrP3FAinGed7Oym+rawOfBblOAPXYEhzovPzUOscAHYk3Xrpy3txDgD70pfZiaS/5BHV9nkG dJ+1AiC03jg/3ByygkdKAcBWuuP8Mh6TMu7iqdC7DMB1GvkYmfqvjRuI+24uGhKufct3zu+rT6Iy jQEx2+UvdPl0X/mJ/UrXc7mqAhBR9xAsjYdNppOjCvV2ViXiNr3d4suvC1nl1Gue2ERn+VmPIB2o 79Fyif06W8+FqgpAUPu9FG6J4Tc2z7+BFdi8ngDE0t3Zvuj543bhZXpV7emBA8HWew6upKnqvD4A nbrjfNGHayZyt34yPnGOfPb6LH3ufC47rx4AW9njY2Q2aXBLSgDiGnnufIle3yFlNpYpG9azBH9Z TufVc33jcWXNr16dWM8ligIguozWeaNTujNZiy8vYcu6sapv+fR2OyXunetceNErxvvqP9fxAYCO OG/kRP3Xvunhc9L6lk9st3OVITMnXmSeNX/4MUzXU4QDXIMRn/Z7PHe+Mq1eAPZj3LlUcV4UIhyA fRidR1/E+e3t7RyVAQBW9UucP3369Hg85q58e3s7Yi0AoG56nv4S53d3d6OL+PTp05QaAADFtCze 7zPbAYCBxDkAhCfOASC83xRF8fqPr7euBgAw0tu/vC3ufhx/4h0A2IP/BWHAMceM3laAAAAAB3RJ TUUH0gEGFickdUTongAAAABJRU5ErkJggg== --Multipart_Sun_Jan__6_23:42:34_2002-1 Content-Type: text/plain; charset=US-ASCII --Multipart_Sun_Jan__6_23:42:34_2002-1-- From martin@v.loewis.de Sun Jan 6 22:47:45 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 23:47:45 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <20020106215100.1D2DDE8451@oratrix.oratrix.nl> (message from Jack Jansen on Sun, 06 Jan 2002 22:50:55 +0100) References: <20020106215100.1D2DDE8451@oratrix.oratrix.nl> Message-ID: <200201062247.g06MljV02361@mira.informatik.hu-berlin.de> > Well, I only know about the Mac and (to a lesser extent) about > Windows, but there's lots of methods that are not in > {posix,mac,nt}module.c there that want filenames. And I think mmap > also uses filenames, no? All in all I'm in favor of a single place > where file name encoding magic is handled. I think Marc not only things about encoding, he also wants that the single place actually performs the system calls. So if you want to support mmap, or an additional system call that expects or returns a file name, you cannot put it into your module; instead, you must put it in fileapi.c first, and *then* call the function in fileapi.c from your module. It may be necessary to call different routines depending on whether you have a byte or a character string; this is not something a getargs converter can do. It also may be that, depending on which system routine you call, the system will *return* either wide or narrow strings to you. Every time you find another use of file names, Marc suggests you put that into fileapi.c. Regards, Martin From martin@v.loewis.de Sun Jan 6 22:51:37 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 6 Jan 2002 23:51:37 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: <20020106221917.15318E8451@oratrix.oratrix.nl> (message from Jack Jansen on Sun, 06 Jan 2002 23:19:04 +0100) References: <20020106221917.15318E8451@oratrix.oratrix.nl> Message-ID: <200201062251.g06Mpbv02364@mira.informatik.hu-berlin.de> > There's a lot of Python objects that are really little more than > wrappers around an opaque C pointer (plus all the methods to operate > on it, etc). Can you give a few examples? I'm not aware of any such types, off-hand. Regards, Martin From nhodgson@bigpond.net.au Sun Jan 6 23:19:11 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Mon, 7 Jan 2002 10:19:11 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <20020106213650.8E224E8451@oratrix.oratrix.nl> <200201062242.g06MgYa02308@mira.informatik.hu-berlin.de> Message-ID: <024101c19708$8c64e6c0$0acc8490@neil> This is a multi-part message in MIME format. ------=_NextPart_000_023E_01C19764.BFBE5450 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Martin v. Loewis: > I'll attach a script below. It contains UTF-8 encoded data, so to > prevent transmission errors, it comes base-64 attached. Running it > creates a three additional files in the current directory; I recommend > to run it in an empty directory. I have added some more cases to your example Martin, in Hebrew, Chinese and Japanese and a combination. The combination is an interesting case as it will not work with mbcs with a particular code page, as no code page (to my knowledge) contains all the characters. This works using my modifications except for the calls to os.rename. Neil ------=_NextPart_000_023E_01C19764.BFBE5450 Content-Type: text/plain; name="uni.py" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="uni.py" # -*- coding: utf-8 -*-=0A= import locale, os=0A= =0A= locale.setlocale(locale.LC_ALL, "")=0A= =0A= filenames =3D [=0A= unicode("Gr=C3=BC=C3=9F-Gott","utf-8"),=0A= unicode("=CE=93=CE=B5=CE=B9=CE=AC-=CF=83=CE=B1=CF=82","utf-8"),=0A= = unicode("=D0=97=D0=B4=D1=80=D0=B0=D0=B2=D1=81=D1=82=D0=B2=D1=83=D0=B9=D1=82= =D0=B5","utf-8"), unicode("=E3=81=AB=E3=81=BD=E3=82=93","utf-8"), unicode("=D7=94=D7=A9=D7=A7=D7=A6=D7=A5=D7=A1","utf-8"),=0A= unicode("=E6=9B=A8=E6=9B=A9=E6=9B=AB","utf-8"),=0A= unicode("=E6=9B=A8=D7=A9=E3=82=93=D0=B4=CE=93=C3=9F","utf-8"),=0A= ]=0A= =0A= for name in filenames:=0A= print repr(name)=0A= f =3D open(name, 'w') f.write((name+u'\n').encode("utf-8"))=0A= f.close()=0A= os.stat(name)=0A= =0A= print os.listdir(".")=0A= =0A= for name in filenames:=0A= os.rename(name,"tmp")=0A= os.rename("tmp",name)=0A= =0A= ------=_NextPart_000_023E_01C19764.BFBE5450-- From barry@zope.com Mon Jan 7 00:05:08 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 6 Jan 2002 19:05:08 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> <15415.30180.542499.182362@anthem.wooz.org> <200201052358.g05NwDN14410@mira.informatik.hu-berlin.de> <15415.42968.453024.267482@anthem.wooz.org> <200201061137.g06Bb0o01516@mira.informatik.hu-berlin.de> Message-ID: <15416.58932.135700.16016@anthem.wooz.org> Okay, I'm totally confuggled now. Let's boil this down. Take this simple program: -------------------- snip snip --------------------/tmp/foo.sh #! /bin/sh echo "OPT = x${OPT}x" echo "CFLAGS= x${CFLAGS}x" -------------------- snip snip -------------------- and invoke it like: % CFLAGS='one' OPT="two $CFLAGS" /tmp/foo.sh What do you get? What do you *expect* to get? Am I boiling things down correctly? On every system I've tested, the following output is what I get: % CFLAGS='one' OPT="two $CFLAGS" /tmp/foo.sh OPT = xtwo x CFLAGS= xonex So, why should any of this work anywhere? Should we ever expect $OPT to get the right value? i-must-be-missing-something-really-obvious,-obvious-ly y'rs, -Barry From guido@python.org Mon Jan 7 00:20:32 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 06 Jan 2002 19:20:32 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: Your message of "Sun, 06 Jan 2002 19:05:08 EST." <15416.58932.135700.16016@anthem.wooz.org> References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> <15415.30180.542499.182362@anthem.wooz.org> <200201052358.g05NwDN14410@mira.informatik.hu-berlin.de> <15415.42968.453024.267482@anthem.wooz.org> <200201061137.g06Bb0o01516@mira.informatik.hu-berlin.de> <15416.58932.135700.16016@anthem.wooz.org> Message-ID: <200201070020.TAA13961@cj20424-a.reston1.va.home.com> > Okay, I'm totally confuggled now. Let's boil this down. Take this > simple program: > > -------------------- snip snip --------------------/tmp/foo.sh > #! /bin/sh > echo "OPT = x${OPT}x" > echo "CFLAGS= x${CFLAGS}x" > -------------------- snip snip -------------------- > > and invoke it like: > > % CFLAGS='one' OPT="two $CFLAGS" /tmp/foo.sh > > What do you get? What do you *expect* to get? Am I boiling things > down correctly? > > On every system I've tested, the following output is what I get: > > % CFLAGS='one' OPT="two $CFLAGS" /tmp/foo.sh > OPT = xtwo x > CFLAGS= xonex > > So, why should any of this work anywhere? Should we ever expect $OPT > to get the right value? I haven't followed this, but from the above it appears that if you use the form VAR1=val1 VAR2=val2 ... program args then all of val1, val2, ... are evaluated simultaneously using the previous values of VAR1, VAR2, ... rather than left-to-right. That's mildly surprising but not really upsetting to me. --Guido van Rossum (home page: http://www.python.org/~guido/) From neal@metaslash.com Mon Jan 7 00:27:31 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 06 Jan 2002 19:27:31 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> <15415.30180.542499.182362@anthem.wooz.org> <200201052358.g05NwDN14410@mira.informatik.hu-berlin.de> <15415.42968.453024.267482@anthem.wooz.org> <200201061137.g06Bb0o01516@mira.informatik.hu-berlin.de> <15416.58932.135700.16016@anthem.wooz.org> Message-ID: <3C38EB73.437BE9E6@metaslash.com> "Barry A. Warsaw" wrote: > > Okay, I'm totally confuggled now. Let's boil this down. Take this > simple program: > > -------------------- snip snip --------------------/tmp/foo.sh > #! /bin/sh > echo "OPT = x${OPT}x" > echo "CFLAGS= x${CFLAGS}x" > -------------------- snip snip -------------------- > > and invoke it like: > > % CFLAGS='one' OPT="two $CFLAGS" /tmp/foo.sh I think the intent was to use single quotes for OPT='two $CFLAGS'. (You could also do OPT="two \$CFLAGS".) This will pass the string "$CFLAGS" in OPT, not the value of the shell variable $CFLAGS. While your shell script will print out: OPT = xtwo $CFLAGSx This is ok since it will/should get expanded properly in the Makefile. Or I've totally missed the point too. :-) Neal From mhammond@skippinet.com.au Mon Jan 7 02:01:46 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Mon, 7 Jan 2002 13:01:46 +1100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <200201062247.g06MljV02361@mira.informatik.hu-berlin.de> Message-ID: > It may be necessary to call different routines depending on whether > you have a byte or a character string; this is not something a getargs > converter can do. It also may be that, depending on which system > routine you call, the system will *return* either wide or narrow > strings to you. Every time you find another use of file names, Marc > suggests you put that into fileapi.c. I'm sure that is not what Marc meant. I think he simply meant a conversion function that would return the filename as either byte or Unicode. Get your arg from PyArg_ParseTuple, and convert it with this function. Have I missed it all these years, or should we define a PyArg_ParseTuple format that takes a "void **" and a function pointer to a type conversion function? Mark. From mhammond@skippinet.com.au Mon Jan 7 02:50:10 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Mon, 7 Jan 2002 13:50:10 +1100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <024101c19708$8c64e6c0$0acc8490@neil> Message-ID: > I have added some more cases to your example Martin, in Hebrew, Chinese > and Japanese and a combination. The combination is an interesting > case as it will not work with mbcs with a particular code page, as no > code page (to my knowledge) contains all the characters. > > This works using my modifications except for the calls to os.rename. This looks interesting :) Any chance of putting all this together in a patch at source-forge? Ultimately uni.py should be rolled into test/test_unicode_filename.py, and it is unclear if http://pythoncard.sourceforge.net/posixmodule.c is the latest with Martin's comments - and it appears posix_open may leave 'fd' uninitialized before comparing < 0. Thanks, Mark. From nhodgson@bigpond.net.au Mon Jan 7 03:46:14 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Mon, 7 Jan 2002 14:46:14 +1100 Subject: [Python-Dev] Unicode strings as filenames References: Message-ID: <036d01c1972d$db4ccad0$0acc8490@neil> Mark Hammond: > This looks interesting :) Any chance of putting all this together in a > patch at source-forge? Eventually although I'm not yet sure the direction is sound. It does expand the code horribly. Also not sure if I'll have the determination to push this through to completion - there are still plenty of issues to be resolved. For me, just having open work is the most important bit - all the others are far less used. > Ultimately uni.py should be rolled into > test/test_unicode_filename.py, Directory tests added to uni.py. > and it is unclear if > http://pythoncard.sourceforge.net/posixmodule.c is the latest with Martin's > comments - and it appears posix_open may leave 'fd' uninitialized before > comparing < 0. New version just uploaded fixing that at http://scintilla.sourceforge.net/winunichanges.zip Neil From nhodgson@bigpond.net.au Mon Jan 7 03:52:30 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Mon, 7 Jan 2002 14:52:30 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <20020106215100.1D2DDE8451@oratrix.oratrix.nl> <200201062247.g06MljV02361@mira.informatik.hu-berlin.de> Message-ID: <038a01c1972e$bb18f7b0$0acc8490@neil> Martin: > Looks good. The posix_do_stat changes contain an error; you have put > Python API calls inside the BEGIN_ALLOW_THREADS block. That is wrong: > you must always hold the interpreter lock when calling Python > API. OK, moved the thread stuff so no API calls are inside. However, PyUnicode_AS_UNICODE left in as that is just a macro for accessing a field that should be stable over the call. Or is it? Other methods don't seem to worry that GC will move buffers, during calls. > Also, when calling _wstati64, you might want to assert that the > function pointer is _stati64. Likewise, the code inside posix_open > should hold the interpreter lock. OK, assert in for stat. > However, the size of your changes is really disturbing here. There > used to be already four versions of listing a directory; now you've > added a fifth one. And it isn't even clear whether this code works on > W9x, is it? Currently it won't work on Windows 9x. That is more work and code bulk. > There must be a way to fold the different Windows versions into a > single one; perhaps it is acceptable to drop Win16 support. I think > three different versions should be offered to the end user: Windows does this with the preprocessor - you are either building a Unicode version or an ANSI version. > - path is plain string, result is list of plain strings > - path is Unicode string, result is list of Unicode strings > - path is Unicode string, result is list of plain strings > > Perhaps one could argue that the third version isn't really needed: Sounds good to me. I'm moving back towards not using the 'utf-8' system encoding but rather checking of Unicode arguments and handling them explicitly even at the cost of code expansion. > Now, os.rename will fail if you pass two Unicode strings > referring to non-ASCII file names. posix_1str and posix_2str are like > the stat implementation, except that you cannot know statically what > the function pointer is. The code now passes both narrow and wide functions to posix_nstr and there are two null functions to make this compile on non-Windows. Added mkdir to allow testing the chdir and rmdir functions. Now handled are open, os.open, os.stat. os.listdir, os.rename, os.remove, os.mkdir, os.chdir, os.rmdir. Updated files available from http://scintilla.sourceforge.net/winunichanges.zip Neil From barry@zope.com Mon Jan 7 05:30:52 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 7 Jan 2002 00:30:52 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> <15415.30180.542499.182362@anthem.wooz.org> <200201052358.g05NwDN14410@mira.informatik.hu-berlin.de> <15415.42968.453024.267482@anthem.wooz.org> <200201061137.g06Bb0o01516@mira.informatik.hu-berlin.de> <15416.58932.135700.16016@anthem.wooz.org> <3C38EB73.437BE9E6@metaslash.com> Message-ID: <15417.12940.186180.661369@anthem.wooz.org> >>>>> "NN" == Neal Norwitz writes: NN> I think the intent was to use single quotes for OPT='two NN> $CFLAGS'. (You could also do OPT="two \$CFLAGS".) This will NN> pass the string "$CFLAGS" in OPT, not the value of the shell NN> variable $CFLAGS. NN> While your shell script will print out: OPT = xtwo $CFLAGSx NN> This is ok since it will/should get expanded properly in the NN> Makefile. Unfortunately, none of this really helps. Getting $(CFLAGS) into $OPT just results in this: Makefile:737: *** Recursive variable `CFLAGS' references itself (eventually). Stop. Let me suggest the following, and then I'm going to stop here. Martin's patch to fileobject.c should be applied -- that's a given. As for configure: CC='gcc -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' ./configure works for me. I'll leave it up to others to decide what to change, although IMHO posix-large-file is broken (and also because those instructions shouldn't be necessary for Python 2.2). -Barry From Anthony Baxter Mon Jan 7 05:43:39 2002 From: Anthony Baxter (Anthony Baxter) Date: Mon, 07 Jan 2002 16:43:39 +1100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: Message from barry@zope.com (Barry A. Warsaw) of "Mon, 07 Jan 2002 00:30:52 CDT." <15417.12940.186180.661369@anthem.wooz.org> Message-ID: <200201070543.g075hdc01442@mbuna.arbhome.com.au> >>> Barry A. Warsaw wrote > Let me suggest the following, and then I'm going to stop here. > Martin's patch to fileobject.c should be applied -- that's a given. > As for configure: > CC='gcc -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' ./configure > works for me. That's good enough for me - I'll test it on the boxes I can find... > I'll leave it up to others to decide what to change, Documentation? > although IMHO posix-large-file is broken You mean, even with these new build instructions? > (and also because those > instructions shouldn't be necessary for Python 2.2). They are still going to be necessary for 2.1.2 - I don't want to try and play the game of getting this change in and turned on by default at this stage of the game... :/ Anthony From martin@v.loewis.de Mon Jan 7 06:52:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 7 Jan 2002 07:52:20 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15416.58932.135700.16016@anthem.wooz.org> (barry@zope.com) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> <15415.30180.542499.182362@anthem.wooz.org> <200201052358.g05NwDN14410@mira.informatik.hu-berlin.de> <15415.42968.453024.267482@anthem.wooz.org> <200201061137.g06Bb0o01516@mira.informatik.hu-berlin.de> <15416.58932.135700.16016@anthem.wooz.org> Message-ID: <200201070652.g076qKP01878@mira.informatik.hu-berlin.de> > What do you get? martin@mira:~> CFLAGS='one' OPT="two $CFLAGS" ./foo.sh OPT = xtwo onex CFLAGS= xonex martin@mira:~> echo $BASH_VERSION 2.05.0(1)-release > What do you *expect* to get? What I get, both in zsh and bash. I'd expect environment variable assignments to be evaluated from left to right, one-by-one. The bash documentation says # The order of expansions is: brace expansion, tilde expansion, # parameter, variable and arithmetic expansion and command # substitution (done in a left-to-right fashion), word splitting, and # pathname expansion. The only way I can produce an error is by martin@mira:~> env CFLAGS='one' OPT="two $CFLAGS" ./foo.sh OPT = xtwo x CFLAGS= xonex This is the result of the exact procedure used by bash: # When a simple command is executed, the shell performs the following # expansions, assignments, and redirections, from left to right. # # 1. The words that the parser has marked as variable assignments # (those preceding the command name) and redirections are # saved for later processing. # # 2. The words that are not variable assignments or redirections are # expanded. If any words remain after expansion, the first word # is taken to be the name of the command and the remaining words # are the arguments. # # 3. Redirections are performed as described above under REDIRECTION. # # 4. The text after the = in each variable assignment undergoes tilde # expansion, parameter expansion, command substitution, arithmetic # expansion, and quote removal before being assigned to the # variable. So variable left-more assignments have effect on right-more assignments, but not on any other words in the command line. > Am I boiling things down correctly? I would say so. That also indicates the right change to the documentation: Just put each assignment in an individual export statement: export CFLAGS OPT;CFLAGS='one';OPT="two $CFLAGS";./foo.sh I'm still surprised that it fails on your bash; I get the same (IMO correct) behaviour with bash 2.03 on Solaris. I get failures with bash 2.02, and with /bin/sh on Solaris. /bin/ksh and /usr/xpg4/bin/sh work fine (/usr/xpg4/bin/sh actually is ksh). > So, why should any of this work anywhere? Should we ever expect $OPT > to get the right value? > > i-must-be-missing-something-really-obvious,-obvious-ly y'rs, I'd say (without further research) that this was unspecified for Bourne Shell, and got clarified for POSIX Shell - so both recent Bash versions, and the Solaris ksh work fine. Regards, Martin From martin@v.loewis.de Mon Jan 7 06:56:16 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 7 Jan 2002 07:56:16 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <200201070020.TAA13961@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Sun, 06 Jan 2002 19:20:32 -0500) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> <15415.30180.542499.182362@anthem.wooz.org> <200201052358.g05NwDN14410@mira.informatik.hu-berlin.de> <15415.42968.453024.267482@anthem.wooz.org> <200201061137.g06Bb0o01516@mira.informatik.hu-berlin.de> <15416.58932.135700.16016@anthem.wooz.org> <200201070020.TAA13961@cj20424-a.reston1.va.home.com> Message-ID: <200201070656.g076uGO01899@mira.informatik.hu-berlin.de> > I haven't followed this, but from the above it appears that if you use > the form > > VAR1=val1 VAR2=val2 ... program args > > then all of val1, val2, ... are evaluated simultaneously using the > previous values of VAR1, VAR2, ... rather than left-to-right. > > That's mildly surprising but not really upsetting to me. What *is* upsetting is that different shells behave differently; or else the current documentation would not have been written the way it is now (and Barry and me would not have spent the week-end researching that). Recent bash versions, and Korn shell evaluate from left to right (bash now documents that assignments occur *after* args have been expanded). Regards, Martin From martin@v.loewis.de Mon Jan 7 07:00:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 7 Jan 2002 08:00:20 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <3C38EB73.437BE9E6@metaslash.com> (message from Neal Norwitz on Sun, 06 Jan 2002 19:27:31 -0500) References: <200201040715.g047F8D08868@mbuna.arbhome.com.au> <15413.24423.772132.175722@anthem.wooz.org> <200201041038.g04AcSQ05750@mira.informatik.hu-berlin.de> <15413.55039.802450.257239@anthem.wooz.org> <01e801c19551$222145a0$c617a8c0@kurtz> <200201041949.g04JnOq01749@mira.informatik.hu-berlin.de> <3C362C40.8010002@zope.com> <200201042303.g04N37Y01314@mira.informatik.hu-berlin.de> <15414.14234.952188.980370@anthem.wooz.org> <15414.18585.311512.80269@anthem.wooz.org> <200201050543.g055hkD05245@mira.informatik.hu-berlin.de> <15415.11515.142516.734683@anthem.wooz.org> <200201051802.g05I2Ug11568@mira.informatik.hu-berlin.de> <15415.30180.542499.182362@anthem.wooz.org> <200201052358.g05NwDN14410@mira.informatik.hu-berlin.de> <15415.42968.453024.267482@anthem.wooz.org> <200201061137.g06Bb0o01516@mira.informatik.hu-berlin.de> <15416.58932.135700.16016@anthem.wooz.org> <3C38EB73.437BE9E6@metaslash.com> Message-ID: <200201070700.g0770KI01924@mira.informatik.hu-berlin.de> > I think the intent was to use single quotes for OPT='two $CFLAGS'. > (You could also do OPT="two \$CFLAGS".) This will pass the string > "$CFLAGS" in OPT, not the value of the shell variable $CFLAGS. > > While your shell script will print out: OPT = xtwo $CFLAGSx > This is ok since it will/should get expanded properly in the Makefile. > > Or I've totally missed the point too. :-) The intent really was that the later assigment takes into account the earlier one, by means of shell expansion. Setting OPT to a value that depends on CFLAGS would give you a cyclic expansion in the Makefile - so that clearly was not the intent. You need to set both because one ends up in the Makefile (OPT) whereas the other (CFLAGS) is needed to convince configure that HAVE_LARGEFILE should be turned on. Regards, Martin From martin@v.loewis.de Mon Jan 7 07:07:05 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 7 Jan 2002 08:07:05 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: References: Message-ID: <200201070707.g07775k01927@mira.informatik.hu-berlin.de> > I'm sure that is not what Marc meant. I think he simply meant a > conversion function that would return the filename as either byte or > Unicode. Get your arg from PyArg_ParseTuple, and convert it with > this function. If you have this, how do you know whether to call fopen or wfopen? If it was a byte string, you need to pass it to fopen; if it was a Unicode string, you pass it to wfopen. Maybe that's what MAL meant, but then it won't work. > Have I missed it all these years, or should we define a PyArg_ParseTuple > format that takes a "void **" and a function pointer to a type conversion > function? This is what O& does. Unless it fills a PyObject*, you have a hard time telling what it is that you got. It works for the void** case only if it always fills in the same type (e.g. Py_UNICODE*); filling int Py_UNICODE* in some cases and char* in others, without telling you, is useless. Regards, Martin From martin@v.loewis.de Mon Jan 7 07:09:27 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 7 Jan 2002 08:09:27 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: References: Message-ID: <200201070709.g0779Rp01930@mira.informatik.hu-berlin.de> > This looks interesting :) Any chance of putting all this together in a > patch at source-forge? I do hope Neil will create a patch eventually; so far, it seems to be more convenient to him to post snippets. This is fine with me, since this project still has some way to go for completion. Regards, Martin From martin@v.loewis.de Mon Jan 7 07:33:47 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 7 Jan 2002 08:33:47 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <200201070543.g075hdc01442@mbuna.arbhome.com.au> (message from Anthony Baxter on Mon, 07 Jan 2002 16:43:39 +1100) References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> Message-ID: <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> > > CC='gcc -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' ./configure > > That's good enough for me - I'll test it on the boxes I can find... I'd strongly advise against putting that into the documentation. There are numerous assignments to CC inside configure.in, which would override this setting. Setting OPT and CFLAGS is the right way to pass these configuration options. > > I'll leave it up to others to decide what to change, > > Documentation? Please, not the way Barry proposes. Regards, Martin From mal@lemburg.com Mon Jan 7 09:22:50 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 07 Jan 2002 10:22:50 +0100 Subject: [Python-Dev] Unicode strings as filenames References: Message-ID: <3C3968EA.E0636E65@lemburg.com> Mark Hammond wrote: > > > It may be necessary to call different routines depending on whether > > you have a byte or a character string; this is not something a getargs > > converter can do. It also may be that, depending on which system > > routine you call, the system will *return* either wide or narrow > > strings to you. Every time you find another use of file names, Marc > > suggests you put that into fileapi.c. > > I'm sure that is not what Marc meant. I think he simply meant a conversion > function that would return the filename as either byte or Unicode. Get your > arg from PyArg_ParseTuple, and convert it with this function. What I meant is to move all the file name code from fileobject.c and posixmodule.c to a new file fileapi.c which lives in the Python/ subdir and then let it expose C APIs which the other two files then use in their machinery. It's basically about cleaning up the various bits and pieces in the source code; note that this does not only involve APIs which work on file names, but also other APIs which take filenames as arguments and or return filenames (even though starting out with a file name mapping API would already go a long way). The benefits of such an approach would be two-fold: 1. You centralize the need for #ifdefs and other platform specific quirks in one file. As a result future fixes will only involve this one file. (Py_FileSystemDefaultEncoding should also live in this file, BTW) 2. The C APIs can well be used by other parts of the Python interpreter which need to open and handle files. Extensions would also benefit from this, e.g. they could use the API functions to open files with Unicode names in a cross-platform way using a single API. > Have I missed it all these years, or should we define a PyArg_ParseTuple > format that takes a "void **" and a function pointer to a type conversion > function? Isn't this what "O&" is meant for ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From nhodgson@bigpond.net.au Mon Jan 7 09:30:11 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Mon, 7 Jan 2002 20:30:11 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> Message-ID: <003b01c1975d$e7dd3070$0acc8490@neil> [Replacing the other mail destinations as I didn't do a reply all last time so python-dev dropped off. You may want to resend your last mail to python-dev.] > I don't think we can drop W9x support for Python 2.3, although I'm > still waiting for comments on dropping W3.1 support... I wouldn't want to drop either. > > Sounds good to me. I'm moving back towards not using the 'utf-8' system > > encoding but rather checking of Unicode arguments and handling them > > explicitly even at the cost of code expansion. > > That is very good. I don't know what is best for the file name; > perhaps it is acceptable to encode it with the file system default > encoding (even if it ends up having question marks in it). Programs > relying on the file name to be correct are broken, IMO. My thinking now is that there are two modules here, fileobject and posixmodule which should be handled differently. posixmodule is just a library with calls and no state. IIRC there used to be multiple modules, one per OS, and the correct one was chosen and called os. I think it is perfectly reasonable for there to be an extra 'ntos' module that just works on NT that treats all arguments as Unicode (coercing up using the current locale when given narrow strings) and always calling the wide APIs. It would contain the same methods (when available) as os. NT specific code can use it directly and sufficiently interested portable client code could say something like if nt: filesys = ntos else: filesys = os This hides away all the code bloat from posix code, ensures there are no regressions in posix while developing and debugging ntos, and allows ntos to just convert all arguments into wide strings without worrying about 9x. Maybe call the module osu if there may be implementations on other OS's like OS X. Could have an enquiry method in the module if osu.working: filesys = osu else: filesys = os fileobject is more complex because it holds two strings as state. The mode can probably be assumed to be ASCII so can be left as a narrow string (although it does have to be widened to call _wfopen) but the name is more complex as some client code may just know that it is always a narrow string and thus die if given a file with a wide name. > Looks very good indeed. When producing patches, you might want to > check line endings: currently, your files are a mix of LF only (which > was there before) and CRLF. OK. > In open_the_file, you are still checking for utf-8; that should be > removed also. It seems that open_the_file will always get an > initialized filed, so passing name does not seem to be necessary: one > could look at f_name. OK. So why are the name and mode passed when they are already available? > I suggest that f_name stays as a byte string for the moment, and > open_the_file gets an optional "original name" or "unicode name" > argument, whatever is more convenient. If that is given, open_the_file > should consider it, else it should fall back to f_name. If this is done then the unicode name should also be available as a field of the object as those mangled "z??.html" strings are totally useless. I'm feeling more like making f_name be wide now but I'd expect some opposition now from backwards compatibility advocates. > In posixmodule, I cannot see the move towards passing Unicode objects > directly, either - I guess you were talking about a future plan, > above. Yes, I'm thinking ahead of the coding. Seeing where I'm already going or about to go wrong. > I cannot see the rationale for wfuncNull - wouldn't passing > passing NULL as a function pointer be sufficient as well? Yes, must get used to thinking in C again. I don't think I have written C for 8 years. WTF can't I declare variables just when I need them Neil From jack@oratrix.nl Mon Jan 7 11:55:39 2002 From: jack@oratrix.nl (Jack Jansen) Date: Mon, 07 Jan 2002 12:55:39 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: Message by "Martin v. Loewis" , Sun, 6 Jan 2002 23:51:37 +0100 , <200201062251.g06Mpbv02364@mira.informatik.hu-berlin.de> Message-ID: <20020107115544.95F3FE8451@oratrix.oratrix.nl> Recently, "Martin v. Loewis" said: > > There's a lot of Python objects that are really little more than > > wrappers around an opaque C pointer (plus all the methods to operate > > on it, etc). > > Can you give a few examples? I'm not aware of any such types, off-hand. All the Mac toolbox objects (Windows, Dialogs, Controls, Menus and a zillion more), All the Windows HANDLEs, all the MFC objects (although they might be a bit more difficult), the objects in the X11 and Motif modules, the pyexpat parser object, *dbm objects, dlmodule objects, mpz objects, zlib objects, SGI cl and al objects.... Enough examples? :-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mal@lemburg.com Mon Jan 7 12:08:17 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 07 Jan 2002 13:08:17 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects References: <20020107115544.95F3FE8451@oratrix.oratrix.nl> Message-ID: <3C398FB1.EF972E50@lemburg.com> Jack Jansen wrote: > > Recently, "Martin v. Loewis" said: > > > There's a lot of Python objects that are really little more than > > > wrappers around an opaque C pointer (plus all the methods to operate > > > on it, etc). > > > > Can you give a few examples? I'm not aware of any such types, off-hand. > > All the Mac toolbox objects (Windows, Dialogs, Controls, Menus and a > zillion more), All the Windows HANDLEs, all the MFC objects (although > they might be a bit more difficult), the objects in the X11 and Motif > modules, the pyexpat parser object, *dbm objects, dlmodule objects, > mpz objects, zlib objects, SGI cl and al objects.... > > Enough examples? :-) Sounds like you want to introduce a "buffer" interface for these objects. If that's the case, please write a PEP for it -- I don't think anyone on this list wants to see a second can of worms like the buffer interface in Python :-/ -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jack@oratrix.nl Mon Jan 7 12:12:16 2002 From: jack@oratrix.nl (Jack Jansen) Date: Mon, 07 Jan 2002 13:12:16 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: Message by "M.-A. Lemburg" , Mon, 07 Jan 2002 13:08:17 +0100 , <3C398FB1.EF972E50@lemburg.com> Message-ID: <20020107121221.693D4E8451@oratrix.oratrix.nl> Recently, "M.-A. Lemburg" said: > Jack Jansen wrote: > > > > Recently, "Martin v. Loewis" said: > > > > There's a lot of Python objects that are really little more than > > > > wrappers around an opaque C pointer (plus all the methods to operate > > > > on it, etc). > > > > > > Can you give a few examples? I'm not aware of any such types, off-hand. > > > > All the Mac toolbox objects (Windows, Dialogs, Controls, Menus and a > > zillion more), All the Windows HANDLEs, all the MFC objects (although > > they might be a bit more difficult), the objects in the X11 and Motif > > modules, the pyexpat parser object, *dbm objects, dlmodule objects, > > mpz objects, zlib objects, SGI cl and al objects.... > > > > Enough examples? :-) > > Sounds like you want to introduce a "buffer" interface for these > objects. No, that is something completely different. I want a replacement for PyArg_Parse("O&", funcptr, void**) that has the form PyArg_Parse("O@", typeobject, void**) and similarly for Py_BuildValue. Because the typeobject has a Python representation (whereas the function pointer does not) this would allow modules like struct and calldll to support objects that have this interface, because these modules are driven from specifications in Python. There is currently no way to get from the typeobject to the function pointer needed for O&. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From fredrik@pythonware.com Mon Jan 7 12:25:48 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 7 Jan 2002 13:25:48 +0100 Subject: [Python-Dev] Unicode support in getargs.c References: <20020105231841.54F1BE8451@oratrix.oratrix.nl> Message-ID: <042701c19776$71cb2b80$0900a8c0@spiff> jack wrote: > If Python runs on an EBCDIC machine (does it?) http://home.no.net/pgummeda/ (2.2 on as/400) http://www-1.ibm.com/servers/eserver/zseries/zos/unix/python.html (1.4 on os/390) From barry@zope.com Mon Jan 7 13:19:27 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 7 Jan 2002 08:19:27 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <15417.12940.186180.661369@anthem.wooz.org> <200201070543.g075hdc01442@mbuna.arbhome.com.au> Message-ID: <15417.41055.352175.670714@anthem.wooz.org> >>>>> "AB" == Anthony Baxter writes: >> I'll leave it up to others to decide what to change, AB> Documentation? >> although IMHO posix-large-file is broken AB> You mean, even with these new build instructions? Oops, sorry, I meant: I think the instructions on that page are broken! LFS support seems to work just fine w/ Martin's patch and the new instructions. >> (and also because those instructions shouldn't be necessary for >> Python 2.2). AB> They are still going to be necessary for 2.1.2 - I don't want AB> to try and play the game of getting this change in and turned AB> on by default at this stage of the game... :/ I agree completely! The 2.2 docs should probably say that those instructions aren't necessary, but in the 2.1.2 branch it should say they /are/ needed to turn on LFS. -Barry From barry@zope.com Mon Jan 7 13:45:00 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 7 Jan 2002 08:45:00 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> Message-ID: <15417.42588.807910.631165@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> I'd strongly advise against putting that into the MvL> documentation. There are numerous assignments to CC inside MvL> configure.in, which would override this setting. Setting OPT MvL> and CFLAGS is the right way to pass these configuration MvL> options. >> I'll leave it up to others to decide what to change, >> Documentation? MvL> Please, not the way Barry proposes. Here's another suggestion: add a make variable that isn't used or anything else, has a default empty value, and is used to create the compilation command. Let's say $LARGEFILE. Then the configure command would be % LARGEFILE='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' ./configure and that should work on all shells, and without having to permanently export a variable to the environment, which I think we should avoid recommending. -Barry From fdrake@acm.org Mon Jan 7 14:51:37 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 7 Jan 2002 09:51:37 -0500 (EST) Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15417.42588.807910.631165@anthem.wooz.org> References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> <15417.42588.807910.631165@anthem.wooz.org> Message-ID: <15417.46585.379179.935739@cj42289-a.reston1.va.home.com> Barry A. Warsaw writes: > Here's another suggestion: add a make variable that isn't used or > anything else, has a default empty value, and is used to create the > compilation command. Let's say $LARGEFILE. This seems tolerable. We should probably look for getconf in the configure script, and make the default value the result of "getconf LFS_CFLAGS" if available. This seems like it would do "the right thing" more often without user intervention and is safe when getconf is not available. If LARGEFILE were set in the environment (by a command line such as you suggest), we'd just use that instead. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@zope.com Mon Jan 7 15:00:48 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 7 Jan 2002 10:00:48 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> <15417.42588.807910.631165@anthem.wooz.org> <15417.46585.379179.935739@cj42289-a.reston1.va.home.com> Message-ID: <15417.47136.269212.653134@anthem.wooz.org> >>>>> "Fred" == Fred L Drake, Jr writes: Fred> This seems tolerable. We should probably look for getconf Fred> in the configure script, and make the default value the Fred> result of "getconf LFS_CFLAGS" if available. This seems Fred> like it would do "the right thing" more often without user Fred> intervention and is safe when getconf is not available. If Fred> LARGEFILE were set in the environment (by a command line Fred> such as you suggest), we'd just use that instead. +1 BTW, does anybody have a manpage for getconf? -Barry From fdrake@acm.org Mon Jan 7 15:06:13 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 7 Jan 2002 10:06:13 -0500 (EST) Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15417.47136.269212.653134@anthem.wooz.org> References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> <15417.42588.807910.631165@anthem.wooz.org> <15417.46585.379179.935739@cj42289-a.reston1.va.home.com> <15417.47136.269212.653134@anthem.wooz.org> Message-ID: <15417.47461.780720.52217@cj42289-a.reston1.va.home.com> --IGxTcpdnrL Content-Type: text/plain; charset=us-ascii Content-Description: message body and .signature Content-Transfer-Encoding: 7bit Barry A. Warsaw writes: > +1 > > BTW, does anybody have a manpage for getconf? Not for Linux, but you aleady have the command line we care about. I've attached a Solaris manpage. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation --IGxTcpdnrL Content-Type: text/plain Content-Description: getconf manpage Content-Disposition: inline; filename="getconf.txt" Content-Transfer-Encoding: 7bit User Commands getconf(1) NAME getconf - get configuration values SYNOPSIS getconf [ -v _s_p_e_c_i_f_i_c_a_t_i_o_n ] _s_y_s_t_e_m__v_a_r getconf [ -v _s_p_e_c_i_f_i_c_a_t_i_o_n ] _p_a_t_h__v_a_r _p_a_t_h_n_a_m_e getconf -a DESCRIPTION In the first synopsis form, the getconf utility will write to the standard output the value of the variable specified by _s_y_s_t_e_m__v_a_r, in accordance with _s_p_e_c_i_f_i_c_a_t_i_o_n if the -v option is used. In the second synopsis form, getconf will write to the stan- dard output the value of the variable specified by _p_a_t_h__v_a_r for the path specified by _p_a_t_h_n_a_m_e, in accordance with _s_p_e_c_i_f_i_c_a_t_i_o_n if the -v option is used. In the third synopsis form, config will write to the stan- dard output the names of the current system configuration variables. The value of each configuration variable will be determined as if it were obtained by calling the function from which it is defined to be available. The value will reflect condi- tions in the current operating environment. OPTIONS The following options are supported: -a Writes the names of the current system configuration variables to the standard output. -v _s_p_e_c_i_f_i_c_a_t_i_o_n Gives the specification which governs the selection of values for configuration variables. OPERANDS The following operands are supported: _p_a_t_h__v_a_r A name of a configuration variable whose value is available from the pathconf(2) function. All of the values in the following table are supported: LINK_MAX _N_A_M_E__M_A_X POSIX_CHOWN_RESTRICTED MAX_CANON _P_A_T_H__M_A_X POSIX_NO_TRUNC MAX_INPUT PIPE_BUF POSIX_VDISABLE SunOS 5.8 Last change: 30 Jan 1998 1 User Commands getconf(1) _p_a_t_h_n_a_m_e A path name for which the variable specified by _p_a_t_h__v_a_r is to be determined. _s_y_s_t_e_m__v_a_r A name of a configuration variable whose value is available from confstr(3C) or sysconf(3C). All of the values in the following table are supported: ARG_MAX _B_C__B_A_S_E__M_A_X BC_DIM_MAX _B_C__S_C_A_L_E__M_A_X BC_STRING_MAX CHAR_BIT CHARCLASS_NAME_MAX CHAR_MAX CHAR_MIN CHILD_MAX CLK_TCK COLL_WEIGHTS_MAX CS_PATH EXPR_NEST_MAX INT_MAX INT_MIN LFS64_CFLAGS LFS64_LDFLAGS LFS64_LIBS LFS64_LINTFLAGS LFS_CFLAGS LFS_LDFLAGS LFS_LIBS LFS_LINTFLAGS LINE_MAX LONG_BIT LONG_MAX LONG_MIN MB_LEN_MAX NGROUPS_MAX NL_ARGMAX NL_LANGMAX NL_MSGMAX NL_NMAX NL_SETMAX NL_TEXTMAX NZERO OPEN_MAX POSIX2_BC_BASE_MAX POSIX2_BC_DIM_MAX POSIX2_BC_SCALE_MAX POSIX2_BC_STRING_MAX POSIX2_C_BIND POSIX2_C_DEV POSIX2_CHAR_TERM POSIX2_COLL_WEIGHTS_MAX POSIX2_C_VERSION POSIX2_EXPR_NEST_MAX POSIX2_FORT_DEV POSIX2_FORT_RUN POSIX2_LINE_MAX POSIX2_LOCALEDEF POSIX2_RE_DUP_MAX POSIX2_SW_DEV POSIX2_UPE POSIX2_VERSION _POSIX_ARG_MAX _POSIX_CHILD_MAX _POSIX_JOB_CONTROL _POSIX_LINK_MAX _POSIX_MAX_CANON _POSIX_MAX_INPUT _POSIX_NAME_MAX _POSIX_NGROUPS_MAX _POSIX_OPEN_MAX _POSIX_PATH_MAX _POSIX_PIPE_BUF _POSIX_SAVED_IDS _POSIX_SSIZE_MAX _POSIX_STREAM_MAX _POSIX_TZNAME_MAX _POSIX_VERSION RE_DUP_MAX SCHAR_MAX SCHAR_MIN SHRT_MAX SHRT_MIN SSIZE_MAX STREAM_MAX TMP_MAX TZNAME_MAX UCHAR_MAX UINT_MAX ULONG_MAX USHRT_MAX WORD_BIT SunOS 5.8 Last change: 30 Jan 1998 2 User Commands getconf(1) XBS5_ILP32_OFF32 XBS5_ILP32_OFF32_CFLAGS XBS5_ILP32_OFF32_LDFLAGS XBS5_ILP32_OFF32_LIBS XBS5_ILP32_OFF32_LINTFLAGS XBS5_ILP32_OFFBIG XBS5_ILP32_OFFBIG_CFLAGS XBS5_ILP32_OFFBIG_LDFLAGS XBS5_ILP32_OFFBIG_LIBS XBS5_ILP32_OFFBIG_LINTFLAGS XBS5_LP64_OFF64 XBS5_LP64_OFF64_CFLAGS XBS5_LP64_OFF64_LDFLAGS XBS5_LP64_OFF64_LIBS XBS5_LP64_OFF64_LINTFLAGS XBS5_LPBIG_OFFBIG XBS5_LPBIG_OFFBIG_CFLAGS XBS5_LPBIG_OFFBIG_LDFLAGS XBS5_LPBIG_OFFBIG_LIBS XBS5_LPBIG_OFFBIG_LINTFLAGS _XOPEN_CRYPT _XOPEN_ENH_I18N _XOPEN_LEGACY _XOPEN_SHM _XOPEN_VERSION _XOPEN_XCU_VERSION _XOPEN_XPG2 _XOPEN_XPG3 _XOPEN_XPG4 The symbol PATH also is recognized, yielding the same value as the confstr() name value CS_PATH. USAGE See largefile(5) for the description of the behavior of getconf when encountering files greater than or equal to 2 Gbyte ( 2**31 bytes). EXAMPLES Example 1: Writing the value of a variable This example illustrates the value of {NGROUPS_MAX}: example% getconf NGROUPS_MAX Example 2: Writing the value of a variable for a specific directory This example illustrates the value of NAME_MAX for a specific directory: example% getconf NAME_MAX /usr Example 3: Dealing with unspecified results This example shows how to deal more carefully with results that might be unspecified: if value=$(getconf PATH_MAX /usr); then if [ "$value" = "undefined" ]; then echo PATH_MAX in /usr is infinite. else echo PATH_MAX in /usr is $value. fi else SunOS 5.8 Last change: 30 Jan 1998 3 User Commands getconf(1) echo Error in getconf. fi Note that sysconf(_SC_POSIX_C_BIND); and system("getconf POSIX2_C_BIND"); in a C program could give different answers. The sysconf call supplies a value that corresponds to the conditions when the program was either compiled or executed, depending on the implementation; the system call to getconf always supplies a value corresponding to conditions when the pro- gram is executed. ENVIRONMENT VARIABLES See environ(5) for descriptions of the following environment variables that affect the execution of getconf: LC_CTYPE, LC_MESSAGES, and NLSPATH. EXIT STATUS The following exit values are returned: 0 The specified variable is valid and information about its current state was written successfully. >0 An error occurred. ATTRIBUTES See attributes(5) for descriptions of the following attri- butes: ____________________________________________________________ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | |______________________________|______________________________| | Availability | SUNWcsu | |______________________________|______________________________| SEE ALSO pathconf(2), confstr(3C), sysconf(3C), attributes(5), environ(5), largefile(5) SunOS 5.8 Last change: 30 Jan 1998 4 --IGxTcpdnrL-- From akuchlin@mems-exchange.org Mon Jan 7 17:10:45 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 07 Jan 2002 12:10:45 -0500 Subject: [Python-Dev] eval() slowdown in 2.2 on MacOS X? Message-ID: [CC'ed to python-dev, Barbara Mattson] Barbara's encountered an apparent problem with test_longexp in Python 2.2 on MacOS X. test_longexp creates a big list expression and eval()'s it. The problem is that it takes an exceedingly long time to run, at least more than half an hour (at which point she interrupted it). The two curious things are that 1) while test_longexp requires a lot of memory and often thrashes on a low-memory machine (I found there are 2 or 3 bugs in the SF bugtracker to this effect), the MacOS box in question has a gigabyte of RAM, and 2) Python 2.1.1 *doesn't* show the problem. Quoting from her report: I tried the test_longexp by hand: REPS = XXX l = eval("[" + "2," * REPS + "]") print len(l) changing REPS from 1000 to 50000. 1000 and 10000 ran fairly quickly - under a minute. However, 25000 took about 5 minutes and 50000 took 23 minutes. I'm not about to try 65580 (I need to get some real work done today, after all :-). BTW, out of curiosity, I tried the same thing under 2.1.1, and even for REPS = 70000 it took less than a minute. Any clues? --amk (www.amk.ca) "Peri, how would you like to meet a genius?" "I thought I already have." -- The Doctor and Peri, in "Mark of the Rani" From skip@pobox.com Mon Jan 7 17:27:17 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 7 Jan 2002 11:27:17 -0600 Subject: [Python-Dev] eval() slowdown in 2.2 on MacOS X? In-Reply-To: References: Message-ID: <15417.55925.808372.906770@beluga.mojam.com> amk> test_longexp creates a big list expression and eval()'s it. The amk> problem is that it takes an exceedingly long time to run, at least amk> more than half an hour (at which point she interrupted it). ... amk> changing REPS from 1000 to 50000. 1000 and 10000 ran fairly amk> quickly - under a minute. However, 25000 took about 5 minutes and amk> 50000 took 23 minutes. ... amk> Any clues? Try configuring using --with-pymalloc to see if Vladimir's Python-specific object allocator helps. Even with a gigabyte of RAM, perhaps the malloc free list is getting badly fragmented, causing it to churn forever trying to coalesce memory blocks. -- Skip Montanaro (skip@pobox.com - http://www.mojam.com/) From martin@v.loewis.de Mon Jan 7 22:50:47 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 7 Jan 2002 23:50:47 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: <20020107115544.95F3FE8451@oratrix.oratrix.nl> (message from Jack Jansen on Mon, 07 Jan 2002 12:55:39 +0100) References: <20020107115544.95F3FE8451@oratrix.oratrix.nl> Message-ID: <200201072250.g07Molc01523@mira.informatik.hu-berlin.de> > All the Mac toolbox objects (Windows, Dialogs, Controls, Menus and a > zillion more), All the Windows HANDLEs, all the MFC objects (although > they might be a bit more difficult), the objects in the X11 and Motif > modules, the pyexpat parser object, *dbm objects, dlmodule objects, > mpz objects, zlib objects, SGI cl and al objects.... Could you please try once more, being serious this time? AFAICT, I was asking for examples of types that are parsed by means of O& currently, and do so just to get a void** from the python object. Looking at pyexpat.c, I find a few uses of O&, none related to the pyexpat parser object. In zlibmodule.c, I find not a single mentioning of O&, likewise in dlmodule.c, clmodule.c, almodule.c, dbmmodule.c, and now I'm losing interest into verifying more of your examples. AFAICT, you are trying to replace O& with something. Where, exactly (specific source file and line number), would you want to do that? Regards, Martin From martin@v.loewis.de Mon Jan 7 22:55:47 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 7 Jan 2002 23:55:47 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15417.42588.807910.631165@anthem.wooz.org> (barry@zope.com) References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> <15417.42588.807910.631165@anthem.wooz.org> Message-ID: <200201072255.g07Mtli01529@mira.informatik.hu-berlin.de> > Here's another suggestion: add a make variable that isn't used or > anything else, has a default empty value, and is used to create the > compilation command. Let's say $LARGEFILE. > > Then the configure command would be > > % LARGEFILE='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' ./configure > > and that should work on all shells, and without having to permanently > export a variable to the environment, which I think we should avoid > recommending. "is used to create the compilation command" may be tricky to implement. Anyway, what is wrong with my earlier suggestion export CFLAGS OPT CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' OPT="-g -O2 $CFLAGS" ./configure ??? Regards, Martin From skip@pobox.com Mon Jan 7 23:03:41 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 7 Jan 2002 17:03:41 -0600 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <200201072255.g07Mtli01529@mira.informatik.hu-berlin.de> References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> <15417.42588.807910.631165@anthem.wooz.org> <200201072255.g07Mtli01529@mira.informatik.hu-berlin.de> Message-ID: <15418.10573.203528.723560@beluga.mojam.com> (trimming the cc list... i think everyone on it is a p-dev'er) Martin> Anyway, what is wrong with my earlier suggestion Martin> export CFLAGS OPT Martin> CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' Martin> OPT="-g -O2 $CFLAGS" Martin> ./configure I know I'm coming into this discussion late, but why even involve CFLAGS? export OPT OPT='-g -O2 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' ./configure Skip From martin@v.loewis.de Mon Jan 7 23:17:14 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 00:17:14 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <003b01c1975d$e7dd3070$0acc8490@neil> (nhodgson@bigpond.net.au) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> Message-ID: <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> > posixmodule is just a library with calls and no state. IIRC there used to > be multiple modules, one per OS, and the correct one was chosen and called > os. I think it is perfectly reasonable for there to be an extra 'ntos' > module that just works on NT that treats all arguments as Unicode (coercing > up using the current locale when given narrow strings) and always calling > the wide APIs. It would contain the same methods (when available) as os. I'd be all in favour of bringing ntmodule back into life, especially if that is to become a module that does not need to work on Win9x. Perhaps it can be compiled twice, once into w9x.pyd and once into nt.pyd, or the common code can be shared by means if #include. I'd also be in favour of killing all 16-bit Windows support in Python for 2.3; not sure whether 16-bit DOS needs to stay. > If this is done then the unicode name should also be available as a field > of the object as those mangled "z??.html" strings are totally useless. It is not totally useless. Most users will never see the problem, because their file names represent well in mbcs. In cases where you do get replacement characters, it is still useful, since may roughly recognize what file it is in debugging output (e.g. the file extension will be ASCII-representable in most applicatons, perhaps you get a meaningful path in there also). > I'm feeling more like making f_name be wide now but I'd expect some > opposition now from backwards compatibility advocates. I think the major problem is that performing repr on a file should work. If that turns out to use the repr of the string (can't check right now), instead of raising UnicodeErrors, my oposition to putting Unicode objects into file names is not that strong anymore. > Yes, I'm thinking ahead of the coding. Seeing where I'm already > going or about to go wrong. That looks very good indeed. I was worried about using UTF-8 as file system default encoding, because I believe that this encoding should mandated by the system API, instead of being our choice. Regards, Martin From martin@v.loewis.de Mon Jan 7 23:20:15 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 00:20:15 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15418.10573.203528.723560@beluga.mojam.com> (skip@pobox.com) References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> <15417.42588.807910.631165@anthem.wooz.org> <200201072255.g07Mtli01529@mira.informatik.hu-berlin.de> <15418.10573.203528.723560@beluga.mojam.com> Message-ID: <200201072320.g07NKF601967@mira.informatik.hu-berlin.de> > I know I'm coming into this discussion late, but why even involve CFLAGS? Because without it, autoconf won't detect that large file support is available, and fail to define HAVE_LARGEFILE. This is because configure uses CFLAGS on its own for the test scripts, but won't use OPT. I don't think anything in the configure machinery should change for 2.1.2, since 2.2 does it all in a different and better way. Regards, Martin From martin@v.loewis.de Tue Jan 8 00:16:28 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 01:16:28 +0100 Subject: [Python-Dev] Including BSDDB3 Message-ID: <200201080016.g080GS602121@mira.informatik.hu-berlin.de> What do people think about including bsddb3 in Python 2.3, along with deprecating the existing bsddb module? You'll find the package at http://pybsddb.sourceforge.net/ It would come as a bsddb3 package, which acts interface-compatible with the current bsddb module. Various submodules give access to more advanced features. The main rationale for dropping bsddb is that it still relies on the db_185.h interface, which will be phased out sooner or later. Existance of this interface, in turn, results in problems with anydbm: There are multiple versions of the database files available in the world, and any BSDDB installation can only handle so many of these versions. Now, on Linux, it is common that bsddb3 is installed, but that glibc offers bsddb2 simultaneously. For anydbm to analyse this situation properly, it would need some of the more advanced bsddb facilities. While this is the rationale for dropping the existing bsddb module sooner or later, there are, of course, numerous advantages in exposing the additional BSDDB features, like concurrency, transactions, and cursors. Any opinions? Regards, Martin From barry@zope.com Tue Jan 8 00:47:44 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 7 Jan 2002 19:47:44 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> <15417.42588.807910.631165@anthem.wooz.org> <200201072255.g07Mtli01529@mira.informatik.hu-berlin.de> Message-ID: <15418.16816.651860.723486@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> "is used to create the compilation command" may be tricky to MvL> implement. Anyway, what is wrong with my earlier suggestion | export CFLAGS OPT | CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' | OPT="-g -O2 $CFLAGS" | ./configure Two problems: 1) This requires you to export two variables into the outer shell's environment. As a general rule, I think this is a bad idea for tricking configure. What else might you be affecting? Others might not care as much. 2) Any time you overload a make variable that has existing semantics, you have to worry about losing the original value. Personally, I think it's easier to get CC overloading right than get OPT or CFLAGS overloading (and easier than getting them both right). But maybe that's just me. -Barry From doko@cs.tu-berlin.de Tue Jan 8 01:07:07 2002 From: doko@cs.tu-berlin.de (Matthias Klose) Date: Tue, 8 Jan 2002 02:07:07 +0100 Subject: [Python-Dev] building python info documentation Message-ID: <15418.17979.711869.453474@gargle.gargle.HOWL> The info docs cannot be built with the current 2.1/2.2 and HEAD branches. I found updated versions of the conversion scripts at: http://pag.lcs.mit.edu/~mernst/software/#html2texi http://pag.lcs.mit.edu/~mernst/software/#checkargs with http://pag.lcs.mit.edu/~mernst/software/python-info-Makefile from the same site I get a step further ... but get: emacs -batch api.texi --eval '(progn (goto-char (point-min)) (while (re-search-forward "\\(@setfilename \\)\\([-a-z]*\\)\n" nil t) (replace-match "\\1python-\\2.info\n")) (while (search-forward "@node Front Matter\n@chapter Abstract\n" nil t) (replace-match "@node Abstract\n@section Abstract\n" nil t)) (progn (mark-whole-buffer) (texinfo-master-menu (quote update-all-nodes))) (save-buffer))' End of file during parsing Is there an updated version available, which works for the python2.2 info files as well? btw, who is pdm/pdm, who builds the info tarballs for download? From aahz@rahul.net Tue Jan 8 01:38:33 2002 From: aahz@rahul.net (Aahz Maruch) Date: Mon, 7 Jan 2002 17:38:33 -0800 (PST) Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15418.16816.651860.723486@anthem.wooz.org> from "Barry A. Warsaw" at Jan 07, 2002 07:47:44 PM Message-ID: <20020108013834.1FC12E8C6@waltz.rahul.net> Barry A. Warsaw wrote: > >>>>> "MvL" == Martin v Loewis writes: > > MvL> "is used to create the compilation command" may be tricky to > MvL> implement. Anyway, what is wrong with my earlier suggestion > > | export CFLAGS OPT > | CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' > | OPT="-g -O2 $CFLAGS" > | ./configure > > Two problems: > > 1) This requires you to export two variables into the outer shell's > environment. As a general rule, I think this is a bad idea for > tricking configure. What else might you be affecting? Others > might not care as much. OTOH, if MvL's code is in a shell script, this objection doesn't apply. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From guido@python.org Tue Jan 8 01:44:45 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 07 Jan 2002 20:44:45 -0500 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: Your message of "Tue, 08 Jan 2002 00:17:14 +0100." <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> Message-ID: <200201080144.UAA26381@cj20424-a.reston1.va.home.com> > I'd also be in favour of killing all 16-bit Windows support in Python > for 2.3; not sure whether 16-bit DOS needs to stay. I think both can be killed. Hans Novak has long stopped supporting his DOS version of Python. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jan 8 01:54:09 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 07 Jan 2002 20:54:09 -0500 Subject: [Python-Dev] Including BSDDB3 In-Reply-To: Your message of "Tue, 08 Jan 2002 01:16:28 +0100." <200201080016.g080GS602121@mira.informatik.hu-berlin.de> References: <200201080016.g080GS602121@mira.informatik.hu-berlin.de> Message-ID: <200201080154.UAA26450@cj20424-a.reston1.va.home.com> > What do people think about including bsddb3 in Python 2.3, along with > deprecating the existing bsddb module? You'll find the package at > > http://pybsddb.sourceforge.net/ > > It would come as a bsddb3 package, which acts interface-compatible > with the current bsddb module. Various submodules give access to more > advanced features. > > The main rationale for dropping bsddb is that it still relies on the > db_185.h interface, which will be phased out sooner or > later. Existance of this interface, in turn, results in problems with > anydbm: > > There are multiple versions of the database files available in the > world, and any BSDDB installation can only handle so many of these > versions. Now, on Linux, it is common that bsddb3 is installed, but > that glibc offers bsddb2 simultaneously. For anydbm to analyse this > situation properly, it would need some of the more advanced bsddb > facilities. > > While this is the rationale for dropping the existing bsddb module > sooner or later, there are, of course, numerous advantages in exposing > the additional BSDDB features, like concurrency, transactions, and > cursors. > > Any opinions? Sounds like a good plan, but we should make sure it can all be re-released under the PSF license. For the Zope Corp. portions of the code I promise that's no problem :-) -- but there are so many other contributors that it's getting a little tangled... --Guido van Rossum (home page: http://www.python.org/~guido/) From nhodgson@bigpond.net.au Tue Jan 8 03:04:01 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Tue, 8 Jan 2002 14:04:01 +1100 Subject: [Python-Dev] os.listdir("") bug on Windows References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <200201080144.UAA26381@cj20424-a.reston1.va.home.com> Message-ID: <04be01c197f1$1ff30fa0$0acc8490@neil> There is an out-of-bounds error on Windows when using os.listdir("") which could result in indeterminate behaviour. After parsing the args, it does ch = namebuf[len-1]; which indexes before the array as len = 0. Possibly change this to ch = (len > 0) ? namebuf[len-1] : '\0'; Neil From guido@python.org Tue Jan 8 03:19:20 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 07 Jan 2002 22:19:20 -0500 Subject: [Python-Dev] os.listdir("") bug on Windows In-Reply-To: Your message of "Tue, 08 Jan 2002 14:04:01 +1100." <04be01c197f1$1ff30fa0$0acc8490@neil> References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <200201080144.UAA26381@cj20424-a.reston1.va.home.com> <04be01c197f1$1ff30fa0$0acc8490@neil> Message-ID: <200201080319.WAA26929@cj20424-a.reston1.va.home.com> Neil, thanks for the bug report, but can you please submit it to SourceForge? We don't regularly scan the archives of python-dev looking for bugs we haven't fixed yet -- but we do use SF as a reminder (and triage) system. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Tue Jan 8 03:41:50 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 7 Jan 2002 22:41:50 -0500 Subject: [Python-Dev] Including BSDDB3 References: <200201080016.g080GS602121@mira.informatik.hu-berlin.de> Message-ID: <15418.27262.381031.53951@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> What do people think about including bsddb3 in Python 2.3, MvL> along with deprecating the existing bsddb module? You'll find MvL> the package at MvL> http://pybsddb.sourceforge.net/ +1, for several reasons. - Robin's done a great job with the module. It feels quite solid and reliable. I've used it quite a bit working on Berkeley storage for ZODB/Zope. - Berkeley support in 2.2 is broken -- at least the setup.py rules are. On my stock, but stocked Mandrake 8.1 system, bsddbmodule never links right and the standard setup.py always deletes it because oflink problems. Fixing this is on My List, although I'd prefer to work with pybsddb. - I've talked to the Sleepycat guys, and if we wanted to, we could provide the Berkeley libraries with our distros with no licensing problems. Using Berkeley through the pybsddb binding is perfectly legal for any programs using them through Python. - It'd be great if we actually provided bsddb1, bsddb2, bsddb3 (and bsddb4?) modules which compile against the older libraries so databases written with any version could be accessed in Python. Maybe that's not exactly the right way to do it, but I don't think Python should be limited to just one version of Berkeley db. I've no idea what the default ought to be -- there's no clear winner. MvL> It would come as a bsddb3 package, which acts MvL> interface-compatible with the current bsddb module. Various MvL> submodules give access to more advanced features. I often "import bsddb3 as bsddb". MvL> The main rationale for dropping bsddb is that it still relies MvL> on the db_185.h interface, which will be phased out sooner or MvL> later. Existance of this interface, in turn, results in MvL> problems with anydbm: As mentioned above, I can see reasons for wanting to access any version of Berkeley db. -Barry From barry@zope.com Tue Jan 8 04:06:23 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 7 Jan 2002 23:06:23 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <15418.16816.651860.723486@anthem.wooz.org> <20020108013834.1FC12E8C6@waltz.rahul.net> Message-ID: <15418.28735.16464.124951@anthem.wooz.org> >>>>> "AM" == Aahz Maruch writes: AM> OTOH, if MvL's code is in a shell script, this objection AM> doesn't apply. I must have missed that. Was Martin suggesting a shell script, like "configure-lfs"? -Barry From barry@zope.com Tue Jan 8 04:10:00 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 7 Jan 2002 23:10:00 -0500 Subject: [Python-Dev] Including BSDDB3 References: <200201080016.g080GS602121@mira.informatik.hu-berlin.de> <200201080154.UAA26450@cj20424-a.reston1.va.home.com> Message-ID: <15418.28952.542216.58713@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> Sounds like a good plan, but we should make sure it can all GvR> be re-released under the PSF license. For the Zope GvR> Corp. portions of the code I promise that's no problem :-) -- GvR> but there are so many other contributors that it's getting a GvR> little tangled... I /think/ we're just talking mostly about Robin Dunn and Andrew Kuchling. From the description on the page, I can't quite tell whether any of Gregory P. Smith's original code remains. i'm-sure-andrew-won't-mind-either-ly y'rs, -Barry From fdrake@acm.org Tue Jan 8 04:09:38 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 7 Jan 2002 23:09:38 -0500 (EST) Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15418.28735.16464.124951@anthem.wooz.org> References: <15418.16816.651860.723486@anthem.wooz.org> <20020108013834.1FC12E8C6@waltz.rahul.net> <15418.28735.16464.124951@anthem.wooz.org> Message-ID: <15418.28930.856764.646759@grendel.zope.com> Barry A. Warsaw writes: > I must have missed that. Was Martin suggesting a shell script, like > "configure-lfs"? As long as configure captures the values to the Makefile, it doesn't matter whether the user types CFLAGS=... OPT=... export OPT CFLAGS ./configure or CFLAGS=... OPT=... ./configure is a matter of syntax, not functionality. We should not rely on any special environment variables being set after configure has been run. I think we're wasting time arguing about syntax at this point, and not making any progress. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From aahz@rahul.net Tue Jan 8 05:15:13 2002 From: aahz@rahul.net (Aahz Maruch) Date: Mon, 7 Jan 2002 21:15:13 -0800 (PST) Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15418.28735.16464.124951@anthem.wooz.org> from "Barry A. Warsaw" at Jan 07, 2002 11:06:23 PM Message-ID: <20020108051513.D953FE8C6@waltz.rahul.net> Barry A. Warsaw wrote: > >>>>> "AM" == Aahz Maruch writes: > > AM> OTOH, if MvL's code is in a shell script, this objection > AM> doesn't apply. > > I must have missed that. Was Martin suggesting a shell script, like > "configure-lfs"? Martin didn't, but it answers your objection. ;-) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From barry@zope.com Tue Jan 8 06:01:51 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 8 Jan 2002 01:01:51 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <15418.28735.16464.124951@anthem.wooz.org> <20020108051513.D953FE8C6@waltz.rahul.net> Message-ID: <15418.35663.459612.299157@anthem.wooz.org> >>>>> "AM" == Aahz Maruch writes: AM> Martin didn't, but it answers your objection. ;-) Yes, it would. -Barry From martin@v.loewis.de Tue Jan 8 07:08:27 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 08:08:27 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15418.16816.651860.723486@anthem.wooz.org> (barry@zope.com) References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> <15417.42588.807910.631165@anthem.wooz.org> <200201072255.g07Mtli01529@mira.informatik.hu-berlin.de> <15418.16816.651860.723486@anthem.wooz.org> Message-ID: <200201080708.g0878Rb01366@mira.informatik.hu-berlin.de> > 2) Any time you overload a make variable that has existing semantics, > you have to worry about losing the original value. Personally, I > think it's easier to get CC overloading right than get OPT or > CFLAGS overloading (and easier than getting them both right). But > maybe that's just me. Ok. For Solaris and Linux, the instruction about setting CC is about right, so I'm no longer objecting to changing the documentation in that direction. It is just that if you specify --without-gcc, or are on SGI or BSD/OS, that your environment setting of CC will be ignored. Regards, Martin From martin@v.loewis.de Tue Jan 8 07:20:12 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 08:20:12 +0100 Subject: [Python-Dev] Including BSDDB3 In-Reply-To: <200201080154.UAA26450@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Mon, 07 Jan 2002 20:54:09 -0500) References: <200201080016.g080GS602121@mira.informatik.hu-berlin.de> <200201080154.UAA26450@cj20424-a.reston1.va.home.com> Message-ID: <200201080720.g087KC701410@mira.informatik.hu-berlin.de> > Sounds like a good plan, but we should make sure it can all be > re-released under the PSF license. For the Zope Corp. portions of the > code I promise that's no problem :-) -- but there are so many other > contributors that it's getting a little tangled... Ok, I'll investigate. Martin From barry@zope.com Tue Jan 8 07:24:20 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 8 Jan 2002 02:24:20 -0500 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) References: <200201070543.g075hdc01442@mbuna.arbhome.com.au> <200201070733.g077Xlh01990@mira.informatik.hu-berlin.de> <15417.42588.807910.631165@anthem.wooz.org> <200201072255.g07Mtli01529@mira.informatik.hu-berlin.de> <15418.16816.651860.723486@anthem.wooz.org> <200201080708.g0878Rb01366@mira.informatik.hu-berlin.de> Message-ID: <15418.40612.331674.439878@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: >> 2) Any time you overload a make variable that has existing >> semantics, you have to worry about losing the original value. >> Personally, I think it's easier to get CC overloading right >> than get OPT or CFLAGS overloading (and easier than getting >> them both right). But maybe that's just me. MvL> Ok. For Solaris and Linux, the instruction about setting CC MvL> is about right, so I'm no longer objecting to changing the MvL> documentation in that direction. It is just that if you MvL> specify --without-gcc, or are on SGI or BSD/OS, that your MvL> environment setting of CC will be ignored. Good point. This should definitely be mentioned in the docs. then-again-future-google-searches-are-sure-to-turn-up-this-entire-painful -thread-ly y'rs, -Barry From martin@v.loewis.de Tue Jan 8 07:33:53 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 08:33:53 +0100 Subject: [Python-Dev] Including BSDDB3 In-Reply-To: <15418.27262.381031.53951@anthem.wooz.org> (barry@zope.com) References: <200201080016.g080GS602121@mira.informatik.hu-berlin.de> <15418.27262.381031.53951@anthem.wooz.org> Message-ID: <200201080733.g087XrJ01436@mira.informatik.hu-berlin.de> > - It'd be great if we actually provided bsddb1, bsddb2, bsddb3 (and > bsddb4?) modules which compile against the older libraries so > databases written with any version could be accessed in Python. > Maybe that's not exactly the right way to do it, but I don't think > Python should be limited to just one version of Berkeley db. I've > no idea what the default ought to be -- there's no clear winner. I'm not sure how that would work, though. Are you thinking of different code bases for the modules, or just compiling the same module multiple times? If the latter, how do you deal with features that are available only in later versions? E.g. I doubt that the current _db.c compiles with bsddb2 (not sure it even compiles with 3.0; it may be that 3.1 is required as a minimum). This *could* be solved with lots of #ifdefs in _db.c, but that sounds difficult to get right (who has so many versions installed to actually test that?). Also, I think it is rare that multiple versions are installed on a single system: I doubt BSDDB even supports simultaneous installation of multiple header file sets, on Unix. So even while you can have multiple versions of the shared library installed, compiling it for use with these libraries may be tricky. About the only case where I know about different systems is on Linux, where glibc incorporates a version of BSDDB2, so you might find database file of that version that the more recent BSDDB3 cannot open, anymore. For any other scenario, users are to blame for forgetting to update their database files when updating the libraries. Regards, Martin From martin@v.loewis.de Tue Jan 8 07:39:28 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 08:39:28 +0100 Subject: Large file system support in 2.1.2 (was Re: [Python-Dev] release for 2.1.2, plus 2.2.1...) In-Reply-To: <15418.28735.16464.124951@anthem.wooz.org> (barry@zope.com) References: <15418.16816.651860.723486@anthem.wooz.org> <20020108013834.1FC12E8C6@waltz.rahul.net> <15418.28735.16464.124951@anthem.wooz.org> Message-ID: <200201080739.g087dSh01439@mira.informatik.hu-berlin.de> > AM> OTOH, if MvL's code is in a shell script, this objection > AM> doesn't apply. > > I must have missed that. Was Martin suggesting a shell script, like > "configure-lfs"? No, I was really talking about the instructions in the manual, which would then indeed result in OPT being in the environment after configure has completed. If that is considered unacceptable, I'm fine with documenting that CC should be set in the environment - even though such instruction may also break on some systems. Enhancing configure to take into account more environment variables is worse: the risk of introducing new errors is just too high. Regards, Martin From nhodgson@bigpond.net.au Tue Jan 8 07:46:01 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Tue, 8 Jan 2002 18:46:01 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> Message-ID: <060d01c19818$8476cda0$0acc8490@neil> Martin: > I'd be all in favour of bringing ntmodule back into life, especially > if that is to become a module that does not need to work on > Win9x. Perhaps it can be compiled twice, once into w9x.pyd and once > into nt.pyd, or the common code can be shared by means if #include. I reversed again, posixmodule now detects Unicode arguments and handles them in UCS-2 rather than converting to UTF-8 and back again. This now looks like the right way to me. The total amount of code bloat is about 8K over a 150K file and this doesn't appear to be too much for me. A check is made to see if the platform supports Unicode file names and if it does not then the old conversion to Py_FileSystemDefaultEncoding is done. This means that Windows 9x should work the same as it currently does. This check is exposed as os.unicodefilenames() so that client code can decide whether to use Unicode. For other OSs that can support Unicode file names, adiitional cases can be added into posixmodule. The other platforms (OS X for example) may not provide these functions as taking UCS-2 arguments but instead UTF-8 arguments. They should still work similarly to the NT code but encode into UTF-8 before making system calls. The basic idea is that if you use a Unicode string for a file or path name in a call then returned information is in Unicode strings. > > I'm feeling more like making f_name be wide now but I'd expect some > > opposition now from backwards compatibility advocates. This is now done. > I think the major problem is that performing repr on a file should > work. If that turns out to use the repr of the string (can't check > right now), instead of raising UnicodeErrors, my oposition to putting > Unicode objects into file names is not that strong anymore. Changed the repr to display Unicode names using escapes so it does not raise errors. _getfullpathname which is available from nt and is used in ntpath now accepts a Unicode argument and then returns a Unicode path. Haven't checked ntpath to see if it will work with Unicode. New code at http://scintilla.sourceforge.net/winunichanges.zip After waiting a while for comments, I'll package this up as a patch. Neil From mal@lemburg.com Tue Jan 8 09:56:58 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 08 Jan 2002 10:56:58 +0100 Subject: [Python-Dev] PEP-time ? (Unicode strings as filenames) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> Message-ID: <3C3AC26A.D40842FB@lemburg.com> [Martin and Niel discussing various ways to add Unicode support to posixmodule] Guys, this discussion is getting somewhat out of hand. I believe that no-one on python-dev is seriously following this anymore, yet OTOH your are working on a rather important part of the Python file API. I'd suggest to write up the problem and your conclusions as a PEP for everyone to understand before actually starting to checkin anything. One thing I'd like to note (again) is that the code base is getting somewhat confusing in this area. I may be better to rip out the various bits and pieces for each supported platform and put the implementations into separate files -- much like what Greg has done for the DLL import machinery. This will reduce the levels of #ifdefs and make the whole API much more readable and understandable. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From skip@pobox.com Tue Jan 8 15:10:25 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 8 Jan 2002 09:10:25 -0600 Subject: [Python-Dev] Including BSDDB3 In-Reply-To: <200201080733.g087XrJ01436@mira.informatik.hu-berlin.de> References: <200201080016.g080GS602121@mira.informatik.hu-berlin.de> <15418.27262.381031.53951@anthem.wooz.org> <200201080733.g087XrJ01436@mira.informatik.hu-berlin.de> Message-ID: <15419.3041.198325.982111@12-248-41-177.client.attbi.com> >> - It'd be great if we actually provided bsddb1, bsddb2, bsddb3 (and >> bsddb4?) modules which compile against the older libraries so >> databases written with any version could be accessed in Python. Martin> I'm not sure how that would work, though. Agreed. I think trying to use multiple versions of libdb-generated files simultaneously is a disaster waiting to happen. It's unfortunate that the folks at Sleepycat haven't been able to provide a more consistent data format, but I understand that stuff is internal details and can change. They have been pretty good about providing update tools. What would be useful is if whatever bsddb module is installed could be more intelligent about file version errors. Instead of reporting something inscrutable like >>> db = bsddb.hashopen("tour.db") Traceback (most recent call last): File "", line 1, in ? bsddb.error: (-30990, 'Unknown error 4294936306') I'd like it to realize that it was asked to open an old format file and give a useful error message like: bsddb.error: (-30990, 'Attempt to open old format file - see db_upgrade(1)') Sleepycat's tools can do this in the face of old files: % file tour.db tour.db: Berkeley DB (Hash, version 5, native byte-order) % db_dump tour.db > tour.txt db_dump: tour.db: hash version 5 requires a version upgrade db_dump: open: tour.db: DB_OLDVERSION: Database requires a version upgrade % db_upgrade tour.db % file tour.db tour.db: Berkeley DB (Hash, version 7, native byte-order) % db_dump tour.db > tour.txt Martin> Also, I think it is rare that multiple versions are installed on Martin> a single system: I doubt BSDDB even supports simultaneous Martin> installation of multiple header file sets, on Unix. Actually, RedHat & Mandrake do. This leads to as many problems as it solves. Take a look at the code in setup.py: dblib = [] if self.compiler.find_library_file(lib_dirs, 'db-3.2'): dblib = ['db-3.2'] elif self.compiler.find_library_file(lib_dirs, 'db-3.1'): dblib = ['db-3.1'] elif self.compiler.find_library_file(lib_dirs, 'db3'): dblib = ['db3'] elif self.compiler.find_library_file(lib_dirs, 'db2'): dblib = ['db2'] elif self.compiler.find_library_file(lib_dirs, 'db1'): dblib = ['db1'] elif self.compiler.find_library_file(lib_dirs, 'db'): dblib = ['db'] db185_incs = find_file('db_185.h', inc_dirs, ['/usr/include/db3', '/usr/include/db2']) db_inc = find_file('db.h', inc_dirs, ['/usr/include/db1']) And it's still not correct, as Barry indicated yesterday. For example, suppose that even though db3 is installed on your system you want to only manipulate db2 databases (perhaps for compatibility with another machine). You're stuck and have to edit setup.py or use Modules/Setup to build bsddb. Martin> So even while you can have multiple versions of the shared Martin> library installed, compiling it for use with these libraries may Martin> be tricky. Got that right... ;-) Martin> For any other scenario, users are to blame for forgetting to Martin> update their database files when updating the libraries. In the presence of anydbm, it's not obvious that users should know what file format their underlying databases are. Skip From barry@zope.com Tue Jan 8 16:47:41 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 8 Jan 2002 11:47:41 -0500 Subject: [Python-Dev] Including BSDDB3 References: <200201080016.g080GS602121@mira.informatik.hu-berlin.de> <15418.27262.381031.53951@anthem.wooz.org> <200201080733.g087XrJ01436@mira.informatik.hu-berlin.de> <15419.3041.198325.982111@12-248-41-177.client.attbi.com> Message-ID: <15419.8877.488862.541659@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: >> - It'd be great if we actually provided bsddb1, bsddb2, bsddb3 >> (and bsddb4?) modules which compile against the older libraries >> so databases written with any version could be accessed in >> Python. Martin> I'm not sure how that would work, though. SM> Agreed. Oops. I thought I had read that pybsddb could be compiled against older APIs. But on a re-read of the pages, that's obviously wrong, so forget this dumb idea. SM> What would be useful is if whatever bsddb module is installed SM> could be more intelligent about file version errors. +1 Martin> Also, I think it is rare that multiple versions are Martin> installed on a single system: I doubt BSDDB even supports Martin> simultaneous installation of multiple header file sets, on Martin> Unix. SM> Actually, RedHat & Mandrake do. This leads to as many SM> problems as it solves. Indeed, this is broken on Mandrake. I was trying to get Postfix and Python to at least agree on the BDB version they were going to use and it wasn't until I installed pybsddb from source, and rebuilt Postfix against the separately downloaded Berkeley 3.3.11 libs/API that I got it all to work. SM> Take a look at the code in setup.py: BTW, I think this a large part of the problem when building Py2.2 on Mandrake 8.1. Maybe these lines in the setup are /too/ smart? I seem to remember having no problems w/ Py2.1.1. But that's excusable I suppose since pybsddb's setup.py has its own problems! It should at least recognize a default from-source install of Sleepycat's libs w/o lots of cryptic command line options. And getting "python setup.py clean -a" to work right would be a bonus. :) Also note that pybsddb should now (or soon) work with Berkeley DB 4 so calling it bsddb3 isn't right either. I don't think there's a db format change from BDB 3 -> BDB 4. bsddb-ng? :) Okay, I'm rambling. Let's add pybsddb (under a better name) and keep bsddbmodule around and /try/ to fix some of the worst installation problems. The state of Berkeley DB on various distros doesn't make our lives easy here, but let's not add to the problems, if at all possible. I'm willing to help out with all this. We should also get buy-in from Robin since we also don't want to fork develoment or have to keep the two in sync. -Barry From tim.one@home.com Tue Jan 8 18:41:37 2002 From: tim.one@home.com (Tim Peters) Date: Tue, 8 Jan 2002 13:41:37 -0500 Subject: [Python-Dev] eval() slowdown in 2.2 on MacOS X? In-Reply-To: Message-ID: [Andrew Kuchling] > [CC'ed to python-dev, Barbara Mattson] > > Barbara's encountered an apparent problem with test_longexp in Python > 2.2 on MacOS X. test_longexp creates a big list expression and > eval()'s it. The problem is that it takes an exceedingly long time to > run, at least more than half an hour (at which point she interrupted > it). > > The two curious things are that 1) while test_longexp requires a lot > of memory and often thrashes on a low-memory machine (I found there > are 2 or 3 bugs in the SF bugtracker to this effect), the MacOS box in > question has a gigabyte of RAM, and 2) Python 2.1.1 *doesn't* show the > problem. The test takes about 2 seconds on my box (Win98SE, 256MB, 866MHz), in 2.2 or 2.1.1, and I don't know of any Mac-specific code that might get touched here except for the C library. So Skip's suggestion to try pymalloc is a good one -- although it's hard to see in advance why that would make a difference in this specific case. > Quoting from her report: > > I tried the test_longexp by hand: > > REPS = XXX > l = eval("[" + "2," * REPS + "]") > print len(l) Break it into smaller steps so we can narrow down possible causes: REPS = 50000 print "building list guts" guts = "2," * REPS print "building input string" input = "[" + guts + "]" print "compiling the input string" code = compile(input, "", "eval") print "executing" thelist = eval(code) print len(thelist) When REPS is large, what's the last thing that gets printed before the huge delay starts? From thomas.heller@ion-tof.com Tue Jan 8 19:23:09 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Tue, 8 Jan 2002 20:23:09 +0100 Subject: [Python-Dev] unicode/string asymmetries Message-ID: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> I noticed several unicode/string asymmetries: 1. No support for unicode in the struct and array modules. Is this an oversight? 2. What would be the corresponding unicode format character for 'z' in the struct module (string or None)? 3. There does not seem to be an equivalent to the 's' format character for PyArg_Parse() or Py_BuildValue(). Thomas From martin@v.loewis.de Tue Jan 8 19:52:29 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 20:52:29 +0100 Subject: [Python-Dev] PEP-time ? (Unicode strings as filenames) In-Reply-To: <3C3AC26A.D40842FB@lemburg.com> (mal@lemburg.com) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> Message-ID: <200201081952.g08JqTC01580@mira.informatik.hu-berlin.de> > I'd suggest to write up the problem and your conclusions as a > PEP for everyone to understand before actually starting to > checkin anything. We certainly would, if we had achieved any conclusions yet. If you want, we can continue discussion in private. Regards, Martin From greg@electricrain.com Tue Jan 8 20:21:27 2002 From: greg@electricrain.com (Gregory P. Smith) Date: Tue, 8 Jan 2002 12:21:27 -0800 Subject: [pybsddb] Re: [Python-Dev] Including BSDDB3 In-Reply-To: <15418.28952.542216.58713@anthem.wooz.org>; from barry@zope.com on Mon, Jan 07, 2002 at 11:10:00PM -0500 References: <200201080016.g080GS602121@mira.informatik.hu-berlin.de> <200201080154.UAA26450@cj20424-a.reston1.va.home.com> <15418.28952.542216.58713@anthem.wooz.org> Message-ID: <20020108122127.A18130@zot.electricrain.com> On Mon, Jan 07, 2002 at 11:10:00PM -0500, Barry A. Warsaw wrote: > > >>>>> "GvR" == Guido van Rossum writes: > > GvR> Sounds like a good plan, but we should make sure it can all > GvR> be re-released under the PSF license. For the Zope > GvR> Corp. portions of the code I promise that's no problem :-) -- > GvR> but there are so many other contributors that it's getting a > GvR> little tangled... > > I /think/ we're just talking mostly about Robin Dunn and Andrew > Kuchling. From the description on the page, I can't quite tell > whether any of Gregory P. Smith's original code remains. > > i'm-sure-andrew-won't-mind-either-ly y'rs, > -Barry Consider any of my pybsddb/bsddb3 code that remains [some does i'm sure] placed under whatever open source license is needed, (PSF license, etc). (I prefer the code to be used, not bickered about :). -g -- Gregory P. Smith From martin@v.loewis.de Tue Jan 8 20:24:57 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 21:24:57 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> (thomas.heller@ion-tof.com) References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> Message-ID: <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> > I noticed several unicode/string asymmetries: > > 1. No support for unicode in the struct and array modules. > Is this an oversight? I'd call it intentional. What exactly would you like to happen? > 2. What would be the corresponding unicode format character for 'z' > in the struct module (string or None)? You mean, in getargs? There is no corresponding thing. I'd recommend against adding new formats. Instead, I'd propose to add new conversion functions: Py_UNICODE *str; PyArg_ParseTuple(args, "O&", &str, PyArg_UnicodeZ); int PyArg_UnicodeZ(PyObject *o, void *d){ PyUnicode **dest = (Py_UNICODE**)d; if (o == Py_None) { *dest = NULL; return 1; } if (PyUnicode_Check(o)){ *dest = PyUnicode_AS_UNICODE(o); return 1; } PyErr_SetString(PyExc_TypeError, "unicode or None expected"); return 0; } It may be desirable to allow passing of : or ; strings to conversion functions, and helper API to format the errors. > 3. There does not seem to be an equivalent to the 's' format character > for PyArg_Parse() or Py_BuildValue(). That would be 'u'. However, is this really needed? PyArg_Parse is deprecated, and I doubt you have Py_UNICODE* often enough to need it to pass to Py_BuildValue. Regards, Martin From martin@v.loewis.de Tue Jan 8 20:52:27 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 21:52:27 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <060d01c19818$8476cda0$0acc8490@neil> (nhodgson@bigpond.net.au) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <060d01c19818$8476cda0$0acc8490@neil> Message-ID: <200201082052.g08KqRJ01826@mira.informatik.hu-berlin.de> > I reversed again, posixmodule now detects Unicode arguments and handles > them in UCS-2 rather than converting to UTF-8 and back again. This now looks > like the right way to me. The total amount of code bloat is about 8K over a > 150K file and this doesn't appear to be too much for me. I agree. We still should keep "mbcs", so extension modules that don't want to go through the troubles of special-casing Windows will be able to get it right most of the time. > A check is made to see if the platform supports Unicode file names and if > it does not then the old conversion to Py_FileSystemDefaultEncoding is done. > This means that Windows 9x should work the same as it currently does. This > check is exposed as os.unicodefilenames() so that client code can decide > whether to use Unicode. That has unclear semantics for me. It sounds like "if true, you can pass Unicode strings to open etc." However, then it should return 1 on all systems, since you always can - the default encoding may apply, and restrict file names to ASCII. Or, it may mean "if true, you can pass all Unicode strings to open". This is not true, either, because there are always reserved characters (such as the path delimiter). > For other OSs that can support Unicode file names, adiitional cases can > be added into posixmodule. The other platforms (OS X for example) may not > provide these functions as taking UCS-2 arguments but instead UTF-8 > arguments. They should still work similarly to the NT code but encode into > UTF-8 before making system calls. I think this is not needed. Instead, using setting the file system encoding to UTF-8 should be sufficient. > After waiting a while for comments, I'll package this up as a patch. Very good. Would you also write the PEP? If not, I will, but that may take some time. Regards, Martin From thomas.heller@ion-tof.com Tue Jan 8 21:24:55 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Tue, 8 Jan 2002 22:24:55 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> Message-ID: <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> > > I noticed several unicode/string asymmetries: > > > > 1. No support for unicode in the struct and array modules. > > Is this an oversight? > > I'd call it intentional. What exactly would you like to happen? I would like to create struct's containing unicode characters (be gentle with me, maybe I mean wide characters, or mbcs, but I'm really not sure) Thomas From nhodgson@bigpond.net.au Tue Jan 8 21:55:14 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Wed, 9 Jan 2002 08:55:14 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <060d01c19818$8476cda0$0acc8490@neil> <200201082052.g08KqRJ01826@mira.informatik.hu-berlin.de> Message-ID: <01f601c1988f$27222920$0acc8490@neil> Martin: > That has unclear semantics for me. It sounds like "if true, you can > pass Unicode strings to open etc." However, then it should return 1 on > all systems, since you always can - the default encoding may apply, > and restrict file names to ASCII. Or, it may mean "if true, you can > pass all Unicode strings to open". This is not true, either, because > there are always reserved characters (such as the path delimiter). OK, it means: If true, the underlying system supports file names containing most Unicode characters and any valid file name may be passed to open as a Unicode string. Yes, the "most" is fuzzy but just as with normal strings, the file system gets to put special meaning on delimiters, restrict file name length, and disallow characters such as \u0000. > > After waiting a while for comments, I'll package this up as a patch. > > Very good. Would you also write the PEP? If not, I will, but that may > take some time. I'll try in the next day or so but may bail if not able to work on it much as I have some backlog from spending time on this rather than other projects. Neil From martin@v.loewis.de Tue Jan 8 22:17:53 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 23:17:53 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> (thomas.heller@ion-tof.com) References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> Message-ID: <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> > > > 1. No support for unicode in the struct and array modules. > > > Is this an oversight? > > > > I'd call it intentional. What exactly would you like to happen? > > I would like to create struct's containing unicode characters > (be gentle with me, maybe I mean wide characters, or mbcs, but I'm really > not sure) Well, that is precisely the problem: When putting a Unicode object into a C structure, there are too many alternatives to pick a sensible default. It is not even clear what a "wide character" is: it mide be a value of wchar_t, or it might be a value of Py_UNICODE (those differ on Unix, in the default installation). For "MBCS", the most reasonable default might be "utf-8", since this capable of encoding all characters. On Windows, "mbcs" is also a good choice, since it uses the encoding that all character API uses. Why are you asking? Do you have a specific implementation in mind, or are you just worried that Unicode objects cannot be put into structures? Don't worry, file objects cannot be put into structures, either :-) Regards, Martin From martin@v.loewis.de Tue Jan 8 22:19:40 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 8 Jan 2002 23:19:40 +0100 Subject: [Python-Dev] Unicode strings as filenames In-Reply-To: <01f601c1988f$27222920$0acc8490@neil> (nhodgson@bigpond.net.au) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <060d01c19818$8476cda0$0acc8490@neil> <200201082052.g08KqRJ01826@mira.informatik.hu-berlin.de> <01f601c1988f$27222920$0acc8490@neil> Message-ID: <200201082219.g08MJe808681@mira.informatik.hu-berlin.de> > If true, the underlying system supports file names containing most > Unicode characters and any valid file name may be passed to open as a > Unicode string. So what is the value of exposing this to Python? It seems to be Windows-specific, so I doubt it should be generalized. Regards, Martin From nhodgson@bigpond.net.au Tue Jan 8 23:35:29 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Wed, 9 Jan 2002 10:35:29 +1100 Subject: [Python-Dev] Unicode strings as filenames References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <060d01c19818$8476cda0$0acc8490@neil> <200201082052.g08KqRJ01826@mira.informatik.hu-berlin.de> <01f601c1988f$27222920$0acc8490@neil> <200201082219.g08MJe808681@mira.informatik.hu-berlin.de> Message-ID: <035c01c1989d$28710ef0$0acc8490@neil> Martin: > > If true, the underlying system supports file names containing most > > Unicode characters and any valid file name may be passed to open as a > > Unicode string. > > So what is the value of exposing this to Python? It seems to be > Windows-specific, so I doubt it should be generalized. It differentiates between those systems where open decodes Unicode file names into a particular locale (possibly losing information) and those systems that preserve Unicode file names. The set of systems where this is true could change in the future. A sufficiently motivated Windows 9x user could make it work there, possibly by looking for the long names in the directory data and converting them to short names. When this is false, client code may be prepared to offer a more reasonable error message indicating the the locale may be set incorrectly or even try multiple locales in order to open a file. Mmm, there is a Japanese character in that file name so I'll try temporarily changing the locale to Japanese to open the file. Neil From guido@python.org Wed Jan 9 01:48:28 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 08 Jan 2002 20:48:28 -0500 Subject: [Python-Dev] Please help making the Python track at OSCON 2002 a success! Message-ID: <200201090148.UAA25635@cj20424-a.reston1.va.home.com> July 22-26 is the date for O'Reilly's Open Source Convention. San Diego is the location. I've been enlisted by O'Reilly to try and make the Python track a success. But I can't do it by myself: I need people to help rustle up speakers and review proposals for presentations and tutorials. If you think you'll be able to make it to the conference this year, please consider helping out! See here for more info: http://conferences.oreillynet.com/cs/os2002/create/e_sess If you want to help, please let me know! --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Wed Jan 9 07:51:15 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 9 Jan 2002 08:51:15 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> Message-ID: <024a01c198e2$823d2280$e000a8c0@thomasnotebook> > > I would like to create struct's containing unicode characters > > (be gentle with me, maybe I mean wide characters, or mbcs, but I'm really > > not sure) > > Well, that is precisely the problem: When putting a Unicode object > into a C structure, there are too many alternatives to pick a sensible > default. It is not even clear what a "wide character" is: it mide be a > value of wchar_t, or it might be a value of Py_UNICODE (those differ > on Unix, in the default installation). > > For "MBCS", the most reasonable default might be "utf-8", since this > capable of encoding all characters. On Windows, "mbcs" is also a good > choice, since it uses the encoding that all character API uses. > > Why are you asking? Do you have a specific implementation in mind, or > are you just worried that Unicode objects cannot be put into > structures? Don't worry, file objects cannot be put into structures, > either :-) Hehe, I don't want to put objects in structures, I just want to buid structures containing "Unicode strings". Actually, in this case I'm trying to build a win32 VS_VERSIONINFO structure, which contains a field WCHAR szKey[]. MSDN says: szKey Contains the Unicode string "VS_VERSION_INFO". Currently I use something like the following code to access the raw buffer: struct.pack("32s", str(buffer(u"VS_VERSION_INFO"))) Looks strange but works: >>> print repr(struct.pack("32s", (str(buffer(u"VS_VERSION_INFO"))))) 'V\x00S\x00_\x00V\x00E\x00R\x00S\x00I\x00O\x00N\x00_\x00I\x00N\x00F\x00O\x00\x00\x00' Thomas From mal@lemburg.com Wed Jan 9 09:02:54 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 09 Jan 2002 10:02:54 +0100 Subject: [Python-Dev] PEP-time ? (Unicode strings as filenames) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <200201081952.g08JqTC01580@mira.informatik.hu-berlin.de> Message-ID: <3C3C073E.3DC68349@lemburg.com> "Martin v. Loewis" wrote: > > > I'd suggest to write up the problem and your conclusions as a > > PEP for everyone to understand before actually starting to > > checkin anything. > > We certainly would, if we had achieved any conclusions yet. If you > want, we can continue discussion in private. No, please keep it on python-dev; at least then the arguments will be kept in the archives. Still, I don't expect anyone here to closely follow the discussion and with most of the PythonLabs team being busy on other tasks you'll have to find some way to summarize the discussion for them and others to review at some later point in time. PEPs are the right method for this, IMHO. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From fredrik@pythonware.com Wed Jan 9 09:02:12 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 9 Jan 2002 10:02:12 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> Message-ID: <077201c198ec$56b9b740$0900a8c0@spiff> thomas wrote: > Hehe, I don't want to put objects in structures, I just want to buid > structures containing "Unicode strings". there is no such thing. what you want is a binary buffer with an *encoded* unicode string. to get one, figure out what encoding you need (probably utf-16-le), convert the string to a byte string using the encode method, and store that byte string in your struct. def wu(str): # encode unicode string for win32 apis return str.encode("utf-16-le") struct.pack("32s", wu(u"VS_VERSION_INFO")) > > struct.pack("32s", str(buffer(u"VS_VERSION_INFO"))) that's evil: you're assuming that Python will always use the same internal representation for unicode strings. that's not the case. From mal@lemburg.com Wed Jan 9 09:33:19 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 09 Jan 2002 10:33:19 +0100 Subject: [Python-Dev] parser markers vs. conversion functions (unicode/string asymmetries) References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> Message-ID: <3C3C0E5F.4B86BB4E@lemburg.com> "Martin v. Loewis" wrote: > > > 2. What would be the corresponding unicode format character for 'z' > > in the struct module (string or None)? > > You mean, in getargs? There is no corresponding thing. > > I'd recommend against adding new formats. Instead, I'd propose to add > new conversion functions: > > Py_UNICODE *str; > PyArg_ParseTuple(args, "O&", &str, PyArg_UnicodeZ); > > int PyArg_UnicodeZ(PyObject *o, void *d){ > ... > } Why do you think that adding the conversion functions to getargs.c would be any different from adding new parser markers ? As I understand "O&", it is meant for user-space conversion functions, not system provided ones. The latter can easily be intergated as parser markers or options to parser markers. Unless, of course, you want to start shifting from parser markers to conversion functions completely (which I doubt). Note that "O&" doesn't really buy you anything much: you could just as well use "O" and then switch on the returned object type or call a converter (with all the extra error handling or other extra information needed for your particular case). > It may be desirable to allow passing of : or ; strings to conversion > functions, and helper API to format the errors. You'd need a new parser marker option to support this new interface. In the end, I don't believe we gain much from beefing up the "O&" interface. I'd rather like to see the Unicode parser markers extended to be more useful (I'll checkin a patch for "u#" later today). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Wed Jan 9 10:17:30 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 9 Jan 2002 11:17:30 +0100 Subject: [Python-Dev] parser markers vs. conversion functions (unicode/string asymmetries) In-Reply-To: <3C3C0E5F.4B86BB4E@lemburg.com> (mal@lemburg.com) References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <3C3C0E5F.4B86BB4E@lemburg.com> Message-ID: <200201091017.g09AHU801428@mira.informatik.hu-berlin.de> > Why do you think that adding the conversion functions to getargs.c > would be any different from adding new parser markers ? For two reasons: - people who want portability across Python versions can better maintain their source code. They just need to provide a definition of the conversion function for older Python versions, which they can copy literally from the more recent version. - the code becomes more readable, since function names are more self-documenting than single letter codes. > As I understand "O&", it is meant for user-space conversion functions, > not system provided ones. It may have been originally defined for that purpose. I believe it would useful to provide a standard library of such functions. > Unless, of course, you want to start shifting from parser markers to > conversion functions completely (which I doubt). I would, in fact, prefer if the set of conversion codes is frozen, and extended only for cases that are likely to get wide applicability. I believe many of the codes invented for Unicode have never been used in any module, it seems that some have been invented just for an abstract notion of "symmetry". > Note that "O&" doesn't really buy you anything much: you could > just as well use "O" and then switch on the returned object > type or call a converter (with all the extra error handling > or other extra information needed for your particular case). People are apparently fond of a single function that simultaneously checks the validity of all arguments. If it fails, it will completely clean up. That makes me wonder about the existing converters and their cleanup capabilities: Suppose I do char *buffer = NULL; int i; if (PyArg_ParseTuple(args, "eti", &buffer, &i)) return NULL; Now suppose I pass a Unicode object for the first argument, and a list for the second. Is it true that this code will leak? since the first argument has already been converted, and the second leads to an error, the encoded string has already been produced. > In the end, I don't believe we gain much from beefing up the > "O&" interface. I'd rather like to see the Unicode parser > markers extended to be more useful (I'll checkin a patch for > "u#" later today). How will that deal with string objects? Regards, Martin From jack@oratrix.nl Wed Jan 9 11:55:12 2002 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 09 Jan 2002 12:55:12 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: Message by "Martin v. Loewis" , Mon, 7 Jan 2002 23:50:47 +0100 , <200201072250.g07Molc01523@mira.informatik.hu-berlin.de> Message-ID: <20020109115512.5B6A1E8451@oratrix.oratrix.nl> > > All the Mac toolbox objects (Windows, Dialogs, Controls, Menus and a > > zillion more), All the Windows HANDLEs, all the MFC objects (although > > they might be a bit more difficult), the objects in the X11 and Motif > > modules, the pyexpat parser object, *dbm objects, dlmodule objects, > > mpz objects, zlib objects, SGI cl and al objects.... > > Could you please try once more, being serious this time? AFAICT, I was > asking for examples of types that are parsed by means of O& currently, > and do so just to get a void** from the python object. Shall we try to keep this civil, please? I *am* being serious, and I'm getting slightly upset that with this subject (again) you appear to start shooting away without trying very hard to understand the issue I'm raising. > Looking at pyexpat.c, I find a few uses of O&, none related to the > pyexpat parser object. In zlibmodule.c, I find not a single mentioning > of O&, likewise in dlmodule.c, clmodule.c, almodule.c, dbmmodule.c, > and now I'm losing interest into verifying more of your examples. Ok, let me rephrase my list then. The first five items in my list, which you carefully ignored, are examples of objects that now already make heavy use of O&. The rest are examples of other objects that wrap a C pointer, and which could potentially also be opened up to use in struct or calldll. And to give a complete example of how useful this would be consider the following. I'll give a mac-centric example, because I don't know enough about calldll on windows (and I don't think there's a unix version yet). Assume you're using Python to extend Photoshop. Assume Photoshop has an API to allow the plugin to get at the screen. Let's assume that there's a C call extern GrafPtr ps_GetDrawableSurface(void); to get at the datastructure you need to draw to. These GrafPtr's are (in Mac/Modules/qd/_Quickdraw.c) wrapped in Carbon.Qd.GrafPortType objects in Python. In the current situation, if you would want to wrap this ps_GetDrawableSurface function you would need to write a C wrapper (which means you would need a C compiler, etc etc) because you would need to convert the return value with ("O&", GrafObj_new). If we had something like ("O@", typeobject) calldll could be extended so you could do something like psapilib = calldll.getlibrary(....) ps_GetDrawableSurface = calldll.newcall(psapilib.ps_GetDrawableSurface, Carbon.Qd.GrafPortType) (newcall() arguments are funcpointer, return value type, arg1 type, ...) You cannot do this currently, because there is no way to get from the type object (which is the only thing you have available in Python) to the functions you need to pass to O&. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jack@oratrix.nl Wed Jan 9 12:12:56 2002 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 09 Jan 2002 13:12:56 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: Message by "Martin v. Loewis" , Tue, 8 Jan 2002 21:24:57 +0100 , <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> Message-ID: <20020109121256.2758DE8451@oratrix.oratrix.nl> > > 3. There does not seem to be an equivalent to the 's' format character > > for PyArg_Parse() or Py_BuildValue(). > > That would be 'u'. However, is this really needed? PyArg_Parse is > deprecated, Huh, what did I miss? Why is PyArg_Parse deprecated, and by what should it be replaced? > and I doubt you have Py_UNICODE* often enough to need > it to pass to Py_BuildValue. Martin, have you ever wrapped any Unicode API's? (As opposed to using unicode as a purely internal datatype, which you clearly know a lot about). Thomas' question are similar to mine from last week, and Neil's are related too. All the niceties we have for strings (optional ones with z, autoconversion from unicode, s# to get the size) are missing for unicode, and that's a pain when you're wrapping an existing C api. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jack@oratrix.nl Wed Jan 9 12:25:19 2002 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 09 Jan 2002 13:25:19 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: Message by "Fredrik Lundh" , Wed, 9 Jan 2002 10:02:12 +0100 , <077201c198ec$56b9b740$0900a8c0@spiff> Message-ID: <20020109122519.75A4AE8451@oratrix.oratrix.nl> > thomas wrote: > > > Hehe, I don't want to put objects in structures, I just want to buid > > structures containing "Unicode strings". > > there is no such thing. > > what you want is a binary buffer with an *encoded* > unicode string. It becomes more and more clear to me that there are two groups of people on this list: those who understand unicode (and may or may not actually use it) and those who want to use unicode (but apparently don't understand it). I'm in the second group:-) > to get one, figure out what encoding you need (probably > utf-16-le), convert the string to a byte string using the > encode method, and store that byte string in your struct. > > def wu(str): > # encode unicode string for win32 apis > return str.encode("utf-16-le") > > struct.pack("32s", wu(u"VS_VERSION_INFO")) Why would you have to specify the encoding if what you want is the normal, standard encoding? Or, to rephrase the question, why do C programmers only have to s/char/wchar_t/, add a "w" to the front of the routine names and a u in front of the string constants, whereas Python programmers are now suddenly expected to learn all this mumbo-jumbo about encodings and such? -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From fredrik@pythonware.com Wed Jan 9 13:16:30 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 9 Jan 2002 14:16:30 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <20020109122519.75A4AE8451@oratrix.oratrix.nl> Message-ID: <08b701c1990f$dbdb0e10$0900a8c0@spiff> jack wrote: > > struct.pack("32s", wu(u"VS_VERSION_INFO")) > > Why would you have to specify the encoding if what you want is the normal, > standard encoding? because there is no such thing as a "normal, standard encoding" for a unicode character, just like there's no "normal, standard encoding" for an integer (big endian, little endian?), a floating point number (ieee, vax, etc), a screen coordinate, etc. as soon as something gets too large to store in a byte, there's always more than one obvious way to store it ;-) > Or, to rephrase the question, why do C programmers only > have to s/char/wchar_t/ because they're tend to prefer to quickly get the wrong result? ;-) C makes no guarantees about wchar_t, so Python's Unicode type doesn't rely on it (it can use it, though: you can check the HAVE_USABLE_WCHAR_T macro to see if it's the same thing; see PyUnicode_FromWideChar for an example). in the Mac case, it might be easiest to configure things so that HAVE_USABLE_WCHAR_T is always true, and assume that Py_UNICODE is the same thing as wchar_t. (checking this in the module init function won't hurt, of course) but you cannot rely on that if you're writing truly portable code. From thomas.heller@ion-tof.com Wed Jan 9 14:00:48 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 9 Jan 2002 15:00:48 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects References: <20020109115512.5B6A1E8451@oratrix.oratrix.nl> Message-ID: <04bc01c19916$22437120$e000a8c0@thomasnotebook> From: "Jack Jansen" > And to give a complete example of how useful this would be consider the > following. I'll give a mac-centric example, because I don't know enough about > calldll on windows (and I don't think there's a unix version yet). > > Assume you're using Python to extend Photoshop. Assume Photoshop has an API to > allow the plugin to get at the screen. Let's assume that there's a C call > extern GrafPtr ps_GetDrawableSurface(void); > to get at the datastructure you need to draw to. > These GrafPtr's are (in Mac/Modules/qd/_Quickdraw.c) wrapped in > Carbon.Qd.GrafPortType objects in Python. > > In the current situation, if you would want to wrap this ps_GetDrawableSurface > function you would need to write a C wrapper (which means you would need a C > compiler, etc etc) because you would need to convert the return value with > ("O&", GrafObj_new). If we had something like ("O@", typeobject) calldll could > be extended so you could do something like > psapilib = calldll.getlibrary(....) > ps_GetDrawableSurface = calldll.newcall(psapilib.ps_GetDrawableSurface, > Carbon.Qd.GrafPortType) > > (newcall() arguments are funcpointer, return value type, arg1 type, ...) > > You cannot do this currently, because there is no way to get from the type > object (which is the only thing you have available in Python) to the functions > you need to pass to O&. In Python 2.2, the type object can itself be an instance, and you could call classmethods on it... I'm doing something similar on windows. Thomas From thomas.heller@ion-tof.com Wed Jan 9 14:07:57 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 9 Jan 2002 15:07:57 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> <077201c198ec$56b9b740$0900a8c0@spiff> Message-ID: <04ca01c19917$2229b220$e000a8c0@thomasnotebook> From: "Fredrik Lundh" > thomas wrote: > > > Hehe, I don't want to put objects in structures, I just want to buid > > structures containing "Unicode strings". > > there is no such thing. > > what you want is a binary buffer with an *encoded* > unicode string. > > to get one, figure out what encoding you need (probably > utf-16-le), convert the string to a byte string using the > encode method, and store that byte string in your struct. > > def wu(str): > # encode unicode string for win32 apis > return str.encode("utf-16-le") > > struct.pack("32s", wu(u"VS_VERSION_INFO")) Thanks, works great. And utf-16-le *seems* to be what I want... Next question ;-), sorry for beeing off-topic for python-dev: How can I do the equivalent of u"some string" in terms of unicode("some string", encoding) Thanks, Thomas From fdrake@acm.org Wed Jan 9 14:52:39 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 9 Jan 2002 09:52:39 -0500 (EST) Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <20020109121256.2758DE8451@oratrix.oratrix.nl> References: <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <20020109121256.2758DE8451@oratrix.oratrix.nl> Message-ID: <15420.22839.70151.27559@cj42289-a.reston1.va.home.com> Jack Jansen writes: > Huh, what did I miss? Why is PyArg_Parse deprecated, and by what > should it be replaced? I think it is only recommended to avoid this as the argument-parsing function for an extension function/method; PyArg_ParseTuple() should be used instead since it can give better error messages using the :funcname syntax for the format string (which is strongly recommended!). -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jack@oratrix.nl Wed Jan 9 14:55:11 2002 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 09 Jan 2002 15:55:11 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: Message by "Fredrik Lundh" , Wed, 9 Jan 2002 14:16:30 +0100 , <08b701c1990f$dbdb0e10$0900a8c0@spiff> Message-ID: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> > jack wrote: > > > struct.pack("32s", wu(u"VS_VERSION_INFO")) > > > > Why would you have to specify the encoding if what you want is the normal, > > standard encoding? > > because there is no such thing as a "normal, standard > encoding" for a unicode character, just like there's no > "normal, standard encoding" for an integer (big endian, > little endian?), a floating point number (ieee, vax, etc), > a screen coordinate, etc. What I here call the "normal, standard encoding" is what the C library supports. Your analogy of integers and floats is exactly the right one: even though there are many ways to represent an integer what you get back from PyArg_Parse("l") is a standard C "long". Maybe the confusion is that whereever I have said "unicode" in the past I should have said "wchar_t". I know there are, in theory, many encodings of Unicode but in practice there is only one that I'm interested in most of the time and that's wchar_t, because that's what all my APIs want. So, I would like PyArg_Parse/Py_BuildValue formats that are symmetric to "s", "s#" and "z" but that return wchar_t strings and that work with both UnicodeObjects and StringObjects. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jack@oratrix.nl Wed Jan 9 15:00:32 2002 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 09 Jan 2002 16:00:32 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: Message by "Thomas Heller" , Wed, 9 Jan 2002 15:00:48 +0100 , <04bc01c19916$22437120$e000a8c0@thomasnotebook> Message-ID: <20020109150032.6FB34E8451@oratrix.oratrix.nl> > > You cannot do this currently, because there is no way to get from the type > > object (which is the only thing you have available in Python) to the functions > > you need to pass to O&. > > In Python 2.2, the type object can itself be an instance, and you could call > classmethods on it... > I'm doing something similar on windows. Could you explain how you do this? If I have the typeobject, how would I get to the address of the "int (*converter)(PyObject *, void *)" function? -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From guido@python.org Wed Jan 9 14:56:15 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Jan 2002 09:56:15 -0500 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: Your message of "Wed, 09 Jan 2002 09:52:39 EST." <15420.22839.70151.27559@cj42289-a.reston1.va.home.com> References: <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <20020109121256.2758DE8451@oratrix.oratrix.nl> <15420.22839.70151.27559@cj42289-a.reston1.va.home.com> Message-ID: <200201091456.JAA04425@cj20424-a.reston1.va.home.com> > Jack Jansen writes: > > Huh, what did I miss? Why is PyArg_Parse deprecated, and by what > > should it be replaced? > > I think it is only recommended to avoid this as the argument-parsing > function for an extension function/method; PyArg_ParseTuple() should > be used instead since it can give better error messages using the > :funcname syntax for the format string (which is strongly > recommended!). > > -Fred The other problem with PyArg_Parse that PyArg_ParseTuple avoids is that a function declared as taking N arguments can also be called with a single tuple of N items. This is not supposed to happen (you should use apply or the *args call notation for that). --Guido van Rossum (home page: http://www.python.org/~guido/) From jack@oratrix.nl Wed Jan 9 15:03:45 2002 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 09 Jan 2002 16:03:45 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: Message by "Fred L. Drake, Jr." , Wed, 9 Jan 2002 09:52:39 -0500 (EST) , <15420.22839.70151.27559@cj42289-a.reston1.va.home.com> Message-ID: <20020109150439.BE28FE8451@oratrix.oratrix.nl> > > Jack Jansen writes: > > Huh, what did I miss? Why is PyArg_Parse deprecated, and by what > > should it be replaced? > > I think it is only recommended to avoid this as the argument-parsing > function for an extension function/method; PyArg_ParseTuple() should > be used instead [...] Ow, ok, I knew about that one. Silly me:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From thomas.heller@ion-tof.com Wed Jan 9 15:11:13 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 9 Jan 2002 16:11:13 +0100 Subject: Was Re: [Python-Dev] unicode/string asymmetries References: <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de><20020109121256.2758DE8451@oratrix.oratrix.nl> <15420.22839.70151.27559@cj42289-a.reston1.va.home.com> Message-ID: <006a01c1991f$f8949ed0$e000a8c0@thomasnotebook> From: "Fred L. Drake, Jr." > > Jack Jansen writes: > > Huh, what did I miss? Why is PyArg_Parse deprecated, and by what > > should it be replaced? > > I think it is only recommended to avoid this as the argument-parsing > function for an extension function/method; PyArg_ParseTuple() should > be used instead since it can give better error messages using the > :funcname syntax for the format string (which is strongly > recommended!). Offtopic again: PyArg_ParseTuple() is also nice for parsing a tuple in C code, which you for example receive as a result from calling a method. IIRC the only problem here is that it may throw weird error messages if the object is not a tuple. Instead of 'TypeError: unpack non-sequence' you get a 'SystemError: new style getargs format but argument is not a tuple'. Should this be changed? Thomas From just@letterror.com Wed Jan 9 15:22:05 2002 From: just@letterror.com (Just van Rossum) Date: Wed, 9 Jan 2002 16:22:05 +0100 Subject: Was Re: [Python-Dev] unicode/string asymmetries In-Reply-To: <006a01c1991f$f8949ed0$e000a8c0@thomasnotebook> Message-ID: <20020109162207-r01010800-f5d854de-0920-010c@10.0.0.23> Thomas Heller wrote: > Offtopic again: PyArg_ParseTuple() is also nice for parsing a tuple > in C code, which you for example receive as a result from calling a method. > IIRC the only problem here is that it may throw weird error > messages if the object is not a tuple. > Instead of 'TypeError: unpack non-sequence' you get a > 'SystemError: new style getargs format but argument is not a tuple'. You can do that with PyArg_Parse(), too, if you point parens around your format string, as in this converter function: int CGPoint_Convert(PyObject *v, CGPoint *p_itself) { if( !PyArg_Parse(v, "(ff)", &p_itself->x, &p_itself->y) ) return 0; return 1; } The nice is that this will accept _any_ (length 2) sequence, not just tuples! So this seems to be a case where PyArg_Parse() is actually better than PyArg_ParseTuple(). Just From guido@python.org Wed Jan 9 15:30:50 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 09 Jan 2002 10:30:50 -0500 Subject: [Python-Dev] Re: PyArg_ParseTuple In-Reply-To: Your message of "Wed, 09 Jan 2002 16:11:13 +0100." <006a01c1991f$f8949ed0$e000a8c0@thomasnotebook> References: <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de><20020109121256.2758DE8451@oratrix.oratrix.nl> <15420.22839.70151.27559@cj42289-a.reston1.va.home.com> <006a01c1991f$f8949ed0$e000a8c0@thomasnotebook> Message-ID: <200201091530.KAA04516@cj20424-a.reston1.va.home.com> > Offtopic again: PyArg_ParseTuple() is also nice for parsing a tuple > in C code, which you for example receive as a result from calling a method. > IIRC the only problem here is that it may throw weird error > messages if the object is not a tuple. > Instead of 'TypeError: unpack non-sequence' you get a > 'SystemError: new style getargs format but argument is not a tuple'. > > Should this be changed? No, you should test for PyTuple_Check before calling PyArg_ParseTuple. Why do you think it's called that? The other problem with this use, alas, is that when it catches a legitimate error, the error it reports is confusing if you don't change it. Example: >>> from socket import * >>> s = socket(AF_INET, SOCK_STREAM) >>> s.bind(()) Traceback (most recent call last): File "", line 1, in ? TypeError: getsockaddrarg() takes exactly 2 arguments (0 given) >>> --Guido van Rossum (home page: http://www.python.org/~guido/) From walter@livinglogic.de Wed Jan 9 16:02:13 2002 From: walter@livinglogic.de (Walter =?ISO-8859-15?Q?D=F6rwald?=) Date: Wed, 09 Jan 2002 17:02:13 +0100 Subject: [Python-Dev] Re: PyArg_ParseTuple References: <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de><20020109121256.2758DE8451@oratrix.oratrix.nl> <15420.22839.70151.27559@cj42289-a.reston1.va.home.com> <006a01c1991f$f8949ed0$e000a8c0@thomasnotebook> <200201091530.KAA04516@cj20424-a.reston1.va.home.com> Message-ID: <3C3C6985.4060708@livinglogic.de> Guido van Rossum wrote: > [...] > No, you should test for PyTuple_Check before calling > PyArg_ParseTuple. Why do you think it's called that? > > The other problem with this use, alas, is that when it catches a > legitimate error, the error it reports is confusing if you don't > change it. Example: > > >>>>from socket import * >>>>s = socket(AF_INET, SOCK_STREAM) >>>>s.bind(()) >>>> > Traceback (most recent call last): > File "", line 1, in ? > TypeError: getsockaddrarg() takes exactly 2 arguments (0 given) This should be fixed by using ;error message in the format string. Bye, Walter Dörwald From mal@lemburg.com Wed Jan 9 16:45:19 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 09 Jan 2002 17:45:19 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> Message-ID: <3C3C739F.9784054A@lemburg.com> Jack Jansen wrote: > ... > So, I would like PyArg_Parse/Py_BuildValue formats that are symmetric to "s", > "s#" and "z" but that return wchar_t strings and that work with both > UnicodeObjects and StringObjects. How about this: we add a wchar_t codec to Python and the "eu#" parser marker. Then you could write: wchar_t value = NULL; int len = 0; if (PyArg_ParseTuple(tuple, "eu#", "wchar_t", &value, &len) < 0) return NULL; ... PyMem_Free(value); return ... or, for 8-bit strings: char value = NULL; int len = 0; if (PyArg_ParseTuple(tuple, "es#", "latin-1", &value, &len) < 0) return NULL; ... PyMem_Free(value); return ... Is that symmetric enough ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jan 9 16:50:32 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 09 Jan 2002 17:50:32 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <20020109121256.2758DE8451@oratrix.oratrix.nl> Message-ID: <3C3C74D8.F7C26930@lemburg.com> Jack Jansen wrote: > > > > 3. There does not seem to be an equivalent to the 's' format character > > > for PyArg_Parse() or Py_BuildValue(). > Martin: > > and I doubt you have Py_UNICODE* often enough to need > > it to pass to Py_BuildValue. > > Martin, have you ever wrapped any Unicode API's? (As opposed to using unicode > as a purely internal datatype, which you clearly know a lot about). Thomas' > question are similar to mine from last week, and Neil's are related too. All > the niceties we have for strings (optional ones with z, autoconversion from > unicode, s# to get the size) are missing for unicode, and that's a pain when > you're wrapping an existing C api. Jack, please take a look at the very complete C API we have for Unicode. AFACTL, the Unicode API has more to offer than even the string C API. BTW, the "u" and "u#" build markers are available too, so there should be no problem using them for Py_BuildValue(). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From thomas.heller@ion-tof.com Wed Jan 9 16:56:47 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 9 Jan 2002 17:56:47 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects References: <20020109150032.6FB34E8451@oratrix.oratrix.nl> Message-ID: <015a01c1992e$b8557c40$e000a8c0@thomasnotebook> > > > You cannot do this currently, because there is no way to get from the type > > > object (which is the only thing you have available in Python) to the functions > > > you need to pass to O&. > > > > In Python 2.2, the type object can itself be an instance, and you could call > > classmethods on it... > > I'm doing something similar on windows. > > Could you explain how you do this? If I have the typeobject, how would I get > to the address of the "int (*converter)(PyObject *, void *)" function? Jack, it seems I misunderstood you (slightly?). I was talking about the other direction (constructing Python objects from C pointers or handles). I had to invent a special convention: I use O& with a function which calls obj->as_parameter() to convert from Python to C, but of course this gives you no typechecks as your O@ proposal does. I've reread your original O@ proposal, and I like it very much. Aren't there really any other positive responses? Thomas From mal@lemburg.com Wed Jan 9 18:42:11 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 09 Jan 2002 19:42:11 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects References: <20020107121221.693D4E8451@oratrix.oratrix.nl> Message-ID: <3C3C8F03.B4D6479D@lemburg.com> Jack Jansen wrote: > > Recently, "M.-A. Lemburg" said: > > Sounds like you want to introduce a "buffer" interface for these > > objects. > > No, that is something completely different. I want a replacement for > PyArg_Parse("O&", funcptr, void**) that has the form > PyArg_Parse("O@", typeobject, void**) and similarly for Py_BuildValue. > > Because the typeobject has a Python representation (whereas the > function pointer does not) this would allow modules like struct and > calldll to support objects that have this interface, because these > modules are driven from specifications in Python. There is currently > no way to get from the typeobject to the function pointer needed for > O&. If I'm not mistaken this looks like an interface which resembles the copyreg registry where you ask an object for a way to pickle itself and a way to restore itself from the pickle. (I think one of the ways pickle supports this more directly is by looking for a reduce method.) That would be nice to have indeed. For the simple objects you have in mind, the void* could be wrapped into a PyCObject, BTW. Could you write this up as a short PEP ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Wed Jan 9 19:36:58 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 9 Jan 2002 20:36:58 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <20020109122519.75A4AE8451@oratrix.oratrix.nl> (message from Jack Jansen on Wed, 09 Jan 2002 13:25:19 +0100) References: <20020109122519.75A4AE8451@oratrix.oratrix.nl> Message-ID: <200201091936.g09Jawq01658@mira.informatik.hu-berlin.de> > Why would you have to specify the encoding if what you want is the normal, > standard encoding? Well, because utf-16-le definitely is *not* the normal, standard encoding. It is only the right thing if the C type is WCHAR[], which is a Microsoft invention. > Or, to rephrase the question, why do C programmers only have to > s/char/wchar_t/, add a "w" to the front of the routine names and a u > in front of the string constants, whereas Python programmers are now > suddenly expected to learn all this mumbo-jumbo about encodings and > such? That is definitely not the only thing that C programmers have to do. They need to invoke conversion functions all the time. Plus, they are faced with the problem that, when integrating different Unicode-supporting libraries, they have to convert forth and back between different Unicode types. Regards, Martin From martin@v.loewis.de Wed Jan 9 21:14:26 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 9 Jan 2002 22:14:26 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> (message from Jack Jansen on Wed, 09 Jan 2002 15:55:11 +0100) References: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> Message-ID: <200201092114.g09LEQH01895@mira.informatik.hu-berlin.de> > So, I would like PyArg_Parse/Py_BuildValue formats that are > symmetric to "s", "s#" and "z" but that return wchar_t strings and > that work with both UnicodeObjects and StringObjects. Unfortunately, that is quite difficult. Python does not guarantee that the internal representation of Unicode strings uses wchar_t, so such a conversion definitely requires explicit memory management. This is unlike plain strings, which do guarantee that the internal representation is char[]. Regards, Martin From martin@v.loewis.de Wed Jan 9 21:24:28 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 9 Jan 2002 22:24:28 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <3C3C739F.9784054A@lemburg.com> (mal@lemburg.com) References: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> <3C3C739F.9784054A@lemburg.com> Message-ID: <200201092124.g09LOSp01918@mira.informatik.hu-berlin.de> > How about this: we add a wchar_t codec to Python and the "eu#" parser > marker. Then you could write: > > wchar_t value = NULL; > int len = 0; > if (PyArg_ParseTuple(tuple, "eu#", "wchar_t", &value, &len) < 0) > return NULL; Wouldn't that code be incorrect if there are further format argument whose conversion could fail also? I think format specifiers that require explicit memory management are so difficult to use that they must be avoided. I'd be in favour of extending the argtuple type to include additional slots for objects that go away when the tuple goes away. Regards, Martin From martin@v.loewis.de Wed Jan 9 21:11:50 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 9 Jan 2002 22:11:50 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <20020109121256.2758DE8451@oratrix.oratrix.nl> (message from Jack Jansen on Wed, 09 Jan 2002 13:12:56 +0100) References: <20020109121256.2758DE8451@oratrix.oratrix.nl> Message-ID: <200201092111.g09LBoj01892@mira.informatik.hu-berlin.de> > Huh, what did I miss? Why is PyArg_Parse deprecated, and by what > should it be replaced? Not precisely; METH_OLDARGS and its combination with Py_ArgParse is deprecated, use PyArg_ParseTuple instead. That still leaves a few uses of PyArg_Parse, but these are really to special to worry about. > > and I doubt you have Py_UNICODE* often enough to need > > it to pass to Py_BuildValue. > Martin, have you ever wrapped any Unicode API's? (As opposed to > using unicode as a purely internal datatype, which you clearly know > a lot about). Certainly, I've tried providing libiconv interfacing. I was strongly pushing the notion that Py_UNICODE is equal to wchar_t on all platforms, that notion was unfortunately rejected. As a result, using wchar_t together with Python Unicode objects is difficult. No existing C library reliably accepts Py_UNICODE*, if anything, they accept wchar_t* (although Microsoft, and apparently also Apple, manages to use yet another type, further complicating issues). There are exceptions: on some platforms, Py_UNICODE currently is equal to wchar_t, like Windows. That may change in the future, if people request full Unicode support (i.e. a 4-byte Unicode type) - then Py_UNICODE might differ from WCHAR even on Windows. At that time, any code that currently assumes they are equal will break. So I'd rather educate people about the issues now than having to come up with work-arounds when they eventually run into them. > Thomas' question are similar to mine from last week, and Neil's are > related too. All the niceties we have for strings (optional ones > with z, autoconversion from unicode, s# to get the size) are missing > for unicode, and that's a pain when you're wrapping an existing C > api. These problems are inherent in the subject matter: the C support of Unicode, and its relationship to the char type is inherently inconsistent. If Python would offer a struct code that translates into wchar_t, he'd get away with that on Window. However, it seemed to me that the specific structure was primarily used in files, so code that tries to fill it should use formats that are platform-independent. For the integer types, that means you cannot just use the "i" format, but you need to know what the integer range is (i.e. 8, 16, 32, or 64 bits). Likewise, for strings, you need to know what the width of each character, and the endianness is. Furthermore, apart from Windows, I doubt *anybody* puts wide strings in platform encoding into files. I'd hope anybody else is so smart to clearly define the encoding used when representing Unicode strings in byte-oriented files, streams, and structures. Regards, Martin From nhodgson@bigpond.net.au Wed Jan 9 21:41:52 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Thu, 10 Jan 2002 08:41:52 +1100 Subject: [Python-Dev] unicode/string asymmetries References: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> <200201092114.g09LEQH01895@mira.informatik.hu-berlin.de> Message-ID: <008f01c19956$73c4f790$0acc8490@neil> Martin: > Unfortunately, that is quite difficult. Python does not guarantee that > the internal representation of Unicode strings uses wchar_t, so such a > conversion definitely requires explicit memory management. This could be a problem with my file patches as I have been using PyUnicode_AS_UNICODE which will 4 byte strings if Py_UNICODE_WIDE is defined. 4 byte strings can not be passed to the Windows API. So it looks like PyUnicode_AsWideChar has to be used instead with a wrapper to allocate enough memory to hold the resulting string. Neil From martin@v.loewis.de Wed Jan 9 22:12:59 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 9 Jan 2002 23:12:59 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: <20020109115512.5B6A1E8451@oratrix.oratrix.nl> (message from Jack Jansen on Wed, 09 Jan 2002 12:55:12 +0100) References: <20020109115512.5B6A1E8451@oratrix.oratrix.nl> Message-ID: <200201092212.g09MCxQ02275@mira.informatik.hu-berlin.de> > > Could you please try once more, being serious this time? AFAICT, I was > > asking for examples of types that are parsed by means of O& currently, > > and do so just to get a void** from the python object. > > Shall we try to keep this civil, please? I *am* being serious Please accept my apologies. I was expecting a single specific example, and was somewhat surprised to get a list of unspecific ones. > Ok, let me rephrase my list then. The first five items in my list, > which you carefully ignored I have ignored the Mac toolbox objects, since I don't know what they are, and where to find their source code. I have ignored Windows HANDLEs, since I don't have PythonWin sources readily available; I don't know what the X11 and Motif modules are. Now I've looked somewhat throught the Python source, and found Mac/Modules/Win/_Winmodule.c:WinObj_SetWindowModality (taking an arbitrary that seemed to match your description of "Windows"). Is that one of the examples you were referring to? If so, I still cannot understand the example. It reads if (!PyArg_ParseTuple(_args, "lO&", &inModalKind, WinObj_Convert, &inUnavailableWindow)) so it appears that you would like to rewrite this as if (!PyArg_ParseTuple(_args, "lO@", &inModalKind, WinObj_Type, &inUnavailableWindow)) Now, if that is how it is supposed to look like: How exactly would it work? WinObj_Convert accepts None, integers, and WinObjs. It seems that the rewritten version would only accept WinObj objects. > extern GrafPtr ps_GetDrawableSurface(void); [...] > If we had something like ("O@", typeobject) calldll could > be extended so you could do something like > psapilib = calldll.getlibrary(....) > ps_GetDrawableSurface = calldll.newcall(psapilib.ps_GetDrawableSurface, > Carbon.Qd.GrafPortType) > > (newcall() arguments are funcpointer, return value type, arg1 type, ...) > > You cannot do this currently Please let me try to summarize what this is doing: Given a type object and a long, create an instance of that type. Is that a correct analysis of what has to be done? I see two ways to do that currently: 1. Arrange that it is possible to construct GrafPortType objects from integers. Then you do curarg = PyObject_Call(returntype, "l", c_rv); inside calldll.c:cdc_call 2. Extend the type object to, say, MacType, which offers special support for calldll, to allow creation of instances given a long value. > because there is no way to get from the type object (which is the > only thing you have available in Python) to the functions you need > to pass to O&. I completely fail to see how O& fits into the puzzle. AFAICT, conversion of the return value occurs inside cdc_call. There is no tuple to parse anyway nearby. Regards, Martin From martin@v.loewis.de Wed Jan 9 22:15:24 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 9 Jan 2002 23:15:24 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <04ca01c19917$2229b220$e000a8c0@thomasnotebook> (thomas.heller@ion-tof.com) References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> <077201c198ec$56b9b740$0900a8c0@spiff> <04ca01c19917$2229b220$e000a8c0@thomasnotebook> Message-ID: <200201092215.g09MFO902299@mira.informatik.hu-berlin.de> > How can I do the equivalent of > u"some string" > in terms of > unicode("some string", encoding) Again, what do you need that for? If there won't be any escape sequences or non-ASCII characters inside, then unicode("some string", "ascii") will work fine. In the general case, unicode("some string", "unicode-escape") should work. Regards, Martin From mal@lemburg.com Wed Jan 9 23:14:25 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 10 Jan 2002 00:14:25 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> <3C3C739F.9784054A@lemburg.com> <200201092124.g09LOSp01918@mira.informatik.hu-berlin.de> Message-ID: <3C3CCED1.DBCE3B98@lemburg.com> "Martin v. Loewis" wrote: > > > How about this: we add a wchar_t codec to Python and the "eu#" parser > > marker. Then you could write: > > > > wchar_t value = NULL; > > int len = 0; > > if (PyArg_ParseTuple(tuple, "eu#", "wchar_t", &value, &len) < 0) > > return NULL; > > Wouldn't that code be incorrect if there are further format argument > whose conversion could fail also? Yes; you'd currently have to write: wchar_t value = NULL; int len = 0; if (PyArg_ParseTuple(tuple, "eu#", "wchar_t", &value, &len) < 0) goto onError; ... onError: if (value) PyMem_Free(value); return NULL; > I think format specifiers that require explicit memory management are > so difficult to use that they must be avoided. I'd be in favour of > extending the argtuple type to include additional slots for objects > that go away when the tuple goes away. I don't understand that last comment. Anyway, you've got a point there: allocated buffers should be freed in case the PyArg_ParserTuple() API fails (and then reset the *buffer pointer to NULL). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jack@oratrix.nl Wed Jan 9 23:57:15 2002 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 10 Jan 2002 00:57:15 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: Message by "Martin v. Loewis" , Wed, 9 Jan 2002 23:12:59 +0100 , <200201092212.g09MCxQ02275@mira.informatik.hu-berlin.de> Message-ID: <20020109235720.6CE49E8451@oratrix.oratrix.nl> Recently, "Martin v. Loewis" said: > Now I've looked somewhat throught the Python source, and found > Mac/Modules/Win/_Winmodule.c:WinObj_SetWindowModality (taking an > arbitrary that seemed to match your description of "Windows"). Is that > one of the examples you were referring to? If so, I still cannot > understand the example. It reads > > if (!PyArg_ParseTuple(_args, "lO&", > &inModalKind, > WinObj_Convert, &inUnavailableWindow)) > > so it appears that you would like to rewrite this as > > > if (!PyArg_ParseTuple(_args, "lO@", > &inModalKind, > WinObj_Type, &inUnavailableWindow)) > > Now, if that is how it is supposed to look like: How exactly would it > work? WinObj_Convert accepts None, integers, and WinObjs. It seems > that the rewritten version would only accept WinObj objects. Basically correct, but there is no reason why the rewritten version would accept only WinObj's. ("O@", typeobj, ptr) would call typeobj->tp_convert(arg[i], ptr) and the semantics of tp_convert would be '"cast" arg PyObject to whatever your type is and store the C pointer value for that thing in ptr'. Or, to make things clearer, WinObj_Type->tp_convert would simply point to the current WinObj_Convert function. > > If we had something like ("O@", typeobject) calldll could > > be extended so you could do something like > > psapilib = calldll.getlibrary(....) > > ps_GetDrawableSurface = calldll.newcall(psapilib.ps_GetDrawableSurface, > > Carbon.Qd.GrafPortType) > > > > (newcall() arguments are funcpointer, return value type, arg1 type, ...) > > > > You cannot do this currently > > Please let me try to summarize what this is doing: Given a type object > and a long, create an instance of that type. Is that a correct > analysis of what has to be done? That would allow you to do the same thing, but rather more error prone (i.e. I think it is much more of a hack than what I'm trying to get at). As you noted above WinObj's unfortunately need such a hack, but I would expect to get rid of it as soon as possible. I really don't like passing C pointers around in Python integers. > I completely fail to see how O& fits into the puzzle. AFAICT, > conversion of the return value occurs inside cdc_call. There is no > tuple to parse anyway nearby. Not at the moment, but in calldll version 2 there would be. In stead of passing types as "l" or "h" you would pass type objects to newcall(). Newcall() would probably special-case the various ints but for all other types simply call PyArg_Parse(arg, "O@", typeobj, &voidptr). -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jack@oratrix.nl Thu Jan 10 00:17:57 2002 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 10 Jan 2002 01:17:57 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: Message by "M.-A. Lemburg" , Wed, 09 Jan 2002 17:45:19 +0100 , <3C3C739F.9784054A@lemburg.com> Message-ID: <20020110001802.54361E8451@oratrix.oratrix.nl> Recently, "M.-A. Lemburg" said: > How about this: we add a wchar_t codec to Python and the "eu#" parser > marker. Then you could write: > > wchar_t value = NULL; > int len = 0; > if (PyArg_ParseTuple(tuple, "eu#", "wchar_t", &value, &len) < 0) > return NULL; I like it! Even though I have to do the memory management myself (and have to think of the error case) it at least looks reasonable. I'm assuming here that if I pass a StringObject it will be unicode-encoded using the default encoding, and that unicode value will then be converted to wchar_t and put in value, right? Or, in other words, passing "a.out" will do the same as passing u"a.out"... One minor misgiving is that this call will *always* copy the string, even if the internal coding of unicode objects is wchar_t. That's a bit of a nuisance, but we can try to fix that later. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From martin@v.loewis.de Thu Jan 10 00:18:09 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 10 Jan 2002 01:18:09 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <008f01c19956$73c4f790$0acc8490@neil> (nhodgson@bigpond.net.au) References: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> <200201092114.g09LEQH01895@mira.informatik.hu-berlin.de> <008f01c19956$73c4f790$0acc8490@neil> Message-ID: <200201100018.g0A0I9B02928@mira.informatik.hu-berlin.de> > This could be a problem with my file patches as I have been using > PyUnicode_AS_UNICODE which will 4 byte strings if Py_UNICODE_WIDE is > defined. 4 byte strings can not be passed to the Windows API. So it looks > like PyUnicode_AsWideChar has to be used instead with a wrapper to allocate > enough memory to hold the resulting string. Yes. Unfortunately, that would be much more inefficient. So I'd suggest you just put an assertion into the code that Py_UNICODE is the same size as WCHAR (that can be even done through a preprocessor #error, using the _SIZE #defines). I'll expect people will resist changing Py_UNICODE on Windows for quite some time, even if other platforms move on. Regards, Martin From jack@oratrix.nl Thu Jan 10 00:21:14 2002 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 10 Jan 2002 01:21:14 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: Message by "Thomas Heller" , Wed, 9 Jan 2002 17:56:47 +0100 , <015a01c1992e$b8557c40$e000a8c0@thomasnotebook> Message-ID: <20020110002119.D8C9BE8451@oratrix.oratrix.nl> Recently, "Thomas Heller" said: > I've reread your original O@ proposal, and I like it very much. > Aren't there really any other positive responses? You and Marc-Andre, so far. I'll write a PEP, as MAL suggested. Sigh, two PEPs on my plate:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From cinthia@anovaluz.com.br Wed Jan 9 23:15:35 2002 From: cinthia@anovaluz.com.br (NovaLuz) Date: Wed, 09 Jan 2002 21:15:35 -0200 Subject: [Python-Dev] =?iso-8859-1?Q?Don=B4t_stay_in_the_dark?= Message-ID: This is a Multipart MIME message. ------=_NextPart_000_001__5510939_76535,04 Content-Type: multipart/alternative; boundary="----=_NextPart_001_002__5510939_76535,04" ------=_NextPart_001_002__5510939_76535,04 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 7bit ------=_NextPart_001_002__5510939_76535,04 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: base64 PGh0bWwgeG1sbnM6dj0idXJuOnNjaGVtYXMtbWljcm9zb2Z0LWNvbTp2bWwiDQp4bWxuczpv PSJ1cm46c2NoZW1hcy1taWNyb3NvZnQtY29tOm9mZmljZTpvZmZpY2UiDQp4bWxuczp3PSJ1 cm46c2NoZW1hcy1taWNyb3NvZnQtY29tOm9mZmljZTp3b3JkIg0KeG1sbnM9Imh0dHA6Ly93 d3cudzMub3JnL1RSL1JFQy1odG1sNDAiPg0KDQo8aGVhZD4NCjxtZXRhIGh0dHAtZXF1aXY9 Q29udGVudC1UeXBlIGNvbnRlbnQ9InRleHQvaHRtbDsgY2hhcnNldD13aW5kb3dzLTEyNTIi Pg0KPG1ldGEgbmFtZT1Qcm9nSWQgY29udGVudD1Xb3JkLkRvY3VtZW50Pg0KPG1ldGEgbmFt ZT1HZW5lcmF0b3IgY29udGVudD0iTWljcm9zb2Z0IFdvcmQgOSI+DQo8bWV0YSBuYW1lPU9y aWdpbmF0b3IgY29udGVudD0iTWljcm9zb2Z0IFdvcmQgOSI+DQo8bGluayByZWw9RmlsZS1M aXN0IGhyZWY9Ii4vSW5nbGVzJTIwLSUyME5MNTVDRF9hcnF1aXZvcy9maWxlbGlzdC54bWwi Pg0KPGxpbmsgcmVsPUVkaXQtVGltZS1EYXRhIGhyZWY9Ii4vSW5nbGVzJTIwLSUyME5MNTVD RF9hcnF1aXZvcy9lZGl0ZGF0YS5tc28iPg0KPCEtLVtpZiAhbXNvXT4NCjxzdHlsZT4NCnZc Oioge2JlaGF2aW9yOnVybCgjZGVmYXVsdCNWTUwpO30NCm9cOioge2JlaGF2aW9yOnVybCgj ZGVmYXVsdCNWTUwpO30NCndcOioge2JlaGF2aW9yOnVybCgjZGVmYXVsdCNWTUwpO30NCi5z aGFwZSB7YmVoYXZpb3I6dXJsKCNkZWZhdWx0I1ZNTCk7fQ0KPC9zdHlsZT4NCjwhW2VuZGlm XS0tPg0KPHRpdGxlPk7jbyBGaXF1ZSBubyBFc2N1cm88L3RpdGxlPg0KPCEtLVtpZiBndGUg bXNvIDldPjx4bWw+DQogPG86RG9jdW1lbnRQcm9wZXJ0aWVzPg0KICA8bzpBdXRob3I+Q0lO VEhJQTwvbzpBdXRob3I+DQogIDxvOlRlbXBsYXRlPk5vcm1hbDwvbzpUZW1wbGF0ZT4NCiAg PG86TGFzdEF1dGhvcj5DaW50aGlhIEFsbWVpZGEgZGUgU291emE8L286TGFzdEF1dGhvcj4N CiAgPG86UmV2aXNpb24+MjwvbzpSZXZpc2lvbj4NCiAgPG86VG90YWxUaW1lPjA8L286VG90 YWxUaW1lPg0KICA8bzpDcmVhdGVkPjIwMDItMDEtMDdUMjI6MzA6MDBaPC9vOkNyZWF0ZWQ+ DQogIDxvOkxhc3RTYXZlZD4yMDAyLTAxLTA3VDIyOjMwOjAwWjwvbzpMYXN0U2F2ZWQ+DQog IDxvOlBhZ2VzPjI8L286UGFnZXM+DQogIDxvOldvcmRzPjEyMTwvbzpXb3Jkcz4NCiAgPG86 Q2hhcmFjdGVycz42OTA8L286Q2hhcmFjdGVycz4NCiAgPG86Q29tcGFueT5Ob3ZhIGx1ejwv bzpDb21wYW55Pg0KICA8bzpMaW5lcz41PC9vOkxpbmVzPg0KICA8bzpQYXJhZ3JhcGhzPjE8 L286UGFyYWdyYXBocz4NCiAgPG86Q2hhcmFjdGVyc1dpdGhTcGFjZXM+ODQ3PC9vOkNoYXJh Y3RlcnNXaXRoU3BhY2VzPg0KICA8bzpWZXJzaW9uPjkuMjgxMjwvbzpWZXJzaW9uPg0KIDwv bzpEb2N1bWVudFByb3BlcnRpZXM+DQo8L3htbD48IVtlbmRpZl0tLT48IS0tW2lmIGd0ZSBt c28gOV0+PHhtbD4NCiA8dzpXb3JkRG9jdW1lbnQ+DQogIDx3Okh5cGhlbmF0aW9uWm9uZT4y MTwvdzpIeXBoZW5hdGlvblpvbmU+DQogPC93OldvcmREb2N1bWVudD4NCjwveG1sPjwhW2Vu ZGlmXS0tPg0KPHN0eWxlPg0KPCEtLQ0KIC8qIEZvbnQgRGVmaW5pdGlvbnMgKi8NCkBmb250 LWZhY2UNCgl7Zm9udC1mYW1pbHk6IkNvcHBycGxHb3RoIEJkIEJUIjsNCglwYW5vc2UtMToy IDE0IDcgNSAyIDIgMyAyIDQgNDsNCgltc28tZm9udC1jaGFyc2V0OjA7DQoJbXNvLWdlbmVy aWMtZm9udC1mYW1pbHk6c3dpc3M7DQoJbXNvLWZvbnQtcGl0Y2g6dmFyaWFibGU7DQoJbXNv LWZvbnQtc2lnbmF0dXJlOjcgMCAwIDAgMTcgMDt9DQpAZm9udC1mYWNlDQoJe2ZvbnQtZmFt aWx5OiJNb25vdHlwZSBDb3JzaXZhIjsNCglwYW5vc2UtMTozIDEgMSAxIDEgMiAxIDEgMSAx Ow0KCW1zby1mb250LWNoYXJzZXQ6MDsNCgltc28tZ2VuZXJpYy1mb250LWZhbWlseTpzY3Jp cHQ7DQoJbXNvLWZvbnQtcGl0Y2g6dmFyaWFibGU7DQoJbXNvLWZvbnQtc2lnbmF0dXJlOjY0 NyAwIDAgMCAxNTkgMDt9DQpAZm9udC1mYWNlDQoJe2ZvbnQtZmFtaWx5OiJMdWNpZGEgU2Fu cyBVbmljb2RlIjsNCglwYW5vc2UtMToyIDExIDYgMiAzIDUgNCAyIDIgNDsNCgltc28tZm9u dC1jaGFyc2V0OjA7DQoJbXNvLWdlbmVyaWMtZm9udC1mYW1pbHk6c3dpc3M7DQoJbXNvLWZv bnQtcGl0Y2g6dmFyaWFibGU7DQoJbXNvLWZvbnQtc2lnbmF0dXJlOjY3OTEgMCAwIDAgNjMg MDt9DQpAZm9udC1mYWNlDQoJe2ZvbnQtZmFtaWx5OiJBdmFudEdhcmRlIE1kIEJUIjsNCglw YW5vc2UtMToyIDExIDYgMiAyIDIgMiAyIDIgNDsNCgltc28tZm9udC1jaGFyc2V0OjA7DQoJ bXNvLWdlbmVyaWMtZm9udC1mYW1pbHk6c3dpc3M7DQoJbXNvLWZvbnQtcGl0Y2g6dmFyaWFi bGU7DQoJbXNvLWZvbnQtc2lnbmF0dXJlOjcgMCAwIDAgMTcgMDt9DQogLyogU3R5bGUgRGVm aW5pdGlvbnMgKi8NCnAuTXNvTm9ybWFsLCBsaS5Nc29Ob3JtYWwsIGRpdi5Nc29Ob3JtYWwN Cgl7bXNvLXN0eWxlLXBhcmVudDoiIjsNCgltYXJnaW46MGNtOw0KCW1hcmdpbi1ib3R0b206 LjAwMDFwdDsNCgltc28tcGFnaW5hdGlvbjp3aWRvdy1vcnBoYW47DQoJZm9udC1zaXplOjEy LjBwdDsNCglmb250LWZhbWlseToiVGltZXMgTmV3IFJvbWFuIjsNCgltc28tZmFyZWFzdC1m b250LWZhbWlseToiVGltZXMgTmV3IFJvbWFuIjt9DQpoMQ0KCXttc28tc3R5bGUtbmV4dDpO b3JtYWw7DQoJbWFyZ2luOjBjbTsNCgltYXJnaW4tYm90dG9tOi4wMDAxcHQ7DQoJdGV4dC1h bGlnbjpjZW50ZXI7DQoJbXNvLXBhZ2luYXRpb246d2lkb3ctb3JwaGFuOw0KCXBhZ2UtYnJl YWstYWZ0ZXI6YXZvaWQ7DQoJbXNvLW91dGxpbmUtbGV2ZWw6MTsNCglmb250LXNpemU6MTgu MHB0Ow0KCW1zby1iaWRpLWZvbnQtc2l6ZToxMi4wcHQ7DQoJZm9udC1mYW1pbHk6Ik1vbm90 eXBlIENvcnNpdmEiOw0KCW1zby1mb250LWtlcm5pbmc6MHB0Ow0KCWZvbnQtd2VpZ2h0Om5v cm1hbDt9DQpoMg0KCXttc28tc3R5bGUtbmV4dDpOb3JtYWw7DQoJbWFyZ2luOjBjbTsNCglt YXJnaW4tYm90dG9tOi4wMDAxcHQ7DQoJdGV4dC1hbGlnbjpjZW50ZXI7DQoJbXNvLXBhZ2lu YXRpb246d2lkb3ctb3JwaGFuOw0KCXBhZ2UtYnJlYWstYWZ0ZXI6YXZvaWQ7DQoJbXNvLW91 dGxpbmUtbGV2ZWw6MjsNCglmb250LXNpemU6MTQuMHB0Ow0KCW1zby1iaWRpLWZvbnQtc2l6 ZToxMi4wcHQ7DQoJZm9udC1mYW1pbHk6IlRpbWVzIE5ldyBSb21hbiI7DQoJZm9udC13ZWln aHQ6bm9ybWFsO30NCnAuTXNvQ2FwdGlvbiwgbGkuTXNvQ2FwdGlvbiwgZGl2Lk1zb0NhcHRp b24NCgl7bXNvLXN0eWxlLW5leHQ6Tm9ybWFsOw0KCW1hcmdpbjowY207DQoJbWFyZ2luLWJv dHRvbTouMDAwMXB0Ow0KCW1zby1wYWdpbmF0aW9uOndpZG93LW9ycGhhbjsNCglmb250LXNp emU6MTIuMHB0Ow0KCW1zby1iaWRpLWZvbnQtc2l6ZToxMC4wcHQ7DQoJZm9udC1mYW1pbHk6 IlRpbWVzIE5ldyBSb21hbiI7DQoJbXNvLWZhcmVhc3QtZm9udC1mYW1pbHk6IlRpbWVzIE5l dyBSb21hbiI7DQoJY29sb3I6IzVGNUY1Rjt9DQpwLk1zb1RpdGxlLCBsaS5Nc29UaXRsZSwg ZGl2Lk1zb1RpdGxlDQoJe21hcmdpbjowY207DQoJbWFyZ2luLWJvdHRvbTouMDAwMXB0Ow0K CXRleHQtYWxpZ246Y2VudGVyOw0KCW1zby1wYWdpbmF0aW9uOndpZG93LW9ycGhhbjsNCglm b250LXNpemU6MjAuMHB0Ow0KCW1zby1iaWRpLWZvbnQtc2l6ZToxMi4wcHQ7DQoJZm9udC1m YW1pbHk6Ikx1Y2lkYSBTYW5zIFVuaWNvZGUiOw0KCW1zby1mYXJlYXN0LWZvbnQtZmFtaWx5 OiJUaW1lcyBOZXcgUm9tYW4iOw0KCWNvbG9yOiMzMzMzQ0M7fQ0KcC5Nc29Cb2R5VGV4dCwg bGkuTXNvQm9keVRleHQsIGRpdi5Nc29Cb2R5VGV4dA0KCXttYXJnaW46MGNtOw0KCW1hcmdp bi1ib3R0b206LjAwMDFwdDsNCgltc28tcGFnaW5hdGlvbjp3aWRvdy1vcnBoYW47DQoJZm9u dC1zaXplOjE0LjBwdDsNCgltc28tYmlkaS1mb250LXNpemU6MTIuMHB0Ow0KCWZvbnQtZmFt aWx5OiJUaW1lcyBOZXcgUm9tYW4iOw0KCW1zby1mYXJlYXN0LWZvbnQtZmFtaWx5OiJUaW1l cyBOZXcgUm9tYW4iO30NCnAuTXNvQm9keVRleHRJbmRlbnQsIGxpLk1zb0JvZHlUZXh0SW5k ZW50LCBkaXYuTXNvQm9keVRleHRJbmRlbnQNCgl7bWFyZ2luLXRvcDowY207DQoJbWFyZ2lu LXJpZ2h0OjBjbTsNCgltYXJnaW4tYm90dG9tOjBjbTsNCgltYXJnaW4tbGVmdDoyODMuMnB0 Ow0KCW1hcmdpbi1ib3R0b206LjAwMDFwdDsNCgl0ZXh0LWFsaWduOmp1c3RpZnk7DQoJdGV4 dC1pbmRlbnQ6MzUuNHB0Ow0KCW1zby1wYWdpbmF0aW9uOndpZG93LW9ycGhhbjsNCglmb250 LXNpemU6MTQuMHB0Ow0KCW1zby1iaWRpLWZvbnQtc2l6ZToxMi4wcHQ7DQoJZm9udC1mYW1p bHk6IkF2YW50R2FyZGUgTWQgQlQiOw0KCW1zby1mYXJlYXN0LWZvbnQtZmFtaWx5OiJUaW1l cyBOZXcgUm9tYW4iOw0KCW1zby1iaWRpLWZvbnQtZmFtaWx5OiJUaW1lcyBOZXcgUm9tYW4i Ow0KCWNvbG9yOiM5OUNDMDA7DQoJZm9udC13ZWlnaHQ6Ym9sZDsNCgltc28tYmlkaS1mb250 LXdlaWdodDpub3JtYWw7DQoJZm9udC1zdHlsZTppdGFsaWM7DQoJbXNvLWJpZGktZm9udC1z dHlsZTpub3JtYWw7fQ0KYTpsaW5rLCBzcGFuLk1zb0h5cGVybGluaw0KCXtjb2xvcjpibHVl Ow0KCXRleHQtZGVjb3JhdGlvbjp1bmRlcmxpbmU7DQoJdGV4dC11bmRlcmxpbmU6c2luZ2xl O30NCmE6dmlzaXRlZCwgc3Bhbi5Nc29IeXBlcmxpbmtGb2xsb3dlZA0KCXtjb2xvcjpwdXJw bGU7DQoJdGV4dC1kZWNvcmF0aW9uOnVuZGVybGluZTsNCgl0ZXh0LXVuZGVybGluZTpzaW5n bGU7fQ0KQHBhZ2UgU2VjdGlvbjENCgl7c2l6ZTo2MTIuMHB0IDc5Mi4wcHQ7DQoJbWFyZ2lu OjcwLjlwdCA0Mi41NXB0IDcwLjlwdCA0Mi41NXB0Ow0KCW1zby1oZWFkZXItbWFyZ2luOjM1 LjQ1cHQ7DQoJbXNvLWZvb3Rlci1tYXJnaW46MzUuNDVwdDsNCgltc28tcGFwZXItc291cmNl OjA7fQ0KZGl2LlNlY3Rpb24xDQoJe3BhZ2U6U2VjdGlvbjE7fQ0KLS0+DQo8L3N0eWxlPg0K PCEtLVtpZiBndGUgbXNvIDldPjx4bWw+DQogPG86c2hhcGVkZWZhdWx0cyB2OmV4dD0iZWRp dCIgc3BpZG1heD0iMjA1MCIvPg0KPC94bWw+PCFbZW5kaWZdLS0+PCEtLVtpZiBndGUgbXNv IDldPjx4bWw+DQogPG86c2hhcGVsYXlvdXQgdjpleHQ9ImVkaXQiPg0KICA8bzppZG1hcCB2 OmV4dD0iZWRpdCIgZGF0YT0iMSIvPg0KIDwvbzpzaGFwZWxheW91dD48L3htbD48IVtlbmRp Zl0tLT4NCjwvaGVhZD4NCg0KPGJvZHkgbGFuZz1QVC1CUiBsaW5rPWJsdWUgdmxpbms9cHVy cGxlIHN0eWxlPSd0YWItaW50ZXJ2YWw6MzUuNHB0Jz4NCg0KPGRpdiBjbGFzcz1TZWN0aW9u MT4NCg0KPHAgY2xhc3M9TXNvVGl0bGU+PHNwYW4gbGFuZz1FTi1VUyBzdHlsZT0nbXNvLWFu c2ktbGFuZ3VhZ2U6RU4tVVMnPkRvbrR0IHN0YXkNCmluIHRoZSBkYXJrPG86cD48L286cD48 L3NwYW4+PC9wPg0KDQo8aDE+PHNwYW4gbGFuZz1FTi1VUyBzdHlsZT0nZm9udC1zaXplOjIy LjBwdDttc28tYmlkaS1mb250LXNpemU6MTIuMHB0Ow0KY29sb3I6cmVkO21zby1hbnNpLWxh bmd1YWdlOkVOLVVTJz5FbWVyZ2VuY3kgTGlnaHQ8c3BhbiBzdHlsZT0ibXNvLXNwYWNlcnVu Og0KeWVzIj6gIDwvc3Bhbj6WIE5vdmFMdXo8bzpwPjwvbzpwPjwvc3Bhbj48L2gxPg0KDQo8 cCBjbGFzcz1Nc29Ob3JtYWw+PHNwYW4gbGFuZz1FTi1VUyBzdHlsZT0nZm9udC1zaXplOjE0 LjBwdDttc28tYmlkaS1mb250LXNpemU6DQoxMi4wcHQ7Zm9udC1mYW1pbHk6IkNvcHBycGxH b3RoIEJkIEJUIjttc28tYW5zaS1sYW5ndWFnZTpFTi1VUyc+PCFbaWYgIXN1cHBvcnRFbXB0 eVBhcmFzXT4mbmJzcDs8IVtlbmRpZl0+PG86cD48L286cD48L3NwYW4+PC9wPg0KDQo8aDI+ PHNwYW4gbGFuZz1FTi1VUyBzdHlsZT0nZm9udC1zaXplOjE1LjBwdDttc28tYmlkaS1mb250 LXNpemU6MTIuMHB0Ow0KZm9udC1mYW1pbHk6IkNvcHBycGxHb3RoIEJkIEJUIjttc28tYW5z aS1sYW5ndWFnZTpFTi1VUyc+TW9kZWwgTkwgNTUgQ0QgliBIaWdoDQpQb3dlciBJbmRlcGVu ZGVudCBVbml0PG86cD48L286cD48L3NwYW4+PC9oMj4NCg0KPHAgY2xhc3M9TXNvTm9ybWFs IGFsaWduPWNlbnRlciBzdHlsZT0ndGV4dC1hbGlnbjpjZW50ZXInPjxzcGFuIGxhbmc9RU4t VVMNCnN0eWxlPSdtc28tYW5zaS1sYW5ndWFnZTpFTi1VUyc+QXV0b21hdGljIGFuZCBpbnN0 YW50YW5lb3VzIHN3aXRjaGluZyBpbiBjYXNlDQpvZiBlbGVjdHJpY2FsIHBvd2VyIHNob3J0 YWdlPG86cD48L286cD48L3NwYW4+PC9wPg0KDQo8aDI+PHNwYW4gbGFuZz1FTi1VUyBzdHls ZT0nbXNvLWFuc2ktbGFuZ3VhZ2U6RU4tVVMnPjwhW2lmICFzdXBwb3J0RW1wdHlQYXJhc10+ Jm5ic3A7PCFbZW5kaWZdPjxvOnA+PC9vOnA+PC9zcGFuPjwvaDI+DQoNCjxwIGNsYXNzPU1z b05vcm1hbD48c3BhbiBsYW5nPUVOLVVTIHN0eWxlPSdtc28tYW5zaS1sYW5ndWFnZTpFTi1V Uyc+PCFbaWYgIXN1cHBvcnRFbXB0eVBhcmFzXT4mbmJzcDs8IVtlbmRpZl0+PG86cD48L286 cD48L3NwYW4+PC9wPg0KDQo8cCBjbGFzcz1Nc29Ob3JtYWwgYWxpZ249Y2VudGVyIHN0eWxl PSdtYXJnaW4tbGVmdDoxNDEuNnB0O3RleHQtYWxpZ246Y2VudGVyJz48IS0tW2lmIGd0ZSB2 bWwgMV0+PHY6c2hhcGV0eXBlDQogaWQ9Il94MDAwMF90NzUiIGNvb3Jkc2l6ZT0iMjE2MDAs MjE2MDAiIG86c3B0PSI3NSIgbzpwcmVmZXJyZWxhdGl2ZT0idCINCiBwYXRoPSJtQDRANWxA NEAxMUA5QDExQDlANXhlIiBmaWxsZWQ9ImYiIHN0cm9rZWQ9ImYiPg0KIDx2OnN0cm9rZSBq b2luc3R5bGU9Im1pdGVyIi8+DQogPHY6Zm9ybXVsYXM+DQogIDx2OmYgZXFuPSJpZiBsaW5l RHJhd24gcGl4ZWxMaW5lV2lkdGggMCIvPg0KICA8djpmIGVxbj0ic3VtIEAwIDEgMCIvPg0K ICA8djpmIGVxbj0ic3VtIDAgMCBAMSIvPg0KICA8djpmIGVxbj0icHJvZCBAMiAxIDIiLz4N CiAgPHY6ZiBlcW49InByb2QgQDMgMjE2MDAgcGl4ZWxXaWR0aCIvPg0KICA8djpmIGVxbj0i cHJvZCBAMyAyMTYwMCBwaXhlbEhlaWdodCIvPg0KICA8djpmIGVxbj0ic3VtIEAwIDAgMSIv Pg0KICA8djpmIGVxbj0icHJvZCBANiAxIDIiLz4NCiAgPHY6ZiBlcW49InByb2QgQDcgMjE2 MDAgcGl4ZWxXaWR0aCIvPg0KICA8djpmIGVxbj0ic3VtIEA4IDIxNjAwIDAiLz4NCiAgPHY6 ZiBlcW49InByb2QgQDcgMjE2MDAgcGl4ZWxIZWlnaHQiLz4NCiAgPHY6ZiBlcW49InN1bSBA MTAgMjE2MDAgMCIvPg0KIDwvdjpmb3JtdWxhcz4NCiA8djpwYXRoIG86ZXh0cnVzaW9ub2s9 ImYiIGdyYWRpZW50c2hhcGVvaz0idCIgbzpjb25uZWN0dHlwZT0icmVjdCIvPg0KIDxvOmxv Y2sgdjpleHQ9ImVkaXQiIGFzcGVjdHJhdGlvPSJ0Ii8+DQo8L3Y6c2hhcGV0eXBlPjx2OnNo YXBlIGlkPSJfeDAwMDBfczEwMjgiIHR5cGU9IiNfeDAwMDBfdDc1Ig0KIGhyZWY9Imh0dHA6 Ly93d3cuYW5vdmFsdXouY29tLmJyLyIgc3R5bGU9J3Bvc2l0aW9uOmFic29sdXRlO2xlZnQ6 MDsNCiB0ZXh0LWFsaWduOmxlZnQ7bWFyZ2luLWxlZnQ6MDttYXJnaW4tdG9wOjA7d2lkdGg6 MTE0cHQ7aGVpZ2h0OjExOS4xNXB0Ow0KIHotaW5kZXg6MTttc28td3JhcC1lZGl0ZWQ6Zjtt c28tcG9zaXRpb24taG9yaXpvbnRhbDpsZWZ0Ow0KIG1zby1wb3NpdGlvbi12ZXJ0aWNhbDp0 b3A7bXNvLXBvc2l0aW9uLXZlcnRpY2FsLXJlbGF0aXZlOmxpbmUnIHdyYXBjb29yZHM9Ii0x NDIgMCAtMTQyIDIxNDY0IDIxNjAwIDIxNDY0IDIxNjAwIDAgLTE0MiAwIg0KIG86YWxsb3dv dmVybGFwPSJmIiBvOmJ1dHRvbj0idCI+DQogPHY6ZmlsbCBvOmRldGVjdG1vdXNlY2xpY2s9 InQiLz4NCiA8djppbWFnZWRhdGEgc3JjPSJjaWQ6NTUxMDk0OTEwNDI1QGltYWdlMDAxLmdp ZiIgbzp0aXRsZT0ibmw1NWNkY2FwYSIvPg0KIDx3OndyYXAgdHlwZT0ic3F1YXJlIiBhbmNo b3J4PSJwYWdlIi8+DQo8L3Y6c2hhcGU+PCFbZW5kaWZdLS0+PCFbaWYgIXZtbF0+PGEgaHJl Zj0iaHR0cDovL3d3dy5hbm92YWx1ei5jb20uYnIvIj48aW1nDQpib3JkZXI9MCB3aWR0aD0x NTIgaGVpZ2h0PTE1OSBzcmM9ImNpZDo1NTEwOTQ5MTA0MjVAaW1hZ2UwMDEuZ2lmIg0KYWxp Z249bGVmdCBoc3BhY2U9MTIgdjpzaGFwZXM9Il94MDAwMF9zMTAyOCI+PC9hPjwhW2VuZGlm XT48YQ0KaHJlZj0iaHR0cDovL3d3dy5hbm92YWx1ei5jb20uYnIvIj48L2E+PGEgaHJlZj0i aHR0cDovL3d3dy5hbm92YWx1ei5jb20uYnIvIj48L2E+PHNwYW4NCmxhbmc9RU4tVVMgc3R5 bGU9J21zby1hbnNpLWxhbmd1YWdlOkVOLVVTJz48bzpwPjwvbzpwPjwvc3Bhbj48L3A+DQoN CjxwIGNsYXNzPU1zb05vcm1hbD48c3BhbiBsYW5nPUVOLVVTIHN0eWxlPSdtc28tYW5zaS1s YW5ndWFnZTpFTi1VUyc+Kg0KT3BlcmFudGluZyBSYWdlOiAzICwgNCBvciA1IGhvdXJzPG86 cD48L286cD48L3NwYW4+PC9wPg0KDQo8cCBjbGFzcz1Nc29Ob3JtYWw+PHNwYW4gbGFuZz1F Ti1VUyBzdHlsZT0nbXNvLWFuc2ktbGFuZ3VhZ2U6RU4tVVMnPjwhW2lmICFzdXBwb3J0RW1w dHlQYXJhc10+Jm5ic3A7PCFbZW5kaWZdPjxvOnA+PC9vOnA+PC9zcGFuPjwvcD4NCg0KPHAg Y2xhc3M9TXNvTm9ybWFsPiogQ292ZXJlZCDhcmVhOiAzMDAgbTIuPC9wPg0KDQo8cCBjbGFz cz1Nc29Ob3JtYWw+PCFbaWYgIXN1cHBvcnRFbXB0eVBhcmFzXT4mbmJzcDs8IVtlbmRpZl0+ PG86cD48L286cD48L3A+DQoNCjxwIGNsYXNzPU1zb05vcm1hbD48c3BhbiBsYW5nPUVOLVVT IHN0eWxlPSdtc28tYW5zaS1sYW5ndWFnZTpFTi1VUyc+Kg0KTWFpbnRlbmFuY2UgZnJlZSBz ZWFsZWQgYmF0dGVyeTxvOnA+PC9vOnA+PC9zcGFuPjwvcD4NCg0KPHAgY2xhc3M9TXNvTm9y bWFsPjxzcGFuIGxhbmc9RU4tVVMgc3R5bGU9J21zby1hbnNpLWxhbmd1YWdlOkVOLVVTJz48 IVtpZiAhc3VwcG9ydEVtcHR5UGFyYXNdPiZuYnNwOzwhW2VuZGlmXT48bzpwPjwvbzpwPjwv c3Bhbj48L3A+DQoNCjxwIGNsYXNzPU1zb05vcm1hbD48c3BhbiBsYW5nPUVOLVVTIHN0eWxl PSdtc28tYW5zaS1sYW5ndWFnZTpFTi1VUyc+KiBNYWRlIGluDQpCcmF6aWwgliBCeSBOb3Zh THV6PG86cD48L286cD48L3NwYW4+PC9wPg0KDQo8cCBjbGFzcz1Nc29Ob3JtYWw+PHNwYW4g bGFuZz1FTi1VUyBzdHlsZT0nbXNvLWFuc2ktbGFuZ3VhZ2U6RU4tVVMnPjwhW2lmICFzdXBw b3J0RW1wdHlQYXJhc10+Jm5ic3A7PCFbZW5kaWZdPjxvOnA+PC9vOnA+PC9zcGFuPjwvcD4N Cg0KPHAgY2xhc3M9TXNvTm9ybWFsIGFsaWduPWNlbnRlciBzdHlsZT0ndGV4dC1hbGlnbjpj ZW50ZXInPjxzcGFuIGxhbmc9RU4tVVMNCnN0eWxlPSdmb250LXNpemU6MTQuMHB0O21zby1i aWRpLWZvbnQtc2l6ZToxMi4wcHQ7bXNvLWFuc2ktbGFuZ3VhZ2U6RU4tVVMnPjwhW2lmICFz dXBwb3J0RW1wdHlQYXJhc10+Jm5ic3A7PCFbZW5kaWZdPjxvOnA+PC9vOnA+PC9zcGFuPjwv cD4NCg0KPHAgY2xhc3M9TXNvTm9ybWFsIGFsaWduPWNlbnRlciBzdHlsZT0ndGV4dC1hbGln bjpjZW50ZXInPjxzcGFuIGxhbmc9RU4tVVMNCnN0eWxlPSdmb250LXNpemU6MTQuMHB0O21z by1iaWRpLWZvbnQtc2l6ZToxMi4wcHQ7bXNvLWFuc2ktbGFuZ3VhZ2U6RU4tVVMnPjwhW2lm ICFzdXBwb3J0RW1wdHlQYXJhc10+Jm5ic3A7PCFbZW5kaWZdPjxvOnA+PC9vOnA+PC9zcGFu PjwvcD4NCg0KPHAgY2xhc3M9TXNvTm9ybWFsIGFsaWduPWNlbnRlciBzdHlsZT0ndGV4dC1h bGlnbjpjZW50ZXInPjxzcGFuIGxhbmc9RU4tVVMNCnN0eWxlPSdmb250LXNpemU6MTQuMHB0 O21zby1iaWRpLWZvbnQtc2l6ZToxMi4wcHQ7bXNvLWFuc2ktbGFuZ3VhZ2U6RU4tVVMnPkNv bnN1bHQNCnRoaXMgYW5kIG90aGVyIG1vZGVscywgdmlzaXQgb3VyIHdlYnNpdGU8bzpwPjwv bzpwPjwvc3Bhbj48L3A+DQoNCjxwIGNsYXNzPU1zb05vcm1hbCBhbGlnbj1jZW50ZXIgc3R5 bGU9J3RleHQtYWxpZ246Y2VudGVyJz48c3BhbiBsYW5nPUVOLVVTDQpzdHlsZT0nZm9udC1z aXplOjE2LjBwdDttc28tYmlkaS1mb250LXNpemU6MTIuMHB0O21zby1hbnNpLWxhbmd1YWdl OkVOLVVTJz48YQ0KaHJlZj0iaHR0cDovL3d3dy5hbm92YWx1ei5jb20uYnIvc2l0ZW5vdmFs dXoyL0luZ2xlcy9ocC11c2EuaHRtIj5odHRwOi8vd3d3LmFub3ZhbHV6LmNvbS5ici9zaXRl bm92YWx1ejIvSW5nbGVzL2hwLXVzYS5odG08L2E+PG86cD48L286cD48L3NwYW4+PC9wPg0K DQo8cCBjbGFzcz1Nc29Ob3JtYWwgYWxpZ249Y2VudGVyIHN0eWxlPSd0ZXh0LWFsaWduOmNl bnRlcic+PHNwYW4gbGFuZz1FTi1VUw0Kc3R5bGU9J2ZvbnQtc2l6ZToxNC4wcHQ7bXNvLWJp ZGktZm9udC1zaXplOjEyLjBwdDttc28tYW5zaS1sYW5ndWFnZTpFTi1VUyc+PCFbaWYgIXN1 cHBvcnRFbXB0eVBhcmFzXT4mbmJzcDs8IVtlbmRpZl0+PG86cD48L286cD48L3NwYW4+PC9w Pg0KDQo8cCBjbGFzcz1Nc29Ob3JtYWwgYWxpZ249Y2VudGVyIHN0eWxlPSd0ZXh0LWFsaWdu OmNlbnRlcic+PHNwYW4NCnN0eWxlPSdmb250LXNpemU6MTQuMHB0O21zby1iaWRpLWZvbnQt c2l6ZToxMi4wcHQnPjwhLS1baWYgZ3RlIHZtbCAxXT48djpzaGFwZQ0KIGlkPSJfeDAwMDBf aTEwMjYiIHR5cGU9IiNfeDAwMDBfdDc1IiBzdHlsZT0nd2lkdGg6MTM1Ljc1cHQ7aGVpZ2h0 OjQ2LjVwdCc+DQogPHY6aW1hZ2VkYXRhIHNyYz0iY2lkOjU1MTA5NzAyMzUxM0BpbWFnZTAw Mi5naWYiIG86dGl0bGU9ImxvZ290aXBvIi8+DQo8L3Y6c2hhcGU+PCFbZW5kaWZdLS0+PCFb aWYgIXZtbF0+PGltZyBib3JkZXI9MCB3aWR0aD0xODEgaGVpZ2h0PTYyDQpzcmM9ImNpZDo1 NTEwOTcwMjM1MTNAaW1hZ2UwMDIuZ2lmIiB2OnNoYXBlcz0iX3gwMDAwX2kxMDI2Ij48IVtl bmRpZl0+PG86cD48L286cD48L3NwYW4+PC9wPg0KDQo8cCBjbGFzcz1Nc29Ob3JtYWwgYWxp Z249Y2VudGVyIHN0eWxlPSd0ZXh0LWFsaWduOmNlbnRlcic+PHNwYW4gbGFuZz1FTi1VUw0K c3R5bGU9J2ZvbnQtc2l6ZToxMy4wcHQ7bXNvLWJpZGktZm9udC1zaXplOjEyLjBwdDttc28t YW5zaS1sYW5ndWFnZTpFTi1VUyc+UGhvbmU6DQo1NSCWIDExIJYgNDM2OCA3NzgyPHNwYW4g c3R5bGU9Im1zby1zcGFjZXJ1bjogeWVzIj6goKAgPC9zcGFuPkZheDogNTUgliAxMSCWDQo0 MzY4IDg2MDI8bzpwPjwvbzpwPjwvc3Bhbj48L3A+DQoNCjxwIGNsYXNzPU1zb05vcm1hbCBh bGlnbj1jZW50ZXIgc3R5bGU9J3RleHQtYWxpZ246Y2VudGVyJz48c3Bhbg0Kc3R5bGU9J2Zv bnQtc2l6ZToxMy4wcHQ7bXNvLWJpZGktZm9udC1zaXplOjEyLjBwdCc+ZS1tYWlsPHNwYW4N CnN0eWxlPSdjb2xvcjojMzM2NkZGJz46IDwvc3Bhbj48Yj48c3BhbiBzdHlsZT0nY29sb3I6 I0ZGNjYwMCc+PGENCmhyZWY9Im1haWx0bzpub3ZhbHV6QGFub3ZhbHV6LmNvbS5iciI+PHNw YW4gc3R5bGU9J2NvbG9yOiNGRjY2MDA7dGV4dC1kZWNvcmF0aW9uOg0Kbm9uZTt0ZXh0LXVu ZGVybGluZTpub25lJz5leHRlcmlvckBhbm92YWx1ei5jb20uYnI8L3NwYW4+PC9hPiA8bzpw PjwvbzpwPjwvc3Bhbj48L2I+PC9zcGFuPjwvcD4NCg0KPHAgY2xhc3M9TXNvTm9ybWFsIGFs aWduPWNlbnRlciBzdHlsZT0ndGV4dC1hbGlnbjpjZW50ZXInPjxiPjxzcGFuDQpzdHlsZT0n Zm9udC1zaXplOjEzLjBwdDttc28tYmlkaS1mb250LXNpemU6MTIuMHB0O2NvbG9yOiNGRjY2 MDAnPjwhW2lmICFzdXBwb3J0RW1wdHlQYXJhc10+Jm5ic3A7PCFbZW5kaWZdPjxvOnA+PC9v OnA+PC9zcGFuPjwvYj48L3A+DQoNCjxwIGNsYXNzPU1zb05vcm1hbD48c3BhbiBzdHlsZT0n Zm9udC1zaXplOjE0LjBwdDttc28tYmlkaS1mb250LXNpemU6MTIuMHB0Jz48IVtpZiAhc3Vw cG9ydEVtcHR5UGFyYXNdPiZuYnNwOzwhW2VuZGlmXT48bzpwPjwvbzpwPjwvc3Bhbj48L3A+ DQoNCjxwIGNsYXNzPU1zb05vcm1hbD48c3BhbiBzdHlsZT0nZm9udC1zaXplOjE0LjBwdDtt c28tYmlkaS1mb250LXNpemU6MTIuMHB0Jz48IVtpZiAhc3VwcG9ydEVtcHR5UGFyYXNdPiZu YnNwOzwhW2VuZGlmXT48bzpwPjwvbzpwPjwvc3Bhbj48L3A+DQoNCjxwIGNsYXNzPU1zb05v cm1hbD48c3BhbiBzdHlsZT0nZm9udC1zaXplOjE0LjBwdDttc28tYmlkaS1mb250LXNpemU6 MTIuMHB0Jz48IVtpZiAhc3VwcG9ydEVtcHR5UGFyYXNdPiZuYnNwOzwhW2VuZGlmXT48bzpw PjwvbzpwPjwvc3Bhbj48L3A+DQoNCjxwIGNsYXNzPU1zb05vcm1hbD48c3BhbiBzdHlsZT0n Zm9udC1zaXplOjE0LjBwdDttc28tYmlkaS1mb250LXNpemU6MTIuMHB0Jz48IVtpZiAhc3Vw cG9ydEVtcHR5UGFyYXNdPiZuYnNwOzwhW2VuZGlmXT48bzpwPjwvbzpwPjwvc3Bhbj48L3A+ DQoNCjxwIGNsYXNzPU1zb05vcm1hbD48c3BhbiBzdHlsZT0nZm9udC1zaXplOjE0LjBwdDtt c28tYmlkaS1mb250LXNpemU6MTIuMHB0Jz48IVtpZiAhc3VwcG9ydEVtcHR5UGFyYXNdPiZu YnNwOzwhW2VuZGlmXT48bzpwPjwvbzpwPjwvc3Bhbj48L3A+DQoNCjxwIGNsYXNzPU1zb05v cm1hbD48c3BhbiBzdHlsZT0nZm9udC1zaXplOjE0LjBwdDttc28tYmlkaS1mb250LXNpemU6 MTIuMHB0Jz48IVtpZiAhc3VwcG9ydEVtcHR5UGFyYXNdPiZuYnNwOzwhW2VuZGlmXT48bzpw PjwvbzpwPjwvc3Bhbj48L3A+DQoNCjxwIGNsYXNzPU1zb05vcm1hbD48c3BhbiBzdHlsZT0n Zm9udC1zaXplOjE0LjBwdDttc28tYmlkaS1mb250LXNpemU6MTIuMHB0Jz48IVtpZiAhc3Vw cG9ydEVtcHR5UGFyYXNdPiZuYnNwOzwhW2VuZGlmXT48bzpwPjwvbzpwPjwvc3Bhbj48L3A+ DQoNCjxwIGNsYXNzPU1zb05vcm1hbD48c3BhbiBzdHlsZT0nZm9udC1zaXplOjE0LjBwdDtt c28tYmlkaS1mb250LXNpemU6MTIuMHB0Jz48IVtpZiAhc3VwcG9ydEVtcHR5UGFyYXNdPiZu YnNwOzwhW2VuZGlmXT48bzpwPjwvbzpwPjwvc3Bhbj48L3A+DQoNCjxwIGNsYXNzPU1zb05v cm1hbD48c3BhbiBzdHlsZT0nZm9udC1zaXplOjE0LjBwdDttc28tYmlkaS1mb250LXNpemU6 MTIuMHB0Jz48IVtpZiAhc3VwcG9ydEVtcHR5UGFyYXNdPiZuYnNwOzwhW2VuZGlmXT48bzpw PjwvbzpwPjwvc3Bhbj48L3A+DQoNCg0KPGRpdiBjbGFzcz1Nc29Ob3JtYWwgYWxpZ249Y2Vu dGVyIHN0eWxlPSd0ZXh0LWFsaWduOmNlbnRlcic+PHNwYW4NCnN0eWxlPSdmb250LXNpemU6 MTQuMHB0O21zby1iaWRpLWZvbnQtc2l6ZToxMi4wcHQ7Y29sb3I6Z3JheSc+DQoNCjxociBz aXplPTIgd2lkdGg9IjEwMCUiIGFsaWduPWNlbnRlcj4NCg0KPC9zcGFuPjwvZGl2Pg0KDQoN CjxwIGNsYXNzPU1zb0NhcHRpb24+PHNwYW4gbGFuZz1FTi1VUyBzdHlsZT0nZm9udC1mYW1p bHk6QXJpYWw7bXNvLWFuc2ktbGFuZ3VhZ2U6DQpFTi1VUyc+SW4gY2FzZSB5b3UgZG8gbm90 IGRlc2lyZSB0byByZWNlaXZlIG1vcmUgZS1tYWlscywgcGxlYXNlIGFuc3dlciB0aGlzDQpt ZXNzYWdlIHBsYWNpbmcgaW4gdGhlIHN1YmplY3Q6IFJFTU9WRTwvc3Bhbj48c3BhbiBsYW5n PUVOLVVTIHN0eWxlPSdtc28tYW5zaS1sYW5ndWFnZToNCkVOLVVTJz48bzpwPjwvbzpwPjwv c3Bhbj48L3A+DQoNCjxwIGNsYXNzPU1zb05vcm1hbD48c3BhbiBsYW5nPUVOLVVTIHN0eWxl PSdmb250LXNpemU6MTQuMHB0O21zby1iaWRpLWZvbnQtc2l6ZToNCjEyLjBwdDttc28tYW5z aS1sYW5ndWFnZTpFTi1VUyc+PCFbaWYgIXN1cHBvcnRFbXB0eVBhcmFzXT4mbmJzcDs8IVtl bmRpZl0+PG86cD48L286cD48L3NwYW4+PC9wPg0KDQo8cCBjbGFzcz1Nc29Ob3JtYWw+PHNw YW4gbGFuZz1FTi1VUyBzdHlsZT0nZm9udC1zaXplOjE0LjBwdDttc28tYmlkaS1mb250LXNp emU6DQoxMi4wcHQ7bXNvLWFuc2ktbGFuZ3VhZ2U6RU4tVVMnPjwhW2lmICFzdXBwb3J0RW1w dHlQYXJhc10+Jm5ic3A7PCFbZW5kaWZdPjxvOnA+PC9vOnA+PC9zcGFuPjwvcD4NCg0KPC9k aXY+DQoNCjwvYm9keT4NCg0KPC9odG1sPg0K ------=_NextPart_001_002__5510939_76535,04-- ------=_NextPart_000_001__5510939_76535,04 Content-Type: image/gif; name="image001.gif" Content-Transfer-Encoding: base64 Content-ID: <551094910425@image001.gif> R0lGODlhmACfAPUbACgoJkE5OkNIOFFRT2JiW3Fwb3p7hGNrzm93zmOMEHuEe3uEh3uEzoha JYl6cZKKeo+MipaMhZuViZ6XqKaYj7ClmbWwrpOUxq2xxsa4t8bGvcDAwMbGxsbGztDO3ubn 88DAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH/C01TT0ZGSUNF OS4wGAAAAAxtc09QTVNPRkZJQ0U5LjBAaUuUKgAh/ghHaWYgTHViZQAh/wtNU09GRklDRTku MBgAAAAMY21QUEpDbXAwNzEyAAAAA0gAc7wAIfkEAQAAIAAsAAAAAJgAnwAABv/Ay2FILBqP yKRyyWw6n9DoUyitWq/YrNZ56Xq/4LB4TC6bz+i0en0uXkDwuHxOr9vv+Lx+z+/78VRDb3+E hYaHiIlzgQeDcBcIQwiOipWWl5hxjI6QFxgYnZmio6R7m48IGHIYjaWur6SnIJN0kLC3uImy B6pzrLnAwXy7vasHwsjJi25wvHS/ytHCu0rS1rmyn9rbxdfescxwoKuU3+aY2c5xGLTn7pay XbUI7/W64e3G3fb8w+HqcwD2G5hHVr51AgkqrBUO0j5bCyMyJDKIXSpxQvZJjCgLhAUGBxCA vLixJIiOGD2ZXIlypUtN4V7KPBlzpsuWNkviBAFKZc7/fvF6MRCJgOTPesROXrRw8Oi5pAwc QXPqLqk8cceoVv03aagqDEO1bqWYMpKqohrFStv5yILapzXXdUn7VllQOEOHGq279p8qhx6b 8vVoobAFDhosVLAgAQIECREcF1YEVWpWtYc5KK4AocCAz6AHCABAurRpCG4PWbWck4Nmzo4p cPb8ubQA0AICCBh9O/SA0gM4qOZ6wSvPsDItFKAd2rZo36CBRzdtmsBwsjy7mJ2195zmyXkk AB9N+jOB8+gHEPBd3jyB5ep9Xxd050JqYIbz56/AH3KB8wVUoNkdEAAwwGzovbfccgkWoIBj nS2Y4HrQzddKModt1t9/FJr3/xl51PE2AAR2ZFAgARAMICFzE77XIoAFLABBBP+5N4CF5ZSi WWMQ8FcjdLgZGJqCHaoIAQUKSJABHcqNuCCD60U5oXoGFPDYBLJRIAEFEUQG2noF4AiLBQoM 6VuHwCloZWedObDcAxA6VkAEBFTA5HmdSQDll1ZyiGcFEgQKGYQRPBjBlwNEIGYpGFRQHnm3 vRfnk8s9FoEEbkro4pwRrnfpnShCsJ6onoEZWY14CiqnjI49KAGA69lpCFsYIWIBAD1Glulj Wnb2opqUUlolgyNSQAcHeIqKInRyuidBBRPAFuGCjxFQ5Wf3EUIrT5cVUsEAXD4mZ3rnqddg jDxCIP/jsNQCquyRx6rXmYrMmdcYbdY+2yu+CXb6nnCzxuVLt4RU8F6gEEIJI6Vxihvsgu6m yqS8Ng4QgHmzmfsYwvWa2yeYSi5qx1TeEqCAmlW6CKMBBsgJWWS8ItzlpY4tZjABERg7hwX0 mkthpOpZ6XG5FgdA2sWffcwpXab8U80hFQiNmo8wAjitpI3da6mgWgY6wbU9wqFZBspJaC6Q 8aFtMQC9AfBefChqsCg33CASdYoSFKYnhx7Pa14Be6Nn5aWCNtbyfw7sJ9uTUqoNGuAQpFwv aT8mjUg8ZNi9bN4WkEqhpPD93djnnMZ5apUzPsCfbPxlWrXjSXvgwSekgnb/MZ1Jq3d5OAz0 7vtQQ2ieYo+dF6lgq/9VmiKfM8Is53+p81cB6w8wCDidxodungUfdO/BtY//6FmYIsthgRDd /WFwiihOUC+Rnbnr2NvMN9wZy5hGMH2WFVTPIbGhgc95WvYsDMhudhMQVe7K5RkJ7A47q4iE fRJhsHuhiG82QtEnGPMx5jlPXJJ6gOqmJz3XQa9UX7KaqwCVLsfw6HPlypa2agKKkDBNDxwg mwUoQAAJPMBNAELUl+KEgQm4j18OgFlknkUbwEmPhP5jkAU9lLQ12W9V8/PZiAAWMAiy40J/ yCHZVgeoHgYtPVSk4nlm54Gv9csxXeqRlsD0AMVQ/2CHFaCRhAKFr/8BqGXicuEEDPe/5HGx i/ShSfrywAGb7e+Rr9Ja5Vo0JA907wPLs1qcGKMnAiTxjiTsn6Y0tbAFETByqIPAg0R4x8Jk 4Ib+wM4S8mAB/YXyiTxkk69SGKUhFaADsuuAAvvmQhYi7lnSo94DpPSrYH3pATN7gA9lSJlw 0G0b4XGihvijGBSdR18wFNyCEGa/J+mySgYAECiTSYFwKeAB74SACCPwABAJwJMFMI0D4NSl B2iJAkuCh8D8wAFM+ZObjqSAld5TmAoY4G/WGxahXJgwal3tPbZM5vQCRc9+ilCEDqCOi0pD gI9+tKN5O2T5CKGBAvxQdf/bVMxCC9DQVwXNT4ezomMWUACW+ZRlkesX69pJVB92FKQOyFQB QKSi85D0YQVwgAIUQM1CbEsPFUjqPBezzZkWLkLDYllPgRonnooTjoa6FFFDWVR6fvSdT/oN dUzjM2a+xwEQCOgDE2kIeu4zqcjcn0KXB1VKndUx/tsn4ba0VtYhlJ2QSar1EjRXuvbyXO/M 2z0gSAgMUOCvH3WsbOCUPD8th10cWmV8gFM4oraTm/pRzA4hE0WVobGy5VmYVk9ax83yVS4h 8UkdOPDSl35SsEg6FcP4SQCSPkA0zQUOl9YqPVdaIAM63AyXfgjEF62WOkn76IzGK15pqvQP LWH/hycsQhcLgLa4OVudlqA5s0t5VAJyFdJ58FqaArSTsY6MrXwD9UPGnYtS7+ySoKIpzwZ3 VJ7n9UNLrhKYHMEBU399b0az5Nqi/jOftilpoUbTw8L1k2sd7jBtRaiApO52sa6FzMvGCzNo NhiaNN0rGJvRjeLYwaD7NOk+ASrgaG0XAq4DEWmYqlRKKXZLAOZwUQvXWihzlJ43jgw/4STP yPRTZ4jc8Sx6zIDhfhakJmXldQ3Dzf+6WKn5rexlzwVNafLviXgWLJQvlWYHNy+O9U0dPR24 UsDwRDAeeW9xf8i5orK4yZTFLduaGVU7V1c/ehaUNE16407DactaBrQ8/ykQ4Vj+NjCTEIKF QZBVRae5nW7tLrnOFmfwPunFW+IqLv9JZQrw9tPy/HSd/UzfP9d5iaVuGmfhcL6orLqgQU6z SQlc21+lpzcem5NarfxaXv93wT7smghlLM1Bj1uEnX5wqPnpZXomWw9XtYMFjCvtaTeGu06W Kn0Z22ESqrhrVK4ylzgqTR8aFd0IDza6H8zu1HX507KyqtOSUAf31rveU26ta/0d40ANXMZc yxmUed1rwtG3zj78tFsR60NPd9nYJ57bNT9R8d1ePLQDjvG3Az5ywk1X5Pb1ONe2JHKDR9Pc DodmyhXuYIZ7WnUrzU4jaD4HDnw22jcveM5m9v8yGP/7ZfbWkshJPvCB85naKK+z2peO7mAf ldg3JnWYy6HeSbBChsTNepq5LXb79jNOCijsnLoGcMILncDitu/JyZ1yPjMdy8CGvMoh8G5A xEUeUQFB5sW2Q5vr/fBrQq3gDzeskR/e8Fb2OJ+PevCVlxexCX+9Wz9dVWWf2hmZ9zEcMsDn rM++4P6kUdXul9PITauJgEO98oU+3637mvVK//S91R5sl7c92GBGb1xakfsyi82fvs566uxc bs+4rY9RLRf9bAS40wu96Gbf7qCPau76R5/gD/e0WxU890XQIyqs4AjzBmUXp3DQJHZcYn5p cybr4Ta10VwDoABFx23/qkeBJudr4TZ/aWdu89SBwrZl5SZCEad9y8YURTEUlKAZB/V7nKZl Q6dH4qEiEnAxRFMjpgUgh0eBOph4ikdPz7dpBpd20jd5Gnh/eiVhAtNsFBYHGZBE0+OBwIZy ZQeDBqI0yYNXSGZ8PbUcjLWD7tdz/3RsQLhp9GdULZdmbkVbaXgYJHhqeqAY0gQo7BZ7i0V0 pyUsx/cwrjJ0XqiDVxaErbdpP5hyB6dyRjdPucZNbXghszQHTXgkeQR5DRZspqdHDwNHHsc6 smUYO7d8RMdtzheE0GdugbhyTGdpsDU9tVcf1vQJaFFDOcIYdsYlDIdlE9h8avVal4ZH/PN1 /6j3fqhHOKs3hvaXhpumcFyDZ462inaAEoZ2Et63M0q3Q7NXY9sGcA0lZf8FcDzHjRToif8k csRIhuOIjAC2a2LXgcxYByixeVhRcfzkQ28SR/T1cabnjQJ3j97mhxb4cfK3ekYliR9lcIm4 a6IYgr2FhJzljkphZnXGIDbGbqaXet5GdnwYcMu3fBPYeGmXcuMGYNv0X713YnV2R5U3Eb9l aOolbw/pJn82aIWXg9xIdoa3j334hVunhiY1aOeoIW11Ur72fCd2kuwYFygICQhQexngUjSS RP3kZXwYkxMpk1QJjn6IgIB4jo81ZR7Yg1p2YyOokG7YBVGxinknT//PYlRdZpFfyHxuKZVf OJOQQYs8qYtc1VAbtWIoFX3RNE8etY6suGx9oAGN0VGy8XfwF5VDJ5VsOZGeqGCJaF1stj8E RojEyHURAETLUX9EWZSCWUN2NzLhl1nUCJOL+Y0XmZEU2Y2ysYlkk12ZdnAZWG6KV2BE8x4n pzqwFJinpl6uuEgeAU2BZ0u+tjEkV2Wp2ZbI6Vj6AZsDpmlrd2yK12J+UhvY5lIHB5i8KWYU xhTPBk1u0iX9cYCoWZNTeXprlR+vmV121DXDKIYE5pW2WS4G0h5ng52bpp3NuH1kZgdWV2Az kiUFd5E811q7mF2vqTgZ954+WIybVj1E0xz/bEMvRKIiRyV3/ScHe6F7deAfVhKHszWT/7Rx u3hdCTqZ8sWNtSmIGuhR9ARAv2ExcANRfkIhDkBt+rmfy6aSiAYH/eMmC4BXzIlnbBZbeMlO Oydj9Kd2tCmdYuhlDjAd+qUgogcfkVJSQRiWi1gOJqgdq/Z98JROILpm+zGZKkZlR7ekJtdR fWlsELJleuJLpWU9JxNXVTgnISg3UaeEXyo2erJMA5BEiXime5aGShqdgJib9RV5Sbd/x7Y3 D8NMUcU4E1ppKYehGfo7mhqNPwZEFnpiYKdgTOqRA7l6bEp/5AV3CLeXQRhFK4IqAkSpQZNE /pSQixJcYpAHUbMc/25CjudmmQBZXy46h+12bDcmbIilpgcIofcEJaVSSMGSKZnlbr51IZBw AAywm3LQUpX2UmdYf633lCfnZXDETy9Hriq3qk6aqDQiAAnQVNkmeMlTUseWozr6W6BQFMKF B43Srbz1kqeqbuj6dqH2cK+HrOQIfSnnADByfKUUV+6hduCxp9fao3JQUG8iXk/pZ7Y4bFGo ZfOUf7mJjE86jjkDRADAhb4SJQsSPvV5HqLamZYnmOIwEnugHDbXZV+5pDQWhcIGsiILJ2eI lvDZkZtGAALQADgTr0/SXyCmIgOQndW0bPnaCH1qPn+KsA5mY5Q4bPn3leT4GBEZhUQriv9m eIBAlFRqAivw4bS9JKj+JLPbyQkjsa97wAGVVo2gRoci+3ZqaH0C6aCk6E8uhk/EYli3QSx3 RYj2OrdwoK/aWnFR04Eim6pvp7PpGpHJ+nceG3uyuXg/1AAB4ABaAqvB8jnRx1UCJUtPswca kJkL97Ue62Uh67GCJmjpSrTAl2VwGoIEMLoKszCoC5QDgg6tOHN+kLXUx6i0a65EqG42NjOI Nb0di5j0xWdqCyywul9oBiiN2wfx5gdL2XZziLvtxrvWZ76aC3dsWrIuFngsu19d4ib/9b1b mgmT65db1jzp1jzjSmNNV2OCJom1KK6QOryxFiByW61Xy1JZm2X/L+e8txu9oUa7BctlDSe9 1/uoHxVkfoUa3xEM4fsH84ZX5ytqGgxo4kqutQiw1Vts1yuuIjhbFqCnyDDCJKxYEVnBmnvB L3m9PTvBf8a/IIhusPUNOExQmMK/LExjKIyuROxwLKzCmKt0EWAYrvEOSewHk6umlft2xcam QUy29JRgg5YZ9ivCA1UJGFtsTgxHTEzELzyJY/uRKcWGC7HFfqAcApyuYJx05YpwA8kjs4UB aewNetwHyuG1kwjH1bdwaEZtz3IYr5kTicwHhYGFZCtteCVk0lTD+SEWl9wHBtViwtZimVJH WGwYg/EIa3wJ6rkZMXXIlvzKrYzItnzLktYwyroMC7zcy67wy8AMDjQ7zH1RzMZsF7mczGqM zMw8Dcv8zLggzNJcCdRczQyMzcfshtoMDNfczRLnzOAczNE8zqLwzeYMvuWczpeAzuwMb+v8 zlPLzfI8Cu5cz/cqZvhsz/G8z+FMz/68ugAd0PNstWxw0Aid0Aq90AwdBlvw0BAd0RK9BYww 0RZ90RhN0UEAADs= ------=_NextPart_000_001__5510939_76535,04 Content-Type: image/gif; name="image002.gif" Content-Transfer-Encoding: base64 Content-ID: <551097023513@image002.gif> R0lGODlhtQA+AHcAACH/C01TT0ZGSUNFOS4wGAAAAAxtc09QTVNPRkZJQ0U5LjCA8i9WmgAh /wtNU09GRklDRTkuMBgAAAAMY21QUEpDbXAwNzEyAAAAA0gAc7wAIfkEAQAAQAAsAAAAALUA PgCG//////j/8Pfw8PDw7+/v7+jv4Ofg4ODg39/f39jf0NfQ0NDQz8/Pz8jPwMfAwMDAv7+/ v7i/sLewsLCwr6+vr6ivoKegoKCgn5+fn5ifkJeQkJCQj4+Pj4iPgIeAgICAf39/f3h/cHdw cHBwb29vb2hvYGdgYGBgX19fX1hfUFdQUFBQT09PT0hPQEdAQEBAPz8/Pzg/MDcwMDAwLy8v LygvICcgICAgHx8fHxgfEBcQEBAQDw8PDwgPAAcAAAAAwMDAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAB/+AAIKDGio2h4iJiouMjY6PkJGSk5SVlpAyJg6DnIMGJjw+oqOkpaanqKmq q6ytrq+wsa08OiACnYIcoiYeGL6/wMHCw8TFxsfIycrLzM3HGh4uPjqbnCQ+KAwEBtzd3t/g 4eLj5OXm5+jp6uvoBA46PASDBD48CgYK+fr7/P3+/wADChxIsKDBgwgNIjCgwQeHQRB8YECg gGLCixgzatzIMaFFA4YMCELhg0HHkyhTqlzZDwEBDzyqeSjJsqbNmzhbvvRhIRfNfBZTBq04 NKfRowoJ6KLg0yTQcgiKEgSHtKrVgC6XNiXKwITXr2BFMJAqEAEHEyRMoOBA9qpbo1n/fTAF oMupAQerVBBAiEDaKBr43gpGGndu3Xx3Q6mSEHggRQMsSMFoPLiyzcJb7zFQrGps232PI4+a XHnhQnyfLXtUKjezgc2sXBhIDfSeaFGkBUdV4cKFirGqT2Km+/M1Z1UOKP8LLVm5VQMYRqGY Hbzj8MOaj6eSARyr7ea6KSjWQb06x+vFYbci4Zwf81GyBRPw60MDbfMF0dtV3wpCe33viRKf UPfdg5coLAT1WVRRKcRgRg2uphVx+2mnigyztRWgD7kx+CBBFEkVIUCQjZIcYv6ECBqILf3j zVAFLseaYelZqIoFBGh4jwrNUQSDDjjoIOSQRBYJz1gGSECL/5A8uCCQAboguNc9GBQplwJK LckDCP89BcqQPrBX1F1A4oDDRPkQgMGSRrZJjUX6IcYfKiaQIsOI7u3YowI0yOKDAwtJUIoM AhHQpygShGiADqZQgOVMo5jQZUUGkETKdGMeKIp99yiAQyzdxZndKg5cMwoL/z3G42jUHRoL oEmWAgNALpnqgwl7LQRpKYmyNgp7T1oaqXN3kTIRRa6+AihQM7o25ykQIHCcBe2pumeyPtCg FgrccmuCDKTAKqhktEKgGA0VVcSADaf0GqUowAZUaSmS9lPsKMdi4EK3/HaLlnahNkuhnDaS IgEBIpCCA1F5gnRtKbZs481OJgYq6/9yBtCnQWAG2MprXL9Oeo+wu5Snz72ioHnaxBMzIGtj cS70rCm9riqKCibv4zCr+WArgogghGsxuS0hYK4ogDHoMirukhIviSTf6hwCmkpEm0UwkMJD tAAKjB3VBY+S6GuMmlitzRwGBi4pP7cUdMUIjDuaizajWVHUpFCw0LthXh11ve5VzemT0TlN magELK0KBAIYIEDhonAn8cT0+SCDADmy63Tj3wiQ8ChYChCRneFQfKvEprcrD98kcD6OAH8D AE7ixk4uDu2k0EDA7txE5TVNCLCgAgQOFG/88Q6w0JsLMFjAwPMMSLD89C5gAD0DGMDQmwTX MwCC9tTDwAH/9A7w5gIJ3VOgPPUoOPC8AjCsDwMExBOvAfIoKO8B9ApoQP3/vbmf8RjgPwCa gAHFY0A0ADi95pFPAebrDQz09jtK+QAHt8AFJzQnChR0Ym2lgAAnNIWATvCNFBrgxNpg0AkE nIIEhCCFCzhhgRSqUBQPGQQFWjEXHaKChYN4myriwQlPlcI+UGoNcSjFAxtkUIOC8BknEFA2 UohwEHJTQCd2VYocAkAAa2NBJxSgHR3IAwAKKIUWB+EDD3Qia22kIQ87scNTAFEQQkyFGzkB R2PtTYm6YKIToTgIbEmEE6MzGCeyuEVU2PCLYRxjFXEoCAH00Qci4ETQvCgIOO5R/xAWmOMi f6jJVbDgiQCoGr4MUMGFNBGVGjSkD9YoiDopEouga+QpHglGBEmSFDY4IxlH88SlcRIAnpQj K3ooiDqa4o4AyOMpqlFJEBqLlRMKpCsHSUgAyNIGnZikBEYpCloKgosoHEQvfSDGIopzEJW7 YicpyUdRfBIAoVwmHUkZRFXMkBOfMwUGsAnIWW4TlriQZX1GeEtBMJIT6BwFLyPpzlGoIIaj kAE5HfJGeypzFcwEgDNlVUpU2KCE80jFQFtpgFd205upYKYQx4nLcurSFBP1ZUXrUY1FjYIH NAUAAj5Fz0EkcxD5BOk+7VjSUxzzhNdkqUu7qdB6iGQQZf8LKgAeOoiIbkqdFB2EAsoGgn6O oqyDINkxjwrKVvSEEyOVTFNNMUZVrDSbBm0pNwlZVXYWk1Hy3GouIepIsOp0EOrJYEtJcVUA KO6r9YwjUt3aiaSS1Kw47QTaBEpQGgkSoZ3oK0dxGViunrOw1TysIGADw9OO4p6bRatRPTpZ Vry1ts+c61/OKAhVclaqe4WiaEvCCRfcVrA2JewuDcvOTmwGBhl0AGdO6QkSgEAE2BVBYJFJ 27batrL8xCO0OmrXzvrksy8dLjgRW9rBdhW1kFStY30gT5JR86Wzlax3V3FcfIY3mqbQaBFZ cdeCopeqrZAtLkwLAK8uNLXNHaH/gJGLNPzigq3+/e5HLyveUqC0kps9RYE9e9D0uqK/NZ3l TY/I3HaKlQGVJGo9YDwPfjyvsfPUb4b5C16mYraDPV7FiM9bYgS3AoMaZLA0JdpiQkLuVm9k E5B00F8MWzYV/X1y7nSrVQBYU6XmXeJCdBBcXAiAg6zI5ILd20wUgMUrKAiqAA7l4k4IgDPB vfIoqtxdNHqAA4AOdKA1cI3+KuDPguZAL+ba2kE4AC0kiLSkJS0CcA1ZzHoFrTrRzAoTrDm5 FlYnnaHIRWaOFRV81jF+64RiC+cRA6HmREMurU0DXFDTg5CxK8xJYV5bWABEhaajt8wJvOEr tH1+aRp5/xJrgJLimBaeyUSSSGIEeOCRhISGB7bN7W53W9GwLiIIto3jWEMDBK3GALfPwm0Q YEDRiv42CGg8CHWDYLvdhMC4k+OhfvubQRL4swdAcN9QO+DeMTPNvz3EDtQAZWWpWfjKlMMg bhDAskAlx5h6Ryt/c9wjL5K4yDP0O5PESOT/rg2A8OQRlgMIH7r2wJQWvnIIhQjlNE+XRkSF H40QII/2cHnPz1PyoWckVqTwgMiMfpHhNMQpTD+IS6x5pxhFPT+seSsooH71qchtxlbvumN+ Tl9BNIRrYh+IA9CMgimlnegk0AFK6YGhsBsdASKI3/q68/adG00HHkyrD0CgcIucG/7wiE/8 7hafo8Q7/vGQf/xr+mROQ/lABdzLR/c2z/nOe/7zoAe95kNP+tKb/vSo37wDNMAuT+NCBIzi gexnT/va2/72uM+97nfP+977/vfAD37vRaEDfM/DAt9joPKXz/zmO//50I++9KdPfReYgANd brb2t8/97nv/++APv/jHT/7ym//82w8EADs= ------=_NextPart_000_001__5510939_76535,04-- From martin@v.loewis.de Thu Jan 10 07:17:30 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 10 Jan 2002 08:17:30 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <3C3CCED1.DBCE3B98@lemburg.com> (mal@lemburg.com) References: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> <3C3C739F.9784054A@lemburg.com> <200201092124.g09LOSp01918@mira.informatik.hu-berlin.de> <3C3CCED1.DBCE3B98@lemburg.com> Message-ID: <200201100717.g0A7HUg01381@mira.informatik.hu-berlin.de> > > I think format specifiers that require explicit memory management are > > so difficult to use that they must be avoided. I'd be in favour of > > extending the argtuple type to include additional slots for objects > > that go away when the tuple goes away. > > I don't understand that last comment. I was suggesting that the tuple passed to C API should not be of , but of , which should have a method add_object(o), which puts a reference to o into the tuple. Then, whenever you want to return memory to the user, you create a string object whose contents is that memory, and you put a reference to the string into the argument tuple. The author of the C function then does not need to worry about memory management: the memory will be deallocated when the argument tuple is released. Unfortunately, that approach cannot be used for the existing conversion codes that return memory, since it is the extension's job to release the memory; changing that would break extensions which do properly release memory. Regards, Martin From martin@v.loewis.de Thu Jan 10 07:32:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 10 Jan 2002 08:32:20 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <20020110001802.54361E8451@oratrix.oratrix.nl> (message from Jack Jansen on Thu, 10 Jan 2002 01:17:57 +0100) References: <20020110001802.54361E8451@oratrix.oratrix.nl> Message-ID: <200201100732.g0A7WKB01423@mira.informatik.hu-berlin.de> > One minor misgiving is that this call will *always* copy the string, > even if the internal coding of unicode objects is wchar_t. That's a > bit of a nuisance, but we can try to fix that later. Not sure what you mean by "later". Once this is being used, you cannot fix it anymore. Extensions *will* have to call PyMem_Free, and when they do so, changing the format specifier to do something better won't be possible, anymore, since the call to PyMem_Free will be in the way. Regards, Martin From martin@v.loewis.de Thu Jan 10 07:27:39 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 10 Jan 2002 08:27:39 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: <20020109235720.6CE49E8451@oratrix.oratrix.nl> (message from Jack Jansen on Thu, 10 Jan 2002 00:57:15 +0100) References: <20020109235720.6CE49E8451@oratrix.oratrix.nl> Message-ID: <200201100727.g0A7Rdd01384@mira.informatik.hu-berlin.de> > Or, to make things clearer, WinObj_Type->tp_convert would simply > point to the current WinObj_Convert function. So what do you gain with that extension? It seem all that is done is you can replace _Convert by _Type everywhere, with no additional change to the semantics. > > > ps_GetDrawableSurface = calldll.newcall(psapilib.ps_GetDrawableSurface, > > > Carbon.Qd.GrafPortType) [...] > Not at the moment, but in calldll version 2 there would be. In stead > of passing types as "l" or "h" you would pass type objects to > newcall(). Newcall() would probably special-case the various ints but > for all other types simply call PyArg_Parse(arg, "O@", typeobj, > &voidptr). I still don't understand. In your example, GrafPortType is a return type, not an argument type. So you *have* an anything, and you *want* the GrafPortType. How exactly do you use PyArg_Parse in that scenario? Also, why would you use this extension inside newcall()? I'd rather expect it in ps_GetDrawableSurface.__call__ instead (i.e. when you deal with a specific call, not when you create the callable instance). Regards, Martin From Anthony Baxter Thu Jan 10 08:17:02 2002 From: Anthony Baxter (Anthony Baxter) Date: Thu, 10 Jan 2002 19:17:02 +1100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: Message from barry@zope.com (Barry A. Warsaw) of "Fri, 04 Jan 2002 02:53:11 CDT." <15413.24423.772132.175722@anthem.wooz.org> Message-ID: <200201100817.g0A8H2Y01871@mbuna.arbhome.com.au> >>> Barry A. Warsaw wrote > AB> Ok, I'd like to make the 2.1.2 release some time in the first > AB> half of the week starting 7th Jan, assuming that's ok for the > AB> folks who'll need to do the work on the PC/Mac packaging. I'm doing this this evening; i.e. now. > I'd be more inclined to clone PEP 101 into a PEP 102 with micro > release instructions. The nice thing about 101 is that you can just > go down the list, checking things off in a linear fashion as you > complete each item. I'd be loathe to break up the linearity of that. Ok. I'm doing this as I go. Should I just check in PEP 102 directly, or is that Not The Done Thing? > AB> I don't have access to creosote.python.org, so someone else's > AB> going to need to do this. > I can certainly help with any fiddling necessary on creosote. Then > again... > ...if this is going to be a recurring role, we might just want to give > you access to the web cvs tree and creosote. Whichever works for you. thanks, Anthony -- Anthony Baxter It's never too late to have a happy childhood. From mal@lemburg.com Thu Jan 10 08:49:32 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 10 Jan 2002 09:49:32 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <20020110001802.54361E8451@oratrix.oratrix.nl> Message-ID: <3C3D559C.19B58C79@lemburg.com> Jack Jansen wrote: > > Recently, "M.-A. Lemburg" said: > > How about this: we add a wchar_t codec to Python and the "eu#" parser > > marker. Then you could write: > > > > wchar_t value = NULL; > > int len = 0; > > if (PyArg_ParseTuple(tuple, "eu#", "wchar_t", &value, &len) < 0) > > return NULL; > > I like it! Even though I have to do the memory management myself (and > have to think of the error case) it at least looks reasonable. Good :-) > I'm > assuming here that if I pass a StringObject it will be unicode-encoded > using the default encoding, and that unicode value will then be > converted to wchar_t and put in value, right? Or, in other words, > passing "a.out" will do the same as passing u"a.out"... Yes. > One minor misgiving is that this call will *always* copy the string, > even if the internal coding of unicode objects is wchar_t. That's a > bit of a nuisance, but we can try to fix that later. Copying will always take place (either into a preallocated buffer or one which the PyArg_ParseTuple() API allocates), but then: that's the cost you have to pay for the simplicity of the approach. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From thomas.heller@ion-tof.com Thu Jan 10 09:10:58 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 10 Jan 2002 10:10:58 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects References: <20020109235720.6CE49E8451@oratrix.oratrix.nl> Message-ID: <03be01c199b6$cfcfe350$e000a8c0@thomasnotebook> > > > If we had something like ("O@", typeobject) calldll could > > > be extended so you could do something like > > > psapilib = calldll.getlibrary(....) > > > ps_GetDrawableSurface = calldll.newcall(psapilib.ps_GetDrawableSurface, > > > Carbon.Qd.GrafPortType) > > > > > > (newcall() arguments are funcpointer, return value type, arg1 type, ...) > > > > > > You cannot do this currently > > > > Please let me try to summarize what this is doing: Given a type object > > and a long, create an instance of that type. Is that a correct > > analysis of what has to be done? > > That would allow you to do the same thing, but rather more error prone > (i.e. I think it is much more of a hack than what I'm trying to get > at). As you noted above WinObj's unfortunately need such a hack, but I > would expect to get rid of it as soon as possible. I really don't like > passing C pointers around in Python integers. > > > I completely fail to see how O& fits into the puzzle. AFAICT, > > conversion of the return value occurs inside cdc_call. There is no > > tuple to parse anyway nearby. > > Not at the moment, but in calldll version 2 there would be. In stead > of passing types as "l" or "h" you would pass type objects to > newcall(). Newcall() would probably special-case the various ints but > for all other types simply call PyArg_Parse(arg, "O@", typeobj, > &voidptr). Here's an outline which could work in 2.2: Create a subtype of type, having a tp_convert slot: typedef int (*convert_func)(PyTypeObject *, void **); typedef struct { PyTypeObject type; convert_func tp_convert; } WrapperTypeType; and use it as metaclass (metatype?) for your WindowObj: class WindowObj(...): __metaclass__ = WrapperTypeType Write a function to return a conversion function: convert_func *get_converter(PyTypeObject *type) { if (WrapperTypeType_Check(type)) return ((WrapperTypeType *)type)->tp_convert; /* code to check additional types and return their converters */ .... } and then if (!PyArg_ParseTuple(args, "O&", get_converter(WinObj_Type), &Window)) How does this sound? Thomas From thomas.heller@ion-tof.com Thu Jan 10 09:22:29 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 10 Jan 2002 10:22:29 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> <077201c198ec$56b9b740$0900a8c0@spiff> <04ca01c19917$2229b220$e000a8c0@thomasnotebook> <200201092215.g09MFO902299@mira.informatik.hu-berlin.de> Message-ID: <03d001c199b8$6c70d290$e000a8c0@thomasnotebook> > > How can I do the equivalent of > > u"some string" > > in terms of > > unicode("some string", encoding) > > Again, what do you need that for? If there won't be any escape > sequences or non-ASCII characters inside, then > > unicode("some string", "ascii") > > will work fine. In the general case, > > unicode("some string", "unicode-escape") > > should work. In the case of pure ASCII, unicode("some string") also works. Here's what I'm trying to do: I have a string variable containing some non-ascii characters (from a characterset which was previously called 'ansi' instead of 'oem' on windows). For example the copyright symbol "=A9" (repr("=A9") gives "\xa9"). Now I want to convert this string to unicode. u"=A9" works fine, unicode(variable) gives an ASCII decoding error. Thomas From mal@lemburg.com Thu Jan 10 11:14:31 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 10 Jan 2002 12:14:31 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> <077201c198ec$56b9b740$0900a8c0@spiff> <04ca01c19917$2229b220$e000a8c0@thomasnotebook> <200201092215.g09MFO902299@mira.informatik.hu-berlin.de> <03d001c199b8$6c70d290$e000a8c0@thomasnotebook> Message-ID: <3C3D7797.BB5A8088@lemburg.com> Thomas Heller wrote: >=20 > > > How can I do the equivalent of > > > u"some string" > > > in terms of > > > unicode("some string", encoding) > For example the copyright symbol "=A9" (repr("=A9") gives "\xa9"). > Now I want to convert this string to unicode. > u"=A9" works fine, unicode(variable) gives an ASCII decoding error. u"something" maps to unicode("something", "latin-1"). This is because Unicode literals in Python are interpreted as being Latin-1.=20 See the source code encoding PEP (0263) for details on what could be=20 done to make this user-configurable. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From anthony@interlink.com.au Thu Jan 10 12:24:29 2002 From: anthony@interlink.com.au (Anthony Baxter) Date: Thu, 10 Jan 2002 23:24:29 +1100 Subject: [Python-Dev] 2.1.2 testing. Message-ID: <200201101224.g0ACOTG05498@mbuna.arbhome.com.au> Has anyone had a chance to test that 2.1.2 builds and works correctly on anything? I'm testing on the following systems. sourceforge compile farm boxes are marked as such, compaq testdrive boxes to arrive as well[1]. For each, a fresh cvs export, followed by ./configure ; make ; make test. Are there additional useful tests that could be run? Linux/x86 Redhat 6.2 PASSED Linux/x86 Redhat 7.1 PASSED Linux/x86 Redhat 7.2 PASSED Solaris/sparc 2.7 (gcc-2.95.2) PASSED Linux/x86 Debian 2.2 (cf.sf.net) PASSED Linux/PPC [RS/6000] Debian 2.2 (cf.sf.net) PASSED Linux/alpha Debian 2.2 (cf.sf.net) PASSED FreeBSD 4.4 (cf.sf.net) PASSED Solaris/sparc 2.8 (cf.sf.net) (gcc-2.95.2) PASSED Tru64/Alpha 4.0 (compaq) ... still building ... Tru64/Alpha 5.1 (compaq) ... to be done ... Linux/sparc Debian 2.2 (cf.sf.net) FAILED This is scary. I don't know why this one alone fails - it fails the test_math test. Running the test by hand: anthonybaxter@usf-cf-sparc-linux-1:~/python212_linxsparc$ PYTHONPATH= ./python ./Lib/test/test_math.py math module, testing with eps 1e-05 constants acos Traceback (most recent call last): File "./Lib/test/test_math.py", line 21, in ? testit('acos(-1)', math.acos(-1), math.pi) OverflowError: math range error Running math.acos(-1) gives the correct answer. Anyone got any idea? I couldn't get py212 to build on our remaining solaris/x86 box, but then I can't get 2.1.1 to build on it either, without a whole lot of manual hackery - so I don't care about that. It's just a stuffed machine. :) I was hoping to test on MacOS X, but the cf.sf.net boxes aren't answering... anyone else want to give it a go? Anthony [1] sheesh. had to install telnet for the compaq boxes. first time I've not had ssh access somewhere for a while. . . (plus, they don't have cvs. sigh.) From skip@pobox.com Thu Jan 10 13:33:43 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 10 Jan 2002 07:33:43 -0600 Subject: [Python-Dev] 2.1.2 testing. In-Reply-To: <200201101224.g0ACOTG05498@mbuna.arbhome.com.au> References: <200201101224.g0ACOTG05498@mbuna.arbhome.com.au> Message-ID: <15421.38967.922983.126297@12-248-41-177.client.attbi.com> Anthony> Has anyone had a chance to test that 2.1.2 builds and works Anthony> correctly on anything? I will give it a quick try on my Mandrake 8.1 system. What's the relevant CVS branch? I didn't see anything obvious like "r212". Skip From Anthony Baxter Thu Jan 10 13:34:56 2002 From: Anthony Baxter (Anthony Baxter) Date: Fri, 11 Jan 2002 00:34:56 +1100 Subject: [Python-Dev] 2.1.2 testing. In-Reply-To: Message from "Skip Montanaro" of "Thu, 10 Jan 2002 07:33:43 MDT." <15421.38967.922983.126297@12-248-41-177.client.attbi.com> Message-ID: <200201101334.g0ADYud06133@mbuna.arbhome.com.au> It's still release21-maint. I'm waiting til I've finished my testing before making the tag. (As it's a bugfix release, I'm not making a release branch off the existing maintenance branch (that path leads to madness)) Ta, Anthony >>> "Skip Montanaro" wrote > > Anthony> Has anyone had a chance to test that 2.1.2 builds and works > Anthony> correctly on anything? > > I will give it a quick try on my Mandrake 8.1 system. What's the relevant > CVS branch? I didn't see anything obvious like "r212". > > Skip > -- Anthony Baxter It's never too late to have a happy childhood. From thomas.heller@ion-tof.com Thu Jan 10 14:04:18 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 10 Jan 2002 15:04:18 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> <077201c198ec$56b9b740$0900a8c0@spiff> <04ca01c19917$2229b220$e000a8c0@thomasnotebook> <200201092215.g09MFO902299@mira.informatik.hu-berlin.de> <03d001c199b8$6c70d290$e000a8c0@thomasnotebook> <3C3D7797.BB5A8088@lemburg.com> Message-ID: <06e901c199df$cb1ff330$e000a8c0@thomasnotebook> My problem is solved. I'm using now unicode(some_string, "latin-1").encode("utf-16-le") or unicode(some_string, "unicode-escape").encode("utf-16-le") to pack "unicode strings" (not sure about the terminology) into my structures. It seems PEP100 and the unicode standard (link in PEP 100) should be required reading for everyone using unicode. Thanks again, MaL, Martin, /F. Thomas From guido@python.org Thu Jan 10 14:31:58 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 10 Jan 2002 09:31:58 -0500 Subject: [Python-Dev] 2.1.2 release -- do we need a beta? Message-ID: <200201101431.JAA30746@cj20424-a.reston1.va.home.com> Do we need a beta for the 2.1.2 release? I think it might be prudent -- Anthony's last-minute checking of a critical fix to a bug that prevented compilation on one platform points this out again. The alternative is to be optimistic, and to quickly release 2.1.3 if 2.1.2 has a problem that we discover after its release. Opinions? I think a beta is prudent, and it shouldn't cost too much more in effort -- if we're lucky, nothing changes and we just fiddle some version numbers. If it turns out to be needed, it's better than having to wear a brown bag over your head. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Anthony Baxter Thu Jan 10 14:44:44 2002 From: Anthony Baxter (Anthony Baxter) Date: Fri, 11 Jan 2002 01:44:44 +1100 Subject: [Python-Dev] Re: 2.1.2 release -- do we need a beta? In-Reply-To: Message from Guido van Rossum of "Thu, 10 Jan 2002 09:31:58 CDT." <200201101431.JAA30746@cj20424-a.reston1.va.home.com> Message-ID: <200201101444.g0AEiie07296@mbuna.arbhome.com.au> >>> Guido van Rossum wrote > Do we need a beta for the 2.1.2 release? I think it might be prudent > -- Anthony's last-minute checking of a critical fix to a bug that > prevented compilation on one platform points this out again. Maybe. But on the other hand, I've also done a bunch of different builds on as many platforms as I could find. [The oopsie I found was actually probably the most complex merge of the lot, and that's not saying much. put it down to too many CVS checkouts and not enough brain :)] The ugliness potential is from either those platforms that are an offense against nature that no-one thinks to try, or from some sort of weird compilation options. I don't think that there's many of the fixes in the 2.1.2 code that are going to break something that worked before - with the list of platforms I've hit tonight, I think I've got most of the new code exercised. (One of the minor-ish constraints I put on candidate fixes was whether or not I could easily test it.) The other question I have to ask is whether people will actually download and test a beta/release candidate of a bugfix release. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From skip@pobox.com Thu Jan 10 15:01:52 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 10 Jan 2002 09:01:52 -0600 Subject: [Python-Dev] 2.1.2 build on Mandrake Message-ID: <15421.44256.208557.830065@12-248-41-177.client.attbi.com> I got the usual output on my Mandrake 8.1 system when building the release21-maint branch: 126 tests OK. 1 test failed: test_linuxaudiodev 13 tests skipped: test_al test_cd test_cl test_dbm test_dl test_gl test_imgfile test_largefile test_nis test_socketserver test_sunaudiodev test_winreg test_winsound Skip From thomas.heller@ion-tof.com Thu Jan 10 15:07:49 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 10 Jan 2002 16:07:49 +0100 Subject: [Python-Dev] Re: 2.1.2 release -- do we need a beta? References: <200201101444.g0AEiie07296@mbuna.arbhome.com.au> Message-ID: <075701c199e8$afb9b410$e000a8c0@thomasnotebook> > >>> Guido van Rossum wrote > > Do we need a beta for the 2.1.2 release? I think it might be prudent > > -- Anthony's last-minute checking of a critical fix to a bug that > > prevented compilation on one platform points this out again. > > Maybe. But on the other hand, I've also done a bunch of different builds > on as many platforms as I could find. > [The oopsie I found was actually probably the most complex merge of the > lot, and that's not saying much. put it down to too many CVS checkouts > and not enough brain :)] > > The ugliness potential is from either those platforms that are an offense > against nature that no-one thinks to try, or from some sort of weird > compilation options. I don't think that there's many of the fixes in > the 2.1.2 code that are going to break something that worked before - > with the list of platforms I've hit tonight, I think I've got most of > the new code exercised. (One of the minor-ish constraints I put on > candidate fixes was whether or not I could easily test it.) > > The other question I have to ask is whether people will actually download > and test a beta/release candidate of a bugfix release. Given my current schedule I cannot offord to build 2.1.2 from CVS and test it, but I would certainly try out a beta or rc on win2000. Been burned too often by a buggy bdist_wininst ;-( Thomas From fdrake@acm.org Thu Jan 10 15:08:25 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 10 Jan 2002 10:08:25 -0500 (EST) Subject: [Python-Dev] 2.1.2 testing. In-Reply-To: <200201101457.g0AEvJg07475@mbuna.arbhome.com.au> References: <15421.43373.880967.798299@grendel.zope.com> <200201101457.g0AEvJg07475@mbuna.arbhome.com.au> Message-ID: <15421.44649.241947.301731@grendel.zope.com> [Sending to python-dev so people know the results for Solaris 2.8.] Anthony Baxter writes: > > I'm exporting onto Solaris 2.8 now; will report the results. > > Great. If you get a chance to try it with some non-standard build > args, that would also be appreciated... Specific suggestions please! I've not looked at those in a while, so I don't know which would be most useful. A note summarizing desired alternate builds would be good. Results for Solaris 2.8, gcc 2.95.2: 117 tests OK. 1 test failed: test_sunaudiodev 22 tests skipped: test_al test_bsddb test_cd test_cl test_dl test_gdbm test_gl test_gzip test_imgfile test_largefile test_linuxaudiodev test_minidom test_nis test_openpty test_pyexpat test_sax test_socketserver test_sundry test_winreg test_winsound test_zipfile test_zlib The sunaudiodev failure is "permission denied", which is not a real failure; treat this as skipped. (The machine is not local to me, so there's no way for me to know if the test worked anyway.) Note that many of the optional modules don't get built on that machine, but I can't do much (and nothing quickly) to change the availability of additional libraries. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From sjoerd@acm.org Thu Jan 10 15:20:51 2002 From: sjoerd@acm.org (Sjoerd Mullender) Date: Thu, 10 Jan 2002 16:20:51 +0100 Subject: [Python-Dev] 2.1.2 testing. References: <200201101224.g0ACOTG05498@mbuna.arbhome.com.au> Message-ID: <3C3DB153.B73E6457@acm.org> I tried it on IRIX 6.5.13m (SGI) using gcc, and I saw two problems in the test set. One was in test_locale and can be written off as a bug in the IRIX environment. The other was in test_pty for which there is a fix. Just get the latest version of test_pty (the bug is in the test). One more problem I saw was that test_sundry was skipped with the message that there was an unresolvable symbol in bsddb.so by the name of dbopen. I don't quite understand why this is. Anthony Baxter wrote: > > Has anyone had a chance to test that 2.1.2 builds and works correctly > on anything? I'm testing on the following systems. sourceforge compile > farm boxes are marked as such, compaq testdrive boxes to arrive as well[1]. > > For each, a fresh cvs export, followed by ./configure ; make ; make test. > > Are there additional useful tests that could be run? > > Linux/x86 Redhat 6.2 PASSED > Linux/x86 Redhat 7.1 PASSED > Linux/x86 Redhat 7.2 PASSED > Solaris/sparc 2.7 (gcc-2.95.2) PASSED > Linux/x86 Debian 2.2 (cf.sf.net) PASSED > Linux/PPC [RS/6000] Debian 2.2 (cf.sf.net) PASSED > Linux/alpha Debian 2.2 (cf.sf.net) PASSED > FreeBSD 4.4 (cf.sf.net) PASSED > Solaris/sparc 2.8 (cf.sf.net) (gcc-2.95.2) PASSED > Tru64/Alpha 4.0 (compaq) ... still building ... > Tru64/Alpha 5.1 (compaq) ... to be done ... > > Linux/sparc Debian 2.2 (cf.sf.net) FAILED > This is scary. I don't know why this one alone fails - it fails the > test_math test. > > Running the test by hand: > anthonybaxter@usf-cf-sparc-linux-1:~/python212_linxsparc$ PYTHONPATH= ./python ./Lib/test/test_math.py > math module, testing with eps 1e-05 > constants > acos > Traceback (most recent call last): > File "./Lib/test/test_math.py", line 21, in ? > testit('acos(-1)', math.acos(-1), math.pi) > OverflowError: math range error > > Running math.acos(-1) gives the correct answer. Anyone got any idea? > > I couldn't get py212 to build on our remaining solaris/x86 box, but then > I can't get 2.1.1 to build on it either, without a whole lot of manual > hackery - so I don't care about that. It's just a stuffed machine. :) > > I was hoping to test on MacOS X, but the cf.sf.net boxes aren't > answering... anyone else want to give it a go? > > Anthony > > [1] sheesh. had to install telnet for the compaq boxes. first time I've not > had ssh access somewhere for a while. . . (plus, they don't have cvs. sigh.) > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From barry@zope.com Thu Jan 10 15:29:57 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 10 Jan 2002 10:29:57 -0500 Subject: [Python-Dev] 2.1.2 release -- do we need a beta? References: <200201101431.JAA30746@cj20424-a.reston1.va.home.com> Message-ID: <15421.45941.29585.415550@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> Do we need a beta for the 2.1.2 release? I think it might be GvR> prudent -- Anthony's last-minute checking of a critical fix GvR> to a bug that prevented compilation on one platform points GvR> this out again. GvR> The alternative is to be optimistic, and to quickly release GvR> 2.1.3 if 2.1.2 has a problem that we discover after its GvR> release. We should be prepared for this in any case. GvR> Opinions? I think a beta is prudent, and it shouldn't cost GvR> too much more in effort -- if we're lucky, nothing changes GvR> and we just fiddle some version numbers. If it turns out to GvR> be needed, it's better than having to wear a brown bag over GvR> your head. :-) I think micro releases should be as lightweight as possible so we /can/ quickly get a new one out if a small, but important fix becomes necessary. I'd say do a release candidate (which will probably not get much testing beyond those who test cvs anyway), and then get 2.1.2 final out. -Barry From Anthony Baxter Thu Jan 10 15:31:54 2002 From: Anthony Baxter (Anthony Baxter) Date: Fri, 11 Jan 2002 02:31:54 +1100 Subject: [Python-Dev] 2.1.2 release -- do we need a beta? In-Reply-To: Message from barry@zope.com (Barry A. Warsaw) of "Thu, 10 Jan 2002 10:29:57 CDT." <15421.45941.29585.415550@anthem.wooz.org> Message-ID: <200201101531.g0AFVsR11565@mbuna.arbhome.com.au> >>> Barry A. Warsaw wrote > I'd say do a release candidate (which will probably not get much > testing beyond those who test cvs anyway), and then get 2.1.2 final > out. Ok. In that case I'll put the version back to rc1, and will start rolling the tarball? (Not going to do it immediately - but soonish...) Anthony From barry@zope.com Thu Jan 10 15:34:24 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 10 Jan 2002 10:34:24 -0500 Subject: [Python-Dev] 2.1.2 testing. References: <200201101224.g0ACOTG05498@mbuna.arbhome.com.au> <3C3DB153.B73E6457@acm.org> Message-ID: <15421.46208.47478.817367@anthem.wooz.org> >>>>> "SM" == Sjoerd Mullender writes: SM> One more problem I saw was that test_sundry was skipped with SM> the message that there was an unresolvable symbol in bsddb.so SM> by the name of dbopen. I don't quite understand why this is. Hmm, if this was 2.2.1 I'd say it's the known brokenness of setup.py w.r.t. bsddbmodule on some systems. I think the setup.py is okay in 2.1.x but I'm doing a build on Mandrake 8.1 now... -Barry From guido@python.org Thu Jan 10 15:42:35 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 10 Jan 2002 10:42:35 -0500 Subject: [Python-Dev] 2.1.2 testing. In-Reply-To: Your message of "Thu, 10 Jan 2002 16:20:51 +0100." <3C3DB153.B73E6457@acm.org> References: <200201101224.g0ACOTG05498@mbuna.arbhome.com.au> <3C3DB153.B73E6457@acm.org> Message-ID: <200201101542.KAA31093@cj20424-a.reston1.va.home.com> > One more problem I saw was that test_sundry was skipped with the > message that there was an unresolvable symbol in bsddb.so by the > name of dbopen. I don't quite understand why this is. test_sundry shouldn't import dbhash. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 10 15:43:39 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 10 Jan 2002 10:43:39 -0500 Subject: [Python-Dev] 2.1.2 release -- do we need a beta? In-Reply-To: Your message of "Fri, 11 Jan 2002 02:31:54 +1100." <200201101531.g0AFVsR11565@mbuna.arbhome.com.au> References: <200201101531.g0AFVsR11565@mbuna.arbhome.com.au> Message-ID: <200201101543.KAA31111@cj20424-a.reston1.va.home.com> [Barry] > > I'd say do a release candidate (which will probably not get much > > testing beyond those who test cvs anyway), and then get 2.1.2 final > > out. +1 [Anthony] > Ok. In that case I'll put the version back to rc1, and will start rolling > the tarball? > > (Not going to do it immediately - but soonish...) But how about the Windows installer? A release isn't done without it. --Guido van Rossum (home page: http://www.python.org/~guido/) From Anthony Baxter Thu Jan 10 15:51:57 2002 From: Anthony Baxter (Anthony Baxter) Date: Fri, 11 Jan 2002 02:51:57 +1100 Subject: [Python-Dev] 2.1.2 release -- do we need a beta? In-Reply-To: Message from Guido van Rossum of "Thu, 10 Jan 2002 10:43:39 CDT." <200201101543.KAA31111@cj20424-a.reston1.va.home.com> Message-ID: <200201101551.g0AFpvJ11880@mbuna.arbhome.com.au> >>> Guido van Rossum wrote > [Anthony] > > Ok. In that case I'll put the version back to rc1, and will start rolling > > the tarball? > > > > (Not going to do it immediately - but soonish...) > > But how about the Windows installer? A release isn't done without it. That, I can't help you with. I don't have access to MSVC, and I don't have the requisite level of windows knowledge to do the build, anyway. (PEP-0101 refers to 'Windows Magic'. What's Tim's time like? Anthony -- Anthony Baxter It's never too late to have a happy childhood. From fdrake@acm.org Thu Jan 10 16:04:00 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 10 Jan 2002 11:04:00 -0500 (EST) Subject: [Python-Dev] Python 2.1.2c1 docs Message-ID: <15421.47984.453766.444940@grendel.zope.com> The documentation for Python 2.1.2c1 is online at: http://python.sourceforge.net/maint21-docs/ Please report any real problems with these docs to me or Anthony with a level 7 priority and set the "Group" to "Python 2.1.2". To file a bug, log into SourceForge and then visit: http://sourceforge.net/tracker/index.php?group_id=5470&atid=105470 Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@zope.com Thu Jan 10 16:13:28 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 10 Jan 2002 11:13:28 -0500 Subject: [Python-Dev] 2.1.2 testing. References: <200201101224.g0ACOTG05498@mbuna.arbhome.com.au> <3C3DB153.B73E6457@acm.org> <200201101542.KAA31093@cj20424-a.reston1.va.home.com> Message-ID: <15421.48552.779672.846007@anthem.wooz.org> All the tests pass on Mandrake 8.1, including LFS with the CC='...' configure instruction. -Barry From Anthony Baxter Thu Jan 10 16:15:15 2002 From: Anthony Baxter (Anthony Baxter) Date: Fri, 11 Jan 2002 03:15:15 +1100 Subject: [Python-Dev] Re: [Zope.Com Geeks] 2.1.2 testing. In-Reply-To: Message from Jens Vagelpohl of "Thu, 10 Jan 2002 11:13:54 CDT." <0AD45C29-05E5-11D6-8583-003065C7DEAE@zope.com> Message-ID: <200201101615.g0AGFFn17870@mbuna.arbhome.com.au> Jens tested MacOS X. Results below: >>> Jens Vagelpohl wrote > anthony, > > here is what i found: > > as a "basic" build with the minimum configure options required to compile > (--with-dyld --with-suffix) i get the following test results (after upping > the stacksize to allow the re and sre-tests to succeed): > > 119 tests OK. > 1 test failed: test_largefile > 20 tests skipped: test_al test_cd test_cl test_dl test_fcntl test_gdbm > test_gl test_imgfile test_linuxaudiodev test_locale test_minidom test_nis > test_poll test_pty test_pyexpat test_sax test_socketserver > test_sunaudiodev test_winreg test_winsound > make: *** [test] Error 1 > > this is the same result as with 2.1.1. then, as a last test, i built and > tested zope with it. all unit tests run except for 3 ZODB tests which are > most likely not due to python misbehaving. > > lookin' good! > > jens > -- Anthony Baxter It's never too late to have a happy childhood. From fredrik@pythonware.com Thu Jan 10 16:38:27 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 10 Jan 2002 17:38:27 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> <077201c198ec$56b9b740$0900a8c0@spiff> <04ca01c19917$2229b220$e000a8c0@thomasnotebook> <200201092215.g09MFO902299@mira.informatik.hu-berlin.de> <03d001c199b8$6c70d290$e000a8c0@thomasnotebook> Message-ID: <0da701c199f5$3bf4c9e0$0900a8c0@spiff> thomas wrote: > I have a string variable containing some non-ascii characters (from > a characterset which was previously called 'ansi' instead of 'oem' > on windows). short answer: "iso-8859-1" should work ::: longer answer: windows "ansi" is an alias for the encoding you get from import locale language, encoding = locale.getdefaultlocale() for people in western europe/north america, that's usually "cp1252", which is a microsoft version of latin-1: http://www.microsoft.com/typography/unicode/1252.htm (characters 0x80-0x9f isn't part of iso-8859-1, aka latin-1) cheers /F From gward@mems-exchange.org Thu Jan 10 18:04:52 2002 From: gward@mems-exchange.org (Greg Ward) Date: Thu, 10 Jan 2002 13:04:52 -0500 Subject: [Python-Dev] Change in unpickle order in 2.2? Message-ID: <20020110180452.GA11414@mems-exchange.org> I have an application (Grouch) that has to do a lot of trickery at pickle-time and unpickle-time, and as a result it happens to be sensitive to the order of unpickling. (The reason for the pickle-time intervention is that Grouch stores type objects in its data structure, and you can't pickle type objects. So it hangs on to a representive value of the type for pickling -- eg. for the "integer" type, it keeps both IntType and 0 in memory, but only pickles 0, and uses type(0) to get IntType back at unpickle time.) The reason that Grouch is sensitive to the order of unpickling is because its data structure is a gnarly, incestuous knot of mutually interdependent classes, and I stopped tinkering with the pickle code as soon as I got something that worked with Python 2.0 and 2.1. Now it fails under 2.2. Under 2.1, it appears that certain more-deeply nested objects were unpickled first; under 2.2, that is no longer the case, and that screws up Grouch's test suite. Anyone got a vague, hand-waving explanation for my vague, hand-waving complaint? Or should I try to come up with a test case? Thanks -- Greg -- Greg Ward - software developer gward@mems-exchange.org MEMS Exchange http://www.mems-exchange.org From tim.one@home.com Thu Jan 10 18:24:13 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 10 Jan 2002 13:24:13 -0500 Subject: [Python-Dev] 2.1.2 release -- do we need a beta? In-Reply-To: <200201101431.JAA30746@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > Do we need a beta for the 2.1.2 release? Yes, and whether or not I do a Windows release. We could call it a "release candidate" (i.e., 2.1.2c1). From mal@lemburg.com Thu Jan 10 19:02:02 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 10 Jan 2002 20:02:02 +0100 Subject: [Python-Dev] Change in unpickle order in 2.2? References: <20020110180452.GA11414@mems-exchange.org> Message-ID: <3C3DE52A.83AE0B26@lemburg.com> Greg Ward wrote: > > I have an application (Grouch) that has to do a lot of trickery at > pickle-time and unpickle-time, and as a result it happens to be > sensitive to the order of unpickling. What's Grouch ? > (The reason for the pickle-time intervention is that Grouch stores type > objects in its data structure, and you can't pickle type objects. So it > hangs on to a representive value of the type for pickling -- eg. for the > "integer" type, it keeps both IntType and 0 in memory, but only pickles > 0, and uses type(0) to get IntType back at unpickle time.) Why don't you use a special reduce function which takes the tp_name as index into the types module ? Storing strings should avoid all complicated type object saving. > The reason that Grouch is sensitive to the order of unpickling is > because its data structure is a gnarly, incestuous knot of mutually > interdependent classes, and I stopped tinkering with the pickle code as > soon as I got something that worked with Python 2.0 and 2.1. Now it > fails under 2.2. Under 2.1, it appears that certain more-deeply nested > objects were unpickled first; under 2.2, that is no longer the case, and > that screws up Grouch's test suite. > > Anyone got a vague, hand-waving explanation for my vague, hand-waving > complaint? Or should I try to come up with a test case? You should probably first check wether the pickle string is identical in 2.1 and 2.2 and then go on from there. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From akuchlin@mems-exchange.org Thu Jan 10 19:42:18 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Thu, 10 Jan 2002 14:42:18 -0500 Subject: [Python-Dev] eval() slowdown in 2.2 on MacOS X? In-Reply-To: References: Message-ID: <20020110194218.GA3810@ute.mems-exchange.org> On Tue, Jan 08, 2002 at 01:41:37PM -0500, Tim Peters wrote: >Break it into smaller steps so we can narrow down possible causes: You should have cc'ed Barbara on that. I forwarded your message to her and she wrote back (eventually): >BTW, I forgot to pass this on yesterday, but I tried the code in Tim Peters' >e-mail yesterday and the delay happens during the code = compile(...) >statement. She's going to install sshd on her machine, so maybe this weekend I'll be able to log in, compile Python from source, and poke around in an effort to figure out what's going on. --amk (www.amk.ca) Our lives are different from anybody else's. That's the exciting thing. Nobody in the universe can do what we're doing! -- The Doctor, in "Tomb of the Cybermen" From tim.one@home.com Thu Jan 10 20:26:32 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 10 Jan 2002 15:26:32 -0500 Subject: [Python-Dev] Ouch -- CVS troubles with 2.1.2c1 Message-ID: Trying to add a new file to the release21-maint branch caused CVS commit to die with an assertion error: C:\Code\python\dist\src\PCbuild>cvs commit uninstal.wse RCS file: /cvsroot/python/python/dist/src/PCbuild/Attic/uninstal.wse,v done cvs: commit.c:2104: checkaddfile: Assertion `*rcsnode == ((void *)0)' failed. Terminated with fatal signal 6 CVS.EXE commit: saving log message in c:\windows\TEMP\3 Trying again finds a stale lock: C:\Code\python\dist\src\PCbuild>cvs commit uninstal.wse cvs server: [12:22:36] waiting for tim_one's lock in /cvsroot/python/python/dist/src/PCbuild cvs server: [12:23:06] waiting for tim_one's lock in /cvsroot/python/python/dist/src/PCbuild ... Anyone got a clue? From martin@v.loewis.de Thu Jan 10 20:27:46 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 10 Jan 2002 21:27:46 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <0da701c199f5$3bf4c9e0$0900a8c0@spiff> (fredrik@pythonware.com) References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> <077201c198ec$56b9b740$0900a8c0@spiff> <04ca01c19917$2229b220$e000a8c0@thomasnotebook> <200201092215.g09MFO902299@mira.informatik.hu-berlin.de> <03d001c199b8$6c70d290$e000a8c0@thomasnotebook> <0da701c199f5$3bf4c9e0$0900a8c0@spiff> Message-ID: <200201102027.g0AKRkl01775@mira.informatik.hu-berlin.de> > windows "ansi" is an alias for the encoding you get from > > import locale > language, encoding = locale.getdefaultlocale() > > for people in western europe/north america Isn't that also known as "mbcs" in Python? And it is different from "oem", which is not exposed to Python, right? > "cp1252", which is a microsoft version of latin-1: > > http://www.microsoft.com/typography/unicode/1252.htm > > (characters 0x80-0x9f isn't part of iso-8859-1, aka latin-1) Strictly speaking, the characters 0x80-0x9f *are* assigned in latin-1, to control characters - so these assignments differ in CP 1252. Regards, Martin From tim.one@home.com Thu Jan 10 20:37:46 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 10 Jan 2002 15:37:46 -0500 Subject: [Python-Dev] eval() slowdown in 2.2 on MacOS X? In-Reply-To: <20020110194218.GA3810@ute.mems-exchange.org> Message-ID: [Tim] >> Break it into smaller steps so we can narrow down possible causes: [Andrew Kuchling] > You should have cc'ed Barbara on that. I did a Reply-All. In the copy I got back from Python-Dev, mattson@milkyway.gsfc.nasa.gov was in the cc list. If that didn't reach her, sorry, but I don't think I could have done more than I did. > I forwarded your message to her and she wrote back (eventually): > >> BTW, I forgot to pass this on yesterday, but I tried the code in >> Tim Peters' e-mail yesterday and the delay happens during the code = >> compile(...) statement. So it's somehere in the front end -- that's a real help . > She's going to install sshd on her machine, so maybe this weekend I'll > be able to log in, compile Python from source, and poke around in an > effort to figure out what's going on. Did she try Skip's suggestion to try pymalloc? Given that we believe there is no Mac-specific code here outside libc, the first suggestion was (and remains) the best. The front end will be doing a whale of a lot of mallocs. If it's like "the usual" malloc disease under glibc, the delays would appear during the free()s. From tim.one@home.com Thu Jan 10 20:54:56 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 10 Jan 2002 15:54:56 -0500 Subject: [Python-Dev] Ouch -- CVS troubles with 2.1.2c1 In-Reply-To: Message-ID: This looks hopeless. I submitted an SF support request to get the stale lock removed: http://sf.net/tracker/?func=detail&aid=502032&group_id=1&atid=200001 In the meantime, you should expect this: > cvs server: [12:22:36] waiting for tim_one's lock in > /cvsroot/python/python/dist/src/PCbuild > cvs server: [12:23:06] waiting for tim_one's lock in > /cvsroot/python/python/dist/src/PCbuild > ... Perhaps you can arrange to skip the PCbuild directory? From martin@v.loewis.de Thu Jan 10 21:04:00 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 10 Jan 2002 22:04:00 +0100 Subject: [Python-Dev] Ouch -- CVS troubles with 2.1.2c1 In-Reply-To: References: Message-ID: <200201102104.g0AL40T02169@mira.informatik.hu-berlin.de> [Tim Peters] > This looks hopeless. I submitted an SF support request to get the stale > lock removed: > > http://sf.net/tracker/?func=detail&aid=502032&group_id=1&atid=200001 Well, Jacob Moorman is *really* quick with this kind of stuff these days. Thanks, Jacob! Martin From skip@pobox.com Thu Jan 10 21:12:31 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 10 Jan 2002 15:12:31 -0600 Subject: [Python-Dev] eval() slowdown in 2.2 on MacOS X? In-Reply-To: <20020110194218.GA3810@ute.mems-exchange.org> References: <20020110194218.GA3810@ute.mems-exchange.org> Message-ID: <15422.959.242927.178606@beluga.mojam.com> >> BTW, I forgot to pass this on yesterday, but I tried the code in Tim >> Peters' e-mail yesterday and the delay happens during the code = >> compile(...) statement. I saw the same effect on my Linux laptop (with a mere 128MB). The disk went nuts when it tried compiling "[" + "2," * 200000 + "]" VM size as reported by top went to 98.5MB. This does not appear to be exclusively a 2.2 issue, as I got this with the fresh 2.1.2 I built this morning. If you consider what this compiles to: LOAD_CONST 1 (2) LOAD_CONST 1 (2) LOAD_CONST 1 (2) LOAD_CONST 1 (2) ... LOAD_CONST 1 (2) LOAD_CONST 1 (2) LOAD_CONST 1 (2) LOAD_CONST 1 (2) BUILD_LIST 200000 To generate that it has to generate and parse a pretty deep abstract syntax tree. It looks like symtable_node gets called once for each list element. There are probably other functions that are called once per list element as well. Skip From martin@v.loewis.de Thu Jan 10 19:44:26 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 10 Jan 2002 20:44:26 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <03d001c199b8$6c70d290$e000a8c0@thomasnotebook> (thomas.heller@ion-tof.com) References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> <077201c198ec$56b9b740$0900a8c0@spiff> <04ca01c19917$2229b220$e000a8c0@thomasnotebook> <200201092215.g09MFO902299@mira.informatik.hu-berlin.de> <03d001c199b8$6c70d290$e000a8c0@thomasnotebook> Message-ID: <200201101944.g0AJiQl01632@mira.informatik.hu-berlin.de> > > unicode("some string", "unicode-escape") [...] > For example the copyright symbol "©" (repr("©") gives "\xa9"). > Now I want to convert this string to unicode. > u"©" works fine, unicode(variable) gives an ASCII decoding error. As I said: unicode-escape is the precise encoding that is used to parse Unicode strings from source files. It interprets all bytes above 128 as Latin-1. Regards, Martin From thomas.heller@ion-tof.com Thu Jan 10 21:21:27 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 10 Jan 2002 22:21:27 +0100 Subject: [Python-Dev] unicode/string asymmetries References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> <077201c198ec$56b9b740$0900a8c0@spiff> <04ca01c19917$2229b220$e000a8c0@thomasnotebook> <200201092215.g09MFO902299@mira.informatik.hu-berlin.de> <03d001c199b8$6c70d290$e000a8c0@thomasnotebook> <200201101944.g0AJiQl01632@mira.informatik.hu-berlin.de> Message-ID: <039701c19a1c$db9f3350$e000a8c0@thomasnotebook> From: "Martin v. Loewis" > > > unicode("some string", "unicode-escape") > [...] > > For example the copyright symbol "=A9" (repr("=A9") gives "\xa9"). > > Now I want to convert this string to unicode. > > u"=A9" works fine, unicode(variable) gives an ASCII decoding error. > > As I said: unicode-escape is the precise encoding that is used to > parse Unicode strings from source files. It interprets all bytes above > 128 as Latin-1. > I must apologize, because first it didn't seem to work: >>> print unicode("\xa9", "unicode-escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: ASCII encoding error: ordinal not in range(128) >>> but then I found out that the result simply cannot be printed out, while the repr of it can be: >>> unicode("\xa9", "unicode-escape") u'\xa9' >>> Thanks, Thomas From akuchlin@mems-exchange.org Thu Jan 10 21:29:31 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Thu, 10 Jan 2002 16:29:31 -0500 Subject: [Python-Dev] eval() slowdown in 2.2 on MacOS X? In-Reply-To: References: <20020110194218.GA3810@ute.mems-exchange.org> Message-ID: <20020110212931.GB4302@ute.mems-exchange.org> On Thu, Jan 10, 2002 at 03:37:46PM -0500, Tim Peters wrote: >I did a Reply-All. In the copy I got back from Python-Dev, ... Oops, I misread the headers of the original mail. Sorry! >Did she try Skip's suggestion to try pymalloc? Given that we believe there >is no Mac-specific code here outside libc, the first suggestion was (and She hasn't compiled it herself yet, but that's the first thing I'll try. --amk From moorman@sourceforge.net Thu Jan 10 21:08:15 2002 From: moorman@sourceforge.net (Jacob Moorman) Date: 10 Jan 2002 16:08:15 -0500 Subject: [Python-Dev] Ouch -- CVS troubles with 2.1.2c1 In-Reply-To: <200201102104.g0AL40T02169@mira.informatik.hu-berlin.de> Message-ID: <200201102122.g0ALMLf02237@mira.informatik.hu-berlin.de> On Thu, 2002-01-10 at 16:04, Martin v. Loewis wrote: > [Tim Peters] > > This looks hopeless. I submitted an SF support request to get the stale > > lock removed: > > > > http://sf.net/tracker/?func=detail&aid=502032&group_id=1&atid=200001 > > Well, Jacob Moorman is *really* quick with this kind of stuff these > days. > > Thanks, Jacob! As always, we are glad to assist :-) If ever in the future you or any other member of your team has support concerns which do not appear to be receiving the level of response they deserve, feel free to contact me directly (please include the support request number of the issue in question) via e-mail at moorman@sourceforge.net Issues related to CVS stale locks, repository manipulation, etc. are treated with our highest priority. Our stated response time is 'two business days' (roughly 48-72 hours), however we tend to respond to these issues much faster than that. Once again, thanks for the feedback; and do let me know if we may be of further assistance in the future. Jacob Moorman Quality of Service Manager, SourceForge.net moorman@sourceforge.net moorman@vasoftware.com From martin@v.loewis.de Thu Jan 10 21:31:06 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 10 Jan 2002 22:31:06 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <039701c19a1c$db9f3350$e000a8c0@thomasnotebook> (thomas.heller@ion-tof.com) References: <012501c1987a$0622caa0$e000a8c0@thomasnotebook> <200201082024.g08KOvl01737@mira.informatik.hu-berlin.de> <01f601c1988b$03b00d30$e000a8c0@thomasnotebook> <200201082217.g08MHrQ08678@mira.informatik.hu-berlin.de> <024a01c198e2$823d2280$e000a8c0@thomasnotebook> <077201c198ec$56b9b740$0900a8c0@spiff> <04ca01c19917$2229b220$e000a8c0@thomasnotebook> <200201092215.g09MFO902299@mira.informatik.hu-berlin.de> <03d001c199b8$6c70d290$e000a8c0@thomasnotebook> <200201101944.g0AJiQl01632@mira.informatik.hu-berlin.de> <039701c19a1c$db9f3350$e000a8c0@thomasnotebook> Message-ID: <200201102131.g0ALV6e02303@mira.informatik.hu-berlin.de> > >>> unicode("\xa9", "unicode-escape") > u'\xa9' As a follow up, in source code, you might want to write u"\N{COPYRIGHT SIGN}" instead, for better readability. Regards, Martin From skip@pobox.com Thu Jan 10 22:14:44 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 10 Jan 2002 16:14:44 -0600 Subject: [Python-Dev] eval() slowdown in 2.2 on MacOS X? In-Reply-To: <20020110212931.GB4302@ute.mems-exchange.org> References: <20020110194218.GA3810@ute.mems-exchange.org> <20020110212931.GB4302@ute.mems-exchange.org> Message-ID: <15422.4692.859570.594788@beluga.mojam.com> >> Did she try Skip's suggestion to try pymalloc? amk> She hasn't compiled it herself yet, but that's the first thing I'll amk> try. I did try that when the problem was first raised. I just tried it again. It did have a positive effect: w/ threads and w/o pymalloc: user system elapsed CPU 7.64 0.72 0:09.73 85% 7.86 0.45 0:08.66 95% 7.66 0.66 0:08.32 99% w/o threads and w/ pymalloc: user system elapsed CPU 5.44 0.58 0:06.85 87% 5.57 0.46 0:06.02 100% 5.57 0.45 0:06.02 99% The above was with my memory usage trimmed about as far down as I could get it (turned off X, for example). My apologies that both sets of numbers don't have threads disabled. It's just what I happened to have laying around on the disk. Skip From jack@oratrix.nl Thu Jan 10 22:47:58 2002 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 10 Jan 2002 23:47:58 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: Message by "Martin v. Loewis" , Thu, 10 Jan 2002 08:32:20 +0100 , <200201100732.g0A7WKB01423@mira.informatik.hu-berlin.de> Message-ID: <20020110224803.F008CE8451@oratrix.oratrix.nl> Recently, "Martin v. Loewis" said: > > One minor misgiving is that this call will *always* copy the string, > > even if the internal coding of unicode objects is wchar_t. That's a > > bit of a nuisance, but we can try to fix that later. > > Not sure what you mean by "later". Once this is being used, you cannot > fix it anymore. By "later" I meant "when your argtuple idea has been accepted":-) Remember: most of my code is generated anyway, so fixing things like this is a minor effort. In case it wasn't clear yet: this is a firm +1 for the argtuple idea. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jack@oratrix.nl Thu Jan 10 22:52:26 2002 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 10 Jan 2002 23:52:26 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: Message by "Martin v. Loewis" , Thu, 10 Jan 2002 08:27:39 +0100 , <200201100727.g0A7Rdd01384@mira.informatik.hu-berlin.de> Message-ID: <20020110225231.B15DDE8451@oratrix.oratrix.nl> Recently, "Martin v. Loewis" said: > > Or, to make things clearer, WinObj_Type->tp_convert would simply > > point to the current WinObj_Convert function. > > So what do you gain with that extension? It seem all that is done is > you can replace _Convert by _Type everywhere, with no additional > change to the semantics. Because you can refer to the _Type from Python, that is the whole point of this exercise. And because you can refer to it from Python you can pass it to calldll.newcall and such. > > > > ps_GetDrawableSurface = calldll.newcall(psapilib.ps_GetDrawableSurface, > > > > Carbon.Qd.GrafPortType) > [...] > > Not at the moment, but in calldll version 2 there would be. In stead > > of passing types as "l" or "h" you would pass type objects to > > newcall(). Newcall() would probably special-case the various ints but > > for all other types simply call PyArg_Parse(arg, "O@", typeobj, > > &voidptr). > > I still don't understand. In your example, GrafPortType is a return > type, not an argument type. So you *have* an anything, and you *want* > the GrafPortType. How exactly do you use PyArg_Parse in that scenario? Sorry, you're right. My example was for a return value, so we're talking Py_BuildValue here. But this situation is equivalent to a GrafPort argument, where PyArg_Parse would be used. > Also, why would you use this extension inside newcall()? I'd rather > expect it in ps_GetDrawableSurface.__call__ instead (i.e. when you > deal with a specific call, not when you create the callable instance). Absolutely right, sloppy typing on my part. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jack@oratrix.nl Thu Jan 10 22:55:00 2002 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 10 Jan 2002 23:55:00 +0100 Subject: [Python-Dev] release for 2.1.2, plus 2.2.1... In-Reply-To: Message by Anthony Baxter , Thu, 10 Jan 2002 19:17:02 +1100 , <200201100817.g0A8H2Y01871@mbuna.arbhome.com.au> Message-ID: <20020110225505.97E26E8451@oratrix.oratrix.nl> Recently, Anthony Baxter said: > >>> Barry A. Warsaw wrote > > AB> Ok, I'd like to make the 2.1.2 release some time in the first > > AB> half of the week starting 7th Jan, assuming that's ok for the > > AB> folks who'll need to do the work on the PC/Mac packaging. > > I'm doing this this evening; i.e. now. And I'm not going to do a MacPython 2.1.2. The effort needed is too much, and people seem to be happy enough with 2.1.1 (most have switched to 2.2 anyway). Oh yes, Anthony: I tried the current 2.1.2 CVS on Mac OS X (unix-Python), and all problems appear to be solved. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jack@oratrix.nl Thu Jan 10 23:02:17 2002 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 11 Jan 2002 00:02:17 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: Message by "Thomas Heller" , Thu, 10 Jan 2002 10:10:58 +0100 , <03be01c199b6$cfcfe350$e000a8c0@thomasnotebook> Message-ID: <20020110230223.13ABDE8451@oratrix.oratrix.nl> Recently, "Thomas Heller" said: > Here's an outline which could work in 2.2: This sounds very good! There's only one thing you'll have to explain to me: how would this work from C? My types are all in C, not in Python, so I'd need to do the magic in C. Where do I find examples of using metatypes from C? I could then put all this wrapper stuff in a file WrapperObject.c and it would be reusable by any object that wanted this functionality. > Create a subtype of type, having a tp_convert slot: > > typedef int (*convert_func)(PyTypeObject *, void **); > > typedef struct { > PyTypeObject type; > convert_func tp_convert; > } WrapperTypeType; > > and use it as metaclass (metatype?) for your WindowObj: > > class WindowObj(...): > __metaclass__ = WrapperTypeType > > Write a function to return a conversion function: > > convert_func *get_converter(PyTypeObject *type) > { > if (WrapperTypeType_Check(type)) > return ((WrapperTypeType *)type)->tp_convert; > /* code to check additional types and return their converters */ > .... > } > > and then > > if (!PyArg_ParseTuple(args, "O&", get_converter(WinObj_Type), &Window)) > > How does this sound? > > Thomas > > -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From nhodgson@bigpond.net.au Thu Jan 10 23:06:06 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Fri, 11 Jan 2002 10:06:06 +1100 Subject: [Python-Dev] unicode/string asymmetries References: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> <200201092114.g09LEQH01895@mira.informatik.hu-berlin.de> <008f01c19956$73c4f790$0acc8490@neil> <200201100018.g0A0I9B02928@mira.informatik.hu-berlin.de> Message-ID: <040501c19a2b$6237f470$0acc8490@neil> Martin: > ... So I'd > suggest you just put an assertion into the code that Py_UNICODE is the > same size as WCHAR (that can be even done through a preprocessor > #error, using the _SIZE #defines). I'll expect people will resist > changing Py_UNICODE on Windows for quite some time, even if other > platforms move on. OK, I've turned off the wide character functions when Py_UNICODE_WIDE defined. It even compiles in wide mode although with a lot (about 30) warnings. The warnings are because I'm avoiding the wide char functions with a runtime check rather than a compile time check as the preprocessor checks would get messy with the extra case. The wide mode settings I used were: #define PY_UNICODE_TYPE unsigned long #define Py_UNICODE_SIZE SIZEOF_LONG Why isn't Py_UNICODE_SIZE defined as #define Py_UNICODE_SIZE sizeof(PY_UNICODE_TYPE) ? Changes at http://scintilla.sourceforge.net/winunichanges.zip Neil From mhammond@skippinet.com.au Thu Jan 10 23:11:56 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Fri, 11 Jan 2002 10:11:56 +1100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <200201102027.g0AKRkl01775@mira.informatik.hu-berlin.de> Message-ID: > > windows "ansi" is an alias for the encoding you get from > > > > import locale > > language, encoding = locale.getdefaultlocale() > > > > for people in western europe/north america > > Isn't that also known as "mbcs" in Python? And it is different from > "oem", which is not exposed to Python, right? My turn to speak of which I do not really understand :) mbcs is an "encoding", but a strange encoding in that it depends on the character set. The character set itself determines what bytes are lead bytes. Thus, the same mbcs string may be interpreted differently depending on the current character set/code page. Thus "ansi" and "oem" are code pages, where mbcs is an encoding. This is why Neil demonstrated problems referencing (say) a Japenese filename when the current code-page is not Japanese - there is only a valid mbcs representation in supported code pages. Mark. From guido@python.org Fri Jan 11 04:01:13 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 10 Jan 2002 23:01:13 -0500 Subject: [Python-Dev] RELEASED - Python 2.1.2c1 Message-ID: <200201110401.XAA07009@cj20424-a.reston1.va.home.com> We've issued a release candidate of Python 2.1.2: http://www.python.org/2.1.2/ Our thanks go out to Anthony Baxter, who almost singlehandedly produced this release. We're planning a final release of 2.1.2 early next week, probably Tuesday night (Wednesday morning for Anthony :-). Please report any bugs you find to the bug tracker: http://sourceforge.net/bugs/?group_id=5470 This being a bugfix release, there are no exciting new features -- we just fixed a lot of bugs; a few are outlined below. For a complete list, please see: http://sourceforge.net/project/shownotes.php?release_id=69287 - The socket object gained a new method, 'sendall()'. This method is guaranteed to send all data - this is not guaranteed by the 'send()' method. See also SF patch #474307. The standard library has been updated to use this method where appropriate. - Fix for incorrectly swapped arguments to PyFrame_BlockSetup in ceval.c. This bug could cause python to crash. It was related to using a 'continue' inside a 'try' block. - The Python compiler package was updated to correctly calculate stack depth in some cases. This was affecting Zope Python Scripts rather badly. - Largefile support was added (but not on by default, you'll need to follow the instructions in the documentation of the posix module). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 11 05:08:43 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 11 Jan 2002 00:08:43 -0500 Subject: [Python-Dev] Change in unpickle order in 2.2? In-Reply-To: Your message of "Thu, 10 Jan 2002 13:04:52 EST." <20020110180452.GA11414@mems-exchange.org> References: <20020110180452.GA11414@mems-exchange.org> Message-ID: <200201110508.AAA07290@cj20424-a.reston1.va.home.com> > I have an application (Grouch) that has to do a lot of trickery at > pickle-time and unpickle-time, and as a result it happens to be > sensitive to the order of unpickling. > > (The reason for the pickle-time intervention is that Grouch stores type > objects in its data structure, and you can't pickle type objects. So it > hangs on to a representive value of the type for pickling -- eg. for the > "integer" type, it keeps both IntType and 0 in memory, but only pickles > 0, and uses type(0) to get IntType back at unpickle time.) > > The reason that Grouch is sensitive to the order of unpickling is > because its data structure is a gnarly, incestuous knot of mutually > interdependent classes, and I stopped tinkering with the pickle code as > soon as I got something that worked with Python 2.0 and 2.1. Now it > fails under 2.2. Under 2.1, it appears that certain more-deeply nested > objects were unpickled first; under 2.2, that is no longer the case, and > that screws up Grouch's test suite. > > Anyone got a vague, hand-waving explanation for my vague, hand-waving > complaint? Or should I try to come up with a test case? Yes please, and post it to SourceForge. There aren't that many changes in the source of pickle.py since release 2.1. (Or are you using cPickle? If so, please say so. The two aren't 100% equivalent.) I see changes related to unicode, and type objects are pickled differently in 2.2. There's also a change that refuses to pickle an "global" (a reference by module and object name, used for classes, types and functions) when the name that the object claims to have doesn't refer to the same object. There's a new test on __safe_for_unpickling__. Hm, I think you must be using cPickle, I don't know enough about it to help. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Fri Jan 11 05:21:47 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 11 Jan 2002 06:21:47 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects In-Reply-To: <20020110225231.B15DDE8451@oratrix.oratrix.nl> (message from Jack Jansen on Thu, 10 Jan 2002 23:52:26 +0100) References: <20020110225231.B15DDE8451@oratrix.oratrix.nl> Message-ID: <200201110521.g0B5Lln01966@mira.informatik.hu-berlin.de> > Because you can refer to the _Type from Python, that is the whole > point of this exercise. And because you can refer to it from Python > you can pass it to calldll.newcall and such. I still fail to see why you need additional ParseTuple support in calldll. > Sorry, you're right. My example was for a return value, so we're > talking Py_BuildValue here. But this situation is equivalent to a > GrafPort argument, where PyArg_Parse would be used. In cdc_call, there is a loop over all arguments, rather than a ParseTuple call. I don't see how this could change: all arguments are processed uniformly. Precisely how would you use O@ in there? Actually, it may be worthwhile to get rid of the PyArg_ParseTuple call in call_newcall also: for performance reasons, to soften the dependency on MAXARG, and to give better diagnostics in case of user errors. There is a loop over argconv, anyway; this loop could have run over args in the first place. All you might want to have is additionals slots in type objects; as Thomas explains, you can have that using just the 2.2 facilities. For the specific case of calldll, it seems that a generic mechanism would be harmful: You want to be absolutely sure that an object is convertible to a long *for the purposes of API calls*. So I'd even encourage to create a PyCallDll_RegisterTypeConverter function; extension types that want to support calldll should register a conventry and a rvconventry. That approach works for any Python version. Regards, Martin From martin@v.loewis.de Fri Jan 11 05:26:27 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 11 Jan 2002 06:26:27 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: <040501c19a2b$6237f470$0acc8490@neil> (nhodgson@bigpond.net.au) References: <20020109145511.BE0FCE8451@oratrix.oratrix.nl> <200201092114.g09LEQH01895@mira.informatik.hu-berlin.de> <008f01c19956$73c4f790$0acc8490@neil> <200201100018.g0A0I9B02928@mira.informatik.hu-berlin.de> <040501c19a2b$6237f470$0acc8490@neil> Message-ID: <200201110526.g0B5QR001972@mira.informatik.hu-berlin.de> > #define Py_UNICODE_SIZE sizeof(PY_UNICODE_TYPE) > ? Because you cannot use that in preprocessor tests. If you do #if Py_UNICODE_SIZE == SIZEOF_INT then the preprocessor is not supposed to do this properly unless you have a integral number on each side. Regards, Martin From tim.one@home.com Fri Jan 11 05:31:12 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 11 Jan 2002 00:31:12 -0500 Subject: [Python-Dev] eval() slowdown in 2.2 on MacOS X? In-Reply-To: <15422.4692.859570.594788@beluga.mojam.com> Message-ID: [Skip Montanaro] > I did try that when the problem was first raised. I just tried it > again. It did have a positive effect: > > w/ threads and w/o pymalloc: > > user system elapsed CPU > 7.64 0.72 0:09.73 85% > 7.86 0.45 0:08.66 95% > 7.66 0.66 0:08.32 99% > > w/o threads and w/ pymalloc: > > user system elapsed CPU > 5.44 0.58 0:06.85 87% > 5.57 0.46 0:06.02 100% > 5.57 0.45 0:06.02 99% Skip, I think this is irrelevant to the OP's problem. You're telling us you can save a few seconds running test_longexp on a box with barely enough memory to run it at all. Barbara is worried about shaving *hours* off it on a box with gobs of memory to spare. Still, I expect pymalloc will fix her problem (since malloc is the only suspect on the list, it better ). From martin@v.loewis.de Fri Jan 11 05:47:14 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 11 Jan 2002 06:47:14 +0100 Subject: [Python-Dev] unicode/string asymmetries In-Reply-To: References: Message-ID: <200201110547.g0B5lEr02026@mira.informatik.hu-berlin.de> > > > windows "ansi" is an alias for the encoding you get from [...] > > Isn't that also known as "mbcs" in Python? And it is different from > > "oem", which is not exposed to Python, right? [...] > mbcs is an "encoding", but a strange encoding in that it depends on the > character set. The character set itself determines what bytes are lead > bytes. That is my understanding also. > Thus, the same mbcs string may be interpreted differently depending on the > current character set/code page. Thus "ansi" and "oem" are code pages, > where mbcs is an encoding. That is not really true, is it: "ansi" and "oem" are not code pages, are they? Atleast, not constant code pages, but code pages that depend on the national version, right? "mbcs" uses MultiByteToWideChar with CP_ACP, so "mbcs" *is* CP_ACP, where ACP stands for "ANSI Code Page", right? CP_ACP is the code page that the "ANSI" functions, i.e. the *A functions, expect. It might be code page 1252, or it might be something else. Likewise, the OEM code page is not a fixed thing, either. Instead, it is what DOS would have used in this locale. So, CP_OEMCP might be 437, or it might be something else, again, e.g. 850. I think it might have been less confusing to call the "mbcs" encoding "ansi", and to expose the "oem" encoding (which can still be done). Please correct me if I'm wrong. Regards, Martin From tim.one@home.com Fri Jan 11 06:13:26 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 11 Jan 2002 01:13:26 -0500 Subject: [Python-Dev] 2.1.2 testing. In-Reply-To: <200201101224.g0ACOTG05498@mbuna.arbhome.com.au> Message-ID: [Anthony Baxter] > ... > Linux/sparc Debian 2.2 (cf.sf.net) FAILED > This is scary. I don't know why this one alone fails - it fails the > test_math test. > > Running the test by hand: > anthonybaxter@usf-cf-sparc-linux-1:~/python212_linxsparc$ > PYTHONPATH= ./python ./Lib/test/test_math.py > math module, testing with eps 1e-05 > constants > acos > Traceback (most recent call last): > File "./Lib/test/test_math.py", line 21, in ? > testit('acos(-1)', math.acos(-1), math.pi) > OverflowError: math range error > > Running math.acos(-1) gives the correct answer. Anyone got any idea? Sorry, not short of stepping into mathmodule.c under a debugger. The only interesting thing about that test is that math.acos(-1) is the very first call test_math.py makes to the platform libm. Perhaps if you commented it out, you'd get a bogus OverflowError from testit('acos(0)', math.acos(0), math.pi/2) on the following line. From martin@v.loewis.de Fri Jan 11 07:02:31 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 11 Jan 2002 08:02:31 +0100 Subject: [Python-Dev] Change in unpickle order in 2.2? In-Reply-To: <200201110508.AAA07290@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Fri, 11 Jan 2002 00:08:43 -0500) References: <20020110180452.GA11414@mems-exchange.org> <200201110508.AAA07290@cj20424-a.reston1.va.home.com> Message-ID: <200201110702.g0B72VN02573@mira.informatik.hu-berlin.de> > Yes please, and post it to SourceForge. There aren't that many > changes in the source of pickle.py since release 2.1. I think there have been changes to the order in which things come out of a dictionary, which could affect pickling order. Unpickling order, of course, should strictly follow the order in which things are in the file. Regards, Martin From martin@v.loewis.de Fri Jan 11 07:33:23 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 11 Jan 2002 08:33:23 +0100 Subject: [Python-Dev] 2.1.2 testing. In-Reply-To: References: Message-ID: <200201110733.g0B7XNk02621@mira.informatik.hu-berlin.de> > > Linux/sparc Debian 2.2 (cf.sf.net) FAILED > > This is scary. I don't know why this one alone fails - it fails the > > test_math test. [...] > Sorry, not short of stepping into mathmodule.c under a debugger. The only > interesting thing about that test is that math.acos(-1) is the very first > call test_math.py makes to the platform libm. Perhaps if you commented it > out, you'd get a bogus OverflowError from > > testit('acos(0)', math.acos(0), math.pi/2) > > on the following line. Seems to be a Sparclinux bug. If mathmodule is statically linked into python (via Modules/Setup), the test passes fine. Without further analysis, I'd say that assigning to errno does not work well when done in a shared library. I'd say this is bug #459464. Last time, I incorrectly diagnosed this as a sparc64 gcc issue, which it isn't: Even though 'uname -m' reports 'sparc64', all userland code is 32-bit. I'm probably wrong with my current guess as well. HTH, Martin From niemeyer@conectiva.com Fri Jan 11 00:49:08 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Thu, 10 Jan 2002 22:49:08 -0200 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] Message-ID: <20020110224908.C884@ibook.distro.conectiva> --VywGB/WGlW4DM4P8 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi everyone! Now that 2.2 is history (well, kind of ;-), would it be the time to think about this again? Thank you! ----- Forwarded message from Gustavo Niemeyer ----- Date: Wed, 14 Nov 2001 20:07:03 -0200 From: Gustavo Niemeyer To: python-dev@python.org Subject: Re: [Python-Dev] Python's footprint In-Reply-To: <20011108165105.A29947@gerg.ca> User-Agent: Mutt/1.3.23i > > It means that about 10% of python's executable is documentation. [...] > Anyways, that sounds like a useful idea. It would probably be a big > patch that touches lots of files, so it's unlikely to get into Python > 2.2. You might consider whipping up a patch now to get it under > consideration early in 2.3's life-cycle. Ok. The patch is ready (attached). It's very simple. Just introducing two new macros: Py_DOCSTR() to be used in usual doc strings, and WITH_DOC_STRINGS, for more complex ones (sys module's doc string comes into my mind). I'd just like to know the moment when it is going to be applied, so I can change every documentation string accordingly and submit the patch. I could do this right now, for sure. But if it's going to be applied just for 2.3, the patch will certainly be broken at that time. Thanks! --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --- Python-2.2.orig/pyconfig.h.in Wed Nov 14 17:54:31 2001 +++ Python-2.2/pyconfig.h.in Wed Nov 14 19:08:08 2001 @@ -765,3 +765,13 @@ #define STRICT_SYSV_CURSES /* Don't use ncurses extensions */ #endif =20 +/* Define if you want to have inline documentation. */ +#undef WITH_DOC_STRINGS + +/* Define macro for inline documentation. */ +#ifdef WITH_DOC_STRINGS +#define Py_DOCSTR(x) x +#else +#define Py_DOCSTR(x) "" +#endif + --- Python-2.2.orig/configure.in Wed Nov 14 17:54:31 2001 +++ Python-2.2/configure.in Wed Nov 14 19:20:07 2001 @@ -1305,6 +1305,20 @@ fi AC_MSG_RESULT($with_cycle_gc) =20 +# Check for --with-doc-strings +AC_MSG_CHECKING(for --with-doc-strings) +AC_ARG_WITH(doc-strings, +[ --with(out)-doc-strings disable/enable documentation strings]) + +if test -z "$with_doc_strings" +then with_doc_strings=3D"yes" +fi +if test "$with_doc_strings" !=3D "no" +then + AC_DEFINE(WITH_DOC_STRINGS) +fi +AC_MSG_RESULT($with_doc_strings) + # Check for Python-specific malloc support AC_MSG_CHECKING(for --with-pymalloc) AC_ARG_WITH(pymalloc, ----- End forwarded message ----- --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --VywGB/WGlW4DM4P8 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8PjaEIlOymmZkOgwRAkI2AKCMbpUhatkpfUqIkDOzjBXKCBwUPQCcCIcH qkLKz+rldrQBaoU7c5G23V4= =v6Cf -----END PGP SIGNATURE----- --VywGB/WGlW4DM4P8-- From martin@v.loewis.de Fri Jan 11 13:34:21 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 11 Jan 2002 14:34:21 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <20020110224908.C884@ibook.distro.conectiva> (message from Gustavo Niemeyer on Thu, 10 Jan 2002 22:49:08 -0200) References: <20020110224908.C884@ibook.distro.conectiva> Message-ID: <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> > Now that 2.2 is history (well, kind of ;-), would it be the time to > think about this again? By "consideration early in 2.3's life cycle", the OP probably meant that a patch should be posted to SF. Are you willing to implement the complete change (i.e. create a patch that changes each and every source file)? If so, please post one to SF. You may want to start this slowly, first creating only the infrastructure and touching a single file (say, stringobject.c) I'd personally like to see opportunities for more magic used. E.g. in a compiler that uses sections, putting all doc strings into a single section might be desirable. They will be a contiguous fragment of the python executable, which helps on demand-paged systems to reduce the startup time. Going further, it might be possible to strip off "unused sections" from the binary after it has been linked, deferring the choice of doc string presence to the installation time. For that to work, we'd first need to know what compilers offer what syntax to implement such magic, then generalize it to the right macro. If that is a desirable goal, I'd be willing to investigate how to achieve things with gcc, on ELF systems. Regards, Martin From skip@pobox.com Fri Jan 11 14:15:27 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 11 Jan 2002 08:15:27 -0600 Subject: [Python-Dev] PEP 100 references & wording Message-ID: <15422.62335.99150.451320@12-248-41-177.client.attbi.com> I just noticed that PEP 100 (Python/Unicode integration) references http://starship.python.net/~lemburg/unicode-proposal.txt as the latest version. Sure enough, I visited that and found that it's newer than the PEP (1.8 v. 1.7). Shouldn't the PEP be the most up-to-date public document? The comment right after that suggests this should be so: [ed. note: new revisions should be made to this PEP document, while the historical record previous to version 1.7 should be retrieved from MAL's url, or Misc/unicode.txt] Since this is now an informational PEP, I believe the wording should change to reflect functionality that has already been implemented. For instance, instead of Python should provide a built-in constructor for Unicode strings which is available through __builtins__: it should read Python provides a built-in constructor for Unicode strings which is available through __builtins__: Skip From niemeyer@conectiva.com Fri Jan 11 14:21:05 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Fri, 11 Jan 2002 12:21:05 -0200 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> Message-ID: <20020111122105.B1808@ibook.distro.conectiva> --E39vaYmALEf/7YXx Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Martin! > > Now that 2.2 is history (well, kind of ;-), would it be the time to > > think about this again? >=20 > By "consideration early in 2.3's life cycle", the OP probably meant > that a patch should be posted to SF. Are you willing to implement the > complete change (i.e. create a patch that changes each and every > source file)? If so, please post one to SF. You may want to start this > slowly, first creating only the infrastructure and touching a single > file (say, stringobject.c) Yes, I'm going to implement it. I'd just like to know if there was interest in the patch. Implementing it slowly looks like a nice idea as well. I'll post a patch there. Thanks! > I'd personally like to see opportunities for more magic used. E.g. in > a compiler that uses sections, putting all doc strings into a single > section might be desirable. They will be a contiguous fragment of the > python executable, which helps on demand-paged systems to reduce the > startup time. Going further, it might be possible to strip off "unused > sections" from the binary after it has been linked, deferring the > choice of doc string presence to the installation time. Interesting. I know it's possible to discard a session. OTOH, I don't know what happens if somebody refer to discarded data. I'll have a look at this. > For that to work, we'd first need to know what compilers offer what > syntax to implement such magic, then generalize it to the right macro. > If that is a desirable goal, I'd be willing to investigate how to > achieve things with gcc, on ELF systems. This is something pretty easy with gcc. When reading your email, I remembered that the kernel uses this magic to discard a session with code used just when initializing. Looking in the kernel code, I found out this in include/linux/init.h: /* * Mark functions and data as being only used at initialization * or exit time. */ #define __init __attribute__ ((__section__ (".text.init"))) #define __exit __attribute__ ((unused, __section__(".text.exit"))) #define __initdata __attribute__ ((__section__ (".data.init"))) #define __exitdata __attribute__ ((unused, __section__ (".data.exit"))) #define __initsetup __attribute__ ((unused,__section__ (".setup.init"))) #define __init_call __attribute__ ((unused,__section__ (".initcall.init"))) #define __exit_call __attribute__ ((unused,__section__ (".exitcall.exit"))) After surrounding doc strings with a macro, this will be easy to achieve. Thanks! --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --E39vaYmALEf/7YXx Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8PvTQIlOymmZkOgwRAh/yAJ0c5tWHiOhhr0tk6tmic8Zi1JmsigCePNIZ 7LNVHje7zOlwEfAZ9rYkbw4= =hI95 -----END PGP SIGNATURE----- --E39vaYmALEf/7YXx-- From gward@python.net Fri Jan 11 14:49:29 2002 From: gward@python.net (Greg Ward) Date: Fri, 11 Jan 2002 09:49:29 -0500 Subject: [Python-Dev] Change in unpickle order in 2.2? In-Reply-To: <3C3DE52A.83AE0B26@lemburg.com> References: <20020110180452.GA11414@mems-exchange.org> <3C3DE52A.83AE0B26@lemburg.com> Message-ID: <20020111144929.GA13139@gerg.ca> On 10 January 2002, M.-A. Lemburg said: > What's Grouch ? Grouch is a system for 1) describing a Python object schema, and 2) traversing an existing object graph (eg. a pickle or ZODB) to ensure that it conforms to that object schema. An object schema is a collection of classes (including the attributes in each class and the type of each attribute), atomic types, and type aliases. An atomic type is a type with no sub-types; by default every Grouch schema has five atomic types: int, string, long, complex, and float. You can easily add new atomic types, eg. the MEMS Exchange virtual fab has mxDateTime as an atomic type. A type alias is just what it sounds like, eg. "Foo" might be an alias for "foo.Foo" (a fully qualified class name representing a Grouch instance type), and "real" might be an alias for "int|long|float" (a Grouch union type). See http://www.mems-exchange.org/software/grouch/ Anyways, that's not terribly relevant, but it gives me an excuse to plug my most arcane and (IMHO) interesting Python hack. [me] > (The reason for the pickle-time intervention is that Grouch stores type > objects in its data structure, and you can't pickle type objects. So it > hangs on to a representive value of the type for pickling -- eg. for the > "integer" type, it keeps both IntType and 0 in memory, but only pickles > 0, and uses type(0) to get IntType back at unpickle time.) [MAL] > Why don't you use a special reduce function which takes the > tp_name as index into the types module ? Storing strings should > avoid all complicated type object saving. I'm not sure I understand what you're saying. Are you just suggesting that, when I need to pickle IntType, I pickle the string "int" instead of the integer 0? I don't see how that makes any difference: I still need to intercede at pickle/unpickle time to make this happen. Also, the fact that type(x).__name__ is not consistent across Python versions or implementations (Jython) screws this up. Grouch now has its own canonical set of type names because of this, and I could easily reverse that dictionary to make a typename->typeobject mapping. But I don't see how pickling "int" is a win over pickling 0, when what I *really* want to do is pickle IntType. > You should probably first check wether the pickle string is > identical in 2.1 and 2.2 and then go on from there. Excellent idea -- thanks! Greg -- Greg Ward - nerd gward@python.net http://starship.python.net/~gward/ "Eine volk, eine reich, eine führer" --Hitler "One world, one web, one program" --Microsoft From skip@pobox.com Fri Jan 11 14:49:33 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 11 Jan 2002 08:49:33 -0600 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <20020111122105.B1808@ibook.distro.conectiva> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> Message-ID: <15422.64381.344299.899663@12-248-41-177.client.attbi.com> Gustavo> Yes, I'm going to implement it. I'd just like to know if there Gustavo> was interest in the patch. Implementing it slowly looks like a Gustavo> nice idea as well. I'll post a patch there. Thanks! Gustavo, I recommend you do the whole patch thing through SourceForge. Just post a link to your patch to python-dev. Skip From gward@python.net Fri Jan 11 14:52:06 2002 From: gward@python.net (Greg Ward) Date: Fri, 11 Jan 2002 09:52:06 -0500 Subject: [Python-Dev] Change in unpickle order in 2.2? In-Reply-To: <200201110508.AAA07290@cj20424-a.reston1.va.home.com> References: <20020110180452.GA11414@mems-exchange.org> <200201110508.AAA07290@cj20424-a.reston1.va.home.com> Message-ID: <20020111145206.GB13139@gerg.ca> [me] > I have an application (Grouch) that has to do a lot of trickery at > pickle-time and unpickle-time, and as a result it happens to be > sensitive to the order of unpickling. [...] > Anyone got a vague, hand-waving explanation for my vague, hand-waving > complaint? Or should I try to come up with a test case? [Guido] > Yes please, and post it to SourceForge. There aren't that many > changes in the source of pickle.py since release 2.1. (Or are you > using cPickle? If so, please say so. The two aren't 100% > equivalent.) Tried it with both pickle and cPickle, with the same result (ie. one of my test cases failed with the exact same traceback, apparently for the same reason). I'll see if I can't reduce this to something that doesn't rely on 1500 hairy lines of Grouch code. (Only fitting that something named for Oscar the Grouch is hairy, eh?) Greg -- Greg Ward - Linux weenie gward@python.net http://starship.python.net/~gward/ A man without religion is like a fish without a bicycle. From mal@lemburg.com Fri Jan 11 15:03:27 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jan 2002 16:03:27 +0100 Subject: [Python-Dev] PEP 100 references & wording References: <15422.62335.99150.451320@12-248-41-177.client.attbi.com> Message-ID: <3C3EFEBF.48D925C6@lemburg.com> Skip Montanaro wrote: > > I just noticed that PEP 100 (Python/Unicode integration) references > > http://starship.python.net/~lemburg/unicode-proposal.txt > > as the latest version. Sure enough, I visited that and found that it's > newer than the PEP (1.8 v. 1.7). True. I'm not sure why the above file is 1.8 and the CVS PEP at 1.7. I guess I forgot to update the PEP. FYI, here's adiff between the 1.7 and 1.8 version: --- unicode-proposal-1.7.txt Tue Oct 17 17:38:40 2000 +++ unicode-proposal.txt Tue Oct 17 17:38:40 2000 @@ -1,7 +1,7 @@ ============================================================================= - Python Unicode Integration Proposal Version: 1.7 + Python Unicode Integration Proposal Version: 1.8 ----------------------------------------------------------------------------- Introduction: ------------- @@ -612,11 +612,11 @@ Case Conversion: ---------------- Case conversion is rather complicated with Unicode data, since there are many different conditions to respect. See - http://www.unicode.org/unicode/reports/tr13/ + http://www.unicode.org/unicode/reports/tr21/ for some guidelines on implementing case conversion. For Python, we should only implement the 1-1 conversions included in Unicode. Locale dependent and other special case conversions (see the @@ -631,11 +631,15 @@ possible. Line Breaks: ------------ Line breaking should be done for all Unicode characters having the B property as well as the combinations CRLF, CR, LF (interpreted in that -order) and other special line separators defined by the standard. +order) and other special line separators defined by the standard. See + + http://www.unicode.org/unicode/reports/tr13/ + +for some guidelines on implementing line breaks and newline handling. The Unicode type should provide a .splitlines() method which returns a list of lines according to the above specification. See Unicode Methods. @@ -1010,11 +1014,11 @@ Unicode 3.0: Unicode-TechReports: http://www.unicode.org/unicode/reports/techreports.html Unicode-Mappings: - ftp://ftp.unicode.org/Public/MAPPINGS/ + http://www.unicode.org/Public/MAPPINGS/ Introduction to Unicode (a little outdated by still nice to read): http://www.nada.kth.se/i18n/ucs/unicode-iso10646-oview.html For comparison: @@ -1047,10 +1051,11 @@ Encodings: http://www.uazone.com/multiling/unicode/wg2n1035.html History of this Proposal: ------------------------- +1.8: Fixed some URLs to the unicode.org site. 1.7: Added note about the changed behaviour of "s#". 1.6: Changed to since this is the name used in the implementation. Added notes about the usage of in the buffer protocol implementation. 1.5: Added notes about setting the . Fixed some > Shouldn't the PEP be the most up-to-date public document? The comment right > after that suggests this should be so: > > [ed. note: new revisions should be made to this PEP document, while the > historical record previous to version 1.7 should be retrieved from > MAL's url, or Misc/unicode.txt] > > Since this is now an informational PEP, I believe the wording should change > to reflect functionality that has already been implemented. For instance, > instead of > > Python should provide a built-in constructor for Unicode strings which > is available through __builtins__: > > it should read > > Python provides a built-in constructor for Unicode strings which is > available through __builtins__: True again; I just didn't find time to rewrite these bits. The PEP is basically a reformatted proposal. That's where the "should" wording originates from. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Fri Jan 11 15:11:59 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jan 2002 16:11:59 +0100 Subject: [Python-Dev] Change in unpickle order in 2.2? References: <20020110180452.GA11414@mems-exchange.org> <3C3DE52A.83AE0B26@lemburg.com> <20020111144929.GA13139@gerg.ca> Message-ID: <3C3F00BF.BC3252D3@lemburg.com> Greg Ward wrote: > > On 10 January 2002, M.-A. Lemburg said: > > What's Grouch ? > > [Grouch is a system for 1) describing a Python object schema, and 2) > traversing an existing object graph (eg. a pickle or ZODB) to ensure > that it conforms to that object schema.] Sounds very interesting :-) > [me] > > (The reason for the pickle-time intervention is that Grouch stores type > > objects in its data structure, and you can't pickle type objects. So it > > hangs on to a representive value of the type for pickling -- eg. for the > > "integer" type, it keeps both IntType and 0 in memory, but only pickles > > 0, and uses type(0) to get IntType back at unpickle time.) > > [MAL] > > Why don't you use a special reduce function which takes the > > tp_name as index into the types module ? Storing strings should > > avoid all complicated type object saving. > > I'm not sure I understand what you're saying. Are you just suggesting > that, when I need to pickle IntType, I pickle the string "int" instead > of the integer 0? Right. It needn't be 'int', any string will do as long as you have a mapping from strings to type objects. > I don't see how that makes any difference: I still > need to intercede at pickle/unpickle time to make this happen. Well, I suppose with the new Python 2.2 version you could add a special __reduce__ method to type objects which takes of this for you. For older versions, you should probably register a pickle handler for type objects which does the same. Pickle should then use this handler for pickling the type object. > Also, the fact that type(x).__name__ is not consistent across Python > versions or implementations (Jython) screws this up. Grouch now has its > own canonical set of type names because of this, and I could easily > reverse that dictionary to make a typename->typeobject mapping. But I > don't see how pickling "int" is a win over pickling 0, when what I > *really* want to do is pickle IntType. True, but it saves you the trouble of storing global references to the type constructors in the pickle. Your system will do the mapping using the above hooks. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Fri Jan 11 15:54:17 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 11 Jan 2002 10:54:17 -0500 Subject: [Python-Dev] PEP 100 references & wording In-Reply-To: Your message of "Fri, 11 Jan 2002 16:03:27 +0100." <3C3EFEBF.48D925C6@lemburg.com> References: <15422.62335.99150.451320@12-248-41-177.client.attbi.com> <3C3EFEBF.48D925C6@lemburg.com> Message-ID: <200201111554.KAA12727@cj20424-a.reston1.va.home.com> Marc, can you update PEP 100? You might want to retire the starship URL and use the PEP URL as the official location. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Fri Jan 11 16:40:46 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 11 Jan 2002 17:40:46 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects References: <20020110230223.13ABDE8451@oratrix.oratrix.nl> Message-ID: <08f501c19abe$d6785a80$e000a8c0@thomasnotebook> From: "Jack Jansen" > > Recently, "Thomas Heller" said: > > Here's an outline which could work in 2.2: > > This sounds very good! There's only one thing you'll have to explain > to me: how would this work from C? My types are all in C, not in > Python, so I'd need to do the magic in C. Where do I find examples of > using metatypes from C? > I don't know of any, well, except ceval.c build_class(): result = PyObject_CallFunction(metaclass, "OOO", name, bases, methods); I had no need for this, because I'm very happy to write base classes/types in C, extend them by deriving subtypes from them in Python, and plugging everything together in Python. Thomas From mal@lemburg.com Fri Jan 11 17:46:06 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jan 2002 18:46:06 +0100 Subject: [Python-Dev] PEP 100 references & wording References: <15422.62335.99150.451320@12-248-41-177.client.attbi.com> <3C3EFEBF.48D925C6@lemburg.com> <200201111554.KAA12727@cj20424-a.reston1.va.home.com> Message-ID: <3C3F24DE.92586A6C@lemburg.com> Guido van Rossum wrote: > > Marc, can you update PEP 100? > > You might want to retire the starship URL and use the PEP URL as the > official location. Will do, but it might take a week or two. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Fri Jan 11 18:06:04 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jan 2002 19:06:04 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects References: <20020110230223.13ABDE8451@oratrix.oratrix.nl> Message-ID: <3C3F298C.4BFB5C66@lemburg.com> [Metatypes, callbacks, etc.] Wouldn't it be *much* easier to just use the copyreg/pickle API/protocol for dealing with all this ? AFAICTL, the actions needed by Jack are very similar to what pickle et al. do, and we already have all that in Python -- it's just not exposed too well at C level. Example: PyArg_ParseTuple(args, "O@", &factory, &tuple) would return a factory function and a tuple storing the data of the object passed to the function while Py_BuildValue("O@", factory, tuple) would simply call factory with tuple and use the return value as object. (Note that void* can be wrapped into PyCObjects for "use" in Python.) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From andymac@bullseye.apana.org.au Fri Jan 11 09:32:32 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Fri, 11 Jan 2002 20:32:32 +1100 (EDT) Subject: [Python-Dev] eval() slowdown in 2.2 on MacOS X? In-Reply-To: <20020110194218.GA3810@ute.mems-exchange.org> Message-ID: On Thu, 10 Jan 2002, Andrew Kuchling wrote: > On Tue, Jan 08, 2002 at 01:41:37PM -0500, Tim Peters wrote: > >Break it into smaller steps so we can narrow down possible causes: > > You should have cc'ed Barbara on that. I forwarded your message to her > and she wrote back (eventually): > > >BTW, I forgot to pass this on yesterday, but I tried the code in Tim Peters' > >e-mail yesterday and the delay happens during the code = compile(...) > >statement. > > She's going to install sshd on her machine, so maybe this weekend I'll > be able to log in, compile Python from source, and poke around in an > effort to figure out what's going on. IMHO, Barbara's problem is almost certainly related to the system malloc(), and if that is the case the only effective antidote is pymalloc. However, just enabling WITH_PYMALLOC isn't enough as its currently only configured to be used for object allocation and doesn't help the parser. I did attach a patch enabling pymalloc for all interpreter memory management to a long post to python-dev (which AMK might recall this from his python-dev summaries) about my research into test_longexp problems with the OS/2+EMX port. My research revealed that test_longexp causes the parser to go ballistic with small mallocs. While pymalloc solved the test_longexp problem, using it for all interpreter memory management caused about a 60% performance hit (on OS/2 + EMX). On OS/2 the problem appeared to be overallocation (allocating 3-4x as much memory as actually requested), but I recall reading a thread on python-list wherein people reported system malloc()s that attempt to coalesce blocks which had a similar slowdown effect in another set of circumstances (I don't recall the details - might have been related to dictionaries?). Although OS/X's BSD heritage goes back to FreeBSD 3.2, I wouldn't have expected either sort of problem from that source as I've had none of these problems with my own FreeBSD systems - in fact last night I ran the test suite on a CVS derived build on a 486/100 FreeBSD 4.4 system with only 32MB of RAM and 128MB of swap and test_longexp passed & in only minutes (the whole test suite took ~25-30 mins for the first pass, ie without .pyc files). OS/X may have acquired a malloc() of different heritage though. I did keep a copy of the instrumented malloc() output from my OS/2 research if you're interested, although it probably isn't as helpful as it could be (Python 2.0 vintage)... Likewise my crude debug malloc wrapper... and I might be able to dig up my pymalloc_for_all patch... -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From thomas.heller@ion-tof.com Fri Jan 11 20:40:56 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 11 Jan 2002 21:40:56 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects References: <20020110230223.13ABDE8451@oratrix.oratrix.nl> <3C3F298C.4BFB5C66@lemburg.com> Message-ID: <0a6201c19ae0$63d379c0$e000a8c0@thomasnotebook> From: "M.-A. Lemburg" > [Metatypes, callbacks, etc.] > > Wouldn't it be *much* easier to just use the copyreg/pickle > API/protocol for dealing with all this ? > I *don't* think it's complicated (once you get used to metatypes). > AFAICTL, the actions needed by Jack are very similar to what > pickle et al. do, and we already have all that in Python -- > it's just not exposed too well at C level. > > Example: > > PyArg_ParseTuple(args, "O@", &factory, &tuple) would > return a factory function and a tuple storing the data of > the object passed to the function > > while > > Py_BuildValue("O@", factory, tuple) would simply call factory > with tuple and use the return value as object. > > (Note that void* can be wrapped into PyCObjects for "use" in > Python.) > I'm not sure we talk about the same thing: we (at least me) do not want to serialize and reconstruct objects (what pickle does), we want to convert objects from Python to C (convert them to parameters usable in C API-calls), and back (convert them from handles, pointers, whatever into Python objects) having only the Python *type* object available in the latter case. Or am I missing something? Thomas From thomas.heller@ion-tof.com Fri Jan 11 20:58:45 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 11 Jan 2002 21:58:45 +0100 Subject: [Python-Dev] 2.2c1 test on windows - ok Message-ID: <0ace01c19ae2$da29e670$e000a8c0@thomasnotebook> installed from the windows installer - everything seems to be ok: 117 tests OK. 23 tests skipped: test_al test_cd test_cl test_crypt test_dbm test_dl test_fcntl test_fork1 test_gdbm test_gl test_grp t est_imgfile test_largefile test_linuxaudiodev test_nis test_openpty test_poll test_pty test_pwd test_signal test_sockets erver test_sunaudiodev test_timing Thomas From fdrake@acm.org Fri Jan 11 21:05:42 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 11 Jan 2002 16:05:42 -0500 (EST) Subject: [Python-Dev] 2.2c1 test on windows - ok In-Reply-To: <0ace01c19ae2$da29e670$e000a8c0@thomasnotebook> References: <0ace01c19ae2$da29e670$e000a8c0@thomasnotebook> Message-ID: <15423.21414.908979.772899@grendel.zope.com> Thomas Heller writes: > installed from the windows installer - everything seems to be ok: I presume you meant 2.1.2c1 ??? ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From thomas.heller@ion-tof.com Fri Jan 11 21:14:27 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 11 Jan 2002 22:14:27 +0100 Subject: [Python-Dev] 2.2c1 test on windows - ok References: <0ace01c19ae2$da29e670$e000a8c0@thomasnotebook> <15423.21414.908979.772899@grendel.zope.com> Message-ID: <0b3201c19ae5$11be0a60$e000a8c0@thomasnotebook> From: "Fred L. Drake, Jr." > > Thomas Heller writes: > > installed from the windows installer - everything seems to be ok: > > I presume you meant 2.1.2c1 ??? ;-) > Of course. Sorry. Thomas From tim.one@home.com Fri Jan 11 21:25:04 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 11 Jan 2002 16:25:04 -0500 Subject: [Python-Dev] 2.2c1 test on windows - ok In-Reply-To: <0ace01c19ae2$da29e670$e000a8c0@thomasnotebook> Message-ID: > installed from the windows installer - everything seems to be ok: Which flavor of Windows? If NT or 2000 or XP, did you install with or without admin privs? > 117 tests OK. > 23 tests skipped: test_al test_cd test_cl test_crypt test_dbm > test_dl test_fcntl test_fork1 test_gdbm test_gl test_grp t > est_imgfile test_largefile test_linuxaudiodev test_nis > test_openpty test_poll test_pty test_pwd test_signal test_sockets > erver test_sunaudiodev test_timing Unfortunately, the code in regrtest.py to format this list to fit screen width, and to say which skips are *expected* on win32, was new in 2.2, so ineligible for inclusion in a 2.1 bugfix release. From mal@lemburg.com Fri Jan 11 22:08:54 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jan 2002 23:08:54 +0100 Subject: [Python-Dev] Feature request: better support for "wrapper" objects References: <20020110230223.13ABDE8451@oratrix.oratrix.nl> <3C3F298C.4BFB5C66@lemburg.com> <0a6201c19ae0$63d379c0$e000a8c0@thomasnotebook> Message-ID: <3C3F6276.371F12A5@lemburg.com> Thomas Heller wrote: > > From: "M.-A. Lemburg" > > [Metatypes, callbacks, etc.] > > > > Wouldn't it be *much* easier to just use the copyreg/pickle > > API/protocol for dealing with all this ? > > I *don't* think it's complicated (once you get used to metatypes). I hear heads exploding already :-) > > AFAICTL, the actions needed by Jack are very similar to what > > pickle et al. do, and we already have all that in Python -- > > it's just not exposed too well at C level. > > > > Example: > > > > PyArg_ParseTuple(args, "O@", &factory, &tuple) would > > return a factory function and a tuple storing the data of > > the object passed to the function > > > > while > > > > Py_BuildValue("O@", factory, tuple) would simply call factory > > with tuple and use the return value as object. > > > > (Note that void* can be wrapped into PyCObjects for "use" in > > Python.) > > > I'm not sure we talk about the same thing: we (at least me) do not > want to serialize and reconstruct objects (what pickle does), > we want to convert objects from Python to C (convert them to parameters > usable in C API-calls), and back (convert them from handles, pointers, > whatever into Python objects) having only the Python *type* object > available in the latter case. > > Or am I missing something? I'm not really talking about serializing in the pickle sense (with the intent of storing the data as string), it's more about providing a way to recreate an object within the same process: given an object x, provide a factory function f and a tuple args such that x == apply(f, args). Now, the object Jack has in mind wrap C pointers, so the args would be a tuple containing one PyCObject. Getting the pointer out of a PyCObject is really easy and by using a tuple as intermediate storage form, you can also support more complex objects, e.g. objects wrapping more than one pointer or value. After you have accessed the internal values, possibily calculating new ones, you can then contruct a tuple, pass it to the factory and return the same type of input object as you received no input. Since the API would be fixed, helper functions could be added to make all this really easy at C level. The fact that a registry similar to copyreg or a new method on the input object is used to contruct the factory function and the tuple, this mechanism can easily be extended in Python as well as C. Furthermore, the existing pickle mechanisms could be reused for the existing objects, since most of these use very reasonable state tuples for storing the object state. I'm just suggesting this to make the whole wrapper idea more flexible. One C void* pointer is really only useful for very simple objects. The above easily extends to complex objects such as e.g. mxDateTime objects. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From gward@python.net Fri Jan 11 22:08:44 2002 From: gward@python.net (Greg Ward) Date: Fri, 11 Jan 2002 17:08:44 -0500 Subject: [Python-Dev] Change in unpickle order in 2.2? In-Reply-To: <20020111145206.GB13139@gerg.ca> References: <20020110180452.GA11414@mems-exchange.org> <200201110508.AAA07290@cj20424-a.reston1.va.home.com> <20020111145206.GB13139@gerg.ca> Message-ID: <20020111220844.GA14753@gerg.ca> [me] > I have an application (Grouch) that has to do a lot of trickery at > pickle-time and unpickle-time, and as a result it happens to be > sensitive to the order of unpickling. [...] > Anyone got a vague, hand-waving explanation for my vague, hand-waving > complaint? Or should I try to come up with a test case? > [Guido] > Yes please, and post it to SourceForge. There aren't that many > changes in the source of pickle.py since release 2.1. (Or are you > using cPickle? If so, please say so. The two aren't 100% > equivalent.) False alarm. It appears that a change in dictionary order bit me; I was lucky that pickling Grouch objects ever worked at all. Lesson: when the code to support pickling is too complex too understand, it's too complex. Hmmm, that might have broader application. ;-) Greg -- Greg Ward - Linux geek gward@python.net http://starship.python.net/~gward/ Time flies like an arrow; fruit flies like a banana. From tim.one@home.com Fri Jan 11 22:46:34 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 11 Jan 2002 17:46:34 -0500 Subject: [Python-Dev] Change in unpickle order in 2.2? In-Reply-To: <20020111220844.GA14753@gerg.ca> Message-ID: [Greg Ward] > False alarm. It appears that a change in dictionary order bit me; I was > lucky that pickling Grouch objects ever worked at all. You were luckier we changed dict iteration order for your own good . > Lesson: when the code to support pickling is too complex too understand, > it's too complex. Hmmm, that might have broader application. ;-) No, I'm sure Zope Corporation would officially deny, denounce and decry any intimation that convolution in support of pickling is a vice. The true problem is more likely that you haven't yet added enough layers of abstraction around your pickling code. I'm especially suspicious of that because you were able to figure out the cause of the problem in less than a week ... From jason-dated-1011480852.3d2412@mastaler.com Fri Jan 11 22:54:10 2002 From: jason-dated-1011480852.3d2412@mastaler.com (Jason R. Mastaler) Date: Fri, 11 Jan 2002 15:54:10 -0700 Subject: [Python-Dev] sourceforge: where should feature requests go? Message-ID: I noticed that the sourceforge tracker has a "Feature Requests" category, but that "Bugs" also has a "Feature Request" group. Which is the right place to submit new feature requests? From tim.one@home.com Fri Jan 11 23:04:11 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 11 Jan 2002 18:04:11 -0500 Subject: [Python-Dev] sourceforge: where should feature requests go? In-Reply-To: Message-ID: [Jason R. Mastaler] > I noticed that the sourceforge tracker has a "Feature Requests" > category, but that "Bugs" also has a "Feature Request" group. > > Which is the right place to submit new feature requests? To the FR tracker. That didn't always exist, and all FRs ended up in the Bug tracker instead, so we added a FR group to Bugs to try to keep track of them. Unfortunately, once you add a group to an SF tracker, it can never be removed, so this confusion won't go away. Thanks for asking! From jason-dated-1011481811.c2564e@mastaler.com Fri Jan 11 23:10:09 2002 From: jason-dated-1011481811.c2564e@mastaler.com (Jason R. Mastaler) Date: Fri, 11 Jan 2002 16:10:09 -0700 Subject: [Python-Dev] sourceforge: where should feature requests go? In-Reply-To: ("Tim Peters"'s message of "Fri, 11 Jan 2002 18:04:11 -0500") References: Message-ID: "Tim Peters" writes: > To the FR tracker. That didn't always exist, and all FRs ended up > in the Bug tracker instead, so we added a FR group to Bugs to try to > keep track of them. OK. My next question is: when a new item is submitted to the FR tracker, does anyone get notice of it, or does it lie until one of you happens to stumble across it? In other words, should a new request be accompanied by an e-mail somewhere? Thanks. -- (TMDA (http://tmda.sourceforge.net/) (UCE intrusion prevention in Python) From tim.one@home.com Fri Jan 11 23:21:48 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 11 Jan 2002 18:21:48 -0500 Subject: [Python-Dev] sourceforge: where should feature requests go? In-Reply-To: Message-ID: [Jason R. Mastaler] > OK. My next question is: when a new item is submitted to the FR > tracker, does anyone get notice of it, or does it lie until one of > you happens to stumble across it? In other words, should a new > request be accompanied by an e-mail somewhere? All new items and changes to the FR tracker are automatically emailed to python-bugs-list@python.org. Whether anyone besides me is subscribed to that list is a question I can't answer . Fair warning: feature requests usually don't go anywhere unless somebody volunteers a patch (comprising code, doc, and test suite changes) implementing the request. If you're not in a position to do that yourself, it can be helpful to discuss what you want on comp.lang.python too (in the hope that somebody else gets inspired to do it). From jason-dated-1011483144.4be859@mastaler.com Fri Jan 11 23:32:22 2002 From: jason-dated-1011483144.4be859@mastaler.com (Jason R. Mastaler) Date: Fri, 11 Jan 2002 16:32:22 -0700 Subject: [Python-Dev] sourceforge: where should feature requests go? In-Reply-To: ("Tim Peters"'s message of "Fri, 11 Jan 2002 18:21:48 -0500") References: Message-ID: "Tim Peters" writes: > Fair warning: feature requests usually don't go anywhere unless > somebody volunteers a patch (comprising code, doc, and test suite > changes) implementing the request. Understandable. > If you're not in a position to do that yourself, it can be helpful > to discuss what you want on comp.lang.python too (in the hope that > somebody else gets inspired to do it). I do have a FR (#499529) that hasn't gone anywhere, but I did volunteer a patch, I just had a question about method naming that needed answering, so I haven't attached anything yet. And of course, you might not even be interested in including such a feature which was another reason to hold off. -- (TMDA (http://tmda.sourceforge.net/) (UCE intrusion prevention in Python) From martin@v.loewis.de Fri Jan 11 23:47:32 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 12 Jan 2002 00:47:32 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <20020111122105.B1808@ibook.distro.conectiva> (message from Gustavo Niemeyer on Fri, 11 Jan 2002 12:21:05 -0200) References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> Message-ID: <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> > #define __init __attribute__ ((__section__ (".text.init"))) [...] > After surrounding doc strings with a macro, this will be easy to achieve. Unfortunately, not with the doc string you propose. Apparently, your macro is going to be used as char foo__doc__[] = Py_DocString("this is foo"); However, with the attribute, the resulting code should read char foo__doc__[] __attribute__((__section__("docstring")) = "this is foo"; You cannot define the macro so that it comes out as expanding to __attribute__, atleast not with that specific macro. Regards, Martin From martin@v.loewis.de Sat Jan 12 00:26:28 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 12 Jan 2002 01:26:28 +0100 Subject: [Python-Dev] sourceforge: where should feature requests go? In-Reply-To: (jason-dated-1011480852.3d2412@mastaler.com) References: Message-ID: <200201120026.g0C0QSp01639@mira.informatik.hu-berlin.de> > I noticed that the sourceforge tracker has a "Feature Requests" > category, but that "Bugs" also has a "Feature Request" group. > > Which is the right place to submit new feature requests? Please use the separate tracker. The Bugs category predates the separate tracker, but it cannot be removed (unfortunately). Regards, Martin From martin@v.loewis.de Sat Jan 12 00:29:19 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 12 Jan 2002 01:29:19 +0100 Subject: [Python-Dev] sourceforge: where should feature requests go? In-Reply-To: References: Message-ID: <200201120029.g0C0TJZ01656@mira.informatik.hu-berlin.de> > All new items and changes to the FR tracker are automatically emailed to > python-bugs-list@python.org. Whether anyone besides me is subscribed to > that list is a question I can't answer . Let's assume you really don't know, for a moment: the subscriber list is at http://mail.python.org/mailman/roster/python-bugs-list Regards, Martin From Anthony Baxter Sat Jan 12 00:59:46 2002 From: Anthony Baxter (Anthony Baxter) Date: Sat, 12 Jan 2002 11:59:46 +1100 Subject: [Python-Dev] Re: Q: Testing Python 2.1.2 on cygwin In-Reply-To: Message from Paul Everitt of "Fri, 11 Jan 2002 06:42:18 CDT." <3C3ECF9A.5030301@zope.com> Message-ID: <200201120059.g0C0xkC09812@mbuna.arbhome.com.au> Paul Everitt tested on cygwin, and make test got: > test_fork1 > test test_fork1 crashed -- exceptions.OSError: [Errno 11] Resource temporaril y unavailable > test_popen2 > test test_popen2 crashed -- exceptions.OSError: [Errno 11] Resource temporari ly unavailable Looks to me like cygwin's fork() support is busted. In addition, the build of curses failed: > build/temp.cygwin_nt-5.0-1.3.6-i686-2.1/_cursesmodule.o:/cygdrive/c/data/tmp/ Python-2.1.2c1/Modules/_cursesmodule.c:1808: more undefined references to `acs_ map' follow > collect2: ld returned 1 exit status > c:\data\tmp\Python-2.1.2c1\python.exe: *** unable to remap C:\apps\cygwin\bin \cygssl.dll to same address as parent -- 0x1A2E0000 > 0 [main] python 964 sync_with_child: child 304(0x170) died before initi alization with status code 0x1 > 38927 [main] python 964 sync_with_child: *** child state child loading dlls > c:\data\tmp\Python-2.1.2c1\python.exe: *** unable to remap C:\apps\cygwin\bin \cygssl.dll to same address as parent -- 0x1A2E0000 > 41469381 [main] python 964 sync_with_child: child 940(0x1E0) died before init ialization with status code 0x1 > 41526514 [main] python 964 sync_with_child: *** child state child loading dll s > c:\data\tmp\Python-2.1.2c1\python.exe: *** unable to remap C:\apps\cygwin\bin \cygssl.dll to same address as parent -- 0x1A2E0000 > 113407858 [main] python 964 sync_with_child: child 1396(0x2F0) died before in itialization with status code 0x1 > 113447685 [main] python 964 sync_with_child: *** child state child loading dl ls > make: [test] Error 58 (ignored) -- Anthony Baxter It's never too late to have a happy childhood. From tim.one@home.com Sat Jan 12 02:32:19 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 11 Jan 2002 21:32:19 -0500 Subject: [Python-Dev] Re: Q: Testing Python 2.1.2 on cygwin In-Reply-To: <200201120059.g0C0xkC09812@mbuna.arbhome.com.au> Message-ID: [Anthony Baxter] > Paul Everitt tested on cygwin, and make test got: There's a long section about known Cygwin problems in the main README file, mostly written by Cygwin developers. Make sure Paul followed the special Cygwin build instructions before worrying too much; the failure of curses to build won't go away regardless (see README). I believe Michael Hudson is (or was) paying more attention to Cygwin than other Python-Dev'ers. From mal@lemburg.com Sat Jan 12 12:27:38 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 12 Jan 2002 13:27:38 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> Message-ID: <3C402BBA.1040806@lemburg.com> Martin v. Loewis wrote: >>#define __init __attribute__ ((__section__ (".text.init"))) >> > [...] > >>After surrounding doc strings with a macro, this will be easy to achieve. >> > > Unfortunately, not with the doc string you propose. Apparently, your > macro is going to be used as > > char foo__doc__[] = Py_DocString("this is foo"); > > However, with the attribute, the resulting code should read > > char foo__doc__[] __attribute__((__section__("docstring")) = > "this is foo"; > > You cannot define the macro so that it comes out as expanding to > __attribute__, atleast not with that specific macro. Why don't you use macro which only takes the name of the static array and the doc-string itself as argument ? This could then be expanded to whatever needs to be done for a particular case/platform, e.g. Py_DefineDocString(foo__doc__, "foo does bar"); (I use such an approach in the mx stuff and it works great.) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jason@tishler.net Sat Jan 12 16:24:14 2002 From: jason@tishler.net (Jason Tishler) Date: Sat, 12 Jan 2002 11:24:14 -0500 Subject: [Python-Dev] Re: Q: Testing Python 2.1.2 on cygwin In-Reply-To: References: <200201120059.g0C0xkC09812@mbuna.arbhome.com.au> Message-ID: <20020112162414.GC1268@dothill.com> Anthony, The problems with fork() and _curses that you reported are already known. In fact, the _curses build problem is already solved. Please read the Python README or the latest Cygwin Python 2.2 README which is available at: http://www.tishler.net/jason/software/python/python-2.2.README.txt On Fri, Jan 11, 2002 at 09:32:19PM -0500, Tim Peters wrote: > [Anthony Baxter] > > Paul Everitt tested on cygwin, and make test got: > > There's a long section about known Cygwin problems in the main README file, > mostly written by Cygwin developers. Make sure Paul followed the special > Cygwin build instructions before worrying too much; The fork() problem can be worked around by building the _socket module statically: http://sources.redhat.com/ml/cygwin/2001-12/msg00542.html > the failure of curses to build won't go away regardless (see README). This problem has been solved in the latest Cygwin ncurses release: http://sources.redhat.com/ml/cygwin/2002-01/msg00473.html http://sources.redhat.com/ml/cygwin/2002-01/msg00529.html Note that you will have to explicitly ask Cygwin's setup.exe to install this ncurses version because it is still marked as test. FYI, a pre-built Python 2.2 is part of the standard Cygwin distribution: http://cygwin.com/ml/cygwin-announce/2002/msg00001.html BTW, if you are trying to build other Python versions (e.g., 2.1.2), you may find the Cygwin specific patch (i.e., CYGWIN-PATCHES/python.patch) in the Python source tarballs on the Cygwin mirrors useful to review. Jason From andymac@bullseye.apana.org.au Sat Jan 12 11:28:46 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Sat, 12 Jan 2002 22:28:46 +1100 (EDT) Subject: [Python-Dev] guidance sought: merging port related changes to Library modules Message-ID: In preparing a set of patches intended to bring the OS/2 EMX port into CVS, I have a dilemma as to how best to integrate some changes to standard library modules. As background to this request I note that EMX and Cygwin have similar philosophies and attributes, being Posix/Unixish runtime environments on OSes with PC-DOS ancestry. Both rely on the GNU toolchain for software development. As a result of feedback on the previous set of patches, I am pruning cosmetic changes and attempting to minimise the footprint of the necessary changes. The particular changes I am looking for guidance on (or BDFL pronouncement on, as the case may be) involve os.py and the functionality in ntpath.py. The approach used in the port as released in binary form was to create a module called os2path.py (probably should really be called os2emxpath.py), which replicates the functionality of ntpath.py with OS2/EMX specific changes. Most of the changes have to do with using different path separator characters, with a few other changes reflecting slightly different behavour under EMX. EMX promotes the use of '/' as the path separator rather than '\', though it works with the latter. I don't know if Cygwin promotes the same convention. If I were to merge os2path.py into ntpath.py (which I incline towards instinctively) I believe that using references to os.sep and os.altsep rather than explicit '\\' and '/' strings would significantly reduce the extent of conditionalisation required, but in the process introduce significant source changes into ntpath.py (although the logical changes would be much less significant). If rationalising the use of separator characters (by moving away from hard-coded strings) in ntpath.py is unattractive, then I think I'd prefer to keep os2path.py (renamed to os2emxpath.py) as is, rather than revert to the DOS standard path separators. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From guido@python.org Sat Jan 12 20:11:15 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 12 Jan 2002 15:11:15 -0500 Subject: [Python-Dev] guidance sought: merging port related changes to Library modules In-Reply-To: Your message of "Sat, 12 Jan 2002 22:28:46 +1100." References: Message-ID: <200201122011.PAA05487@cj20424-a.reston1.va.home.com> > The particular changes I am looking for guidance on (or BDFL > pronouncement on, as the case may be) involve os.py and the functionality > in ntpath.py. > > The approach used in the port as released in binary form was to create a > module called os2path.py (probably should really be called os2emxpath.py), > which replicates the functionality of ntpath.py with OS2/EMX specific > changes. > > Most of the changes have to do with using different path separator > characters, with a few other changes reflecting slightly different > behavour under EMX. EMX promotes the use of '/' as the path separator > rather than '\', though it works with the latter. I don't know if Cygwin > promotes the same convention. > > If I were to merge os2path.py into ntpath.py (which I incline towards > instinctively) I believe that using references to os.sep and os.altsep > rather than explicit '\\' and '/' strings would significantly reduce the > extent of conditionalisation required, but in the process introduce > significant source changes into ntpath.py (although the logical changes > would be much less significant). > > If rationalising the use of separator characters (by moving away from > hard-coded strings) in ntpath.py is unattractive, then I think I'd prefer > to keep os2path.py (renamed to os2emxpath.py) as is, rather than revert > to the DOS standard path separators. The various modules ntpath, posixpath, macpath etc. are not just their to support their own platform on itself. They are also there to support foreign pathname twiddling. E.g. On Windows I might have a need to munge posix paths -- I can do that by explicitly importing posixpath. Likewise the reverse. So I think changing ntpath.py to use os.set etc. would be wrong, and creating a new file os2emxpath.py is the right thing to do -- despite the endless cloning of the same code. :-( (Maybe a different way to share more code between the XXXpath modules could be devised.) --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Sat Jan 12 20:36:44 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 12 Jan 2002 21:36:44 +0100 Subject: [Python-Dev] guidance sought: merging port related changes to Library modules In-Reply-To: (message from Andrew MacIntyre on Sat, 12 Jan 2002 22:28:46 +1100 (EDT)) References: Message-ID: <200201122036.g0CKaiB01415@mira.informatik.hu-berlin.de> > If rationalising the use of separator characters (by moving away from > hard-coded strings) in ntpath.py is unattractive, then I think I'd prefer > to keep os2path.py (renamed to os2emxpath.py) as is, rather than revert > to the DOS standard path separators. I think replacing hard-coded separators by os.sep is a good thing to do. However, if you find that you cannot achieve re-use of ntpath for OS/2 by existing customization alone, please do not add conditional code into ntpath. It would be very confusing if, in ntpath.py, there is a test whether the system is OS/2. Regards, Martin From tim.one@home.com Sun Jan 13 07:39:17 2002 From: tim.one@home.com (Tim Peters) Date: Sun, 13 Jan 2002 02:39:17 -0500 Subject: [Python-Dev] guidance sought: merging port related changes to Library modules In-Reply-To: <200201122011.PAA05487@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > The various modules ntpath, posixpath, macpath etc. are not just their > to support their own platform on itself. They are also there to > support foreign pathname twiddling. E.g. On Windows I might have a > need to munge posix paths -- I can do that by explicitly importing > posixpath. Likewise the reverse. Bingo. > So I think changing ntpath.py to use os.set etc. would be wrong, and > creating a new file os2emxpath.py is the right thing to do -- despite > the endless cloning of the same code. :-( (Maybe a different way to > share more code between the XXXpath modules could be devised.) Create _commonpath.py and put shared routines there. Then a blahpath.py can do from _commonpath import f, g, h to re-export them. An excellent candidate for inclusion would be expandvars(): the different routines for that now have radically different behaviors in endcases, it's impossible to say which are bugs or features, yet the routine *should* be wholly platform-independent (no, the Mac doesn't need its own version -- when an envar isn't found, the $envar token is retained literally). Having different versions of walk() is also silly, ditto isdir(), etc. From nhodgson@bigpond.net.au Mon Jan 14 06:20:32 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Mon, 14 Jan 2002 17:20:32 +1100 Subject: [Python-Dev] PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> Message-ID: <02ff01c19cc3$92514540$0acc8490@neil> M.-A. Lemburg: > Guys, this discussion is getting somewhat out of hand. I believe > that no-one on python-dev is seriously following this anymore, > yet OTOH your are working on a rather important part of the Python > file API. > > I'd suggest to write up the problem and your conclusions as a > PEP for everyone to understand before actually starting to > checkin anything. OK, PEP 277 is now available from: http://python.sourceforge.net/peps/pep-0277.html Neil From martin@v.loewis.de Mon Jan 14 07:11:54 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 14 Jan 2002 08:11:54 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... In-Reply-To: <02ff01c19cc3$92514540$0acc8490@neil> (nhodgson@bigpond.net.au) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> Message-ID: <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> > OK, PEP 277 is now available from: > http://python.sourceforge.net/peps/pep-0277.html Looks very good to me, except that the listdir approach (unicode in, unicode out) should apply uniformly to all platforms; I'll provide an add-on patch to your implementation once the PEP is approved. Regards, Martin From niemeyer@conectiva.com Mon Jan 14 11:30:53 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Mon, 14 Jan 2002 09:30:53 -0200 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <3C402BBA.1040806@lemburg.com> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> Message-ID: <20020114093053.C1325@ibook.distro.conectiva> --2/5bycvrmDh4d1IB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable > Why don't you use macro which only takes the name of the > static array and the doc-string itself as argument ? This > could then be expanded to whatever needs to be done for > a particular case/platform, e.g. >=20 > Py_DefineDocString(foo__doc__, "foo does bar"); >=20 > (I use such an approach in the mx stuff and it works great.) Yes, it's a nice idea! I'm looking for some way to "discard" the string using a macro. Let me explain with code: [...] #define Py_DOCSTR(name, str) static char *name =3D str #ifdef WITH_DOC_STRINGS #define Py_DOCSTR_START(name) Py_DOCSTR(name,) #define Py_DOCSTR_END ; #else #define Py_DOCSTR_START(name) Py_DOCSTR(name, ""); /* Also discards what follows somehow */ #define Py_DOCSTR_END /* Stop discarding */ #endif [...] This would make it possible to do something like this: Py_DOCSTR(simple_doc, "This is a simple doc string."); =2E..and also... Py_DOCSTR_START(complex_doc) "This is a complex doc string" #ifndef MS_WIN16 "like the one in sysmodule.c" #endif "Something else" Py_DOCSTR_END This seems to be the most elegant way to allow these complex strings. But unfortunately, I haven't found any way so far to do this "discarding thing", besides including another "#if" in the documentation itself. Any good ideas? --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --2/5bycvrmDh4d1IB Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8QsFtIlOymmZkOgwRAhCuAKCzfuR27L8hkrrUPdnvp/ACfxUozgCfYuMn n14LE+7EpRShujzh6ZZm+HA= =x9QI -----END PGP SIGNATURE----- --2/5bycvrmDh4d1IB-- From mal@lemburg.com Mon Jan 14 12:05:57 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 14 Jan 2002 13:05:57 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> Message-ID: <3C42C9A5.975FA5B8@lemburg.com> Gustavo Niemeyer wrote: > > > Why don't you use macro which only takes the name of the > > static array and the doc-string itself as argument ? This > > could then be expanded to whatever needs to be done for > > a particular case/platform, e.g. > > > > Py_DefineDocString(foo__doc__, "foo does bar"); > > > > (I use such an approach in the mx stuff and it works great.) > > Yes, it's a nice idea! > > I'm looking for some way to "discard" the string using a macro. Let me > explain with code: > > [...] > #define Py_DOCSTR(name, str) static char *name = str > #ifdef WITH_DOC_STRINGS > #define Py_DOCSTR_START(name) Py_DOCSTR(name,) > #define Py_DOCSTR_END ; > #else > #define Py_DOCSTR_START(name) Py_DOCSTR(name, ""); /* Also discards what > follows somehow */ > #define Py_DOCSTR_END /* Stop discarding */ > #endif > [...] > > This would make it possible to do something like this: > > Py_DOCSTR(simple_doc, "This is a simple doc string."); > > ...and also... > > Py_DOCSTR_START(complex_doc) > "This is a complex doc string" > #ifndef MS_WIN16 > "like the one in sysmodule.c" > #endif > "Something else" > Py_DOCSTR_END > > This seems to be the most elegant way to allow these complex strings. > But unfortunately, I haven't found any way so far to do this "discarding > thing", besides including another "#if" in the documentation itself. > > Any good ideas? Wouldn't it be much simpler to wrap the complete Py_DOCSTR() into #ifdefs ? BTW, I don't we'll ever need to #ifdef doc-strings for platforms; you can just as well put the information for all platforms into the doc-string -- after the recipient is a human with enough non-AI to parse the doc-string into meaningful sections ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From niemeyer@conectiva.com Mon Jan 14 12:41:46 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Mon, 14 Jan 2002 10:41:46 -0200 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <3C42C9A5.975FA5B8@lemburg.com> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> Message-ID: <20020114104146.A2607@ibook.distro.conectiva> --huq684BweRXVnRxX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable > Wouldn't it be much simpler to wrap the complete Py_DOCSTR()=20 > into #ifdefs ? Yes, it's going to be wrapped! I took this code out of a file I was using to show the #ifdef problem. > BTW, I don't we'll ever need to #ifdef doc-strings for platforms; This would make things pretty easy, but note that we are *already* #ifdef'ing doc-strings for platforms. Python/sysmodule.c is an example of such. > you can just as well put the information for all platforms into=20 > the doc-string -- after the recipient is a human with enough=20 > non-AI to parse the doc-string into meaningful sections ;-) Cool! Are we going to change the existent doc strings then? --=20 Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] --huq684BweRXVnRxX Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8QtIKIlOymmZkOgwRAvMrAKCf6bQkKyfsoUv9szN8uAAkLElRbACgrEBv 8ucR3osN1WVHIo2l76qkUV4= =bmId -----END PGP SIGNATURE----- --huq684BweRXVnRxX-- From jepler@inetnebr.com Mon Jan 14 13:41:57 2002 From: jepler@inetnebr.com (jepler@inetnebr.com) Date: Mon, 14 Jan 2002 07:41:57 -0600 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <20020114093053.C1325@ibook.distro.conectiva> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> Message-ID: <20020114074155.A1307@unpythonic.dhs.org> The following is the solution that comes to mind for me. My other idea was creating a static char* or a static function with the char* inside it, in the hopes it would be discarded as unused, but gcc doesn't seem to do that. Seems to me that compared to this, rewriting those docstrings that are victim of preprocessor definitions already is certainly better for readability of the docstrings in the source code... Jeff Epler jepler@inetnebr.com On Mon, Jan 14, 2002 at 09:30:53AM -0200, Gustavo Niemeyer wrote: > I'm looking for some way to "discard" the string using a macro. Let me > explain with code: > > [...] > #define Py_DOCSTR(name, str) static char *name = str > #ifdef WITH_DOC_STRINGS > #define Py_DOCSTR_START(name) Py_DOCSTR(name,) > #define Py_DOCSTR_END ; #define Py_DOCSTR_PART(s) s > #else > #define Py_DOCSTR_START(name) Py_DOCSTR(name, ""); /* Also discards what > follows somehow */ > #define Py_DOCSTR_END /* Stop discarding */ #define Py_DOCSTR_PART(s) /* (nothing) */ > #endif > [...] > > > This would make it possible to do something like this: > > Py_DOCSTR(simple_doc, "This is a simple doc string."); > > ...and also... > > Py_DOCSTR_START(complex_doc) Py_DOCSTR_PART( "This is a complex doc string") > #ifndef MS_WIN16 Py_DOCSTR_PART( "like the one in sysmodule.c") > #endif Py_DOCSTR_PART( "Something else") > Py_DOCSTR_END From fdrake@acm.org Mon Jan 14 14:25:46 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 14 Jan 2002 09:25:46 -0500 (EST) Subject: [Python-Dev] guidance sought: merging port related changes to Library modules In-Reply-To: <200201122011.PAA05487@cj20424-a.reston1.va.home.com> References: <200201122011.PAA05487@cj20424-a.reston1.va.home.com> Message-ID: <15426.60010.318856.560347@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > The various modules ntpath, posixpath, macpath etc. are not just their > to support their own platform on itself. They are also there to Note that ntpath.abspath() relies on nt._getfullpathname(). It is not unreasonable for this particular function to require that it actually be running on NT, so I'm not going to suggest changing this. On the other hand, it means the portable portions of the module are (mostly) not tested when the regression test is run on a platform other than Windows; the ntpath.abspath() test raises an ImportError since ntpath.abspath() imports the "nt" module within the function, and the resulting ImportError causes the rest of the unit test to be skipped and regrtest.py reports that the test is skipped. I'd like to change the test so that the abspath() test is only run if the "nt" module is available: try: import nt except ImportError: pass else: tester('ntpath.abspath("C:\\")', "C:\\") Any objections? -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From sdm7g@Virginia.EDU Mon Jan 14 17:22:26 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 12:22:26 -0500 (EST) Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict Message-ID: Since PEP 216 on string interpolation is still active, I'ld appreciate it if some of it's supporters would comment on my revised alternative solution (posted on comp.lang.python and at google thru): I didn't get any feedback on the first version that was posted -- particularly whether the syntax was acceptable, or if a 'magic string' solution was still preferred. -- Steve Majewski From andymac@bullseye.apana.org.au Mon Jan 14 09:03:24 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Mon, 14 Jan 2002 20:03:24 +1100 (EDT) Subject: [Python-Dev] guidance sought: merging port related changes to Library modules In-Reply-To: <200201122011.PAA05487@cj20424-a.reston1.va.home.com> Message-ID: On Sat, 12 Jan 2002, Guido van Rossum wrote: > The various modules ntpath, posixpath, macpath etc. are not just their > to support their own platform on itself. They are also there to > support foreign pathname twiddling. E.g. On Windows I might have a > need to munge posix paths -- I can do that by explicitly importing > posixpath. Likewise the reverse. > > So I think changing ntpath.py to use os.set etc. would be wrong, and > creating a new file os2emxpath.py is the right thing to do -- despite > the endless cloning of the same code. :-( (Maybe a different way to > share more code between the XXXpath modules could be devised.) I'd not considered the foreign path munging use, which I agree justifies retaining hardcoded path separators. I'll proceed on the basis of the os2emxpath.py approach. I'll pass for the time being on Tim's suggested rationalisation approach though... Thanks for the enlightment. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From mal@lemburg.com Mon Jan 14 18:43:30 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 14 Jan 2002 19:43:30 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> Message-ID: <3C4326D2.F2A82030@lemburg.com> Gustavo Niemeyer wrote: > > > Wouldn't it be much simpler to wrap the complete Py_DOCSTR() > > into #ifdefs ? > > Yes, it's going to be wrapped! I took this code out of a file I was > using to show the #ifdef problem. > > > BTW, I don't we'll ever need to #ifdef doc-strings for platforms; > > This would make things pretty easy, but note that we are *already* > #ifdef'ing doc-strings for platforms. Python/sysmodule.c is an example > of such. Hmm, I wasn't aware of such doc-strings. > > you can just as well put the information for all platforms into > > the doc-string -- after the recipient is a human with enough > > non-AI to parse the doc-string into meaningful sections ;-) > > Cool! Are we going to change the existent doc strings then? Well, can't speak for PythonLabs, but I don't see any benefit from making doc-string complicated by introducing #ifdefs. It doesn't buy us anything, IMHO. Even worse: it makes translating the doc-strings harder. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Mon Jan 14 18:47:05 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 14 Jan 2002 13:47:05 -0500 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: Your message of "Mon, 14 Jan 2002 19:43:30 +0100." <3C4326D2.F2A82030@lemburg.com> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> Message-ID: <200201141847.NAA10894@cj20424-a.reston1.va.home.com> > Well, can't speak for PythonLabs, but I don't see any benefit > from making doc-string complicated by introducing #ifdefs. It > doesn't buy us anything, IMHO. Even worse: it makes translating > the doc-strings harder. If there is platform-specific functionality, the docstring should document that only on the platform where it applies. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Jan 14 19:56:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 14 Jan 2002 20:56:05 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> Message-ID: <3C4337D5.B54330B1@lemburg.com> Guido van Rossum wrote: > > > Well, can't speak for PythonLabs, but I don't see any benefit > > from making doc-string complicated by introducing #ifdefs. It > > doesn't buy us anything, IMHO. Even worse: it makes translating > > the doc-strings harder. > > If there is platform-specific functionality, the docstring should > document that only on the platform where it applies. Just to make sure... I was talking about something like: open__doc__ = \ "Open the file. On Windows, the MBCS encoding is assumed, "\ "on all other systems, the file name must be given in ASCII."; vs. #ifdef MS_WINDOWS open__doc__ = \ "Open the file, assuming the filename is given in the MBCS "\ "encoding."; #else open__doc__ = \ "Open the file, assuming the filename is given in ASCII."; #endif -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From paul@prescod.net Mon Jan 14 19:51:21 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 14 Jan 2002 11:51:21 -0800 Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict References: Message-ID: <3C4336B9.661681DC@prescod.net> Steven Majewski wrote: > >... > > particularly whether the syntax was acceptable, or if a 'magic string' > solution was still preferred. IMHO, string interpolation should be one of the easiest things in the language. It should be something you learn in the first half of your first day learning Python. Any extra level of logical indirection seems misplaced to me. Paul Prescod From skip@pobox.com Mon Jan 14 20:12:24 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 14 Jan 2002 14:12:24 -0600 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <3C4337D5.B54330B1@lemburg.com> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> Message-ID: <15427.15272.171558.1993@12-248-41-177.client.attbi.com> >> If there is platform-specific functionality, the docstring should >> document that only on the platform where it applies. mal> Just to make sure... I was talking about something like: mal> open__doc__ = \ mal> "Open the file. On Windows, the MBCS encoding is assumed, "\ mal> "on all other systems, the file name must be given in ASCII."; +1 mal> vs. mal> #ifdef MS_WINDOWS mal> open__doc__ = \ mal> "Open the file, assuming the filename is given in the MBCS "\ mal> "encoding."; mal> #else mal> open__doc__ = \ mal> "Open the file, assuming the filename is given in ASCII."; mal> #endif -1 I agree w/ MAL. I happen to be developing an application on Linux right now, but I'm interested in where I might encounter problems when it migrates to Windows. I would much prefer the documentation make it eas(y|ier) to identify platform differences. This holds true for docstrings, because they are the most readily available documentation format. Skip From guido@python.org Mon Jan 14 20:06:52 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 14 Jan 2002 15:06:52 -0500 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: Your message of "Mon, 14 Jan 2002 20:56:05 +0100." <3C4337D5.B54330B1@lemburg.com> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> Message-ID: <200201142006.PAA12265@cj20424-a.reston1.va.home.com> > Just to make sure... I was talking about something like: > > open__doc__ = \ > "Open the file. On Windows, the MBCS encoding is assumed, "\ > "on all other systems, the file name must be given in ASCII."; > > vs. > > #ifdef MS_WINDOWS > open__doc__ = \ > "Open the file, assuming the filename is given in the MBCS "\ > "encoding."; > #else > open__doc__ = \ > "Open the file, assuming the filename is given in ASCII."; > #endif Given the main use case for docstrings, I'd prefer the latter. The library manual should contain the "all-platforms" documentation. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jan 14 20:17:19 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 14 Jan 2002 15:17:19 -0500 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: Your message of "Mon, 14 Jan 2002 14:12:24 CST." <15427.15272.171558.1993@12-248-41-177.client.attbi.com> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> <15427.15272.171558.1993@12-248-41-177.client.attbi.com> Message-ID: <200201142017.PAA12356@cj20424-a.reston1.va.home.com> > I agree w/ MAL. I happen to be developing an application on Linux > right now, but I'm interested in where I might encounter problems > when it migrates to Windows. I would much prefer the documentation > make it eas(y|ier) to identify platform differences. This holds > true for docstrings, because they are the most readily available > documentation format. But what about optional features that are only available on platform X? Do you really want those to clutter up the docstring on platforms where they aren't available? On the platform where they *are*, their docstring should have a "(Platform X only)" note. --Guido van Rossum (home page: http://www.python.org/~guido/) From jason@jorendorff.com Mon Jan 14 20:19:13 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 14:19:13 -0600 Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict In-Reply-To: <3C4336B9.661681DC@prescod.net> Message-ID: Paul Prescod wrote: > IMHO, string interpolation should be one of the easiest things in the > language. It should be something you learn in the first half of your > first day learning Python. Any extra level of logical indirection seems > misplaced to me. +1 ## Jason Orendorff http://www.jorendorff.com/ From sdm7g@Virginia.EDU Mon Jan 14 20:44:05 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 15:44:05 -0500 (EST) Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict In-Reply-To: <3C4336B9.661681DC@prescod.net> Message-ID: On Mon, 14 Jan 2002, Paul Prescod wrote: > > particularly whether the syntax was acceptable, or if a 'magic string' > > solution was still preferred. > > IMHO, string interpolation should be one of the easiest things in the > language. It should be something you learn in the first half of your > first day learning Python. Any extra level of logical indirection seems > misplaced to me. Do you have any comments or suggestions about a substitution syntax, Paul? I think anything except PEP 216's magic initial u" for strings is able to be done with an object extension rather than a syntax change, including the substitution syntax within the magic string. I kept '%' rather than '$' because I assumed that particular char choice was a rather arbitrary part of the design patterned after Tcl or Perl, and that by keeping '%' I could do it with a dict. If a different syntax is desired, then it can be done by extending string to a magic format string object (rather than a magic string syntax). I'm not sure what you mean by logical indirection here: is that a comment on the syntax, or do you object to the idea of not implementing substitution by a language syntax change. ( But if what you mean is you want fewer chars for a double substition, that's something that can be fixed.) One reason I would prefer a "magic object" implementation, rather than a 'magic syntax' one is that, after playing around with this for a bit, I can see that there are a lot of possibilities for various substitution and template languages. A language syntax change, once accepted is cast in stone (and a new revised proposal is much less likely to be considered) while we can muck about and experiment with object extensions both before and after the get put into the standard lib. -- Steve Majewski From sdm7g@Virginia.EDU Mon Jan 14 20:49:17 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 15:49:17 -0500 (EST) Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict In-Reply-To: Message-ID: On Mon, 14 Jan 2002, Jason Orendorff wrote: > Paul Prescod wrote: > > IMHO, string interpolation should be one of the easiest things in the > > language. It should be something you learn in the first half of your > > first day learning Python. Any extra level of logical indirection seems > > misplaced to me. > > +1 Was that +1 for PEP 216?, my alternative proposal? or Paul's comments? I think I agree with his comment above, but I'm not sure whether it was intended a comment on the syntax (which is probably justified), or objecting to solving the problem other than my changing the language syntax (which I don't agree with), or just a statement of principals. -- Steve From skip@pobox.com Mon Jan 14 21:04:17 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 14 Jan 2002 15:04:17 -0600 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <200201142017.PAA12356@cj20424-a.reston1.va.home.com> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> <15427.15272.171558.1993@12-248-41-177.client.attbi.com> <200201142017.PAA12356@cj20424-a.reston1.va.home.com> Message-ID: <15427.18385.906669.456387@12-248-41-177.client.attbi.com> >> I would much prefer the documentation make it eas(y|ier) to identify >> platform differences. This holds true for docstrings, because they >> are the most readily available documentation format. Guido> But what about optional features that are only available on Guido> platform X? Do you really want those to clutter up the docstring Guido> on platforms where they aren't available? On the platform where Guido> they *are*, their docstring should have a "(Platform X only)" Guido> note. Perhaps I should take a half-step back under Guido's withering stare. That's probably why I've been feeling a chill all day... ;-) I don't think it's necessary for the docstring to contain all the excruciating detail available in the library reference manual, but I think a quick help(open) at the interpreter prompt or a docstring popped up in PyCrust or other IDE-like thing should give you an indication that there are semantic differences for that function across platforms. Ideally, these differences would only be documented at the highest level they can come into play. For example, if a class or module exhibits some platform-dependency, its docstring would indicate that, not the docstring of every one of its methods. Also, consider time.strptime. It's not always available, so the time module's docstring should mention its possible absence depending on platform. On platforms where it's not supported, putting a "platform x only" note in strptime's docstring won't help much to the confused programmer wondering where to disappeared to. Of course, it's easy for me to spout platitudes here. Adjusting to such a convention will probably add a fair amount of work to somebody's already full schedule. Skip From jason@jorendorff.com Mon Jan 14 21:32:01 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 15:32:01 -0600 Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict In-Reply-To: Message-ID: > > +1 > > Was that +1 for PEP 216?, my alternative proposal? or Paul's comments? It was a +1 for Paul's comments, both its principles and as maybe a -0.3 criticism of your alternative. No opinion on PEP 215. ## Jason Orendorff http://www.jorendorff.com/ From paul@prescod.net Mon Jan 14 21:34:41 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 14 Jan 2002 13:34:41 -0800 Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict References: Message-ID: <3C434EF1.DE085914@prescod.net> Steven Majewski wrote: > >... > > I'm not sure what you mean by logical indirection here: is that > a comment on the syntax, or do you object to the idea of not implementing > substitution by a language syntax change. Sorry I wasn't clear. Let's say it's the second hour of our Perl/Python class. Here's Perl: $a = 5; $b = 6; print "$a $b"; Lots of yucky extra chars in that code but you can't find much negative stuff to say about the complexity of the string interpolation! Here's Python: a = 5; b = 6; print "%(a)s %(b)s" % vars() Extra indirection: What does % do? What does vars() do? What does the "s" mean? How does this use of % relate to the traditional meanings of either percentage or modulus? This is one of the two problems I would like PEP 215 to solve. The other one is to allow simple function calls and array lookups etc. to be done "inline" to avoid setting up trivial vars or building unnecessary dictionaries. If I understand your proposal correctly, I could only get the evaluation behaviour by making the "indirection" problem even worse...by adding in yet another function call (well, class construtor call), tentatively called EvalDict. Another benefit of the PEP 215 model is that something hard-coded in the syntax is much more amenable to compile time analysis. String interpolation is actually quite compatible with standard compilation techniques. You just rip the expressions out of the string, compile them to byte-code and replace them with pointers ot the evaluated results. As PEP 215 mentions, this also has advantages for reasoning about security. If I tell a new programmer to avoid the use of "eval" unless they consult with me, I'll have to tell them to avoid EvalDict also. My usual approach is to consider eval and exec to be advanced (and rarely used) features that I don't even teach new programmers. I don't know that Jython allows me today to ship a JAR without the Python parser and evaluator but I could imagine a future version that would give me that option. Widespread use of EvalDict would render that option useless. Re: $ versus %. $ is "the standard" in other languages and shells. % is the current standard in Python. $ has the advantage that it doesn't have to work around Python's current C-inspired syntax. So I guess I reluctantly favor $. Also, EvalDict should be called evaldict to match the other constructors in __builtins__. So while I understand the advantage of non-syntactic solutions, in this case I am still in favor of the syntax. Paul Prescod From sdm7g@Virginia.EDU Mon Jan 14 22:06:43 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 17:06:43 -0500 (EST) Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: <3C434EF1.DE085914@prescod.net> Message-ID: [ Oops. Initial subject line said incorrectly PEP 216] On Mon, 14 Jan 2002, Paul Prescod wrote: > [...] As > PEP 215 mentions, this also has advantages for reasoning about security. > If I tell a new programmer to avoid the use of "eval" unless they > consult with me, I'll have to tell them to avoid EvalDict also. My usual > approach is to consider eval and exec to be advanced (and rarely used) > features that I don't even teach new programmers. But if you're going to allow interpolation of the results of arbitrary function into a string, it's going to be a security problem whether or not you use 'eval' to do it. My code hides the eval in the object's python code. u" strings would hide the eval in the C code. How is one more or less secure than the other. The security issue seems to be an argument for a non-language-syntax implementation, as it means that: the hidden eval's could be controlled with a restricted execution environment. ( Also the same advantages I cited to easily experiment with alternatives -- we could roll out a solution without having to tackle the security issue right away.) Also, although I agree with most of your other comments on making it simple and easy, the security issue argues against making it TOO simple. For example, I was considering making the current namespace of the call a default, so you wouldn't need globals() -- but I was worried that because of security and other issues, maybe that was too much "magic" . I think maybe how much magic is enough and how much is too much is one of the issues to discuss. Thanks for expanding on your initial comment. I think you're right that it needs to be simpler. But, for several reasons, security among them, I'm still -1 on PEP 215. -- Steve From sdm7g@Virginia.EDU Mon Jan 14 22:18:25 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 17:18:25 -0500 (EST) Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: On Mon, 14 Jan 2002, Steven Majewski wrote: > [...] I think maybe how much magic is enough and how much is too > much is one of the issues to discuss. > > > Thanks for expanding on your initial comment. > I think you're right that it needs to be simpler. > But, for several reasons, security among them, I'm still -1 on > PEP 215. In fact, I think "too much magic" is my main objection to PEP 215. Having a magic string, which looks like it's a constant, with no operators or function calls associated with it being the implicit source of a while series of function calls and possibly unbounded computations is just hiding too much magic for me to swallow. u"$$main()" ? -- Steve From paul@prescod.net Mon Jan 14 22:20:25 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 14 Jan 2002 14:20:25 -0800 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict References: Message-ID: <3C4359A9.4686EA8B@prescod.net> Steven Majewski wrote: > >.... > > But if you're going to allow interpolation of the results of arbitrary > function into a string, it's going to be a security problem whether > or not you use 'eval' to do it. My code hides the eval in the object's > python code. u" strings would hide the eval in the C code. How is one > more or less secure than the other. I think you mean $" strings, not u" strings. Given: a = $"foo.bar: $foo.bar(abc, 5)" I can translate that *at compile time* to: a = $"foo.bar: %s" % foo.bar(abc, 5) No runtime evaluation is necessary. So I see no security issues here. On the other hand, evaldict really does have the same semantics as an eval, right? Probably it is no more or less dangerous if you only do a single level of EvalDict-ing. But once you get into multiple levels you could get into a situation where user-provided code is being evaluated. The first level of EvalDict incorporates the user-provided code into the string and the second level evaluates it. Ping's current runtime implementation does use "eval" but you could imagine an alternate implementation that actually parses the relevant parts of the string according to the Python grammar, and merely applies the appropriate semantics. It would use "." to trigger getattr, "()" to trigger apply, "[]" to trigger getitem and so forth. Then there would be no eval and thus way to eval user-provided code. Paul Prescod From skip@pobox.com Mon Jan 14 22:39:16 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 14 Jan 2002 16:39:16 -0600 Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict In-Reply-To: <3C434EF1.DE085914@prescod.net> References: <3C434EF1.DE085914@prescod.net> Message-ID: <15427.24084.13741.415408@12-248-41-177.client.attbi.com> Paul> Sorry I wasn't clear. Let's say it's the second hour of our Paul> Perl/Python class. Paul> Here's Perl: Paul> $a = 5; Paul> $b = 6; Paul> print "$a $b"; ... Paul> Here's Python: Paul> a = 5; Paul> b = 6; Paul> print "%(a)s %(b)s" % vars() So? There are some things Perl does better than Python, some things Python does better than Perl. Maybe this is a (small) notch in Perl's gun. It just doesn't seem significantly better enough to me to warrant a language change. I would have written the Python example as print a, b For the simple examples that would normally arise in an introductory programming class, I think Python's print statement works just fine. For more hairy cases, Perl probably wins. That's life. but-that's-just-me-ly, y'rs, -- Skip Montanaro (skip@pobox.com - http://www.mojam.com/) From sdm7g@Virginia.EDU Mon Jan 14 22:59:08 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 17:59:08 -0500 (EST) Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: <3C4359A9.4686EA8B@prescod.net> Message-ID: On Mon, 14 Jan 2002, Paul Prescod wrote: > Steven Majewski wrote: > > > >.... > > > > But if you're going to allow interpolation of the results of arbitrary > > function into a string, it's going to be a security problem whether > > or not you use 'eval' to do it. My code hides the eval in the object's > > python code. u" strings would hide the eval in the C code. How is one > > more or less secure than the other. > > I think you mean $" strings, not u" strings. Given: Oops. Yes. > > a = $"foo.bar: $foo.bar(abc, 5)" > > I can translate that *at compile time* to: > > a = $"foo.bar: %s" % foo.bar(abc, 5) > > No runtime evaluation is necessary. So I see no security issues here. On > the other hand, evaldict really does have the same semantics as an eval, > right? Probably it is no more or less dangerous if you only do a single > level of EvalDict-ing. But once you get into multiple levels you could > get into a situation where user-provided code is being evaluated. The > first level of EvalDict incorporates the user-provided code into the > string and the second level evaluates it. The multiple level was an addition to the last version because that was what some people expressed a desire for in the earlier string interpolation discussion. EvalDict2 does a single level eval. ( Again: that seems to me to be an argument for several alternative object versions rather than one builtin syntax change. ) > Ping's current runtime implementation does use "eval" but you could > imagine an alternate implementation that actually parses the relevant > parts of the string according to the Python grammar, and merely applies > the appropriate semantics. It would use "." to trigger getattr, "()" to > trigger apply, "[]" to trigger getitem and so forth. Then there would be > no eval and thus way to eval user-provided code. The same things holds for an object implementation. eval isn't required for an implementation. But EVERY implementation of that semantics allows implicit function calls. ( I was going to say 'hidden' function calls, but I'll admit that may be provocative/argumentative.) Your point about compile time optomization holds here: yes, the builtin syntax version allows much of that analysis to be done at compile time, while the object version would need to do all of the analysis on the fly at execution. However, as I noted -- the object implementation would allow customizing a restricted environment ( which is a simple security implementation than code analysis.) And having an explicit argument for the namespace allows more control, as well as reminding you of the magic going on behind the curtains. At least if there's a security problem, you have somewhere to look for holes other than the Python C source code. If I keep an eval based implementation, I probably ought to make a restricted __builtin__ the default. -- Steve From jason@jorendorff.com Mon Jan 14 23:04:49 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 17:04:49 -0600 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: > But if you're going to allow interpolation of the results of arbitrary > function into a string, it's going to be a security problem whether > or not you use 'eval' to do it. My code hides the eval in the object's > python code. u" strings would hide the eval in the C code. How is one > more or less secure than the other. There is no security issue with PEP 215. $"$a and $b make $c" <==> ("%s and %s make %s" % (a, b, c)) These two are completely equivalent under PEP 215, and therefore equally secure. ## Jason Orendorff http://www.jorendorff.com/ From Samuele Pedroni" Message-ID: <00f601c19d50$1fa472a0$5154ca3e@newmexico> The Jython 2cts. An eval implementation means that for Jython a code using it cannot be run in a Java sand-box context, eval does not work there. > If I keep an eval based implementation, I probably ought to make > a restricted __builtin__ the default. Jython does not support CPython restricted execution. Probably never will. For what it counts I don't care having string interpolation a la Perl in Python. cheers, Samuele Pedroni. From sdm7g@Virginia.EDU Mon Jan 14 23:11:30 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 18:11:30 -0500 (EST) Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict In-Reply-To: <15427.24084.13741.415408@12-248-41-177.client.attbi.com> Message-ID: > Paul> Sorry I wasn't clear. Let's say it's the second hour of our > Paul> Perl/Python class. > > Paul> Here's Perl: > > Paul> $a = 5; > Paul> $b = 6; > Paul> print "$a $b"; > > ... > > Paul> Here's Python: > > Paul> a = 5; > Paul> b = 6; > Paul> print "%(a)s %(b)s" % vars() > How does Perl handle it if the tokens aren't whitespace separated? Is there an optional enclosing bracket as in shell syntax ? How do you do: "%(word)sly yours" % vocabulary ? (Sorry-- I stopped Perling somewhere around version 4.) -- Steve Majewski From fdrake@acm.org Mon Jan 14 23:10:37 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 14 Jan 2002 18:10:37 -0500 (EST) Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict In-Reply-To: References: <15427.24084.13741.415408@12-248-41-177.client.attbi.com> Message-ID: <15427.25965.312076.280929@cj42289-a.reston1.va.home.com> Steven Majewski writes: > How does Perl handle it if the tokens aren't whitespace separated? > Is there an optional enclosing bracket as in shell syntax ? Yes. > How do you do: "%(word)sly yours" % vocabulary ? I've not a clue... manually scan the format string, perhaps? -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From sdm7g@Virginia.EDU Mon Jan 14 23:19:21 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 18:19:21 -0500 (EST) Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: On Mon, 14 Jan 2002, Jason Orendorff wrote: > > But if you're going to allow interpolation of the results of arbitrary > > function into a string, it's going to be a security problem whether > > or not you use 'eval' to do it. My code hides the eval in the object's > > python code. u" strings would hide the eval in the C code. How is one > > more or less secure than the other. > > There is no security issue with PEP 215. > > $"$a and $b make $c" <==> ("%s and %s make %s" % (a, b, c)) > > These two are completely equivalent under PEP 215, and therefore > equally secure. Your right. I'm confusing PEP 215 with the discussion on PEP 215, where that feature was requested. However, if you allow array and member access as well, which Paul suggests, then you open the security problem back up unless you do some code analysis (as he also suggests) to make sure that [index] or .member doesn't perform a hidden function call ( A virus infected __getitem__ for example. ) -- Steve From jason@jorendorff.com Mon Jan 14 23:16:39 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 17:16:39 -0600 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: Would someone please explain to me what is seen as a "possible security issue" in PEP 215? Can anyone propose some real-life situation where PEP 215 causes a vulnerability, and the corresponding % syntax doesn't? ## Jason Orendorff http://www.jorendorff.com/ From nas@python.ca Mon Jan 14 23:42:13 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 14 Jan 2002 15:42:13 -0800 Subject: [Python-Dev] Re: [Python-iterators] Python generators and try/finally.. In-Reply-To: <20020113122012.K5329@mozart.chat.net>; from jeske@chat.net on Sun, Jan 13, 2002 at 12:20:12PM -0800 References: <20020113122012.K5329@mozart.chat.net> Message-ID: <20020114154212.A2478@glacier.arctrix.com> [Cross-posted to python-dev, I'm not sure how many people are still on the python-iterators list] David Jeske wrote: > Hello, > > I just read PEP255 about Python Generators. It's a very interesting > and elegant solution to a tricky problem. > > I have a thought about allowing try/finally with some reasonable > semantics. This is definitely a wart. This problem is one of the major reasons why Ken Pitman did not want continuations in Common Lisp (Scheme predates CL and in CL try/finally is called unwind-protect). It's a hard problem. However, I think try/finally is less of a problem with generators then it is for continuations. Generators only allow you to temporarily jump up one level in the stack frame while continuations allow you to jump to essentially arbirary stack frames. We disallow try/finally inside a generator since there is no guarantee that the finally clause will ever be executed. The problem is localized. With continuations the problem spreads. Any try/finally block could affected. In practice, I think the current restriction is not a big problem. try/finally is allowed in code that calls generators as well as code called by generators. It is only disallowed in the body of generator itself. > The PEP says that there is no guarantee that next() will be called > again. However, there is a guaratee that either next() will be called, > or the Generator will be cleaned up. It seems reasonable to me to > build a mechanism by which, on __del__ cleanup of the Generator, an > exception is raised from the Yeild point "UnfinishedGenerator" (and > also caught by the cleanup function). This exception would trigger any > finally exception clauses which exist above the yeild. This also has > the added advantage that code can detect when a Generator does not run > to completion. > > It might even be useful to be able to flag the generator such that it > does not catch the UnfinishedGenerator exception. Although this > probably wouldn't be used often. I'm pretty sure something like this could be done but I'm not sure it's a good idea. The handling of exceptions in __del__ methods is ugly, IMHO. We should not propagate that behavior without some careful thought. I would like to see some compelling arguments as to why try/finally should be supported inside generators. Neil From nas@python.ca Mon Jan 14 23:49:18 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 14 Jan 2002 15:49:18 -0800 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: ; from jason@jorendorff.com on Mon, Jan 14, 2002 at 05:04:49PM -0600 References: Message-ID: <20020114154918.B2478@glacier.arctrix.com> Jason Orendorff wrote: > There is no security issue with PEP 215. > > $"$a and $b make $c" <==> ("%s and %s make %s" % (a, b, c)) > > These two are completely equivalent under PEP 215, and therefore > equally secure. Not exactly. Say you have the code: secret_key = "spam" x = raw_input() print $"You entered $x" Imagine that the user enters "I'm 3l337, give me the $secret_key" as the input. Neil From sdm7g@Virginia.EDU Mon Jan 14 23:52:13 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 18:52:13 -0500 (EST) Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: On Mon, 14 Jan 2002, Jason Orendorff wrote: > Would someone please explain to me what is seen as a "possible > security issue" in PEP 215? Can anyone propose some real-life > situation where PEP 215 causes a vulnerability, and the > corresponding % syntax doesn't? Do you mean the current '%' or my expanded example ? Any expanded version -- mine or PEP 215 introduces possible security holes. ( And I'm not even sure that the current "%" doesn't have a hole if it's used "the wrong way" ) But, as Paul said, it depends on the implementation. I said in an earlied post that I confused PEP 215 with the discussion of PEP 215, where some expanded capabilities were suggested. However, on looking at it again closer, I would say that the examples in PEP 215 contradict the Security Considerations paragraph. It has expressions in it that can't be evaluated at compile time, and any list index or member reference can, in Python, invoke a hidden function call. Any implementation is going to require some run time checks. But just in case I'm seeing it all wrong: could you explain to me how PEP 215 *doesn't* have the potential of introducing a security hole ? If the current proof-of-concept implementation does use eval (as Paul stated), then there is (I believe) a security problem with that implementation. Paul has proposed some other implementation tricks, but I'm, not convinced that you can get the same semantics suggested in PEP 215's examples without requiring runtime checks. Since eval is a know security hole, I think the burden of proof is on the proponents. ( And I'm not even demanding proof -- just a convincing argument without too much hand waving and we-have-ways-of-dealing-with-that! ) -- Steve Majewski From jason@jorendorff.com Mon Jan 14 23:55:37 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 17:55:37 -0600 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: <20020114154918.B2478@glacier.arctrix.com> Message-ID: Neil Schemenauer wrote: > Jason Orendorff wrote: > > There is no security issue with PEP 215. > > > > $"$a and $b make $c" <==> ("%s and %s make %s" % (a, b, c)) > > > > These two are completely equivalent under PEP 215, and therefore > > equally secure. > > Not exactly. Say you have the code: > > secret_key = "spam" > x = raw_input() > print $"You entered $x" > > Imagine that the user enters "I'm 3l337, give me the $secret_key" as the > input. >>> import Itpl >>> import sys >>> sys.stdout = Itpl.filter() >>> >>> secret_key = "spam" >>> x = raw_input() I'm 3l337, give me the $secret_key >>> print "You entered $x" You entered I'm 3l337, give me the $secret_key >>> The substitution only happens once. ## Jason Orendorff http://www.jorendorff.com/ From tim.one@home.com Tue Jan 15 00:18:40 2002 From: tim.one@home.com (Tim Peters) Date: Mon, 14 Jan 2002 19:18:40 -0500 Subject: [Python-Dev] Re: [Python-iterators] Python generators and try/finally.. In-Reply-To: <20020114154212.A2478@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > ... > In practice, I think the current restriction is not a big problem. > try/finally is allowed in code that calls generators as well as code > called by generators. It is only disallowed in the body of generator > itself. It's not that severe, Neil: the only restriction is that yield cannot appear in the try clause of a try/finally construct. try/finally can otherwise be used freely inside generators, and yield can be used anywhere inside a generator inside try/except/else, and even in a finally clause (these latter assuming the yield is not also in the try clause of an *enclosing* try/finally construct) -- just not in a try/finally's try clause. Here's the example from PEP 255 (also embedded in a test_generators.py doctest, so we know for sure it works as advertised ): >>> def f(): ... try: ... yield 1 ... try: ... yield 2 ... 1//0 ... yield 3 # never get here ... except ZeroDivisionError: ... yield 4 ... yield 5 ... raise ... except: ... yield 6 ... yield 7 # the "raise" above stops this ... except: ... yield 8 ... yield 9 ... try: ... x = 12 ... finally: ... yield 10 ... yield 11 >>> print list(f()) [1, 2, 4, 5, 8, 9, 10, 11] >>> [David Jeske] >> The PEP says that there is no guarantee that next() will be called >> again. However, there is a guaratee that either next() will be called, >> or the Generator will be cleaned up. Not so: Python doesn't guarantee destructors will get called by magic (see the discussion of __del__ in the Python Reference Manual). So best practice is to use explicit (e.g.) close() calls anyway, and if you make your generator a method of an object, its critical resources can (conveniently, even!) be exposed to other methods for explicit cleanup (or its __del__, if you absolutely must). In practice (and I've had a lot ), I have yet to be so much as midly annoyed by this restriction. So I have to echo Neil: > We should not propagate that behavior without some careful thought. I > would like to see some compelling arguments as to why try/finally > should be supported inside generators. And especially with bizarre "and 'finally' will probably get executed, but no guarantee that it will, and there's no predicting when-- or even in which thread --if it does, and if it does and 'finally' itself goes boom, we may also ignore the error" semantics. As the PEP says, all of that is too much a violation of finally's pre-generators contract to bear. you-broke-it-you-fix-it-ly y'rs - tim From jeske@chat.net Mon Jan 14 23:22:00 2002 From: jeske@chat.net (David Jeske) Date: Mon, 14 Jan 2002 15:22:00 -0800 Subject: [Python-Dev] Re: [Python-iterators] Python generators and try/finally.. In-Reply-To: References: <20020114154212.A2478@glacier.arctrix.com> Message-ID: <20020114152200.Z5329@mozart.chat.net> On Mon, Jan 14, 2002 at 07:18:40PM -0500, Tim Peters wrote: > It's not that severe, Neil: the only restriction is that yield > cannot appear in the try clause of a try/finally construct. > try/finally can otherwise be used freely inside generators, and > yield can be used anywhere inside a generator inside > try/except/else, and even in a finally clause (these latter assuming > the yield is not also in the try clause of an *enclosing* > try/finally construct) -- just not in a try/finally's try clause. Thanks for your thoughts, I'll defer this discussion until I do run across a programming problem or two where the lack of finally cleanup hurts the Generator, and then I'll bring that example to the table. Thanks again for spending the time on Generators, it looks like a truly neat and orthogonal feature. -- David Jeske (N9LCA) + http://www.chat.net/~jeske/ + jeske@chat.net From paul@prescod.net Tue Jan 15 01:33:07 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 14 Jan 2002 17:33:07 -0800 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict References: Message-ID: <3C4386D3.5CCC24AD@prescod.net> Steven Majewski wrote: > >... > > Your right. I'm confusing PEP 215 with the discussion on PEP 215, > where that feature was requested. > > However, if you allow array and member access as well, which Paul > suggests, then you open the security problem back up unless you > do some code analysis (as he also suggests) to make sure that > [index] or .member doesn't perform a hidden function call > ( A virus infected __getitem__ for example. ) If you have a virus-infected __getitem__ you are screwed regardless. We can't defend against that. The whole point is that we are never evaluating code provided by the user. "Safe" programmer-supplied literal strings are differentated at compile time from arbitrary strings. The interpolation engine only works on safe strings. Calling an overriden __getitem__ or .member is as safe as if they had done it in the way they would today: "%s" % foo.bar() Think of it as pure, compile-time syntactic sugar. If you want it to act like eval, I guess you would do this: $"$(eval('....'))...." which would compile to: "%s" % eval('....') Paul Prescod From jason@jorendorff.com Tue Jan 15 01:38:42 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 19:38:42 -0600 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: > But just in case I'm seeing it all wrong: could you explain > to me how PEP 215 *doesn't* have the potential of introducing > a security hole ? Gladly. Every $-string can be converted to equivalent code that uses only: a) whatever code the programmer explicitly typed in the $-string; b) str() or unicode(); and c) the + operator applied to strings. Therefore $ is exactly as secure or insecure as those three pieces. All three of these things are just as safe as the non-PEP-215 features that we're already using. Therefore $-strings do not introduce any new security hole. ## Jason Orendorff http://www.jorendorff.com/ From jason@jorendorff.com Tue Jan 15 01:46:03 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 19:46:03 -0600 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: Steven Majewski wrote: > On Mon, 14 Jan 2002, Jason Orendorff wrote: > > Would someone please explain to me what is seen as a "possible > > security issue" in PEP 215? Can anyone propose some real-life > > situation where PEP 215 causes a vulnerability, and the > > corresponding % syntax doesn't? > > Do you mean the current '%' or my expanded example ? I mean the current %. Well? ## Jason Orendorff http://www.jorendorff.com/ From nas@python.ca Tue Jan 15 01:54:53 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 14 Jan 2002 17:54:53 -0800 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: ; from jason@jorendorff.com on Mon, Jan 14, 2002 at 05:55:37PM -0600 References: <20020114154918.B2478@glacier.arctrix.com> Message-ID: <20020114175453.A11294@glacier.arctrix.com> Jason Orendorff wrote: > The substitution only happens once. My example was not well thought out. I was thinking something more like: secret_key = "spam" user = "joe" x = "$user said: " + raw_input() print $x That wouldn't work either since $ only evaluates literals. Amazing what you learn by actually reading the PEP. Yes, I'm an idiot. After reading PEP 215 I like it a lot. The fact that $ can only apply to literals completely solves this issue. Has Guido weighed in on it yet? I didn't find anything in the mail archives from him. Neil From paul@prescod.net Tue Jan 15 02:01:52 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 14 Jan 2002 18:01:52 -0800 Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict References: <3C434EF1.DE085914@prescod.net> <15427.24084.13741.415408@12-248-41-177.client.attbi.com> Message-ID: <3C438D90.4638A1B4@prescod.net> Skip Montanaro wrote: > >... > > So? There are some things Perl does better than Python, some things Python > does better than Perl. It doesn't have anything to do with competing with Perl. It is just about learning from things that other languages do better (in this case simpler) than Python. This feature came from the Bourne shell and is also present in DOS batch, TCL, Ruby, PHP. Python's "%" is much better than nothing (which is what Javascript has) but it is still a pain. First you use it with positional arguments and then realize that is getting confusing so you switch to dictionary arguments and then that gets unweildy because you're just declaring new names for existing variables so you use vars(). But then you want to interpolate the result of a function call or expression. So you have to set up a one-time-use variable. PEP 215 (which I did not write!) unifies all of the use cases into one syntax that can be taught in ten minutes. The % syntax is fine for totally different use cases: printf-style formatting and interpolation of strings that might be generated at runtime. Paul Prescod From sdm7g@Virginia.EDU Tue Jan 15 02:07:24 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 21:07:24 -0500 (EST) Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: On Mon, 14 Jan 2002, Jason Orendorff wrote: > > But just in case I'm seeing it all wrong: could you explain > > to me how PEP 215 *doesn't* have the potential of introducing > > a security hole ? > > Gladly. > > Every $-string can be converted to equivalent code that uses only: > > a) whatever code the programmer explicitly typed > in the $-string; > b) str() or unicode(); and > c) the + operator applied to strings. > But the examples in PEP 215 don't follow those restrictions. That may be the source of the confusion. Maybe someone should revise the PEP for consistency before it's considered further. -- Steve. From neal@metaslash.com Tue Jan 15 02:10:55 2002 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 14 Jan 2002 21:10:55 -0500 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict References: <20020114154918.B2478@glacier.arctrix.com> <20020114175453.A11294@glacier.arctrix.com> Message-ID: <3C438FAF.7D46B50F@metaslash.com> Neil Schemenauer wrote: > > Jason Orendorff wrote: > > The substitution only happens once. > > My example was not well thought out. I was thinking something more > like: > > secret_key = "spam" > user = "joe" > x = "$user said: " + raw_input() > print $x > > That wouldn't work either since $ only evaluates literals. Amazing what > you learn by actually reading the PEP. Yes, I'm an idiot. Sorry, I haven't followed this thread real closely, but I thought someone said eval() was used under the covers. If x is eval'ed and the string is as above, I get the following in 2.1: >>> secret_key = 'spam' >>> x = raw_input('? ') ? eval("secret_key") # Is the following commented print equivalent the the line below it? ### print "You entered $x" >>> print "You entered", eval(x) You entered spam >>> print "You entered %(x)s" % locals() You entered eval("secret_key") Not sure if that's the same as what you are talking about though. Neal From sdm7g@Virginia.EDU Tue Jan 15 02:15:34 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Mon, 14 Jan 2002 21:15:34 -0500 (EST) Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict In-Reply-To: <3C438D90.4638A1B4@prescod.net> Message-ID: On Mon, 14 Jan 2002, Paul Prescod wrote: > ... > then realize that is getting confusing so you switch to dictionary > arguments and then that gets unweildy because you're just declaring new > names for existing variables so you use vars(). But then you want to > interpolate the result of a function call or expression. So you have to > set up a one-time-use variable. > > PEP 215 (which I did not write!) unifies all of the use cases into one > syntax that can be taught in ten minutes. The % syntax is fine for > totally different use cases: printf-style formatting and interpolation > of strings that might be generated at runtime. But Jason just said that function calls are not allowed. ( We -- actually, he listed what was allowed, and function calls were definitely not among them. ) PEP 215's examples don't agree with the limitations in it's security section, and the proposal being discussed seems to be shifting under out feet. That's the reason I got the proposals given in the previous discussion of PEP 215 and PEP 215 itself confused. -- Steve From jason@jorendorff.com Tue Jan 15 02:25:18 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 20:25:18 -0600 Subject: [Python-Dev] Suggested changes to PEP 215 Message-ID: One of the examples in PEP 215 is a bit wrong, I think. >>> print $'\$a' 5 This should output a backslash before the 5, because the string '\$a' has a backslash character in it. Also, for clarity, PEP 215 should explicitly specify that the substitution only occurs once. For example: # Existing examples >>> a, b = 5, 6 >>> print $'a = $a, b = $b' a = 5, b = 6 [...] >>> x = "$a" >>> print $'x = $x' x = $a Maybe there should also be examples demonstrating that $-strings adopt the local namespace. Also, the PEP says: ] $'a = $a, b = $b' ] ] could be compiled as though it were the expression ] ] ('a = ' + str(a) + ', b = ' + str(b)) Consider: def f(str): # The argument 'str' masks the builtin str() function. a, b = find_stuff(str) print $'a = $a, b = $b' return a, b It should be specified that $-strings do not use the local "str" and "unicode" names to find str() and unicode(); nor do they look in the current __builtins__ or the __builtin__ module. They should use the actual python C implementations of str() and unicode(). This can be implemented by putting a direct reference to str or unicode in the co_consts tuple of the code object; I don't know how else the author plans to deal with this. ## Jason Orendorff http://www.jorendorff.com/ From paul@prescod.net Tue Jan 15 02:40:20 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 14 Jan 2002 18:40:20 -0800 Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict References: Message-ID: <3C439693.7A2A3724@prescod.net> Steven Majewski wrote: > >... > > But Jason just said that function calls are not allowed. > ( We -- actually, he listed what was allowed, and function calls > were definitely not among them. ) I misread Jason's list at first myself. Jason was describing the *output* of the transformation. He said that the output of the transformation would be no more and no less than directly typed code with a) whatever code the programmer explicitly typed in the $-string; b) str() or unicode(); and "$" has the power to eval, but only to eval a literal. As described here (a string prefix rather than an operator c) the + operator applied to strings. "a)" embodies a whole host of things listed in the PEP: "A Python identifier optionally followed by any number of trailers, where a trailer consists of: - a dot and an identifier, - an expression enclosed in square brackets, or - an argument list enclosed in parentheses (This is exactly the pattern expressed in the Python grammar by "NAME trailer*", using the definitions in Grammar/Grammar.)" The PEP also has examples: >>> print $'References to $a: $sys.getrefcount(a)' References to 5: 15 > PEP 215's examples don't agree with the limitations in it's > security section, To summarize the security section, it says: *All of the text that is ever processed by this mechanism is textually present in the Python program at compile time*. In other words, users of the program can never submit information and have it be evaluated by this mechanism. Paul Prescod From paul@prescod.net Tue Jan 15 02:40:33 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 14 Jan 2002 18:40:33 -0800 Subject: [Python-Dev] Suggested changes to PEP 215 References: Message-ID: <3C4396A1.45FE2D5@prescod.net> Jason Orendorff wrote: > > ... > > It should be specified that $-strings do not use the local > "str" and "unicode" names to find str() and unicode(); nor > do they look in the current __builtins__ or the __builtin__ > module. They should use the actual python C implementations > of str() and unicode(). Why? Wouldn't it be better to look in __builtin__? If someone overrides str() or unicode() they may well want that behaviour to be respected in interopolations. Paul Prescod From ping@lfw.org Tue Jan 15 02:46:49 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Mon, 14 Jan 2002 20:46:49 -0600 (CST) Subject: [Python-Dev] PEP 215 does not introduce security issues In-Reply-To: <20020114175453.A11294@glacier.arctrix.com> Message-ID: On Mon, 14 Jan 2002, Neil Schemenauer wrote: > Amazing what you learn by actually reading the PEP. May i quote you on that? :) Just kidding. More seriously: there is no security issue introduced by PEP 215. I saw the concerns being raised in the previous e-mail messages on this topic, but every time i was about to compose a reply, i found that Jason Orendorff had already provided exactly the explanation i was about to give, or better. So, thank you, Jason. :) In short: PEP 215 suggests a syntactic transformation that turns $'the $quick brown $fox()' into the fully equivalent 'the %s brown %s' % (quick, fox()) The '$' prefix only applies to literals, and cannot be used as an operator in front of other expressions or variables. This issue is pointed out specifically in the PEP: '$' works like an operator and could be implemented as an operator, but that prevents the compile-time optimization and presents security issues. So, it is only allowed as a string prefix. Therefore, this transformation executes *only* code that was literally present in the original program. (An example of this transformation is given at the end of PEP 215 in the "Implementation" section.) (By the way, i myself am not yet fully convinced that a string interpolation feature is something that Python desperately needs. I do see some considerable potential for good, and so the purpose of PEP 215 was to put a concrete and plausible proposal on the table for discussion. Given that proposal, which i believe to be about as good as one could reasonably expect, we can hope to save ourselves the expense of re-arguing the same issues repeatedly, and make an informed decision about whether to add the feature. Among the possible drawbacks/complaints i see are: more work for automated source code tools, tougher editor syntax highlighting, too many messy string prefix characters, and the addition of yet one more Python feature to teach and document. Security, however, is not among them.) -- ?!ng From ping@lfw.org Tue Jan 15 02:51:43 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Mon, 14 Jan 2002 20:51:43 -0600 (CST) Subject: [Python-Dev] Re: Suggested changes to PEP 215 In-Reply-To: Message-ID: On Mon, 14 Jan 2002, Jason Orendorff wrote: > One of the examples in PEP 215 is a bit wrong, I think. > > >>> print $'\$a' > 5 > > This should output a backslash before the 5, because the > string '\$a' has a backslash character in it. You are correct. I'll make this change. > Also, for clarity, PEP 215 should explicitly specify > that the substitution only occurs once. [...] > Maybe there should also be examples demonstrating that $-strings > adopt the local namespace. Sure, that wouldn't hurt. More examples are a good idea. > Consider: > > def f(str): > # The argument 'str' masks the builtin str() function. > a, b = find_stuff(str) > print $'a = $a, b = $b' > return a, b > > It should be specified that $-strings do not use the local > "str" and "unicode" names to find str() and unicode() Good point. Perhaps it is better to simply describe a transformation using '%s' and '%' instead of 'str' and '+' to avoid this potential confusion altogether. -- ?!ng From jason@jorendorff.com Tue Jan 15 03:01:24 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 21:01:24 -0600 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: Steven Majewski wrote: > On Mon, 14 Jan 2002, Jason Orendorff wrote: > > > > But just in case I'm seeing it all wrong: could you explain > > > to me how PEP 215 *doesn't* have the potential of introducing > > > a security hole ? > > > > Gladly. > > > > Every $-string can be converted to equivalent code that uses only: > > > > a) whatever code the programmer explicitly typed > > in the $-string; > > b) str() or unicode(); and > > c) the + operator applied to strings. > > But the examples in PEP 215 don't follow those restrictions. I dunno, it looks like they do to me. $'a = $a, b = $b' ---> ('a = ' + str(a) + ', b = ' + str(b)) $u'uni${a}ode' ---> (u'uni' + unicode(a) + u'ode') $'\$a' ---> ('\\' + str(a)) $r'\$a' ---> ('\\' + str(a)) $'$$$a.$b' ---> ('$' + str(a) + '.' + str(b)) $'a + b = ${a + b}' ---> ('a + b = ' + str(a + b)) $'References to $a: $sys.getrefcount(a)' ---> ('References to ' + str(a) + ': ' + str(sys.getrefcount(a))) $"sys = $sys, sys = $sys.modules['sys']" ---> ('sys = ' + str(sys) + ', sys = ' + str(sys.modules['sys'])) $'BDFL = $sys.copyright.split()[4].upper()' ---> ('BDFL = ' + str(sys.copyright.split()[4].upper())) In every case, the equivalent uses a) some bits of code that the programmer explicitly typed in the $-string; b) str() or unicode(); c) and the + operator (to join the resulting strings). I guess you're thinking "but those bits of code are invoking other functions that aren't in your list". My point is, the equivalent print statement, or % expression (the existing %, not your proposed %) does the exact same thing. print $'here we go: $y maps to $x[y]' print 'here we go: %s maps to %s' % (y, x[y]) print 'here we go:', y, 'maps to', x[y] print 'here we go: ' + str(y) + ' maps to ' + str(x[y]) Is one of these less secure than the others somehow? There is no new security hole here. ## Jason Orendorff http://www.jorendorff.com/ From ping@lfw.org Tue Jan 15 03:30:47 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Mon, 14 Jan 2002 21:30:47 -0600 (CST) Subject: [Python-Dev] Re: Suggested changes to PEP 215 In-Reply-To: Message-ID: On Mon, 14 Jan 2002, Ka-Ping Yee wrote: > Good point. Perhaps it is better to simply describe a > transformation using '%s' and '%' instead of 'str' and '+' > to avoid this potential confusion altogether. I have just realized, upon careful thought, that it would be better to make this syntactic transformation the official specification of the feature, rather than simply an implementation suggestion. The current specification is incomplete because it does not adequately handle certain corner cases: (current PEP) \ then $ $ then \ what i want >>> x = 'x41' >>> print $'\$x' ??? \x41 A \x41 >>> print $'\x24x' ??? x41 $x $x >>> y = '41' >>> print $'\x$y' ??? A SyntaxError ??? The issue is whether backslash-interpretation happens first, or dollar-interpretation happens first. The current PEP says \ first. I hope you see why i want the first case *not* to do \x interpretation and why i want the second case not to do $ interpretation. (The programmer shouldn't have to look for \x24 in her code!) The third case is a mess and should definitely be a syntax error. I'll write a new PEP. -- ?!ng From jason@jorendorff.com Tue Jan 15 03:36:57 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 21:36:57 -0600 Subject: [Python-Dev] Suggested changes to PEP 215 In-Reply-To: <3C4396A1.45FE2D5@prescod.net> Message-ID: Paul Prescod wrote: > Jason Orendorff wrote: > > ... > > It should be specified that $-strings do not use the local > > "str" and "unicode" names to find str() and unicode(); nor > > do they look in the current __builtins__ or the __builtin__ > > module. They should use the actual python C implementations > > of str() and unicode(). > > Why? Wouldn't it be better to look in __builtin__? If someone overrides > str() or unicode() they may well want that behaviour to be respected in > interopolations. I was thinking it should parallel what the other similar features already do: >>> import __builtin__ >>> __builtin__.str = 'a suffusion of yellow' >>> str 'a suffusion of yellow' >>> print 32 32 >>> print "xyz %s 123" % 4.5 xyz 4.5 123 >>> ## Jason Orendorff http://www.jorendorff.com/ From jason@jorendorff.com Tue Jan 15 03:46:38 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 21:46:38 -0600 Subject: [Python-Dev] Re: Suggested changes to PEP 215 In-Reply-To: Message-ID: Ping wrote: > > Consider: > > > > def f(str): > > # The argument 'str' masks the builtin str() function. > > a, b = find_stuff(str) > > print $'a = $a, b = $b' > > return a, b > > > > It should be specified that $-strings do not use the local > > "str" and "unicode" names to find str() and unicode() > > Good point. Perhaps it is better to simply describe a > transformation using '%s' and '%' instead of 'str' and '+' > to avoid this potential confusion altogether. I thought about this; but I don't know if there's a '%' equivalent for the unicode handling. $u'uni${a}ode' ---> (u'uni' + unicode(a) + u'ode') ---> u'uni%???ode' % (a,) I don't think %s does it. Maybe there's some format spec flag that I'm forgetting. ## Jason Orendorff http://www.jorendorff.com/ From nhodgson@bigpond.net.au Tue Jan 15 03:55:37 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Tue, 15 Jan 2002 14:55:37 +1100 Subject: [Python-Dev] PEP 215 does not introduce security issues References: Message-ID: <003601c19d78$7e220680$0acc8490@neil> The PEP: > '$' works like an operator and could be implemented as an > operator, but that prevents the compile-time optimization > and presents security issues. So, it is only allowed as a > string prefix. I'd like to see the '$' prefix replaced with an ordinary character such as 'i'. '$' is currently unused in Python and so can be used for future extension either as a new operator or as the basis for new operators. Interpolation strings consume this character so it can no longer be chosen as a new operator. Neil From skip@pobox.com Tue Jan 15 03:56:47 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 14 Jan 2002 21:56:47 -0600 Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict In-Reply-To: <3C438D90.4638A1B4@prescod.net> References: <3C434EF1.DE085914@prescod.net> <15427.24084.13741.415408@12-248-41-177.client.attbi.com> <3C438D90.4638A1B4@prescod.net> Message-ID: <15427.43135.659586.314871@12-248-41-177.client.attbi.com> Paul> But then you want to interpolate the result of a function call or Paul> expression. So you have to set up a one-time-use variable. As has been demonstrated, there are several ways to tackle this problem. I first saw something headed in this direction with Zope's (actually DocumentTemplate's) MultiMapping class several years ago. It only aimed to make it easy to interpolate named parameters from several dictionaries simultaneously. Steve Majewski and others have shown how you can do this with an EvalDict type of class, so it's not like you can't do this today. The point is for something to be really worth modifying the syntax of the language I think it has to demonstrate that it's significantly better than the alternatives. The security argument is a red herring. There are enough other ways programmers can blow their feet off. If someone is naive enough to execute the moral equivalent of print raw_input() % EvalDict3() in their programs they will probably learn fairly quickly that it's a questionable programming practice. Paul> PEP 215 (which I did not write!) unifies all of the use cases into Paul> one syntax that can be taught in ten minutes. It unifies all the use cases into *two* syntaxes. The preexisting %-formatted strings aren't going away anytime soon. They are suitable for most applications, so new users would have to contend with at least being able to read, if not write, both forms of string interpolation for the forseeable future if PEP 215 is adopted. It hasn't been demonstrated to me that Steve's EvalDict or something similar couldn't be taught in a similar amount of time. It has the added advantage that it's essentially the same syntax as the current % syntax. You can use expressions where before you had to restrict yourself to names. It requires no change to the language. Just drop it into a module in the std library and away you go. In fact, coded properly (which Steve is eminently capable of doing) it would be 100% backward compatible. People running essentially any version of Python could use it. (I believe Pythonware still makes a 1.4 installer available for Windows.) Paul> The % syntax is fine for totally different use cases: printf-style Paul> formatting and interpolation of strings that might be generated at Paul> runtime. What do you mean by "totally different"? Most examples I've seen so far have looked pretty much like print $"$a $b" which probably covers about 90% of common usage anyway. The examples in PEP-215 don't look any more different than an EvalDict-like class could comfortably handle today either. -- Skip Montanaro (skip@pobox.com - http://www.mojam.com/) From skip@pobox.com Tue Jan 15 04:03:17 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 14 Jan 2002 22:03:17 -0600 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: References: Message-ID: <15427.43525.112234.608199@12-248-41-177.client.attbi.com> $'BDFL = $sys.copyright.split()[4].upper()' ---> ('BDFL = ' + str(sys.copyright.split()[4].upper())) How to you know when to stop gobbling after seeing a dollar sign in the string? -- Skip Montanaro (skip@pobox.com - http://www.mojam.com/) From ping@lfw.org Tue Jan 15 04:04:47 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Mon, 14 Jan 2002 22:04:47 -0600 (CST) Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: <15427.43525.112234.608199@12-248-41-177.client.attbi.com> Message-ID: On Mon, 14 Jan 2002, Skip Montanaro wrote: > $'BDFL = $sys.copyright.split()[4].upper()' > ---> ('BDFL = ' + str(sys.copyright.split()[4].upper())) > > How to you know when to stop gobbling after seeing a dollar sign in the > string? Parse using the "NAME trailer*" production in Grammar/Grammar. -- ?!ng From jason@jorendorff.com Tue Jan 15 04:16:11 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 22:16:11 -0600 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: > On Mon, 14 Jan 2002, Skip Montanaro wrote: > > $'BDFL = $sys.copyright.split()[4].upper()' > > ---> ('BDFL = ' + str(sys.copyright.split()[4].upper())) > > > > How to you know when to stop gobbling after seeing a dollar sign in the > > string? > > Parse using the "NAME trailer*" production in Grammar/Grammar. Except that whitespace is significant, at least in the sample implementation: >>> i = Itpl.itpl >>> x=4 >>> y=3 >>> i("This is x: $x. This is y: $y.") # doesn't grab (x.This) 'This is x: 4. This is y: 3.' >>> i("This is x: $x.This is y: $y.") # does grab (x.This) AttributeError: 'int' object has no attribute 'This' This doesn't seem to be mentioned in the PEP. ## Jason Orendorff http://www.jorendorff.com/ From jason@jorendorff.com Tue Jan 15 04:56:56 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 22:56:56 -0600 Subject: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict In-Reply-To: Message-ID: Steve Majewski wrote: > But Jason just said that function calls are not allowed. > ( We -- actually, he listed what was allowed, and function calls > were definitely not among them. ) [...] Well, when the $-string explicitly contains the name of the function to be called, then that falls into category (a). I wrote: > a) whatever code the programmer explicitly typed > in the $-string; I hope this makes things clearer and not worse. :-) ## Jason Orendorff http://www.jorendorff.com/ From sdm7g@Virginia.EDU Tue Jan 15 05:15:23 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Tue, 15 Jan 2002 00:15:23 -0500 (EST) Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: On Mon, 14 Jan 2002, Jason Orendorff wrote: > Steven Majewski wrote: > > On Mon, 14 Jan 2002, Jason Orendorff wrote: > > > Would someone please explain to me what is seen as a "possible > > > security issue" in PEP 215? Can anyone propose some real-life > > > situation where PEP 215 causes a vulnerability, and the > > > corresponding % syntax doesn't? > > > > Do you mean the current '%' or my expanded example ? > > I mean the current %. > > Well? > Paul is the one who (rightly) brought up the issue of security with respect to double evaluated strings. But in addition, he seemed to be saying that you can do more with a compile time test than you can with a runtime test. I disagree with that. I think, for the same semantics, you get the same security issues. I think it's very similar to the compile time type checking vs. dynamic typing problem. (In fact, I think it reduces to the same problem.) There are clearly some advantages to doing things compile time, but you don't get more security without more restriction. -- Steve From jason@jorendorff.com Tue Jan 15 05:33:24 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 14 Jan 2002 23:33:24 -0600 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: Steven Majewski wrote: > On Mon, 14 Jan 2002, Jason Orendorff wrote: > > Steven Majewski wrote: > > > On Mon, 14 Jan 2002, Jason Orendorff wrote: > > > > Would someone please explain to me what is seen as a "possible > > > > security issue" in PEP 215? Can anyone propose some real-life > > > > situation where PEP 215 causes a vulnerability, and the > > > > corresponding % syntax doesn't? > > > > > > Do you mean the current '%' or my expanded example ? > > > > I mean the current %. > > > > Well? > > > > Paul is the one who (rightly) brought up the issue of security > with respect to double evaluated strings. But in addition, he > seemed to be saying that you can do more with a compile time > test than you can with a runtime test. I disagree with that. > > I think, for the same semantics, you get the same security > issues. I think it's very similar to the compile time type > checking vs. dynamic typing problem. (In fact, I think it > reduces to the same problem.) > > There are clearly some advantages to doing things compile time, > but you don't get more security without more restriction. As long as this "security issue" thread dies, I'm happy. ## Jason Orendorff http://www.jorendorff.com/ From paul@prescod.net Tue Jan 15 05:49:19 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 14 Jan 2002 21:49:19 -0800 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict References: Message-ID: <3C43C2DF.782716DC@prescod.net> Steven Majewski wrote: > >... > Paul is the one who (rightly) brought up the issue of security > with respect to double evaluated strings. But in addition, he > seemed to be saying that you can do more with a compile time > test than you can with a runtime test. I disagree with that. >... > I think, for the same semantics, you get the same security > issues. Sure, for the same semantics. But EvalDict doesn't have the same semantics. Even if we ignore double interpolation there is the issue of code like this: >>> def double(): ... user_val = raw_input("Please enter a number:") ... print "%(2*user_val)" % EvalDict >>> double() Please enter a number: 3 + (os.system("rm -rm *")) For EvalDict to have the same semantics as PEP 215 it would have to disallow interpolations on strings that were not string literals. This would make the EvalDict object somewhat different than any other object in the Python library. Plus it would require compiler support which would break compatibility with older Pythons. Paul Prescod From sdm7g@Virginia.EDU Tue Jan 15 05:56:18 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Tue, 15 Jan 2002 00:56:18 -0500 (EST) Subject: [Python-Dev] PEP 215 does not introduce security issues In-Reply-To: Message-ID: On Mon, 14 Jan 2002, Ka-Ping Yee wrote: > The '$' prefix only applies to literals, and cannot be used as > an operator in front of other expressions or variables. This > issue is pointed out specifically in the PEP: I think the term "the '$' prefix" was one of the sources of my confusion, as '$' is both a string prefix and a symbol prefix within the string. I think I read "the '$' prefix" as referreing to the second kind where you meant the first. The same goes for discussion of '$' as an operator. (This misreading was the source of the inconsistency I thought I saw between the examples and other statements.) > Therefore, this transformation executes *only* code that was > literally present in the original program. (An example of this > transformation is given at the end of PEP 215 in the > "Implementation" section.) O.K. Jason's explaination finally got thru to me: it's more clear if I think of it as a preprocessor that really doesn't add any capabilities to the language. I should think of it more like the 'r' string prefix, which is just a syntactic convenience, rather than like the 'u' string prefix, which creates a special kind of (unicode) string. ( Well, it *does* create a special kind of string in the runtime, but you can't access that string to to do anything strange in Python, because as soon as it's assigned, it gets transformed into a 'normal string' . Thinking of it as a preprocessor makes that more obvious.) > (By the way, i myself am not yet fully convinced that a string > interpolation feature is something that Python desperately needs. > I do see some considerable potential for good, and so the purpose > of PEP 215 was to put a concrete and plausible proposal on the > table for discussion. Given that proposal, which i believe to be > about as good as one could reasonably expect, we can hope to save > ourselves the expense of re-arguing the same issues repeatedly, > and make an informed decision about whether to add the feature. > > Among the possible drawbacks/complaints i see are: more work for > automated source code tools, tougher editor syntax highlighting, > too many messy string prefix characters, and the addition of yet > one more Python feature to teach and document. Security, however, > is not among them.) I'm not wild about more string prefixes, but we've already started down that road, so I can't complain too much. But, as you've already noted: it doesn't add any new capability, just new syntax. ( But it probably as justifiable as the raw string syntax. ) Although I've knocked the idea in the past, I'ld almost rather see some sort of 'macro' facility for python, than to see a bunch of special case syntax added to the language for every feature. -- Steve From jason@jorendorff.com Tue Jan 15 06:01:36 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Tue, 15 Jan 2002 00:01:36 -0600 Subject: [Python-Dev] PEP 215 does not introduce security issues In-Reply-To: Message-ID: Steve Majewski wrote: > [...] it's more clear > if I think of it as a preprocessor that really doesn't add any > capabilities to the language. I should think of it more like > the 'r' string prefix, which is just a syntactic convenience, > rather than like the 'u' string prefix, which creates a special > kind of (unicode) string. ( Well, it *does* create a special kind > of string in the runtime, but you can't access that string to > to do anything strange in Python, because as soon as it's assigned, > it gets transformed into a 'normal string' . Thinking of it as > a preprocessor makes that more obvious.) Yep, I agree, and I'm glad we're all at least seeing PEP 215 the same way now. :-) However, I don't think it would need a special kind of string in the runtime. Thinking of it as a preprocessor, I believe it would only need to generate some Python bytecode that uses the existing str or unicode types. Now I can go back to being neutral on PEP 215. :-) ## Jason Orendorff http://www.jorendorff.com/ From sdm7g@Virginia.EDU Tue Jan 15 06:27:19 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Tue, 15 Jan 2002 01:27:19 -0500 (EST) Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: <3C43C2DF.782716DC@prescod.net> Message-ID: On Mon, 14 Jan 2002, Paul Prescod wrote: > Sure, for the same semantics. But EvalDict doesn't have the same > semantics. Even if we ignore double interpolation there is the issue of > code like this: > > > >>> def double(): > ... user_val = raw_input("Please enter a number:") > ... print "%(2*user_val)" % EvalDict > > >>> double() > Please enter a number: 3 + (os.system("rm -rm *")) > But in EvalDict you have to explicitly pass it a namespace dict. You just don't pass it one with access to os.system ( or most other os calls. ) That's why I disliked an implicit namespace. But your example suggests to me: >>> input('?: ') ?: r'raw string' 'raw string' >>> input('?: ') ?: u'unicode string' u'unicode string' >>> input('?: ') ?: $'$os.system("rm -rm *" )' I guess you need to special case that out of the compiler also. ( Are there any others lurking about ? ) -- Steve From barry@zope.com Tue Jan 15 07:04:10 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 15 Jan 2002 02:04:10 -0500 Subject: PEP 215 (was Re: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict) References: <3C439693.7A2A3724@prescod.net> Message-ID: <15427.54378.963060.448829@anthem.wooz.org> >>>>> "PP" == Paul Prescod writes: PP> He said that the output of the transformation would be no more PP> and no less than directly typed code with | a) whatever code the programmer explicitly typed | in the $-string; | b) str() or unicode(); and | "$" has the power to eval, but only to eval a literal. As | described here (a string prefix rather than an operator | c) the + operator applied to strings. PP> "a)" embodies a whole host of things listed in the PEP: PP> "A Python identifier optionally followed by any number of PP> trailers, where a trailer consists of: - a dot and an PP> identifier, - an expression enclosed in square brackets, or - PP> an argument list enclosed in parentheses (This is exactly the PP> pattern expressed in the Python grammar by "NAME trailer*", PP> using the definitions in Grammar/Grammar.)" Not to pick on Paul, but I'm having a hard time imagining how a newbie Python user being taught this new feature in his second hour will actually understand any of these rules. And how will you later answer their questions about why Python has both $'' literals and '' % dict interpolation when it seems like you can do basically the same task using either of them? >>>>> "KY" == Ka-Ping Yee writes: KY> In short: PEP 215 suggests a syntactic transformation that KY> turns KY> $'the $quick brown $fox()' KY> into the fully equivalent KY> 'the %s brown %s' % (quick, fox()) KY> The '$' prefix only applies to literals, and cannot be used as KY> an operator in front of other expressions or variables. This KY> issue is pointed out specifically in the PEP: [...then...] KY> Good point. Perhaps it is better to simply describe a KY> transformation using '%s' and '%' instead of 'str' and '+' KY> to avoid this potential confusion altogether. That would help . KY> (By the way, i myself am not yet fully convinced that a string KY> interpolation feature is something that Python desperately KY> needs. I am definitely not convinced that Python desperately needs PEP 215. I wonder if the same folks clamoring for it will be the same folks who raise their hands next month when asked again if they think Python is change too fast (naw, that won't happen :). How many of you use Itpl regularly? If Python were so deficient in this regard, I would expect to see a lot of hands. It's certainly easy enough to define in today's Python, a simple function call that adds only two characters to the proposal, so I don't buy that this /only/ has utility if were to apply to literals. I'm willing to accept that as applied only to literals it doesn't raise more security concerns, but it also isn't nearly as useful then IMO. And BTW, as I've told Ka-Ping before, I /am/ sympathetic to many of the ideas in this PEP and in Itpl. In fact, I have something very similar in Mailman that I use all the time[1]. Instead of $'...' I spell it _('...') which actually stands out better to me, and is only two extra characters. It's not as feature rich as PEP 215, but then about the /only/ thing I'd add would be attribute access. As it is, _('You owe me %(num)d dollars for that %(adj)s parrot') gets me there 9 times out of 10, while for the 10th bird = cage.bird state = bird.wake_up() days = int(time.time() - bird.lastmodtime) / 86400 _('That %(bird)s has been %(state)s for %(days)s') is really not much more onerous, and certainly less jarring to my eye than all those $ signs. -1 -Barry [1] I use _() ostensibly to mark translatable strings, but it has a side benefit in that it interpolates into the string named variables from the locals and globals of the calling context. It does this by using sys._getframe(1) in Python 2.1 and try/except hackery in older versions of Python. I find it quite handy, and admittedly magical, but then I'm not suggesting it become a standard Python feature. :) From barry@zope.com Tue Jan 15 07:26:19 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 15 Jan 2002 02:26:19 -0500 Subject: [Python-Dev] PEP 215 and EvalDict, yet another alternative References: Message-ID: <15427.55707.272542.866767@anthem.wooz.org> >>>>> "SM" == Steven Majewski writes: SM> Since PEP 216 on string interpolation is still active, I'ld SM> appreciate it if some of it's supporters would comment on my SM> revised alternative solution (posted on comp.lang.python and SM> at google thru): [Steve's EvalDict] For completeness, here's a simplified version of Mailman's _() function which does auto-interpolation from locals and globals of the calling context. This version works in Python 2.1 or beyond and has the i18n translation stuff stripped out. For the full deal, see http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/mailman/mailman/Mailman/i18n.py?rev=2.4&content-type=text/plain Cheers, -Barry -------------------- snip snip --------------------dollar.py import sys from UserDict import UserDict from types import StringType class SafeDict(UserDict): """Dictionary which returns a default value for unknown keys.""" def __getitem__(self, key): try: return self.data[key] except KeyError: if isinstance(key, StringType): return '%('+key+')s' else: return '' % `key` def _(s): frame = sys._getframe(1) d = SafeDict(frame.f_globals.copy()) d.update(frame.f_locals) return s % d BIRD = 'parrot' def examples(thing): bird = 'dead ' + BIRD print _('It used to be a %(BIRD)s') print _('But now it is a %(bird)s') print _('%(BIRD)s or %(bird)s?') print _('You are not %(morg)s, you are not %(imorg)s') print _('%(thing)s, %(thing)s, what is %(thing)s?') examples(sys.argv[1]) -------------------- snip snip -------------------- % python /tmp/dollar.py brain It used to be a parrot But now it is a dead parrot parrot or dead parrot? You are not %(morg)s, you are not %(imorg)s brain, brain, what is brain? From paul@prescod.net Tue Jan 15 09:58:31 2002 From: paul@prescod.net (Paul Prescod) Date: Tue, 15 Jan 2002 01:58:31 -0800 Subject: PEP 215 (was Re: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict) References: <3C439693.7A2A3724@prescod.net> <15427.54378.963060.448829@anthem.wooz.org> Message-ID: <3C43FD47.664911C5@prescod.net> "Barry A. Warsaw" wrote: > >... > > Not to pick on Paul, but I'm having a hard time imagining how a newbie > Python user being taught this new feature in his second hour will > actually understand any of these rules. It's relatively simple. "You can do attribute access and function or method calls. You can wrap things in parens do to more complicated expressions." I would also be interested in a version of PEP 215 that merely required parens all of the time. $"$(foo) $(5 + bar)" I have always been nervous when I start new languages about how the interpolation strings figure out where they end. > ... And how will you later answer > their questions about why Python has both $'' literals and '' % dict > interpolation when it seems like you can do basically the same task > using either of them? One is for working with literals and the other for working with computed strings that arise in your code. It's one of those things where you use the simple way you are taught in class until you find a case where you can't use it any more and then you'll understand why you need the advanced way. Today's situation is that you are probably taught about three or four ways in class because none of them is really particularly "advanced". >... > > I am definitely not convinced that Python desperately needs PEP 215. I don't think anybody is convinced that Python desperately needs PEP AFAIK, it hasn't been touched since July 2000. How could a 10 year old language desperately need ANY syntactic sugar? If we survived until now without something then we could probably survive another few years. > I wonder if the same folks clamoring for it will be the same folks who > raise their hands next month when asked again if they think Python is > change too fast (naw, that won't happen :). Ummm. Who is clamoring for this feature? We were presented with a newer proposal to be compared with PEP 215. Some of us came to the conclusion that PEP 215 is better than the new proposal. Nobody has, AFAIK, proposed to complete or implement the PEP. > How many of you use Itpl regularly? If Python were so deficient in > this regard, I would expect to see a lot of hands. .... The hassle of an extra dependency is without a doubt greater than the hassle of working around Python in this regard. But then there are may features in today's Python that fell into that category originally. Like you could get a form of type/class unification from ExtensionClass. But who would bother to install ExtensionClass just for that? Anyhow, Mailman's code demonstrates that when the feature is provided at low cost (i.e. no dependency), people use it. > is really not much more onerous, and certainly less jarring to my eye > than all those $ signs. This from mister print >>? ;) Paul Prescod From mal@lemburg.com Tue Jan 15 10:34:04 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 15 Jan 2002 11:34:04 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> Message-ID: <3C44059C.CFC09899@lemburg.com> "Martin v. Loewis" wrote: > > > OK, PEP 277 is now available from: > > http://python.sourceforge.net/peps/pep-0277.html > > Looks very good to me, except that the listdir approach (unicode in, > unicode out) should apply uniformly to all platforms; I'll provide an > add-on patch to your implementation once the PEP is approved. +1 Some nits: The restriction when compiling Python in wide mode on Windows should be lifted: The PyUnicode_AsWideChar() API should be used to convert 4-byte Unicode to wchar_t (which is 2-byte on Windows). Why is "unicodefilenames" a function and not a constant ? I'm still in favour of a file API abstraction layer in Python, but that can be done at some later point (moving the code from the various platform specific modules into a Python/fileapi.c file). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jack@oratrix.nl Tue Jan 15 13:20:15 2002 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 15 Jan 2002 14:20:15 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... In-Reply-To: Message by "Martin v. Loewis" , Mon, 14 Jan 2002 08:11:54 +0100 , <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> Message-ID: <20020115132015.8A0A6E8451@oratrix.oratrix.nl> > > OK, PEP 277 is now available from: > > http://python.sourceforge.net/peps/pep-0277.html > > Looks very good to me, except that the listdir approach (unicode in, > unicode out) should apply uniformly to all platforms; I'll provide an > add-on patch to your implementation once the PEP is approved. Yes, I would like this. On Mac OS X I don't have wide API's, but all calls use and return utf8 filenames. If listdir() could return Unicode I could convert the utf8 results to Unicode without setting sys.encoding. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jack@oratrix.nl Tue Jan 15 15:29:26 2002 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 15 Jan 2002 16:29:26 +0100 Subject: [Python-Dev] Name clash with typedefs in object.h Message-ID: <20020115153050.5F587E8452@oratrix.oratrix.nl> Object.h declares various typedefs for routine pointers, and their names are not adorned with some sort of Py_ prefix. Suddenly this has started to be a problem for me on OSX (not sure why: either object.h changed or because I got a new version of the OSX devtools): object.h declares a typedef "destructor", and if that is in scope when is included this fails, which uses the name "destructor" as an argument name (for a routine pointer), and the parser gets confused. I think it's GCC that's to blame here, but still: shouldn't these names have some sort of a prefix? Alternatively I can apply a quick fix by defining "destructor" as something else just before including ... -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From guido@python.org Tue Jan 15 15:35:11 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 15 Jan 2002 10:35:11 -0500 Subject: [Python-Dev] Name clash with typedefs in object.h In-Reply-To: Your message of "Tue, 15 Jan 2002 16:29:26 +0100." <20020115153050.5F587E8452@oratrix.oratrix.nl> References: <20020115153050.5F587E8452@oratrix.oratrix.nl> Message-ID: <200201151535.KAA24828@cj20424-a.reston1.va.home.com> > Object.h declares various typedefs for routine pointers, and their names are > not adorned with some sort of Py_ prefix. > > Suddenly this has started to be a problem for me on OSX (not sure > why: either object.h changed or because I got a new version of the > OSX devtools): object.h declares a typedef "destructor", and if that > is in scope when is included this fails, which uses the > name "destructor" as an argument name (for a routine pointer), and > the parser gets confused. destructor is a very old typedef, so OSX must've changed. :-) > I think it's GCC that's to blame here, but still: shouldn't these > names have some sort of a prefix? Looking back, yes, definitely. They were overlooked by the "grand renaming" because they aren't visible to the loader. But hard to fix -- these typedefs are used in 3rd party extensions all over the place. > Alternatively I can apply a quick fix by defining "destructor" as > something else just before including ... That sounds like the right fix, but please do it inside a platform #ifdef. I believe typedef names are exported as gdb symbols, but CPP #defines are not. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Tue Jan 15 17:43:56 2002 From: paul@prescod.net (Paul Prescod) Date: Tue, 15 Jan 2002 09:43:56 -0800 Subject: [Python-Dev] Utopian String Interpolation Message-ID: <3C446A5B.2E7A22CD@prescod.net> I think that if we're going to do string interpolation we might as go all of the way and have one unified string interpolation model. 1. There should be no string-prefix. Instead the string \$ should be magical in all non-raw literal strings as \x, \n etc. are. (if you want to do string interpolation on a raw string, you could do it using the method version below) >>> from __future__ import string_interp >>> a = "acos(.5) = \$(acos(.5))" Embrace the __future__! 2. There should be a transition period where literal strings containing "\$" are flagged. This is likely rare but may occur here and there. And by the way, unused \-sequences should probably be proactively reserved now instead of silently "failing" as they do today. What's the use of making "\" special if sometimes it isn't special? 3. I think that it would be clearest if any expression other than a simple variable name required "\$(parens.around.it())". But that's a minor decision. 4. Between the $-sign and the opening paren, it should be possible to put a C-style formatting specification. "pi = \$5.3f(math.pi)". There is no reason to force people to switch to a totally different language feature to get that functionality. I never use it myself but presume that scientists do! 5. The interpolation functionality is useful enough to be available for use on runtime-generated strings. But at runtime it should have a totally different syntax. Now that Python has string methods it is clear that "%" could (and IMO should) have been implemented that way: newstr = mystr.interp(variabledict, evaluate_expressions=0) By default evaluate_expressions is turned off. That means that all it does is look up variables in the dictionary and insert them into the string where it seems \$. If you want full interpretation behaviour you would flip the evaluate_expressions switch. May Guido have mercy on your soul. 6. People should be discouraged from using the "%" version. Some day far in the future it could be officially deprecated. We'll tell our children stories about the days when we modulo'd strings, tuples and dictionaries in weird and wonderful ways. Once the (admittedly long) transition period is over, we would simply have a better way to do everything we can do today. Code using the new model will be easier to read, more concise, more consistent, more like other scripting languages, abuse syntax less and use fewer logical concepts. Arguably, functions like vars(), locals() and globals() could be relegated to an "introspection" module where no newbie will ever look at them again. (okay, now I'm over-reaching) There will undoubtedly be language-change backlash. Guido will take the heat, not me. He would have to decide if it was worth the pain. I think, however, that the resulting language would be an improvement for experts and newbies alike. And as with other changes -- sooner is better than later. The year after next year is going to be the Year of Python so let's get our changes in before then! Paul Prescod From guido@python.org Tue Jan 15 20:04:52 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 15 Jan 2002 15:04:52 -0500 Subject: [Python-Dev] test_unicode_file.py Message-ID: <200201152004.PAA04738@cj20424-a.reston1.va.home.com> In the most recent CVS checkout on the trunk, test_unicode_file has started to fail. Traceback: Traceback (most recent call last): File "../Lib/test/test_unicode_file.py", line 61, in ? if base not in os.listdir(path): UnicodeError: ASCII decoding error: ordinal not in range(128) This is on Linux (Red Hat 6.2, still). --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@svensson.org Tue Jan 15 21:15:18 2002 From: paul@svensson.org (Paul Svensson) Date: Tue, 15 Jan 2002 16:15:18 -0500 (EST) Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: <3C446A5B.2E7A22CD@prescod.net> Message-ID: On Tue, 15 Jan 2002, Paul Prescod wrote: >I think that if we're going to do string interpolation we might as go >all of the way and have one unified string interpolation model. Nice pie in the sky; my comments inserted below. > 1. There should be no string-prefix. Instead the string \$ should be >magical in all non-raw literal strings as \x, \n etc. are. (if you want >to do string interpolation on a raw string, you could do it using the >method version below) +1 on no prefix, -0 on \$. To my eyes, \(whatever) looks much cleaner, tho I'm not sure how that would work with the evaluate_expressions flag in (5). > 2. There should be a transition period where literal strings containing >"\$" are flagged. This is likely rare but may occur here and there. And >by the way, unused \-sequences should probably be proactively reserved >now instead of silently "failing" as they do today. What's the use of >making "\" special if sometimes it isn't special? +1 on making undefined \-sequences raise SyntaxError. > 3. I think that it would be clearest if any expression other than a >simple variable name required "\$(parens.around.it())". But that's a >minor decision. +1 on parens, but see my comments to (1). > 4. Between the $-sign and the opening paren, it should be possible to >put a C-style formatting specification. > >"pi = \$5.3f(math.pi)". > >There is no reason to force people to switch to a totally different >language feature to get that functionality. I never use it myself but >presume that scientists do! Eek -- feeping creaturism. -2. The only reason to add this here is to be able to remove the % operator on strings, and I'm not convinced that is the right way to go. Anyways, this just begs to be spelled something like \%5.3f(math.pi). Printf-like format specifications without a %-character seems just weird. > 5. The interpolation functionality is useful enough to be available for >use on runtime-generated strings. But at runtime it should have a >totally different syntax. Now that Python has string methods it is clear >that "%" could (and IMO should) have been implemented that way: > >newstr = mystr.interp(variabledict, evaluate_expressions=0) > >By default evaluate_expressions is turned off. That means that all it >does is look up variables in the dictionary and insert them into the >string where it seems \$. If you want full interpretation behaviour you >would flip the evaluate_expressions switch. May Guido have mercy on your >soul. -0. Here I think is a good place to draw the line before the returns diminish too far. I see the major part of the usefulness of string interpolation coming from compile time usage, and that also nicely matches how all other \-sequences are handled. /Paul From martin@v.loewis.de Tue Jan 15 21:13:16 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 15 Jan 2002 22:13:16 +0100 Subject: [Python-Dev] test_unicode_file.py In-Reply-To: <200201152004.PAA04738@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Tue, 15 Jan 2002 15:04:52 -0500) References: <200201152004.PAA04738@cj20424-a.reston1.va.home.com> Message-ID: <200201152113.g0FLDGO02218@mira.informatik.hu-berlin.de> > In the most recent CVS checkout on the trunk, test_unicode_file has > started to fail. Traceback: > > Traceback (most recent call last): > File "../Lib/test/test_unicode_file.py", line 61, in ? > if base not in os.listdir(path): > UnicodeError: ASCII decoding error: ordinal not in range(128) Until PEP 277 is approved, the tests that Mark recently added is bogus: The return value of os.listdir is (currently) a list of byte strings, and you cannot (portably) compare those to a Unicode string if the byte strings contain non-ASCII characters. I'm surprised the test passed for Mark; he either has Neil's patches installed, or has set the default encoding to "mbcs" on his system. I recommend to apply the attached patch. Regards, Martin Index: test_unicode_file.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/test/test_unicode_file.py,v retrieving revision 1.3 diff -u -r1.3 test_unicode_file.py --- test_unicode_file.py 2002/01/07 02:11:43 1.3 +++ test_unicode_file.py 2002/01/15 21:06:24 @@ -55,11 +55,12 @@ print "File doesn't exist after creating it" path, base = os.path.split(os.path.abspath(TESTFN_ENCODED)) -if base not in os.listdir(path): - print "Filename did not appear in os.listdir()" -path, base = os.path.split(os.path.abspath(TESTFN_UNICODE)) -if base not in os.listdir(path): - print "Unicode filename did not appear in os.listdir()" +# Until PEP 277 is adopted, this test is not portable +# if base not in os.listdir(path): +# print "Filename did not appear in os.listdir()" +# path, base = os.path.split(os.path.abspath(TESTFN_UNICODE)) +# if base not in os.listdir(path): +# print "Unicode filename did not appear in os.listdir()" if os.path.abspath(TESTFN_ENCODED) != os.path.abspath(glob.glob(TESTFN_ENCODED)[0]): print "Filename did not appear in glob.glob()" From guido@python.org Tue Jan 15 21:21:04 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 15 Jan 2002 16:21:04 -0500 Subject: [Python-Dev] test_unicode_file.py In-Reply-To: Your message of "Tue, 15 Jan 2002 22:13:16 +0100." <200201152113.g0FLDGO02218@mira.informatik.hu-berlin.de> References: <200201152004.PAA04738@cj20424-a.reston1.va.home.com> <200201152113.g0FLDGO02218@mira.informatik.hu-berlin.de> Message-ID: <200201152121.QAA08431@cj20424-a.reston1.va.home.com> > Until PEP 277 is approved, the tests that Mark recently added is > bogus: The return value of os.listdir is (currently) a list of byte > strings, and you cannot (portably) compare those to a Unicode string > if the byte strings contain non-ASCII characters. > > I'm surprised the test passed for Mark; he either has Neil's patches > installed, or has set the default encoding to "mbcs" on his system. > > I recommend to apply the attached patch. Thanks. Done. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue Jan 15 21:24:31 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 15 Jan 2002 22:24:31 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... In-Reply-To: <3C44059C.CFC09899@lemburg.com> (mal@lemburg.com) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> Message-ID: <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> > The restriction when compiling Python in wide mode on Windows > should be lifted: The PyUnicode_AsWideChar() API should be used > to convert 4-byte Unicode to wchar_t (which is 2-byte on Windows). While I agree that this restriction ought to be removed eventually, I doubt that Python will be usable on Windows with a four-byte Unicode type in any foreseeable future. Just have a look at unicodeobject.c:PyUnicode_DecodeMBCS; it makes the assumption that a Py_UNICODE* is the same thing as a WCHAR*. That means that the "mbcs" encoding goes away on Windows if HAVE_USABLE_WCHAR_T does not hold anymore. Also, I believe most of PythonWin also assumes HAVE_USABLE_WCHAR_T (didn't check, though). > Why is "unicodefilenames" a function and not a constant ? In the Windows binary, you need a run-time check to see whether this is DOS/W9x, or NT/W2k/XP; on DOS, the Unicode API is not available (you still can pass Unicode file names to open and listdir, but they will get converted through the MBCS encoding). So it clearly is not a compile time constant. I'm still not certain what the meaning of this function is, if it means "Unicode file names are only restricted by the file system conventions", then on Unix, it may change at run-time, if a user or the application sets an UTF-8 locale, switching from the original "C" locale. Regards, Martin From nhodgson@bigpond.net.au Tue Jan 15 22:09:44 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Wed, 16 Jan 2002 09:09:44 +1100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> Message-ID: <01e701c19e11$567f71f0$0acc8490@neil> Martin v. Loewis: > > OK, PEP 277 is now available from: > > http://python.sourceforge.net/peps/pep-0277.html > > Looks very good to me, except that the listdir approach (unicode in, > unicode out) should apply uniformly to all platforms; I'll provide an > add-on patch to your implementation once the PEP is approved. Won't this lead to a less useful result as Py_FileSystemDefaultEncoding will be NULL on, for example, Linux, so if there are names containing non-ASCII characters then it will either raise an exception or stick '?'s in the names. So it would be better to use narrow strings there as that will pass through all file names. You have probably already realised, but Windows 9x will also need a Unicode preserving listdir but it will have to encode using mbcs. Neil From guido@python.org Tue Jan 15 22:21:03 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 15 Jan 2002 17:21:03 -0500 Subject: [Python-Dev] Starting 2.1.2 final release Message-ID: <200201152221.RAA09378@cj20424-a.reston1.va.home.com> We're going to cut a 2.1.2 final release tonight. Anthony had to bow out for personal reasons, so it's the PythonLabs crew who are doing the actual release for him. In honor of Anthony's timezone (and because we're all night owls here :-), the official release date will be January 16. Please no more checkins to the release21-maint branch, except from PythonLabbers! --Guido van Rossum (home page: http://www.python.org/~guido/) From Anthony Baxter Tue Jan 15 22:39:14 2002 From: Anthony Baxter (Anthony Baxter) Date: Wed, 16 Jan 2002 09:39:14 +1100 Subject: [Python-Dev] Re: Starting 2.1.2 final release In-Reply-To: Message from Guido van Rossum of "Tue, 15 Jan 2002 17:21:03 CDT." <200201152221.RAA09378@cj20424-a.reston1.va.home.com> Message-ID: <200201152239.g0FMdEP07360@mbuna.arbhome.com.au> >>> Guido van Rossum wrote > We're going to cut a 2.1.2 final release tonight. Anthony had to bow > out for personal reasons, so it's the PythonLabs crew who are doing > the actual release for him. In honor of Anthony's timezone (and > because we're all night owls here :-), the official release date will > be January 16. Thanks for doing this - my most sincere apologies for the last minute drop-out on this - I need to find a new place to live before I head over for the python conference. Anthony From paul@prescod.net Tue Jan 15 23:12:21 2002 From: paul@prescod.net (Paul Prescod) Date: Tue, 15 Jan 2002 15:12:21 -0800 Subject: [Python-Dev] Utopian String Interpolation References: Message-ID: <3C44B755.6F2251E5@prescod.net> Paul Svensson wrote: > >.... > > +1 on no prefix, -0 on \$. > To my eyes, \(whatever) looks much cleaner, tho I'm not sure how > that would work with the evaluate_expressions flag in (5). An offline correspond suggested that and also suggested perhaps \`. \` is nicely reminicent of `abc` and it does basically the same thing, only in strings, so I kind of like it. >>> `5+3` '8' >>> "\`5 + 3` is enough" 8 is enough The downside is that larger characters like $ and % are much more clear to my eyes. Plus there is the whole apos-backtick confusion. The problem with \( is that that is likely to already be a popular string in regular expressions. >... > > 4. Between the $-sign and the opening paren, it should be possible to > >put a C-style formatting specification. > > > >"pi = \$5.3f(math.pi)". > > > >There is no reason to force people to switch to a totally different > >language feature to get that functionality. I never use it myself but > >presume that scientists do! > > Eek -- feeping creaturism. -2. The feature is already there and sometimes used. We either keep two different ways to spell interpolation or we incorporate it. > The only reason to add this here is to be able to remove the % operator > on strings, and I'm not convinced that is the right way to go. > Anyways, this just begs to be spelled something like \%5.3f(math.pi). > Printf-like format specifications without a %-character seems just weird. The offline correspondant also had this idea and I'm coming around to it. >... > -0. Here I think is a good place to draw the line before the returns > diminish too far. I see the major part of the usefulness of string > interpolation coming from compile time usage, and that also nicely matches > how all other \-sequences are handled. And do what to do templating at runtime? Modulo? string.replace? Or just don't provide that feature? Also, how to handle interpolation in raw strings? Paul Prescod From ealiad1220@aol.com Tue Jan 15 16:12:44 2002 From: ealiad1220@aol.com (ealiad1220@aol.com) Date: Tue, 15 Jan 2002 16:12:44 Subject: [Python-Dev] More Customers Now ! Message-ID: <200201152028.PAA22997@armientibrooks.com> Untitled Document

I noticed your email address on a list serve related to technology and web development. Our company has developed a simple, risk-free and cost effective method of generating leads and creating awareness for your Company through targeted email marketing. Please read on to find out more about this awesome service.

The process:

  • You provide us with keywords pertaining to your company's target market.
  • Using our proprietary spider software, we spider the Internet searching for email addresses that are on pages that match those keywords.
  • We setup a database driven form, which allows the prospect to input their contact, company, and any other relative information that you may require.
  • We send emails to the addresses collected. These emails do not contain your company’s name, so that your company is protected.
  • Once the prospect has filled out and submitted their information, the data is automatically written to a database.
  • You may then login to our web driven application and view the current leads that have been submitted.

Results:

We typically provide our clients with anywhere from 30 – 200 leads per week with our standard package depending on:

  • Target market
  • Keywords used
  • Product pricing

We will help develop a customized system for your company that will ensure maximum return.

Pricing:

Our standard package includes the following:

  • HTML email design and implementation
  • Form and database setup
  • Over 100,000 emails distributed per month (done on a weekly basis)
  • Email address collection and filtering

Cost: $750 per Month

The above price is all-inclusive, and no other charges will be incurred. We can also provide higher quantities of distribution if required. Please contact us for details.

If you would like more information on our services or would like to get started please click here

Cordially,

Gary Michaels

From paul@svensson.org Wed Jan 16 00:37:56 2002 From: paul@svensson.org (Paul Svensson) Date: Tue, 15 Jan 2002 19:37:56 -0500 (EST) Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: <3C44B755.6F2251E5@prescod.net> Message-ID: On Tue, 15 Jan 2002, Paul Prescod wrote: >Paul Svensson wrote: >> >>.... >> >> +1 on no prefix, -0 on \$. >> To my eyes, \(whatever) looks much cleaner, tho I'm not sure how >> that would work with the evaluate_expressions flag in (5). > >An offline correspond suggested that and also suggested perhaps \`. \` >is nicely reminicent of `abc` and it does basically the same thing, only >in strings, so I kind of like it. > >>>> `5+3` >'8' >>>> "\`5 + 3` is enough" >8 is enough > >The downside is that larger characters like $ and % are much more clear >to my eyes. Plus there is the whole apos-backtick confusion. I thought of \` as well, but didn't suggest it, mainly for those reasons. >The problem with \( is that that is likely to already be a popular >string in regular expressions. In which case it should either be a raw string, or spelled \\(. (We _really_ need to issue syntax errors on undefined \-sequences) >>... >> > 4. Between the $-sign and the opening paren, it should be possible to >> >put a C-style formatting specification. >> > >> >"pi = \$5.3f(math.pi)". >> > >> >There is no reason to force people to switch to a totally different >> >language feature to get that functionality. I never use it myself but >> >presume that scientists do! >> >> Eek -- feeping creaturism. -2. > >The feature is already there and sometimes used. We either keep two >different ways to spell interpolation or we incorporate it. I don't think interpolation and variable formatting are similar enough to conflate in a single notation -- wasn't it the ungainliness of using the existing variable formatting to interpolate that started this thread ? >> The only reason to add this here is to be able to remove the % operator >> on strings, and I'm not convinced that is the right way to go. >> Anyways, this just begs to be spelled something like \%5.3f(math.pi). >> Printf-like format specifications without a %-character seems just weird. > >The offline correspondant also had this idea and I'm coming around to >it. I'm not particularly happy with that idea; simply mimicking the syntax it was supposed to replace, for little gain. I also think there could be some cause for confusion between \%(foo)s looking in vars() and %(foo)s using the other side of the % operator. >>... >> -0. Here I think is a good place to draw the line before the returns >> diminish too far. I see the major part of the usefulness of string >> interpolation coming from compile time usage, and that also nicely matches >> how all other \-sequences are handled. > >And do what to do templating at runtime? Modulo? string.replace? Or just >don't provide that feature? Also, how to handle interpolation in raw >strings? Since the whole point of raw strings is to _not_ touch what's inside the quotes, I don't see how string interpolation makes much sense there. As for runtime templating, a string method to replace \-sequences seems like a very straightforward idea, that shouldn't need much discussion. Call it "".eval([globals, [locals]]), to get some educational synergy from teaching all the newbies not to give unchecked user input to eval(). I still think compile-time templating would be the more common use, and thus should be the driving issue behind the design. /Paul From nhodgson@bigpond.net.au Wed Jan 16 01:08:34 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Wed, 16 Jan 2002 12:08:34 +1100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> Message-ID: <057101c19e2a$5217efc0$0acc8490@neil> Martin v. Loewis: > I'm still not certain what the meaning of this function is, if it > means "Unicode file names are only restricted by the file system > conventions", then on Unix, it may change at run-time, if a user or > the application sets an UTF-8 locale, switching from the original "C" > locale. The underlying motivation of the function is for code to be able to ask "Is it better to pass Unicode strings to file operations"? For me the main criterion for "better" is whether all files are accessible. It is best to determine this through a test that does not require writing or that is dependent on the user's setup, such as having a "C:" drive. Switching to a UTF-8 locale on Unix will make files inaccessible where their names contain illegal UTF-8 sequences. Neil From jason@jorendorff.com Wed Jan 16 02:53:08 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Tue, 15 Jan 2002 20:53:08 -0600 Subject: [Python-Dev] PEP_215_ (string interpolation) alternative EvalDict In-Reply-To: Message-ID: > But your example suggests to me: > > >>> input('?: ') > ?: $'$os.system("rm -rm *" )' > > I guess you need to special case that out of the compiler also. > ( Are there any others lurking about ? ) The user could just as well type ?: os.system("rm -rf *") and save some keystrokes. input() is totally insecure. Always has been. Nothing new here. ## Jason Orendorff http://www.jorendorff.com/ From guido@python.org Wed Jan 16 03:05:49 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 15 Jan 2002 22:05:49 -0500 Subject: [Python-Dev] RELEASED - Python 2.1.2 (final) Message-ID: <200201160305.WAA21292@cj20424-a.reston1.va.home.com> I've released the final version of Python 2.1.2 - a bugfix release for Python 2.1. I recommend everyone who is using Python 2.1 or 2.1.1 to upgrade to 2.1.2 -- this release fixes a few crashes. Read about it and download it here: http://www.python.org/2.1.2/ My special thanks go out to Anthony Baxter, the relentless 2.1.2 releasemeister (and for the use of his timezone so I can call this a January 16 release without having to stay up until after midnight :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From rsc@plan9.bell-labs.com Wed Jan 16 03:26:51 2002 From: rsc@plan9.bell-labs.com (Russ Cox) Date: Tue, 15 Jan 2002 22:26:51 -0500 Subject: [Python-Dev] thread_foobar.h routines Message-ID: I'm writing thread routines for the Plan 9 port of Python. Is it correct that: PyThread_acquire_lock returns 1 on success, 0 on failure. PyThread_down_sema returns 0 on success, -1 on failure. It appears that way, but the inconsistency bothers me. Thanks. Russ From guido@python.org Wed Jan 16 03:33:08 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 15 Jan 2002 22:33:08 -0500 Subject: [Python-Dev] thread_foobar.h routines In-Reply-To: Your message of "Tue, 15 Jan 2002 22:26:51 EST." References: Message-ID: <200201160333.WAA22447@cj20424-a.reston1.va.home.com> > I'm writing thread routines for the Plan 9 port of Python. > > Is it correct that: > > PyThread_acquire_lock returns 1 on success, 0 on failure. > PyThread_down_sema returns 0 on success, -1 on failure. > > It appears that way, but the inconsistency bothers me. Me too. The PyThread_*_sema routines are not used, and I would recommend that you not bother implementing them at all. (If anyone used them, we would have heard a complaint -- in some thread implementations these return -1 for failure, in others 0. :-) We should cut these out of the sources. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Wed Jan 16 04:00:38 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 15 Jan 2002 23:00:38 -0500 (EST) Subject: [Python-Dev] thread_foobar.h routines In-Reply-To: <200201160333.WAA22447@cj20424-a.reston1.va.home.com> References: <200201160333.WAA22447@cj20424-a.reston1.va.home.com> Message-ID: <15428.64230.707183.14133@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > Me too. The PyThread_*_sema routines are not used, and I would > recommend that you not bother implementing them at all. (If anyone > used them, we would have heard a complaint -- in some thread > implementations these return -1 for failure, in others 0. :-) > > We should cut these out of the sources. I'll be glad to do this. A quick grep seems to show that this really does apply to *all* PyThread_*_sema() routines. If there are no objections, I'll have this done quickly. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jepler@inetnebr.com Wed Jan 16 04:21:16 2002 From: jepler@inetnebr.com (jepler@inetnebr.com) Date: Tue, 15 Jan 2002 22:21:16 -0600 Subject: PEP 215 (was Re: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict) In-Reply-To: <15427.54378.963060.448829@anthem.wooz.org> References: <3C439693.7A2A3724@prescod.net> <15427.54378.963060.448829@anthem.wooz.org> Message-ID: <20020115222113.A987@unpythonic.dhs.org> On Tue, Jan 15, 2002 at 02:04:10AM -0500, Barry A. Warsaw wrote: > [1] I use _() ostensibly to mark translatable strings, but it has a > side benefit in that it interpolates into the string named variables > from the locals and globals of the calling context. It does this by > using sys._getframe(1) in Python 2.1 and try/except hackery in older > versions of Python. I find it quite handy, and admittedly magical, > but then I'm not suggesting it become a standard Python feature. :) This caught my eye. How will programs that use PEP215 for string interpolation be translatable? All translation systems use some method of identifying the strings in source code, then permitting mapping from the string identifiers to the real strings at runtime. With "gettext", the "string identifier" is typically the original-language string, and the marker/mapper is spelled _("string literal"). Given that short introduction, it's obvious how _("hi there, %s") % yourname works, and why _("hi there, %s" % yourname) doesn't work, but how will I use a similar scheme to translate $"hi there, $yourname" ? Obviously, _($"hi there, $yourname") won't work, because it's equivalent to the second, non-working translation example. Well, I guess we could add _ and $_ strings to Python, right? grumble-grumble'ly yours, Jeff From tim.one@home.com Wed Jan 16 05:00:44 2002 From: tim.one@home.com (Tim Peters) Date: Wed, 16 Jan 2002 00:00:44 -0500 Subject: [Python-Dev] thread_foobar.h routines In-Reply-To: <15428.64230.707183.14133@cj42289-a.reston1.va.home.com> Message-ID: [Fred L. Drake, Jr., on removing the unused PyThread_*_sema routines] > If there are no objections, I'll have this done quickly. +1, and you patch looks fine from a skim (and I'd rather fix it retroactively if necessary than bother to apply it first -- live a little, check it in, we're still pre-alpha-1 for 2.3 ). From fdrake@acm.org Wed Jan 16 05:17:36 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 16 Jan 2002 00:17:36 -0500 (EST) Subject: [Python-Dev] thread_foobar.h routines In-Reply-To: References: <15428.64230.707183.14133@cj42289-a.reston1.va.home.com> Message-ID: <15429.3312.503420.50275@cj42289-a.reston1.va.home.com> Tim Peters writes: > +1, and you patch looks fine from a skim (and I'd rather fix it > retroactively if necessary than bother to apply it first -- live a little, > check it in, we're still pre-alpha-1 for 2.3 ). Guido asked me to wait a day in case any legitimate reasons to keep those routines popped up from python-dev, otherwise it would be in already! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Wed Jan 16 05:22:18 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 16 Jan 2002 00:22:18 -0500 (EST) Subject: [Python-Dev] Intel C/C++ compiler evaluation version Message-ID: <15429.3594.897580.664110@cj42289-a.reston1.va.home.com> Has anyone tried the evaluation version of the Intel C/C++ compiler for Linux 32-bit platforms? They distributed a CD in the most recent version of Linux Magazine, and it appears to be available for download as well. I had trouble getting it going; the evaluation license file they sent me didn't work out of the box with the license manager that got installed. If anyone has gotten it to work, please send instructions around! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mhammond@skippinet.com.au Wed Jan 16 05:40:03 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 16 Jan 2002 16:40:03 +1100 Subject: [Python-Dev] guidance sought: merging port related changes to Library modules In-Reply-To: <15426.60010.318856.560347@cj42289-a.reston1.va.home.com> Message-ID: Fred writes: > Guido van Rossum writes: > > The various modules ntpath, posixpath, macpath etc. are not just their > > to support their own platform on itself. They are also there to > > Note that ntpath.abspath() relies on nt._getfullpathname(). It is not > unreasonable for this particular function to require that it actually > be running on NT, so I'm not going to suggest changing this. On the > other hand, it means the portable portions of the module are (mostly) > not tested when the regression test is run on a platform other than > Windows; the ntpath.abspath() test raises an ImportError since > ntpath.abspath() imports the "nt" module within the function, and the > resulting ImportError causes the rest of the unit test to be skipped > and regrtest.py reports that the test is skipped. > > I'd like to change the test so that the abspath() test is only run > if the "nt" module is available: Sigh - this too would be my fault :( Before _getfullpathname() was added to the 'nt' module, there was an attempt to import 'win32api', and if OK, use the equivilent function from that. When I added the new function to 'nt', I removed that import check, in the belief it would now always succeed. This was obviously a bad call ;) (FYI, that was rev 1.35 of ntpath.py) A patch that reinstates the code would be: Index: ntpath.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/ntpath.py,v retrieving revision 1.44 diff -u -r1.44 ntpath.py --- ntpath.py 2001/11/05 21:25:02 1.44 +++ ntpath.py 2002/01/16 05:35:19 @@ -457,8 +457,18 @@ # Return an absolute path. def abspath(path): """Return the absolute version of a path""" - if path: # Empty path must return current working directory. + try: from nt import _getfullpathname + except ImportError: # Not running on Windows - mock up something sensible. + global abspath + def _abspath(path): + if not isabs(path): + path = join(os.getcwd(), path) + return normpath(path) + abspath = _abspath + return _abspath(path) + + if path: # Empty path must return current working directory. try: path = _getfullpathname(path) except WindowsError: This should also solve the test case problem. Thoughts? Mark. From fdrake@acm.org Wed Jan 16 05:53:46 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 16 Jan 2002 00:53:46 -0500 (EST) Subject: [Python-Dev] guidance sought: merging port related changes to Library modules In-Reply-To: References: <15426.60010.318856.560347@cj42289-a.reston1.va.home.com> Message-ID: <15429.5482.746441.920073@cj42289-a.reston1.va.home.com> Mark Hammond writes: > Before _getfullpathname() was added to the 'nt' module, there was an attempt > to import 'win32api', and if OK, use the equivilent function from that. > When I added the new function to 'nt', I removed that import check, in the > belief it would now always succeed. This was obviously a bad call ;) (FYI, > that was rev 1.35 of ntpath.py) > > A patch that reinstates the code would be: ... > This should also solve the test case problem. I haven't tested this, but it looks OK to me. Feel free to check it in. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From martin@v.loewis.de Wed Jan 16 07:08:33 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 16 Jan 2002 08:08:33 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... In-Reply-To: <01e701c19e11$567f71f0$0acc8490@neil> (nhodgson@bigpond.net.au) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <01e701c19e11$567f71f0$0acc8490@neil> Message-ID: <200201160708.g0G78Xr01736@mira.informatik.hu-berlin.de> > Won't this lead to a less useful result as Py_FileSystemDefaultEncoding > will be NULL on, for example, Linux, so if there are names containing > non-ASCII characters then it will either raise an exception or stick '?'s in > the names. So it would be better to use narrow strings there as that will > pass through all file names. On Linux, if the user has set LANG to a reasonable value, and the Python application has invoked setlocale(), Py_FileSystemDefaultEncoding will not be NULL. It still might happen that an individual file name cannot be decoded from the file system encoding, e.g. if the locale is set to UTF-8, but you have a Latin-1 file name (created by a different user). In that exceptional case, I would neither expect an exception, nor expect replacement characters in the Unicode string, but instead use a byte string *for this specific file name*. Just because there is there is the rare chance that you cannot meaningfully interpret a certain file name does not mean that all other installation have to suffer. > You have probably already realised, but Windows 9x will also need a > Unicode preserving listdir but it will have to encode using mbcs. Exactly. Unfortunately, we cannot do anything to avoid replacement characters here, since it is already Windows who will introduce them. In turn, we know that decoding from "mbcs" will always succeed. Regards, Martin From martin@v.loewis.de Wed Jan 16 07:34:12 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 16 Jan 2002 08:34:12 +0100 Subject: [Python-Dev] Intel C/C++ compiler evaluation version In-Reply-To: <15429.3594.897580.664110@cj42289-a.reston1.va.home.com> (fdrake@acm.org) References: <15429.3594.897580.664110@cj42289-a.reston1.va.home.com> Message-ID: <200201160734.g0G7YCD01815@mira.informatik.hu-berlin.de> > Has anyone tried the evaluation version of the Intel C/C++ compiler > for Linux 32-bit platforms? They distributed a CD in the most recent > version of Linux Magazine, and it appears to be available for download > as well. > > I had trouble getting it going; the evaluation license file they sent > me didn't work out of the box with the license manager that got > installed. If anyone has gotten it to work, please send instructions > around! We had no problems installing it. The compiler goes into /opt/intel/compiler50/ia32/*, the license into /opt/intel/license/l_cpp.lic. On the Debian system with a alien RPM installation, the RPM postinstall scripts did not execute properly, so we adjusted the configuration files ourselves (in particular, the postinstall script would have created a broken .csh file, anyway). Looking at the iccvars.csh script, make sure the following settings are correct: setenv IA32ROOT /opt/intel/compiler50/ia32 setenv INTEL_FLEXLM_LICENSE /opt/intel/licenses (iccvars.sh accordingly). I don't think we run flexlm; sourcing the appropriate settings is enough. HTH, Martin From nhodgson@bigpond.net.au Wed Jan 16 11:38:54 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Wed, 16 Jan 2002 22:38:54 +1100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> Message-ID: <018101c19e82$60c42950$0acc8490@neil> M.-A. Lemburg: > The restriction when compiling Python in wide mode on Windows > should be lifted: The PyUnicode_AsWideChar() API should be used > to convert 4-byte Unicode to wchar_t (which is 2-byte on Windows). I'd prefer not to include this as it adds complexity for little benefit but am prepared to do the implementation if it is required. Neil From barry@zope.com Wed Jan 16 13:17:54 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 16 Jan 2002 08:17:54 -0500 Subject: PEP 215 (was Re: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict) References: <3C439693.7A2A3724@prescod.net> <15427.54378.963060.448829@anthem.wooz.org> <20020115222113.A987@unpythonic.dhs.org> Message-ID: <15429.32130.692161.521257@anthem.wooz.org> >>>>> "jepler" == writes: jepler> Well, I guess we could add _ and $_ strings to Python, jepler> right? Ug. t'' strings have been discussed before w.r.t. i18n markup, but I don't like it. I think it's a mistake to proliferate string prefixes. But search the i18n-sig for more discussion on the topic. -Barry From skip@pobox.com Wed Jan 16 14:22:41 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 16 Jan 2002 08:22:41 -0600 Subject: [Python-Dev] deprecate input()? Message-ID: <15429.36017.387707.78193@12-248-41-177.client.attbi.com> I just responded to a question on c.l.py a user had about feeding empty strings to input(). While he didn't say why he called input(), I suspect he thought the semantics were more like raw_input(). In these days of widespread Internet nastiness, shouldn't input() be deprecated? -- Skip Montanaro (skip@pobox.com - http://www.mojam.com/) From mal@lemburg.com Wed Jan 16 18:22:37 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 16 Jan 2002 19:22:37 +0100 Subject: [Python-Dev] Utopian String Interpolation References: <3C446A5B.2E7A22CD@prescod.net> Message-ID: <3C45C4ED.806E64F3@lemburg.com> Paul Prescod wrote: > > I think that if we're going to do string interpolation we might as go > all of the way and have one unified string interpolation model. > > 1. There should be no string-prefix. Instead the string \$ should be > magical in all non-raw literal strings as \x, \n etc. are. (if you want > to do string interpolation on a raw string, you could do it using the > method version below) > > >>> from __future__ import string_interp > > >>> a = "acos(.5) = \$(acos(.5))" > > Embrace the __future__! -1. Too dangerous. If string interpolation makes it into the core, then please use a *new* construct. '\$' is currently interpreted as '\$' and this should not be changed (heck, just think what would happen to all the shell script snippets encoded in Python strings). BTW, why don't you wrap all this interpolation stuff into a module and then call a function to have it apply all the magic you want. If I remember correctly, someone else has already written such a module for Python. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jan 16 18:48:49 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 16 Jan 2002 19:48:49 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> Message-ID: <3C45CB11.ACB2CEE6@lemburg.com> "Martin v. Loewis" wrote: > > > The restriction when compiling Python in wide mode on Windows > > should be lifted: The PyUnicode_AsWideChar() API should be used > > to convert 4-byte Unicode to wchar_t (which is 2-byte on Windows). > > While I agree that this restriction ought to be removed eventually, I > doubt that Python will be usable on Windows with a four-byte Unicode > type in any foreseeable future. Perhaps Neil ought to copy your notes to the PEP, so that we don't forget about this issue. > Just have a look at unicodeobject.c:PyUnicode_DecodeMBCS; it makes the > assumption that a Py_UNICODE* is the same thing as a WCHAR*. That > means that the "mbcs" encoding goes away on Windows if > HAVE_USABLE_WCHAR_T does not hold anymore. > > Also, I believe most of PythonWin also assumes HAVE_USABLE_WCHAR_T > (didn't check, though). > > > Why is "unicodefilenames" a function and not a constant ? > > In the Windows binary, you need a run-time check to see whether this > is DOS/W9x, or NT/W2k/XP; on DOS, the Unicode API is not available > (you still can pass Unicode file names to open and listdir, but they > will get converted through the MBCS encoding). So it clearly is not a > compile time constant. I see. > I'm still not certain what the meaning of this function is, if it > means "Unicode file names are only restricted by the file system > conventions", then on Unix, it may change at run-time, if a user or > the application sets an UTF-8 locale, switching from the original "C" > locale. Doesn't it mean: "posix functions and file() can accept Unicode file names" ? That's what I thought, at least; whether they succeed or not is another question and could well be handled by run-time errors (e.g. on Unix it is not at all clear whether NFS, Samba or some other more exotic file system can handle the encoding chosen by Python or the program). Perhaps we ought to drop that function altogether and let the various file IO functions raise run-time errors instead ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jan 16 18:54:00 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 16 Jan 2002 19:54:00 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <018101c19e82$60c42950$0acc8490@neil> Message-ID: <3C45CC48.A71EEA8A@lemburg.com> Neil Hodgson wrote: > > M.-A. Lemburg: > > > The restriction when compiling Python in wide mode on Windows > > should be lifted: The PyUnicode_AsWideChar() API should be used > > to convert 4-byte Unicode to wchar_t (which is 2-byte on Windows). > > I'd prefer not to include this as it adds complexity for little benefit > but am prepared to do the implementation if it is required. Point taken, but please mention this in the PEP. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Wed Jan 16 19:09:24 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Wed, 16 Jan 2002 20:09:24 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... In-Reply-To: <3C45CB11.ACB2CEE6@lemburg.com> (mal@lemburg.com) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <006e01c1949c$7631d1b0$0acc8490@neil> <200201032209.g03M9vn01498@mira.informatik.hu-berlin.de> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> Message-ID: <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> > > I'm still not certain what the meaning of this function is, if it > > means "Unicode file names are only restricted by the file system > > conventions", then on Unix, it may change at run-time, if a user or > > the application sets an UTF-8 locale, switching from the original "C" > > locale. > > Doesn't it mean: "posix functions and file() can accept Unicode file > names" ? Neil has given his own interpretation (return true if it is *better* to pass Unicode strings than to pass byte strings). You property (accepts Unicode) is true on all Python installations since 2.2: if you pass a Unicode string, it will try the file system encoding; if that is NULL, it will try the system encoding. So on all Python systems, open(u"foo.txt","w") currently succeeds everywhere (unless Unicode was completely disabled in the port). > That's what I thought, at least; whether they succeed or not > is another question and could well be handled by run-time errors > (e.g. on Unix it is not at all clear whether NFS, Samba or some > other more exotic file system can handle the encoding chosen by > Python or the program). For NFS, it is clear - file names are null-terminated byte strings (AFAIK). For Samba, I believe it depends on the installation, specifically whether the encoding of Samba matches the one of the user. For more exotic file systems, it is not all that clear. > Perhaps we ought to drop that function altogether and let the > various file IO functions raise run-time errors instead ?! That was my suggestion as well. However, Neil points out that, on Windows, passing Unicode is sometimes better: For some files, there is no byte string file name to identify the file (if the file name is not representable in MBCS). OTOH, on Unix, some files cannot be accessed with a Unicode string, if the file name is invalid in the user's encoding. It turns out that only OS X really got it right: For each file, there is both a byte string name, and a Unicode name. Regards, Martin From paul@prescod.net Wed Jan 16 19:43:49 2002 From: paul@prescod.net (Paul Prescod) Date: Wed, 16 Jan 2002 11:43:49 -0800 Subject: [Python-Dev] Utopian String Interpolation References: <3C446A5B.2E7A22CD@prescod.net> <3C45C4ED.806E64F3@lemburg.com> Message-ID: <3C45D7F5.BF260898@prescod.net> "M.-A. Lemburg" wrote: > >... > > Embrace the __future__! > > -1. > > Too dangerous. It isn't dangerous. That's precisely what __future__ is for! It is no more dangerous than any other feature that uses __future__. > ... If string interpolation makes it into the core, > then please use a *new* construct. '\$' is currently interpreted > as '\$' and this should not be changed (heck, just think what would > happen to all the shell script snippets encoded in Python strings). No, this should be changed. Completely ignoring string interpolation, I am strongly in favour of changing the behaviour of the literal string parser so that unknown \-combinations raise a SyntaxError. If you don't want a backslash to be interpreted as an escape sequence start, you should use a raw string. The Python documentation and grammar already says: escapeseq ::= "\" The documentation says: "Unlike Standard , all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.)" That's a weird thing to say. What could be more helpful for debugging than a good old SyntaxError??? > BTW, why don't you wrap all this interpolation stuff into > a module and then call a function to have it apply all the > magic you want. We've been through that in this discussion already. In fact, that's how the discussion started. Paul Prescod From paul@svensson.org Wed Jan 16 20:26:23 2002 From: paul@svensson.org (Paul Svensson) Date: Wed, 16 Jan 2002 15:26:23 -0500 (EST) Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: <3C45D7F5.BF260898@prescod.net> Message-ID: On Wed, 16 Jan 2002, Paul Prescod wrote: >The documentation says: > >"Unlike Standard , all unrecognized escape sequences are left in the >string unchanged, i.e., the backslash is left in the string. (This >behavior is useful when debugging: if an escape sequence is mistyped, >the resulting output is more easily recognized as broken.)" > >That's a weird thing to say. What could be more helpful for debugging >than a good old SyntaxError??? The usefulness is relative; it's arguably easier to find the problem and fix it if the \ remains in the string than if it's simply removed (as C does, tho most compilers issue a warning). It could also be argued that you get more nutritinal value by eating only the black raisins from the cake then by eating just the golden raisins... /Paul From paul@prescod.net Wed Jan 16 20:40:39 2002 From: paul@prescod.net (Paul Prescod) Date: Wed, 16 Jan 2002 12:40:39 -0800 Subject: [Python-Dev] Utopian String Interpolation References: Message-ID: <3C45E547.8B8FDEFF@prescod.net> Paul Svensson wrote: > >... > > The usefulness is relative; it's arguably easier to find the > problem and fix it if the \ remains in the string than if it's > simply removed (as C does, tho most compilers issue a warning). Yeah, I understood that. I just don't understand why it isn't like most other things in Python. Python tends to be strict about things that are likely mistakes, rather than helping you "debug them" after passing them through silently. Paul Prescod From guido@python.org Wed Jan 16 21:31:02 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 16 Jan 2002 16:31:02 -0500 Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: Your message of "Wed, 16 Jan 2002 12:40:39 PST." <3C45E547.8B8FDEFF@prescod.net> References: <3C45E547.8B8FDEFF@prescod.net> Message-ID: <200201162131.QAA26108@cj20424-a.reston1.va.home.com> > Yeah, I understood that. I just don't understand why it isn't like most > other things in Python. Python tends to be strict about things that are > likely mistakes, rather than helping you "debug them" after passing them > through silently. > > Paul Prescod The "why" is that long ago Python didn't have raw strings but it did have regular expressions. I thought it would be painful to have to double all backslashes used for the regex syntax. It would be hard to change this policy now. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Wed Jan 16 21:59:17 2002 From: paul@prescod.net (Paul Prescod) Date: Wed, 16 Jan 2002 13:59:17 -0800 Subject: [Python-Dev] Utopian String Interpolation References: <3C45E547.8B8FDEFF@prescod.net> <200201162131.QAA26108@cj20424-a.reston1.va.home.com> Message-ID: <3C45F7B5.7603F4D1@prescod.net> Guido van Rossum wrote: > >... > > The "why" is that long ago Python didn't have raw strings but it did > have regular expressions. I thought it would be painful to have to > double all backslashes used for the regex syntax. Aha. > It would be hard to change this policy now. How about an optional warning which, after a year or so, would be turned on by default, and then a year or so after that would be an error? This same issue may effect some eventual merging of literal strings and Unicode literals because \N, \u etc. are treated differently in strings than in Unicode literals. And even if literal strings and Unicode strings are never merged, \N could be useful in ordinary strings. Paul Prescod From jepler@inetnebr.com Wed Jan 16 22:09:29 2002 From: jepler@inetnebr.com (Jeff Epler) Date: Wed, 16 Jan 2002 16:09:29 -0600 Subject: PEP 215 (was Re: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict) In-Reply-To: <15429.32130.692161.521257@anthem.wooz.org> References: <3C439693.7A2A3724@prescod.net> <15427.54378.963060.448829@anthem.wooz.org> <20020115222113.A987@unpythonic.dhs.org> <15429.32130.692161.521257@anthem.wooz.org> Message-ID: <20020116160928.A473@unpythonic.dhs.org> On Wed, Jan 16, 2002 at 08:17:54AM -0500, Barry A. Warsaw wrote: > > >>>>> "jepler" == writes: > > jepler> Well, I guess we could add _ and $_ strings to Python, > jepler> right? > > Ug. t'' strings have been discussed before w.r.t. i18n markup, but I > don't like it. ... and you like $'' strings? That suggestion was intended to bring a bad taste to *everybody*'s mouth, as much as t'' alone does to yours. (Hmm, and then I might need a raw unicode interpolated translated string ... is that spelled $_ur'' or r_$u'' ?) Jeff From guido@python.org Wed Jan 16 22:19:53 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 16 Jan 2002 17:19:53 -0500 Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: Your message of "Wed, 16 Jan 2002 13:59:17 PST." <3C45F7B5.7603F4D1@prescod.net> References: <3C45E547.8B8FDEFF@prescod.net> <200201162131.QAA26108@cj20424-a.reston1.va.home.com> <3C45F7B5.7603F4D1@prescod.net> Message-ID: <200201162219.RAA26286@cj20424-a.reston1.va.home.com> > How about an optional warning which, after a year or so, would be turned > on by default, and then a year or so after that would be an error? > > This same issue may effect some eventual merging of literal strings and > Unicode literals because \N, \u etc. are treated differently in strings > than in Unicode literals. And even if literal strings and Unicode > strings are never merged, \N could be useful in ordinary strings. -1 I don't find this enough of a problem to invoke the heavy gun of a language change. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Wed Jan 16 22:28:18 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 16 Jan 2002 17:28:18 -0500 Subject: PEP 215 (was Re: [Python-Dev] PEP 216 (string interpolation) alternative EvalDict) References: <3C439693.7A2A3724@prescod.net> <15427.54378.963060.448829@anthem.wooz.org> <20020115222113.A987@unpythonic.dhs.org> <15429.32130.692161.521257@anthem.wooz.org> <20020116160928.A473@unpythonic.dhs.org> Message-ID: <15429.65154.514355.230711@anthem.wooz.org> On Wed, Jan 16, 2002 at 08:17:54AM -0500, Barry A. Warsaw wrote: > Ug. t'' strings have been discussed before w.r.t. i18n markup, but I > don't like it. >>>>> "JE" == Jeff Epler writes: JE> ... and you like $'' strings? No! :) JE> That suggestion was intended to bring a bad taste to JE> *everybody*'s mouth, as much as t'' alone does to yours. Ah, no wonder I've had to drink 3 sodas today. I wondered what that foul flavor was, especially since I made sure to brush my teeth this morning! JE> (Hmm, and then I might need a raw unicode interpolated JE> translated string ... is that spelled $_ur'' or r_$u'' ?) Exactly why I'm against adding more string prefixes. Remember that the _ thingie we currently recommend for gettext /isn't/ prefix proliferation. E.g.: _(u'translate this') _(ru'and this') It's just a function call with a convenient name (and even that's just a convention, of course). -Barry From paul@svensson.org Wed Jan 16 22:29:13 2002 From: paul@svensson.org (Paul Svensson) Date: Wed, 16 Jan 2002 17:29:13 -0500 (EST) Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: <200201162131.QAA26108@cj20424-a.reston1.va.home.com> Message-ID: On Wed, 16 Jan 2002, Guido van Rossum wrote: >> Yeah, I understood that. I just don't understand why it isn't like most >> other things in Python. Python tends to be strict about things that are >> likely mistakes, rather than helping you "debug them" after passing them >> through silently. >> >> Paul Prescod > >The "why" is that long ago Python didn't have raw strings but it did >have regular expressions. I thought it would be painful to have to >double all backslashes used for the regex syntax. > >It would be hard to change this policy now. Yeah, it would be like, say, changing the semantics of integer division. Sometimes it's better to do what's right than what's easy. /Paul From paul@svensson.org Wed Jan 16 22:43:48 2002 From: paul@svensson.org (Paul Svensson) Date: Wed, 16 Jan 2002 17:43:48 -0500 (EST) Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: <3C45F7B5.7603F4D1@prescod.net> Message-ID: On Wed, 16 Jan 2002, Paul Prescod wrote: >Guido van Rossum wrote: >> >>... >> >> The "why" is that long ago Python didn't have raw strings but it did >> have regular expressions. I thought it would be painful to have to >> double all backslashes used for the regex syntax. > >Aha. > >> It would be hard to change this policy now. > >How about an optional warning which, after a year or so, would be turned >on by default, and then a year or so after that would be an error? Such a warning might prove to be a useful debugging tool, even if the language never changed. Maybe it would be a useful addition to PyChecker or some similar tool ? foot-in-the-door-ly, /Paul From Jack.Jansen@oratrix.nl Wed Jan 16 22:56:23 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Wed, 16 Jan 2002 23:56:23 +0100 Subject: [Python-Dev] Extending types in C - help needed Message-ID: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> In the discussion on my request for an ("O@", typeobject, void **) format for PyArg_Parse and Py_BuildValue MAL suggested that I could get the same functionality by creating a type WrapperTypeObject, which would be a subtype of TypeObject with extra fields pointing to the _New() and _Convert() routines to convert Python objects from/to C pointers. This would be good enough for me, because then types wanting to participate in the wrapper protocol would subtype WrapperTypeObject in stead of TypeObject, and two global routines could return the _New and _Convert routines given the type object, and we wouldn't need yet another PyArg_Parse format specifier. However, after digging high and low I haven't been able to deduce how I would then use this WrapperType in C as the type for my extension module objects. Are there any examples? If not, could someone who understands the new inheritance scheme give me some clues as to how to do this? -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mhammond@skippinet.com.au Thu Jan 17 00:53:37 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Thu, 17 Jan 2002 11:53:37 +1100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... In-Reply-To: <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> Message-ID: > Also, I believe most of PythonWin also assumes HAVE_USABLE_WCHAR_T > (didn't check, though). FYI, all the win32 extensions use their own Unicode API. These extensions had Unicode before Python did! These wrapper functions are abstract enough that they should be able to withstand any changes to Python's Unicode implementation quite simply - probably at the cost of extra copies and transformations in those wrappers. Mark. From guido@python.org Thu Jan 17 06:28:03 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 17 Jan 2002 01:28:03 -0500 Subject: [Python-Dev] deprecate input()? In-Reply-To: Your message of "Wed, 16 Jan 2002 08:22:41 CST." <15429.36017.387707.78193@12-248-41-177.client.attbi.com> References: <15429.36017.387707.78193@12-248-41-177.client.attbi.com> Message-ID: <200201170628.BAA28567@cj20424-a.reston1.va.home.com> > I just responded to a question on c.l.py a user had about feeding empty > strings to input(). While he didn't say why he called input(), I suspect he > thought the semantics were more like raw_input(). > > In these days of widespread Internet nastiness, shouldn't input() be > deprecated? Why? I imagine this is only used for interactive input, and then it's the computer's owner who is typing. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Jan 17 10:11:08 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 17 Jan 2002 11:11:08 +0100 Subject: [Python-Dev] Utopian String Interpolation References: <3C446A5B.2E7A22CD@prescod.net> <3C45C4ED.806E64F3@lemburg.com> <3C45D7F5.BF260898@prescod.net> Message-ID: <3C46A33C.D97C831C@lemburg.com> Paul Prescod wrote: > > "M.-A. Lemburg" wrote: > > > >... > > > Embrace the __future__! > > > > -1. > > > > Too dangerous. > > It isn't dangerous. That's precisely what __future__ is for! It is no > more dangerous than any other feature that uses __future__. It is. Currently Python strings are just that: immutable strings. Now, you suddenly add dynamics to then. This will cause nightmares in terms of security. Note that Python hasn't really had a need for Perl's "taint" because of this. I wouldn't want to see that change in any way. If you really need this, either use a string prefix or call a specific function which implements string interpolation. At least then things are obvious and explicit. > > ... If string interpolation makes it into the core, > > then please use a *new* construct. '\$' is currently interpreted > > as '\$' and this should not be changed (heck, just think what would > > happen to all the shell script snippets encoded in Python strings). > > No, this should be changed. Huh ? I bet RedHat and thousands of sysadmins who have switched from shell or Perl to Python would have strong objections. > Completely ignoring string interpolation, I > am strongly in favour of changing the behaviour of the literal string > parser so that unknown \-combinations raise a SyntaxError. If you don't > want a backslash to be interpreted as an escape sequence start, you > should use a raw string. > > The Python documentation and grammar already says: > > escapeseq ::= "\" > > The documentation says: > > "Unlike Standard , all unrecognized escape sequences are left in the > string unchanged, i.e., the backslash is left in the string. (This > behavior is useful when debugging: if an escape sequence is mistyped, > the resulting output is more easily recognized as broken.)" > > That's a weird thing to say. What could be more helpful for debugging > than a good old SyntaxError??? If there's nothing wrong with the escape why raise a SyntaxError ? > > BTW, why don't you wrap all this interpolation stuff into > > a module and then call a function to have it apply all the > > magic you want. > > We've been through that in this discussion already. In fact, that's how > the discussion started. I've jumped in at a rather late point. Perhaps you ought to rewind the discussion then and start discussing in a different direction :-) E.g. about the syntax to be used in the interpolation and where, when and in which context to evaluate the strings. There are so many options that I can't really see any benefit from chosing only one and hard-coding it into the language. Other users will have other requirement which are likely not to combine well with the one implementation you have in mind. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Jan 17 10:19:33 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 17 Jan 2002 11:19:33 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> Message-ID: <3C46A535.3C579501@lemburg.com> "Martin v. Loewis" wrote: > > [unicodefilenames()] > > Perhaps we ought to drop that function altogether and let the > > various file IO functions raise run-time errors instead ?! > > That was my suggestion as well. However, Neil points out that, on > Windows, passing Unicode is sometimes better: For some files, there is > no byte string file name to identify the file (if the file name is not > representable in MBCS). OTOH, on Unix, some files cannot be accessed > with a Unicode string, if the file name is invalid in the user's > encoding. Sounds like the run-time error solution would at least "solve" the issue in terms of making it depend on the used file name and underlying OS or file system. I'd say: let the different file name based APIs try hard enough and then have them bail out if they can't handle the particular case. > It turns out that only OS X really got it right: For each file, there > is both a byte string name, and a Unicode name. I suppose this is due to the fact that Mac file systems store extended attributes (much like what OS/2 does too) along with the file -- that's a really nice way of being able to extend file system semantics on a per-file basis; much better than the Windows Registry or the MIME guess-by-extension mechanisms. Oh well. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Jan 17 10:29:45 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 17 Jan 2002 11:29:45 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> Message-ID: <3C46A799.2E7AD7DB@lemburg.com> Jack Jansen wrote: > > In the discussion on my request for an ("O@", typeobject, > void **) format for PyArg_Parse and Py_BuildValue MAL suggested Thomas Heller suggested this. I am more in favour of exposing the pickle reduce API through "O@", that is have PyArgTuple_Parse() call the .__reduce__() method of the object. This will then return (factory, state_tuple) and these could then be exposed to the C function via two PyObject*. Note that there's no need for any type object magic. If this becomes a common case, it may be worthwhile to add a tp_reduce slot to type objects though. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From nhodgson@bigpond.net.au Thu Jan 17 11:04:53 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Thu, 17 Jan 2002 22:04:53 +1100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> <3C46A535.3C579501@lemburg.com> Message-ID: <08e201c19f46$cad5f070$0acc8490@neil> M.-A. Lemburg, regarding unicodefilenames(): > Sounds like the run-time error solution would at least "solve" > the issue in terms of making it depend on the used file name > and underlying OS or file system. It is much better to choose a technique that will always work rather than try to recover from a technique that may fail. unicodefilenames() can be dropped in favour of explicit OS and version checks but this is replacing a simple robust check with a more fragile one. unicodefilenames() will allow other environments to declare that client code will be more robust by choosing to use Unicode strings as file name arguments. This could include UTF-8 based systems such as OS X and BeOS, as well as Windows variants like CE. Neil From mal@lemburg.com Thu Jan 17 11:36:21 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 17 Jan 2002 12:36:21 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> <3C46A535.3C579501@lemburg.com> <08e201c19f46$cad5f070$0acc8490@neil> Message-ID: <3C46B735.9C433F60@lemburg.com> Neil Hodgson wrote: > > M.-A. Lemburg, regarding unicodefilenames(): > > > Sounds like the run-time error solution would at least "solve" > > the issue in terms of making it depend on the used file name > > and underlying OS or file system. > > It is much better to choose a technique that will always work rather than > try to recover from a technique that may fail. Is it really ? The problem is that under some OSes it is possible to work with multiple very different file system from within a single Python program. In those cases, the unicodefilename() API wouldn't really help all that much. > unicodefilenames() can be dropped in favour of explicit OS and version > checks but this is replacing a simple robust check with a more fragile one. What kind of checks do you have in mind then ? If possible, it should be possible to pass unicodefilenames() a path to check for Unicode- capability, since on Unix (and probably Mac OS X as well), the path decides which file system get's the ioctrl calls. > unicodefilenames() will allow other environments to declare that client code > will be more robust by choosing to use Unicode strings as file name > arguments. This could include UTF-8 based systems such as OS X and BeOS, as > well as Windows variants like CE. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Thu Jan 17 11:42:21 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 17 Jan 2002 12:42:21 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... In-Reply-To: <3C46A535.3C579501@lemburg.com> (mal@lemburg.com) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <01ab01c194a9$237b6dc0$0acc8490@neil> <200201032316.g03NGTB02137@mira.informatik.hu-berlin.de> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> <3C46A535.3C579501@lemburg.com> Message-ID: <200201171142.g0HBgLk01405@mira.informatik.hu-berlin.de> > Sounds like the run-time error solution would at least "solve" > the issue in terms of making it depend on the used file name > and underlying OS or file system. Such a solution is impossible to implement in some case. E.g. on Windows, if you use the ANSI (*A) APIs to list the directory contents, Windows will *silently* (AFAIK) give you incorrect file names, i.e. it will replace unrepresentable characters with the replacement char (QUESTION MARK). OTOH, on Unix, there is a better approach for listdir and unconvertable names: just return the byte strings to the user. > I'd say: let the different file name based APIs try hard enough > and then have them bail out if they can't handle the particular > case. That is a good idea. However, in case of the WinNT replacement strategy, the application may still want to know. Passing *in* Unicode objects is no issue at all: If they cannot be converted to a reasonable file name, you clearly get an exception. > > It turns out that only OS X really got it right: For each file, there > > is both a byte string name, and a Unicode name. > > I suppose this is due to the fact that Mac file systems store > extended attributes (much like what OS/2 does too) along with the > file -- that's a really nice way of being able to extend file > system semantics on a per-file basis; much better than the Windows > Registry or the MIME guess-by-extension mechanisms. I'd assume it is different: They just *define* that all local file systems they have control over use UTF-8 on disk, atleast for BSD ufs; for HFS, it might be that they 'just know' what encoding is used on an HFS partition. I doubt they use extended attributes for this, as they reportedly return UTF-8 even for file systems they've never seen before; this may be either due to static knowledge (e.g. that VFAT is UCS-2LE), or through guessing. It may be that there are also limitations and restrictions, but atleast they remove the burden from the application. Regards, Martin From martin@v.loewis.de Thu Jan 17 12:06:54 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Thu, 17 Jan 2002 13:06:54 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... In-Reply-To: <3C46B735.9C433F60@lemburg.com> (mal@lemburg.com) References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> <3C46A535.3C579501@lemburg.com> <08e201c19f46$cad5f070$0acc8490@neil> <3C46B735.9C433F60@lemburg.com> Message-ID: <200201171206.g0HC6sa01572@mira.informatik.hu-berlin.de> > Is it really ? The problem is that under some OSes it is possible > to work with multiple very different file system from within a > single Python program. In those cases, the unicodefilename() > API wouldn't really help all that much. If you are thinking of Unix: It seems unicodefilename has to return 0 on Unix, meaning that you need to use byte-oriented file names if you want to access all files (not that you will be able to display all file names to the user, though ... there is nothing we can do to achieve *that*). > > unicodefilenames() can be dropped in favour of explicit OS and version > > checks but this is replacing a simple robust check with a more fragile one. > > What kind of checks do you have in mind then ? If possible, it should > be possible to pass unicodefilenames() a path to check for Unicode- > capability, since on Unix (and probably Mac OS X as well), the path > decides which file system get's the ioctrl calls. I think you are missing the point that unicodefilenames, as defined, does not take any parameters. It says either yay or nay. So it could be replaced in application code with if sys.platform == "win32": use_unicode_for_filenames = windowsversion in ['nt','w2k','xp'] elif sys.platform.startswith("darwin"): use_unicode_for_filenames = 1 else: use_unicode_for_filenames = 0 I would not use such code in my applications, nor would I ever use unicodefilenames. Instead, I would just use Unicode file names all the time, and risk that some users have problems with some files. Those users I would tell to fix their systems (i.e. use NT instead of Windows, or use a UTF-8 locale on Unix). Most users will never notice any problem (except for Neil, who likes to put funny file names on his disk :-), so this is a typical 80-20 problem here (or maybe rather 99-1). Regards, Martin From mal@lemburg.com Thu Jan 17 12:29:54 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 17 Jan 2002 13:29:54 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> <3C46A535.3C579501@lemburg.com> <08e201c19f46$cad5f070$0acc8490@neil> <3C46B735.9C433F60@lemburg.com> <200201171206.g0HC6sa01572@mira.informatik.hu-berlin.de> Message-ID: <3C46C3C2.984F6227@lemburg.com> "Martin v. Loewis" wrote: > > > Is it really ? The problem is that under some OSes it is possible > > to work with multiple very different file system from within a > > single Python program. In those cases, the unicodefilename() > > API wouldn't really help all that much. > > If you are thinking of Unix: It seems unicodefilename has to return 0 > on Unix, meaning that you need to use byte-oriented file names if you > want to access all files (not that you will be able to display all > file names to the user, though ... there is nothing we can do to > achieve *that*). Right. I am starting to believe that unicodefilenames() doesn't really provide enough information to make it useful for cross-platform programming. > > > unicodefilenames() can be dropped in favour of explicit OS and version > > > checks but this is replacing a simple robust check with a more fragile one. > > > > What kind of checks do you have in mind then ? If possible, it should > > be possible to pass unicodefilenames() a path to check for Unicode- > > capability, since on Unix (and probably Mac OS X as well), the path > > decides which file system get's the ioctrl calls. > > I think you are missing the point that unicodefilenames, as defined, > does not take any parameters. It says either yay or nay. So it could > be replaced in application code with > > if sys.platform == "win32": > use_unicode_for_filenames = windowsversion in ['nt','w2k','xp'] > elif sys.platform.startswith("darwin"): > use_unicode_for_filenames = 1 > else: > use_unicode_for_filenames = 0 Sounds like this would be a good candidate for platform.py which I'll check into CVS soon. With its many platform querying APIs it should easily be possible to add a function which returns the above information based on the platform Python is running on. > I would not use such code in my applications, nor would I ever use > unicodefilenames. Instead, I would just use Unicode file names all the > time, and risk that some users have problems with some files. Those > users I would tell to fix their systems (i.e. use NT instead of > Windows, or use a UTF-8 locale on Unix). Most users will never notice > any problem (except for Neil, who likes to put funny file names on his > disk :-), so this is a typical 80-20 problem here (or maybe rather > 99-1). I doubt that you'll have any luck in trying to convince a user to switch OSes just because Python applications don't cope with native file names. The UTF-8 locale on Unix is also hard to push: e.g. existing latin-1 file names will probably stop working the minute you switch to that locale. (I always leave the setting to "C" and simply don't use locale based file names -- that way I don't run into problems; non-[a-zA-Z0-9\-\._]+ file names are a no-go for cross-platform-code if you ask me...) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Jan 17 12:36:27 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 17 Jan 2002 13:36:27 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> <3C46A535.3C579501@lemburg.com> <200201171142.g0HBgLk01405@mira.informatik.hu-berlin.de> Message-ID: <3C46C54B.72D0746A@lemburg.com> "Martin v. Loewis" wrote: > > > Sounds like the run-time error solution would at least "solve" > > the issue in terms of making it depend on the used file name > > and underlying OS or file system. > > Such a solution is impossible to implement in some case. E.g. on > Windows, if you use the ANSI (*A) APIs to list the directory contents, > Windows will *silently* (AFAIK) give you incorrect file names, i.e. it > will replace unrepresentable characters with the replacement char > (QUESTION MARK). Samba does the same for mounted Windows shares, BTW. > OTOH, on Unix, there is a better approach for listdir and > unconvertable names: just return the byte strings to the user. Sigh. > > I'd say: let the different file name based APIs try hard enough > > and then have them bail out if they can't handle the particular > > case. > > That is a good idea. However, in case of the WinNT replacement > strategy, the application may still want to know. > > Passing *in* Unicode objects is no issue at all: If they cannot be > converted to a reasonable file name, you clearly get an exception. True and that's good :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From paul@svensson.org Thu Jan 17 13:43:02 2002 From: paul@svensson.org (Paul Svensson) Date: Thu, 17 Jan 2002 08:43:02 -0500 (EST) Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: <3C46A33C.D97C831C@lemburg.com> Message-ID: On Thu, 17 Jan 2002, M.-A. Lemburg wrote: >Paul Prescod wrote: >> >> The documentation says: >> >> "Unlike Standard , all unrecognized escape sequences are left in the >> string unchanged, i.e., the backslash is left in the string. (This >> behavior is useful when debugging: if an escape sequence is mistyped, >> the resulting output is more easily recognized as broken.)" >> >> That's a weird thing to say. What could be more helpful for debugging >> than a good old SyntaxError??? > >If there's nothing wrong with the escape why raise a >SyntaxError ? I would certainly claim that an unrecognized escape sequence _is_ wrong. /Paul From mal@lemburg.com Thu Jan 17 14:02:11 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 17 Jan 2002 15:02:11 +0100 Subject: [Python-Dev] Utopian String Interpolation References: Message-ID: <3C46D963.2BC4F2A1@lemburg.com> Paul Svensson wrote: > > On Thu, 17 Jan 2002, M.-A. Lemburg wrote: > > >Paul Prescod wrote: > >> > >> The documentation says: > >> > >> "Unlike Standard , all unrecognized escape sequences are left in the > >> string unchanged, i.e., the backslash is left in the string. (This > >> behavior is useful when debugging: if an escape sequence is mistyped, > >> the resulting output is more easily recognized as broken.)" > >> > >> That's a weird thing to say. What could be more helpful for debugging > >> than a good old SyntaxError??? > > > >If there's nothing wrong with the escape why raise a > >SyntaxError ? > > I would certainly claim that an unrecognized escape sequence _is_ wrong. Depending on how you see it, an "unrecognized escape sequence" is not an escape sequence to begin with :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Thu Jan 17 14:15:00 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 17 Jan 2002 09:15:00 -0500 Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: Your message of "Thu, 17 Jan 2002 08:43:02 EST." References: Message-ID: <200201171415.JAA30650@cj20424-a.reston1.va.home.com> > I would certainly claim that an unrecognized escape sequence _is_ wrong. Then you are wrong. Go away and design your own language. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Thu Jan 17 15:39:30 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 17 Jan 2002 10:39:30 -0500 Subject: [Python-Dev] Utopian String Interpolation References: <3C446A5B.2E7A22CD@prescod.net> <3C45C4ED.806E64F3@lemburg.com> <3C45D7F5.BF260898@prescod.net> <3C46A33C.D97C831C@lemburg.com> Message-ID: <15430.61490.271542.258065@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> It is. Currently Python strings are just that: immutable MAL> strings. Now, you suddenly add dynamics to then. This will MAL> cause nightmares in terms of security. Note that Python MAL> hasn't really had a need for Perl's "taint" because of MAL> this. I wouldn't want to see that change in any way. Bingo! MAL> I've jumped in at a rather late point. Perhaps you ought to MAL> rewind the discussion then and start discussing in a MAL> different direction :-) E.g. about the syntax to be used in MAL> the interpolation and where, when and in which context to MAL> evaluate the strings. Proponants of this feature can start by updating the PEP. -Barry From paul@svensson.org Thu Jan 17 16:32:11 2002 From: paul@svensson.org (Paul Svensson) Date: Thu, 17 Jan 2002 11:32:11 -0500 (EST) Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: <200201171415.JAA30650@cj20424-a.reston1.va.home.com> Message-ID: On Thu, 17 Jan 2002, Guido van Rossum wrote: >> I would certainly claim that an unrecognized escape sequence _is_ wrong. > >Then you are wrong. (---) Then maybe the Python Referece Manual (2.4.1) needs to be updated, since the paragraph concerning unrecognized escape sequences doesn't mention them other than being "mistyped" or "broken". (Does "mistyped" and "broken" qualify as "wrong" ?) /Paul From skip@pobox.com Thu Jan 17 17:40:19 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 17 Jan 2002 11:40:19 -0600 Subject: [Python-Dev] deprecate input()? In-Reply-To: <200201170628.BAA28567@cj20424-a.reston1.va.home.com> References: <15429.36017.387707.78193@12-248-41-177.client.attbi.com> <200201170628.BAA28567@cj20424-a.reston1.va.home.com> Message-ID: <15431.3203.415990.602525@beluga.mojam.com> >> I just responded to a question on c.l.py a user had about feeding >> empty strings to input(). While he didn't say why he called input(), >> I suspect he thought the semantics were more like raw_input(). >> >> In these days of widespread Internet nastiness, shouldn't input() be >> deprecated? Guido> Why? I imagine this is only used for interactive input, and then Guido> it's the computer's owner who is typing. Yes, but what if the program containing calls to input() get shipped to someone else's computer? It just seems to me that a) input is almost never what you want to call and that b) it would seem to a naive programmer to be the correct way to ask the user for a line of input. Skip From guido@python.org Thu Jan 17 17:49:26 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 17 Jan 2002 12:49:26 -0500 Subject: [Python-Dev] deprecate input()? In-Reply-To: Your message of "Thu, 17 Jan 2002 11:40:19 CST." <15431.3203.415990.602525@beluga.mojam.com> References: <15429.36017.387707.78193@12-248-41-177.client.attbi.com> <200201170628.BAA28567@cj20424-a.reston1.va.home.com> <15431.3203.415990.602525@beluga.mojam.com> Message-ID: <200201171749.MAA00493@cj20424-a.reston1.va.home.com> > Guido> Why? I imagine this is only used for interactive input, > Guido> and then it's the computer's owner who is typing. > > Yes, but what if the program containing calls to input() get shipped > to someone else's computer? It just seems to me that a) input is > almost never what you want to call and that b) it would seem to a > naive programmer to be the correct way to ask the user for a line of > input. I don't see the security problem. Can you explain a scenario where this causes a security risk? If the user of the program types something evil in the input box they screw themselves! --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@rahul.net Thu Jan 17 17:56:46 2002 From: aahz@rahul.net (Aahz Maruch) Date: Thu, 17 Jan 2002 09:56:46 -0800 (PST) Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: <200201171415.JAA30650@cj20424-a.reston1.va.home.com> from "Guido van Rossum" at Jan 17, 2002 09:15:00 AM Message-ID: <20020117175646.034B5E8C8@waltz.rahul.net> Guido van Rossum wrote: >Paul Svensson: >> >> I would certainly claim that an unrecognized escape sequence _is_ wrong. > > Then you are wrong. Go away and design your own language. Hey! That's a bit harsh. I'm not going to campaign to make unrecognized escape sequences a syntax error, but not raising a syntax error does seem to be against Python's principles. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From rsc@plan9.bell-labs.com Thu Jan 17 18:01:21 2002 From: rsc@plan9.bell-labs.com (Russ Cox) Date: Thu, 17 Jan 2002 13:01:21 -0500 Subject: [Python-Dev] deprecate input()? Message-ID: > Yes, but what if the program containing calls to input() get shipped to > someone else's computer? It just seems to me that a) input is almost never > what you want to call and that b) it would seem to a naive programmer to be > the correct way to ask the user for a line of input. Since most arbitrary lines of input generate syntax errors, wouldn't the naive programmer quickly figure out that input isn't the "read a line" function? (Unless you're trying to input numbers, I suppose.) Russ From entropiamax@jazzfree.com Thu Jan 17 18:25:46 2002 From: entropiamax@jazzfree.com (Andres Tuells) Date: Thu, 17 Jan 2002 19:25:46 +0100 Subject: [Python-Dev] Re: Stackless Python is DEAD! Long live Stackless Python References: <3C470552.3040802@tismer.com> Message-ID: <006f01c19f84$620dde20$9d76393e@integralabzenon> Thats great !!! ----- Original Message ----- From: "Christian Tismer" To: ; ; Sent: Thursday, January 17, 2002 6:09 PM Subject: Ann: Stackless Python is DEAD! Long live Stackless Python > > ####################################### > > Announcement: > > ####################################### > > > The end of an era has come: > --------------------------- > Stackless Python, in the form provided upto Python 2.0, is DEAD. > > I am abandoning the whole implementation. > > > A new era has begun: > -------------------- > A completely new implementation is in development for > Python 2.2 and up which gives you the following features: > > - There are no restrictions any longer for uthread/coroutine > switching. Switching is possible at *any* time, in *any* > context. > > - There are no significant changes to the Python core any > longer. The new patches are of minimum size, and they > will probably survive unchanged until Python 3.0 . > > - Maintenance work for Stackless Python is reduced to the > bare minimum. There is no longer a need to incorporate > Stackless into the standard, since there is no work to > be shared. > > - Stackless breaks its major axiom now. It is no longer > platform independent, since it *does* modify the C stack. > I will support all Intel platforms by myself. For other > platforms, I'm asking for volunteers. > > * The basic elements of Stackless are now switchable chains > of frames. We have to define an interface that turns these > chains into microthreads and coroutines. > > Everybody is invited to come to the Stackless mailing > list and discuss the layout of this new design. > Especially we need to decide about (*). > > http://starship.python.net/mailman/listinfo/stackless > > see you there - chris > > -- > Christian Tismer :^) > Mission Impossible 5oftware : Have a break! Take a ride on Python's > Kaunstr. 26 : *Starship* http://starship.python.net/ > 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ > PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF > where do you want to jump today? http://www.stackless.com/ > > > > -- > http://mail.python.org/mailman/listinfo/python-list From guido@python.org Thu Jan 17 18:25:46 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 17 Jan 2002 13:25:46 -0500 Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: Your message of "Thu, 17 Jan 2002 09:56:46 PST." <20020117175646.034B5E8C8@waltz.rahul.net> References: <20020117175646.034B5E8C8@waltz.rahul.net> Message-ID: <200201171825.NAA00602@cj20424-a.reston1.va.home.com> > >Paul Svensson: > >> > >> I would certainly claim that an unrecognized escape sequence _is_ wrong. > > > Guido van Rossum wrote: > > Then you are wrong. Go away and design your own language. > Aahz: > Hey! That's a bit harsh. I'm not going to campaign to make > unrecognized escape sequences a syntax error, but not raising a syntax > error does seem to be against Python's principles. Whatever. Who is Paul Svensson and what is he doing in python-dev? --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Thu Jan 17 18:36:59 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 17 Jan 2002 12:36:59 -0600 Subject: [Python-Dev] deprecate input()? In-Reply-To: <200201171749.MAA00493@cj20424-a.reston1.va.home.com> References: <15429.36017.387707.78193@12-248-41-177.client.attbi.com> <200201170628.BAA28567@cj20424-a.reston1.va.home.com> <15431.3203.415990.602525@beluga.mojam.com> <200201171749.MAA00493@cj20424-a.reston1.va.home.com> Message-ID: <15431.6603.470764.669139@beluga.mojam.com> Guido> Why? I imagine this is only used for interactive input, and then Guido> it's the computer's owner who is typing. >> Yes, but what if the program containing calls to input() get shipped >> to someone else's computer? It just seems to me that a) input is >> almost never what you want to call and that b) it would seem to a >> naive programmer to be the correct way to ask the user for a line of >> input. Guido> I don't see the security problem. Can you explain a scenario Guido> where this causes a security risk? If the user of the program Guido> types something evil in the input box they screw themselves! Fine. Let's drop it. Skip From nhodgson@bigpond.net.au Thu Jan 17 19:31:36 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Fri, 18 Jan 2002 06:31:36 +1100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <020601c194b3$c85a4320$0acc8490@neil> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> <3C46A535.3C579501@lemburg.com> <08e201c19f46$cad5f070$0acc8490@neil> <3C46B735.9C433F60@lemburg.com> Message-ID: <00c401c19f8d$941e3fa0$0acc8490@neil> M.-A. Lemburg: > Is it really ? The problem is that under some OSes it is possible > to work with multiple very different file system from within a > single Python program. In those cases, the unicodefilename() > API wouldn't really help all that much. On NT the core file system calls are Unicode based with the narrow string calls being shims on top of this. When mounting non-native file systems, NT may perform name mapping, but that name mapping is 'complete and consistent' in that it is not possible to do anything with the narrow APIs that cannot be achieved with the Unicode APIs. > > unicodefilenames() can be dropped in favour of explicit OS and version > > checks but this is replacing a simple robust check with a more fragile one. > > What kind of checks do you have in mind then ? If possible, it should > be possible to pass unicodefilenames() a path to check for Unicode- > capability, since on Unix (and probably Mac OS X as well), the path > decides which file system get's the ioctrl calls. Any platform experts know how this works on MacOS X or BeOS? Do non-native file systems get mapped to Unicode names so that UTF-8 will always work? Neil From thomas.heller@ion-tof.com Thu Jan 17 19:23:00 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 17 Jan 2002 20:23:00 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> Message-ID: <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> From: "Jack Jansen" > In the discussion on my request for an ("O@", typeobject, > void **) format for PyArg_Parse and Py_BuildValue MAL suggested (as MAL already explained, that we suggested by me) > that I could get the same functionality by creating a type > WrapperTypeObject, which would be a subtype of TypeObject with > extra fields pointing to the _New() and _Convert() routines to > convert Python objects from/to C pointers. This would be good > enough for me, because then types wanting to participate in the > wrapper protocol would subtype WrapperTypeObject in stead of > TypeObject, and two global routines could return the _New and > _Convert routines given the type object, and we wouldn't need > yet another PyArg_Parse format specifier. > > However, after digging high and low I haven't been able to > deduce how I would then use this WrapperType in C as the type > for my extension module objects. Are there any examples? If not, > could someone who understands the new inheritance scheme give me > some clues as to how to do this? Currently (after quite some time) I have the impression that you cannot create a subtype of PyType_Type in C because PyType_Type ends in a variable sized array, at least not in this way: struct { PyTypeObject type; ...additional fields... } WrapperType_Type; Can someone confirm this? (I have to find out what to do with the tp_members slot, which seems to be correspond to the Python level __slots__ class variable) Thomas From guido@python.org Thu Jan 17 19:51:36 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 17 Jan 2002 14:51:36 -0500 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: Your message of "Thu, 17 Jan 2002 20:23:00 +0100." <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> Message-ID: <200201171951.OAA00909@cj20424-a.reston1.va.home.com> > Currently (after quite some time) I have the impression that you > cannot create a subtype of PyType_Type in C because PyType_Type > ends in a variable sized array, at least not in this way: > > struct { > PyTypeObject type; > ...additional fields... > } WrapperType_Type; > > Can someone confirm this? Yes, alas. The type you would have to declare is 'etype', a private type in typeobject.c. --Guido van Rossum (home page: http://www.python.org/~guido/) From nhodgson@bigpond.net.au Thu Jan 17 20:07:30 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Fri, 18 Jan 2002 07:07:30 +1100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <3C357405.CC439CBE@lemburg.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> <3C46A535.3C579501@lemburg.com> <08e201c19f46$cad5f070$0acc8490@neil> <3C46B735.9C433F60@lemburg.com> <200201171206.g0HC6sa01572@mira.informatik.hu-berlin.de> Message-ID: <016601c19f92$9a049e00$0acc8490@neil> Martin v. Loewis: > Most users will never notice > any problem (except for Neil, who likes to put funny file names on his > disk :-), so this is a typical 80-20 problem here (or maybe rather > 99-1). While Martin is referring to the rarity of having non-native file names on Windows 9x, the problem adressed by PEP 277 is real. Already this year, there have been two enquiries [from Michael Ebert and Guenter Radestock] to comp.lang.python about Unicode file name use on NT. Neil From paul@prescod.net Thu Jan 17 20:10:27 2002 From: paul@prescod.net (Paul Prescod) Date: Thu, 17 Jan 2002 12:10:27 -0800 Subject: [Python-Dev] Utopian String Interpolation References: <3C446A5B.2E7A22CD@prescod.net> <3C45C4ED.806E64F3@lemburg.com> <3C45D7F5.BF260898@prescod.net> <3C46A33C.D97C831C@lemburg.com> Message-ID: <3C472FB3.81EAA5D5@prescod.net> "M.-A. Lemburg" wrote: > >... > > It is. Currently Python strings are just that: immutable strings. > Now, you suddenly add dynamics to then. I don't want to go through this whole thread from the beginning again. PEP 215 does not add "dynamics" to anything. In fact, PEP 215 is a more static mechanism than the current idiom. Even if we make PEP 215's behaviour the default for strings, it is still NOT DYNAMIC. >... This will cause nightmares > in terms of security. There is a thread called "PEP 215 does not introduce security issues". Please read it. Everyone involved who initially thought that PEP 215 had security issues backed down and agreed that it did not. Once again, whether there is a string prefix or not is irrelevant to this question. PEP 215's semantics are *not dynamic*. > ... Note that Python hasn't really had a need > for Perl's "taint" because of this. I wouldn't want to see that > change in any way. I am certainly not a Perl programmer but Python is also attackable through the sorts of holes that "taint" is intended to avoid. username = raw_input() os.system("cp %s.new %s.old" % (username, username)) Perl considers this "dangerous" and so it has taint. It has *nothing* to do with interpolation syntax. >... > Huh ? I bet RedHat and thousands of sysadmins who have switched > from shell or Perl to Python would have strong objections. Python has a construct called a "raw string" which is perfect for when you don't want backslashes treated specially. Paul Prescod From DavidA@ActiveState.com Thu Jan 17 17:46:23 2002 From: DavidA@ActiveState.com (David Ascher) Date: Thu, 17 Jan 2002 09:46:23 -0800 Subject: [Python-Dev] deprecate input()? References: <15429.36017.387707.78193@12-248-41-177.client.attbi.com> <200201170628.BAA28567@cj20424-a.reston1.va.home.com> Message-ID: <3C470DEF.B5AD08EE@ActiveState.com> Guido van Rossum wrote: > > > I just responded to a question on c.l.py a user had about feeding empty > > strings to input(). While he didn't say why he called input(), I suspect he > > thought the semantics were more like raw_input(). > > > > In these days of widespread Internet nastiness, shouldn't input() be > > deprecated? > > Why? I imagine this is only used for interactive input, and then it's > the computer's owner who is typing. input() can also be used effectively in interactive apps (calculators, scripting engines for GUI apps) in contexts where the users can be trusted. Not _everything_ is on the web, luckily, and not everything needs to be evildoer-proof... That doesn't mean that I think the naming choices for input() and raw_input() have withstood the test of hindsight, but few things do... --david From Jack.Jansen@oratrix.nl Thu Jan 17 20:59:13 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Thu, 17 Jan 2002 21:59:13 +0100 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: <3C46A799.2E7AD7DB@lemburg.com> Message-ID: <0F82FB1B-0B8D-11D6-B884-003065517236@oratrix.nl> On Thursday, January 17, 2002, at 11:29 AM, M.-A. Lemburg wrote: > Jack Jansen wrote: >> >> In the discussion on my request for an ("O@", typeobject, >> void **) format for PyArg_Parse and Py_BuildValue MAL suggested > > Thomas Heller suggested this. Oops, you're right. I should be careful not to mix up my Germans;-) > I am more in favour of > exposing the pickle reduce API through "O@", that is > have PyArgTuple_Parse() call the .__reduce__() method > of the object. This will then return (factory, state_tuple) > and these could then be exposed to the C function via two > PyObject*. You've suggested this before, but at that time I ignored it because it made absolutely no sense to me. "pickle" triggers one set of ideas for me, "reduce" triggers a different set, "factory function" yet another different set. None of these sets of ideas have the least resemblance to what I'm trying to do:-) I gave a fairly complete example (using calldll from Python to wrap a function that returns a Mac WindowObject) last week, could you explain how you would implement this with pickle, reduce and factory functions? -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From Jack.Jansen@oratrix.nl Thu Jan 17 21:03:25 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Thu, 17 Jan 2002 22:03:25 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... In-Reply-To: <200201171142.g0HBgLk01405@mira.informatik.hu-berlin.de> Message-ID: On Thursday, January 17, 2002, at 12:42 PM, Martin v. Loewis wrote: >> I suppose this is due to the fact that Mac file systems store >> extended attributes (much like what OS/2 does too) along with the >> file -- that's a really nice way of being able to extend file >> system semantics on a per-file basis; much better than the Windows >> Registry or the MIME guess-by-extension mechanisms. > > I'd assume it is different: They just *define* that all local file > systems they have control over use UTF-8 on disk, atleast for BSD ufs; > for HFS, it might be that they 'just know' what encoding is used on an > HFS partition. I doubt they use extended attributes for this, as they > reportedly return UTF-8 even for file systems they've never seen > before; this may be either due to static knowledge (e.g. that VFAT is > UCS-2LE), or through guessing. It's actually a whole lot simpler: for filesystems with an encoding that is open to interpretation the user specifies it during mount:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From Jack.Jansen@oratrix.nl Thu Jan 17 21:09:20 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Thu, 17 Jan 2002 22:09:20 +0100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... In-Reply-To: <00c401c19f8d$941e3fa0$0acc8490@neil> Message-ID: <79BF50EE-0B8E-11D6-B884-003065517236@oratrix.nl> On Thursday, January 17, 2002, at 08:31 PM, Neil Hodgson wrote: >> What kind of checks do you have in mind then ? If possible, it should >> be possible to pass unicodefilenames() a path to check for Unicode- >> capability, since on Unix (and probably Mac OS X as well), the path >> decides which file system get's the ioctrl calls. > > Any platform experts know how this works on MacOS X or BeOS? Do > non-native file systems get mapped to Unicode names so that UTF-8 will > always work? For Mac OS X: yes, that is how it is supposed to work. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mal@lemburg.com Fri Jan 18 09:47:03 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 18 Jan 2002 10:47:03 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <0F82FB1B-0B8D-11D6-B884-003065517236@oratrix.nl> Message-ID: <3C47EF17.37CE6126@lemburg.com> Jack Jansen wrote: > > On Thursday, January 17, 2002, at 11:29 AM, M.-A. Lemburg wrote: > > > I am more in favour of > > exposing the pickle reduce API through "O@", that is > > have PyArgTuple_Parse() call the .__reduce__() method > > of the object. This will then return (factory, state_tuple) > > and these could then be exposed to the C function via two > > PyObject*. > > You've suggested this before, but at that time I ignored it > because it made absolutely no sense to me. "pickle" triggers one > set of ideas for me, "reduce" triggers a different set, "factory > function" yet another different set. None of these sets of ideas > have the least resemblance to what I'm trying to do:-) The idea is simple but extends what you are trying to achieve (I gave an example on how to use this somewhere in the "wrapper" thread). Basically, you'll just want to use the state tuple to access the underlying void* C pointer via a PyCObject which does the wrapping of the pointer. The "pickle" mechanism would store the PyCObject in the state tuple which you could then access to get at the C pointer. This may sound complicated at first, but it provides much more flexibility w/r to more complex objects, e.g. the method you have in mind only supports wrapping a single C pointer; the "pickle" mechanism can potentially handle any serializable object. > I gave a fairly complete example (using calldll from Python to > wrap a function that returns a Mac WindowObject) last week, > could you explain how you would implement this with pickle, > reduce and factory functions? Sorry, no time for that ... I've got an important business trip next week which needs to be prepared. Please bring this up again after next week. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jack@oratrix.com Fri Jan 18 10:24:01 2002 From: jack@oratrix.com (Jack Jansen) Date: Fri, 18 Jan 2002 11:24:01 +0100 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: <3C47EF17.37CE6126@lemburg.com> Message-ID: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> On Friday, January 18, 2002, at 10:47 , M.-A. Lemburg wrote: > Jack Jansen wrote: >> >> On Thursday, January 17, 2002, at 11:29 AM, M.-A. Lemburg wrote: >> >>> I am more in favour of >>> exposing the pickle reduce API through "O@", that is >>> have PyArgTuple_Parse() call the .__reduce__() method >>> of the object. This will then return (factory, state_tuple) >>> and these could then be exposed to the C function via two >>> PyObject*. >> >> You've suggested this before, but at that time I ignored it >> because it made absolutely no sense to me. "pickle" triggers one >> set of ideas for me, "reduce" triggers a different set, "factory >> function" yet another different set. None of these sets of ideas >> have the least resemblance to what I'm trying to do:-) > > The idea is simple but extends what you are trying to > achieve (I gave an example on how to use this somewhere > in the "wrapper" thread). Basically, you'll just want to > use the state tuple to access the underlying void* C pointer > via a PyCObject which does the wrapping of the pointer. > The "pickle" mechanism would store the PyCObject in the > state tuple which you could then access to get at the > C pointer. > I think you're missing a few points here. First of all, my objects aren't PyCObjects but other extension objects. While the main pointer in the object could be wrapped in a PyCObject there may be other information in my objects that is important, such as a pointer to the dispose routine to call on the c-pointer when the Python object reaches refcount zero (and this pointer may change over time as ownership of, say, a button is passed from Python to the system). The _New and _Convert routines will know how to get from the C pointer to the *correct* object, i.e. normally there will be only one Python object for every C object. Also, the method seems rather complicated for doing a simple thing. The only thing I really want is a way to refer to an _New or _Convert method from Python code. The most reasonable way to do that seems to be by creating a way to get from te type object (which is available in Python) to those routines. Thomas' suggestion looked very promising, and simple too, until Guido said that unfortunately it couldn't be done. Your suggestion, as far as I understand it, looks complicated and probably inefficient too (remember the code will have to go through all these hoops every time it needs to convert an object from Python to C or vice versa). Correct me if I'm wrong, From mal@lemburg.com Fri Jan 18 11:27:11 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 18 Jan 2002 12:27:11 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> Message-ID: <3C48068F.5A239D16@lemburg.com> Jack Jansen wrote: > > >> On Thursday, January 17, 2002, at 11:29 AM, M.-A. Lemburg wrote: > >> > >>> I am more in favour of > >>> exposing the pickle reduce API through "O@", that is > >>> have PyArgTuple_Parse() call the .__reduce__() method > >>> of the object. This will then return (factory, state_tuple) > >>> and these could then be exposed to the C function via two > >>> PyObject*. > >> > >> You've suggested this before, but at that time I ignored it > >> because it made absolutely no sense to me. "pickle" triggers one > >> set of ideas for me, "reduce" triggers a different set, "factory > >> function" yet another different set. None of these sets of ideas > >> have the least resemblance to what I'm trying to do:-) > > > > The idea is simple but extends what you are trying to > > achieve (I gave an example on how to use this somewhere > > in the "wrapper" thread). Basically, you'll just want to > > use the state tuple to access the underlying void* C pointer > > via a PyCObject which does the wrapping of the pointer. > > The "pickle" mechanism would store the PyCObject in the > > state tuple which you could then access to get at the > > C pointer. > > > I think you're missing a few points here. First of all, my objects > aren't PyCObjects but other extension objects. I know. The idea is that either you add a .__reduce__ method to the extension objects or register their types with a registry comparable to copyreg. > While the main pointer in > the object could be wrapped in a PyCObject there may be other > information in my objects that is important, such as a pointer to the > dispose routine to call on the c-pointer when the Python object reaches > refcount zero (and this pointer may change over time as ownership of, > say, a button is passed from Python to the system). Note that PyCObjects support all of this. It's not important in this context, though. The PyCObject is only used to wrap the raw pointer; the factory function then takes this pointer and creates one of your extension object out of it. > The _New and > _Convert routines will know how to get from the C pointer to the > *correct* object, i.e. normally there will be only one Python object for > every C object. That's also possible using the "pickle" approach. > Also, the method seems rather complicated for doing a simple thing. The > only thing I really want is a way to refer to an _New or _Convert method > from Python code. The most reasonable way to do that seems to be by > creating a way to get from te type object (which is available in Python) > to those routines. Thomas' suggestion looked very promising, and simple > too, until Guido said that unfortunately it couldn't be done. Your > suggestion, as far as I understand it, looks complicated and probably > inefficient too (remember the code will have to go through all these > hoops every time it needs to convert an object from Python to C or vice > versa). It is more complicated, but also more flexible. Plus it builds on techniques which are already applied in Python's pickle mechanism. Note that by adding a tp_reduce slot, the overhead of calling a Python function could be kept reasonable. Helper functions could aid in accessing the C pointer which is stored in the state tuple. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Fri Jan 18 15:09:30 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 18 Jan 2002 10:09:30 -0500 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: Your message of "Mon, 14 Jan 2002 15:04:17 CST." <15427.18385.906669.456387@12-248-41-177.client.attbi.com> References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> <15427.15272.171558.1993@12-248-41-177.client.attbi.com> <200201142017.PAA12356@cj20424-a.reston1.va.home.com> <15427.18385.906669.456387@12-248-41-177.client.attbi.com> Message-ID: <200201181509.KAA09399@cj20424-a.reston1.va.home.com> What's the current thinking about making docstrings optional? Does everybody agree on Gustavo's patch? http://sourceforge.net/tracker/?func=detail&atid=305470&aid=505375&group_id=5470 --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Fri Jan 18 15:15:54 2002 From: nas@python.ca (Neil Schemenauer) Date: Fri, 18 Jan 2002 07:15:54 -0800 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <200201181509.KAA09399@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Jan 18, 2002 at 10:09:30AM -0500 References: <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> <15427.15272.171558.1993@12-248-41-177.client.attbi.com> <200201142017.PAA12356@cj20424-a.reston1.va.home.com> <15427.18385.906669.456387@12-248-41-177.client.attbi.com> <200201181509.KAA09399@cj20424-a.reston1.va.home.com> Message-ID: <20020118071554.A17496@glacier.arctrix.com> Guido van Rossum wrote: > What's the current thinking about making docstrings optional? > > Does everybody agree on Gustavo's patch? 10% space saving? That doesn't seem to be worth the effort. OTOH, I'm not dealing with any platforms that are memory constrained right now. Neil From mal@lemburg.com Fri Jan 18 15:23:23 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 18 Jan 2002 16:23:23 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> <15427.15272.171558.1993@12-248-41-177.client.attbi.com> <200201142017.PAA12356@cj20424-a.reston1.va.home.com> <15427.18385.906669.456387@12-248-41-177.client.attbi.com> <200201181509.KAA09399@cj20424-a.reston1.va.home.com> Message-ID: <3C483DEB.5B9D12A6@lemburg.com> Guido van Rossum wrote: > > What's the current thinking about making docstrings optional? > > Does everybody agree on Gustavo's patch? > > http://sourceforge.net/tracker/?func=detail&atid=305470&aid=505375&group_id=5470 +1. This will help Python embedders and porters to embedded systems a lot. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From barry@zope.com Fri Jan 18 15:26:06 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 18 Jan 2002 10:26:06 -0500 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] References: <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> <15427.15272.171558.1993@12-248-41-177.client.attbi.com> <200201142017.PAA12356@cj20424-a.reston1.va.home.com> <15427.18385.906669.456387@12-248-41-177.client.attbi.com> <200201181509.KAA09399@cj20424-a.reston1.va.home.com> <20020118071554.A17496@glacier.arctrix.com> Message-ID: <15432.16014.869439.363615@anthem.wooz.org> >>>>> "NS" == Neil Schemenauer writes: >> What's the current thinking about making docstrings optional? >> Does everybody agree on Gustavo's patch? NS> 10% space saving? That doesn't seem to be worth the effort. NS> OTOH, I'm not dealing with any platforms that are memory NS> constrained right now. Personally I don't care either for the same reasons. I'll just note that what Emacs used to do (maybe it still does, I dunno), is extract all its inlined docstrings into a separate file which could be thrown away if you didn't want to pay for the bloat. All that complexity was built in a time when 300KB or so of docstrings really could make a huge difference for download times or storage resources. -Barry From mal@lemburg.com Fri Jan 18 15:42:09 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 18 Jan 2002 16:42:09 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] References: <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> <15427.15272.171558.1993@12-248-41-177.client.attbi.com> <200201142017.PAA12356@cj20424-a.reston1.va.home.com> <15427.18385.906669.456387@12-248-41-177.client.attbi.com> <200201181509.KAA09399@cj20424-a.reston1.va.home.com> <20020118071554.A17496@glacier.arctrix.com> <15432.16014.869439.363615@anthem.wooz.org> Message-ID: <3C484251.9579FEDF@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "NS" == Neil Schemenauer writes: > > >> What's the current thinking about making docstrings optional? > >> Does everybody agree on Gustavo's patch? > > NS> 10% space saving? That doesn't seem to be worth the effort. > NS> OTOH, I'm not dealing with any platforms that are memory > NS> constrained right now. > > Personally I don't care either for the same reasons. I'll just note > that what Emacs used to do (maybe it still does, I dunno), is extract > all its inlined docstrings into a separate file which could be thrown > away if you didn't want to pay for the bloat. All that complexity was > built in a time when 300KB or so of docstrings really could make a > huge difference for download times or storage resources. You should also consider the possibility of using the macros for translating the docs-strings. They are a form of markup. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From barry@zope.com Fri Jan 18 15:46:48 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 18 Jan 2002 10:46:48 -0500 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] References: <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> <15427.15272.171558.1993@12-248-41-177.client.attbi.com> <200201142017.PAA12356@cj20424-a.reston1.va.home.com> <15427.18385.906669.456387@12-248-41-177.client.attbi.com> <200201181509.KAA09399@cj20424-a.reston1.va.home.com> <20020118071554.A17496@glacier.arctrix.com> <15432.16014.869439.363615@anthem.wooz.org> <3C484251.9579FEDF@lemburg.com> Message-ID: <15432.17256.768942.156692@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> You should also consider the possibility of using the macros MAL> for translating the docs-strings. They are a form of markup. Good point! -Barry From jack@oratrix.com Fri Jan 18 16:23:30 2002 From: jack@oratrix.com (Jack Jansen) Date: Fri, 18 Jan 2002 17:23:30 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <3C483DEB.5B9D12A6@lemburg.com> Message-ID: On Friday, January 18, 2002, at 04:23 , M.-A. Lemburg wrote: > Guido van Rossum wrote: >> >> What's the current thinking about making docstrings optional? >> >> Does everybody agree on Gustavo's patch? >> >> http://sourceforge.net/tracker/?func=detail&atid=305470&aid=505375&group_id= >> 5470 > > +1. > > This will help Python embedders and porters to embedded systems > a lot. +1. Same reasoning. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From paul@prescod.net Fri Jan 18 18:38:05 2002 From: paul@prescod.net (Paul Prescod) Date: Fri, 18 Jan 2002 10:38:05 -0800 Subject: [Python-Dev] Utopian String Interpolation References: <20020117175646.034B5E8C8@waltz.rahul.net> <200201171825.NAA00602@cj20424-a.reston1.va.home.com> Message-ID: <3C486B8D.1E30C93D@prescod.net> I think that something in particular that Paul S. said got under your skin (and there was something he said that could certainly get under a person's skin). I'm pretty sure it isn't now a policy to rudely reject suggestions from people you haven't heard of! Until I went back through the thread I felt as Aahz did that your rejection was somewhat severe in tone. I think you (still) agree that people should not be afraid of (politely) stating their opinions in python-dev, even when those opinions disagree with yours. Or if there is an unspoken rule that unproven developers shouldn't be in python-dev then maybe we should just make it a spoken rule. But I'm most confident of the theory that you snapped at one person in particular because of something he said. Paul Prescod Guido van Rossum wrote: > > > >Paul Svensson: > > >> > > >> I would certainly claim that an unrecognized escape sequence _is_ wrong. > > > > > Guido van Rossum wrote: > > > Then you are wrong. Go away and design your own language. > > > Aahz: > > Hey! That's a bit harsh. I'm not going to campaign to make > > unrecognized escape sequences a syntax error, but not raising a syntax > > error does seem to be against Python's principles. > > Whatever. Who is Paul Svensson and what is he doing in python-dev? > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From martin@v.loewis.de Fri Jan 18 18:42:22 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 18 Jan 2002 19:42:22 +0100 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> (message from Jack Jansen on Fri, 18 Jan 2002 11:24:01 +0100) References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> Message-ID: <200201181842.g0IIgMc01444@mira.informatik.hu-berlin.de> From thomas.heller@ion-tof.com Fri Jan 18 18:56:45 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 18 Jan 2002 19:56:45 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> <200201181842.g0IIgMc01444@mira.informatik.hu-berlin.de> Message-ID: <05fc01c1a051$f1c4db90$e000a8c0@thomasnotebook> From: "Martin v. Loewis" To: Cc: ; ; Sent: Friday, January 18, 2002 7:42 PM Subject: Re: [Python-Dev] Extending types in C - help needed > Hmm, not very much help ;-) Thomas From thomas.heller@ion-tof.com Fri Jan 18 19:01:16 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 18 Jan 2002 20:01:16 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> Message-ID: <060801c1a052$93d5a860$e000a8c0@thomasnotebook> > > Currently (after quite some time) I have the impression that you > > cannot create a subtype of PyType_Type in C because PyType_Type > > ends in a variable sized array, at least not in this way: > > > > struct { > > PyTypeObject type; > > ...additional fields... > > } WrapperType_Type; > > > > Can someone confirm this? > > Yes, alas. The type you would have to declare is 'etype', a private > type in typeobject.c. Does this mean this is the wrong route, or is it absolute impossible to create a subtype of PyType_Type in C with additional slots? Any tips about the route to take? Thanks, Thomas From martin@v.loewis.de Fri Jan 18 19:36:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 18 Jan 2002 20:36:20 +0100 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> (message from Jack Jansen on Fri, 18 Jan 2002 11:24:01 +0100) References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> Message-ID: <200201181936.g0IJaKu01691@mira.informatik.hu-berlin.de> > Also, the method seems rather complicated for doing a simple thing. The > only thing I really want is a way to refer to an _New or _Convert method > from Python code. I believe the attached code implements your requirements. In particular, see PyArg_GenericCopy for an application that extracts a void* from an object through a type-safe protocol, then creates a clone of the original object through the same protocol. Both extractor and creator function are associated with the type object. To see this work in Python, run >>> import handle >>> x=handle.new(10) >>> x >>> y=handle.copy(x) >>> y Regards, Martin #include "Python.h" /************* Generic Converters ***************/ struct converters{ PyObject* (*create)(void*); int (*extract)(PyObject*, void**); }; char descr_string[] = "calldll converter structure"; void PyArg_AddConverters(PyTypeObject *type, struct converters* convs) { PyObject *cobj = PyCObject_FromVoidPtrAndDesc(convs, descr_string, NULL); PyDict_SetItemString(type->tp_dict, "__calldll__", cobj); Py_DECREF(cobj); } struct converters* PyArg_GetConverters(PyTypeObject *type) { PyObject *cobj; void *descr; cobj = PyObject_GetAttrString((PyObject*)type, "__calldll__"); if (!cobj) return NULL; descr = PyCObject_GetDesc(cobj); if (!descr) return NULL; if (descr != descr_string){ PyErr_SetString(PyExc_TypeError, "invalid cobj"); return NULL; } return (struct converters*)PyCObject_AsVoidPtr(cobj); } PyObject *PyArg_Create(PyTypeObject* type, void * value) { struct converters *convs = PyArg_GetConverters(type); if (!convs) return NULL; return convs->create(value); } int PyArg_Extract(PyObject* obj, void** value) { struct converters *convs = PyArg_GetConverters(obj->ob_type); if (!convs) return -1; convs->extract(obj, value); return 0; } PyObject* PyArg_GenericCopy(PyObject* obj) { void *tmp; if (PyArg_Extract(obj, &tmp)) return NULL; return PyArg_Create(obj->ob_type, tmp); } /************* End Generic Converters ***************/ typedef struct { PyObject_HEAD int handle; } HandleObject; staticforward PyTypeObject Handle_Type; #define HandleObject_Check(v) ((v)->ob_type == &Handle_Type) static HandleObject * newHandleObject(int i) { HandleObject *self; self = PyObject_New(HandleObject, &Handle_Type); if (self == NULL) return NULL; self->handle = i; return self; } /* Handle methods */ static void Handle_dealloc(HandleObject *self) { PyObject_Del(self); } /**************** Generic Converters: Handle support ***************/ static PyObject* handle_conv_new(void *s){ return (PyObject*)newHandleObject((int)s); } static int handle_conv_extract(PyObject *o, void **dest){ HandleObject *h = (HandleObject*)o; *dest = (void*)h->handle; return 0; } struct converters HandleConvs = { handle_conv_new, handle_conv_extract }; /**************** Generic Converters: Handle support ***************/ statichere PyTypeObject Handle_Type = { /* The ob_type field must be initialized in the module init function * to be portable to Windows without using C++. */ PyObject_HEAD_INIT(NULL) 0, /*ob_size*/ "handle.Handle", /*tp_name*/ sizeof(HandleObject), /*tp_basicsize*/ 0, /*tp_itemsize*/ /* methods */ (destructor)Handle_dealloc, /*tp_dealloc*/ 0, /*tp_print*/ 0, /*tp_getattr*/ 0, /*tp_setattr*/ 0, /*tp_compare*/ 0, /*tp_repr*/ 0, /*tp_as_number*/ 0, /*tp_as_sequence*/ 0, /*tp_as_mapping*/ 0, /*tp_hash*/ 0, /*tp_call*/ 0, /*tp_str*/ 0, /*tp_getattro*/ 0, /*tp_setattro*/ 0, /*tp_as_buffer*/ Py_TPFLAGS_DEFAULT, /*tp_flags*/ }; /* --------------------------------------------------------------------- */ static PyObject * xx_new(PyObject *self, PyObject *args) { HandleObject *rv; int h; if (!PyArg_ParseTuple(args, "i:new", &h)) return NULL; rv = newHandleObject(h); if ( rv == NULL ) return NULL; return (PyObject *)rv; } static PyObject * xx_copy(PyObject *self, PyObject *args) { PyObject *obj; if (!PyArg_ParseTuple(args, "O:copy", &obj)) return NULL; return PyArg_GenericCopy(obj); } static PyMethodDef xx_methods[] = { {"new", xx_new, METH_VARARGS}, {"copy", xx_copy, METH_VARARGS}, {NULL, NULL} /* sentinel */ }; DL_EXPORT(void) inithandle(void) { PyObject *m; Handle_Type.ob_type = &PyType_Type; PyType_Ready(&Handle_Type); PyArg_AddConverters(&Handle_Type, &HandleConvs); /* Create the module and add the functions */ m = Py_InitModule("handle", xx_methods); } From sdm7g@Virginia.EDU Fri Jan 18 19:52:18 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Fri, 18 Jan 2002 14:52:18 -0500 (EST) Subject: [Python-Dev] (PyMapping|PyDict|PyObject)_DelItemString [was: [Pythonmac-SIG] pyobjc.so ] In-Reply-To: Message-ID: [ Background note for cc: to python-dev: pyobjc.so builds under both python2.1.2 and python2.2. It works under 2.1.2, but under 2.2, it gives a 'Failure linking new module' error. ] Added a call to NSLinkEditError to get back more info from the error (I'll submit this as a patch to SF after I clean it up a bit.): >>> import pyobjc Traceback (most recent call last): File "", line 1, in ? ImportError: dyld: /usr/local/src/Python-2.2/python.exe Undefined symbols: _PyObject_DelItemString Failure linking new module >>> grepping for this in 2.1.2 finds nothing. In 2.2, there seems to be one occurance: grep PyObject_DelItemString */*.[ch] Include/abstract.h:#define PyMapping_DelItemString(O,K) PyObject_DelItemString((O),(K)) Searching for PyMapping_DelItemString, it looks like this changed from PyDict_DelItemString() in 2.1.2 to PyObject_DelItemString() in 2.2: dm7g% grep PyMapping_DelItemString Python-2.*/*/*.[ch] Python-2.1.2/Include/abstract.h: int PyMapping_DelItemString(PyObject *o, char *key); Python-2.1.2/Include/abstract.h:#define PyMapping_DelItemString(O,K) PyDict_DelItemString((O),(K)) Python-2.2/Include/abstract.h: int PyMapping_DelItemString(PyObject *o, char *key); Python-2.2/Include/abstract.h:#define PyMapping_DelItemString(O,K) PyObject_DelItemString((O),(K)) Is this change of name an inadvertant bug, or is it something that was intentionally changed, but incompletely? -- Steve From thomas.heller@ion-tof.com Fri Jan 18 20:06:23 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 18 Jan 2002 21:06:23 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> <200201181936.g0IJaKu01691@mira.informatik.hu-berlin.de> Message-ID: <072001c1a05b$ac8a08c0$e000a8c0@thomasnotebook> > > Also, the method seems rather complicated for doing a simple thing. The > > only thing I really want is a way to refer to an _New or _Convert method > > from Python code. > > I believe the attached code implements your requirements. Yes, this looks very much like what I had in mind, except that you demonstrate how to store and retrieve a C structure in the type's tp_dict. Nice intro into PyCObject! Thanks, Thomas From thomas.heller@ion-tof.com Fri Jan 18 20:23:11 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 18 Jan 2002 21:23:11 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> <200201181936.g0IJaKu01691@mira.informatik.hu-berlin.de> Message-ID: <07a001c1a05e$05170810$e000a8c0@thomasnotebook> [sorry if this is duplicated, I'm having mailer problems] > > Also, the method seems rather complicated for doing a simple thing. The > > only thing I really want is a way to refer to an _New or _Convert method > > from Python code. > > I believe the attached code implements your requirements. Yes, this looks very much like what I had in mind, except that you demonstrate how to store and retrieve a C structure in the type's tp_dict. Nice intro into PyCObject! Thanks, Thomas From martin@v.loewis.de Fri Jan 18 20:24:43 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 18 Jan 2002 21:24:43 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <200201181509.KAA09399@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Fri, 18 Jan 2002 10:09:30 -0500) References: <20020110224908.C884@ibook.distro.conectiva> <200201111334.g0BDYLh01331@mira.informatik.hu-berlin.de> <20020111122105.B1808@ibook.distro.conectiva> <200201112347.g0BNlWk01567@mira.informatik.hu-berlin.de> <3C402BBA.1040806@lemburg.com> <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> <15427.15272.171558.1993@12-248-41-177.client.attbi.com> <200201142017.PAA12356@cj20424-a.reston1.va.home.com> <15427.18385.906669.456387@12-248-41-177.client.attbi.com> <200201181509.KAA09399@cj20424-a.reston1.va.home.com> Message-ID: <200201182024.g0IKOhL01893@mira.informatik.hu-berlin.de> > What's the current thinking about making docstrings optional? > > Does everybody agree on Gustavo's patch? Looks good to me. Martin From martin@v.loewis.de Fri Jan 18 20:27:24 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 18 Jan 2002 21:27:24 +0100 Subject: [niemeyer@conectiva.com: Re: [Python-Dev] Python's footprint] In-Reply-To: <3C484251.9579FEDF@lemburg.com> (mal@lemburg.com) References: <20020114093053.C1325@ibook.distro.conectiva> <3C42C9A5.975FA5B8@lemburg.com> <20020114104146.A2607@ibook.distro.conectiva> <3C4326D2.F2A82030@lemburg.com> <200201141847.NAA10894@cj20424-a.reston1.va.home.com> <3C4337D5.B54330B1@lemburg.com> <15427.15272.171558.1993@12-248-41-177.client.attbi.com> <200201142017.PAA12356@cj20424-a.reston1.va.home.com> <15427.18385.906669.456387@12-248-41-177.client.attbi.com> <200201181509.KAA09399@cj20424-a.reston1.va.home.com> <20020118071554.A17496@glacier.arctrix.com> <15432.16014.869439.363615@anthem.wooz.org> <3C484251.9579FEDF@lemburg.com> Message-ID: <200201182027.g0IKROV01896@mira.informatik.hu-berlin.de> > You should also consider the possibility of using the macros > for translating the docs-strings. They are a form of markup. While that is true, most of the current strings are marked-up already, by means of having an __doc__ suffix. I have an extractor that understands this form of markup, and the Python .pot file in CVS has those strings extracted. Regards, Martin From martin@v.loewis.de Fri Jan 18 20:53:30 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 18 Jan 2002 21:53:30 +0100 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: <072001c1a05b$ac8a08c0$e000a8c0@thomasnotebook> (thomas.heller@ion-tof.com) References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> <200201181936.g0IJaKu01691@mira.informatik.hu-berlin.de> <072001c1a05b$ac8a08c0$e000a8c0@thomasnotebook> Message-ID: <200201182053.g0IKrU001987@mira.informatik.hu-berlin.de> > Yes, this looks very much like what I had in mind, except that you > demonstrate how to store and retrieve a C structure in the type's > tp_dict. Indeed. I also think it is more appropriate than either a new metatype or a ParseTuple extension for the problem at hand (supporting arbitrary types in calldll), for the following reasons: - There may be different ways of how an object converts to a "native" type. In particular, in some cases, ParseTuple may need to return (fill out) something more complex than a void*, something that calldll cannot support by nature. - A type may need to provide various independent extensions to the standard protocols, e.g. it may provide "give me a Unicode doc string" in addition to "give me a conversion function to void*". In this case, you'd need multiple inheritance on the metatype level, something that does not reflect well in C. For Python, it is much more common not to care at all about inheritance. Instead, just access the protocol, and expect an exception if it is not supported. Also notice that this *does* make use of new-style classes: In 2.1, types did not have a tp_dict slot. Of course, the PyType_Ready call should go immediately before the place where tp_dict is accessed, and a check should be added whether tp_flags contains Py_TPFLAGS_HAVE_CLASS. Regards, Martin From guido@python.org Fri Jan 18 20:57:21 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 18 Jan 2002 15:57:21 -0500 Subject: [Python-Dev] (PyMapping|PyDict|PyObject)_DelItemString [was: [Pythonmac-SIG] pyobjc.so ] In-Reply-To: Your message of "Fri, 18 Jan 2002 14:52:18 EST." References: Message-ID: <200201182057.PAA21273@cj20424-a.reston1.va.home.com> > >>> import pyobjc > Traceback (most recent call last): > File "", line 1, in ? > ImportError: dyld: /usr/local/src/Python-2.2/python.exe Undefined symbols: > _PyObject_DelItemString > Failure linking new module > >>> > > grepping for this in 2.1.2 finds nothing. > > In 2.2, there seems to be one occurance: > > grep PyObject_DelItemString */*.[ch] > Include/abstract.h:#define PyMapping_DelItemString(O,K) PyObject_DelItemString((O),(K)) > > Searching for PyMapping_DelItemString, it looks like this changed from > PyDict_DelItemString() in 2.1.2 to PyObject_DelItemString() in 2.2: > > dm7g% grep PyMapping_DelItemString Python-2.*/*/*.[ch] > Python-2.1.2/Include/abstract.h: int PyMapping_DelItemString(PyObject *o, char *key); > Python-2.1.2/Include/abstract.h:#define PyMapping_DelItemString(O,K) PyDict_DelItemString((O),(K)) > Python-2.2/Include/abstract.h: int PyMapping_DelItemString(PyObject *o, char *key); > Python-2.2/Include/abstract.h:#define PyMapping_DelItemString(O,K) PyObject_DelItemString((O),(K)) > > > Is this change of name an inadvertant bug, or is it something that > was intentionally changed, but incompletely? The latter. See: http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=498915 --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Fri Jan 18 20:57:48 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 18 Jan 2002 21:57:48 +0100 Subject: [Python-Dev] (PyMapping|PyDict|PyObject)_DelItemString [was: [Pythonmac-SIG] pyobjc.so ] In-Reply-To: (message from Steven Majewski on Fri, 18 Jan 2002 14:52:18 -0500 (EST)) References: Message-ID: <200201182057.g0IKvms01998@mira.informatik.hu-berlin.de> > Is this change of name an inadvertant bug, or is it something that > was intentionally changed, but incompletely? This is bug #498915, fixed in abstract.h 2.43 and 2.42.6.1, abstract.c 2.94 and 2.93.6.1 Regards, Martin From thomas.heller@ion-tof.com Fri Jan 18 21:21:59 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 18 Jan 2002 22:21:59 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> <200201181936.g0IJaKu01691@mira.informatik.hu-berlin.de> <072001c1a05b$ac8a08c0$e000a8c0@thomasnotebook> <200201182053.g0IKrU001987@mira.informatik.hu-berlin.de> Message-ID: <094901c1a066$3c54e6f0$e000a8c0@thomasnotebook> > Also notice that this *does* make use of new-style classes: In 2.1, > types did not have a tp_dict slot. Of course, the PyType_Ready call > should go immediately before the place where tp_dict is accessed, and > a check should be added whether tp_flags contains > Py_TPFLAGS_HAVE_CLASS. Wouldn't it suffice to check for tp_dict != NULL (after the call to PyType_Ready of course)? Hm. What does Py_TPFLAGS_HAVE_CLASS mean exactly? Or, better, since TPFLAGS_DEFAULT contains TPFLAGS_HAVE_CLASS, what does it mean when Py_TPFLAGS_HAVE_CLASS is NOT in tp_flags? Does it mean that this is a 'new style' type object? Thomas From martin@v.loewis.de Fri Jan 18 21:32:23 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 18 Jan 2002 22:32:23 +0100 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: <094901c1a066$3c54e6f0$e000a8c0@thomasnotebook> (thomas.heller@ion-tof.com) References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> <200201181936.g0IJaKu01691@mira.informatik.hu-berlin.de> <072001c1a05b$ac8a08c0$e000a8c0@thomasnotebook> <200201182053.g0IKrU001987@mira.informatik.hu-berlin.de> <094901c1a066$3c54e6f0$e000a8c0@thomasnotebook> Message-ID: <200201182132.g0ILWNf02114@mira.informatik.hu-berlin.de> > Wouldn't it suffice to check for tp_dict != NULL (after the call > to PyType_Ready of course)? No, see below (although I must admit that I wrote "Right" here first :-) > Hm. What does Py_TPFLAGS_HAVE_CLASS mean exactly? According to the documentation, it means that the underlying TypeObject structure has the necessary fields in its C declaration. > Or, better, since TPFLAGS_DEFAULT contains TPFLAGS_HAVE_CLASS, > what does it mean when Py_TPFLAGS_HAVE_CLASS is NOT in tp_flags? It means you have been loading a module from an earlier Python version, which had a different setting for TPFLAGS_DEFAULTS, and a shorter definition of the TypeObject. If you try to access tp_dict in such an object, you are accessing random memory. This may immediately crash, or only crash when you pass the pointer you got to the dictionary functions. Regards, Martin From guido@python.org Fri Jan 18 22:12:29 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 18 Jan 2002 17:12:29 -0500 Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: Your message of "Fri, 18 Jan 2002 10:38:05 PST." <3C486B8D.1E30C93D@prescod.net> References: <20020117175646.034B5E8C8@waltz.rahul.net> <200201171825.NAA00602@cj20424-a.reston1.va.home.com> <3C486B8D.1E30C93D@prescod.net> Message-ID: <200201182212.RAA26732@cj20424-a.reston1.va.home.com> > I think that something in particular that Paul S. said got under your > skin (and there was something he said that could certainly get under a > person's skin). I'm pretty sure it isn't now a policy to rudely reject > suggestions from people you haven't heard of! Until I went back through > the thread I felt as Aahz did that your rejection was somewhat severe in > tone. I think you (still) agree that people should not be afraid of > (politely) stating their opinions in python-dev, even when those > opinions disagree with yours. Or if there is an unspoken rule that > unproven developers shouldn't be in python-dev then maybe we should just > make it a spoken rule. But I'm most confident of the theory that you > snapped at one person in particular because of something he said. > > Paul Prescod He harped at the same issue in three consecutive message without explaining his position. --Guido van Rossum (home page: http://www.python.org/~guido/) From sdm7g@Virginia.EDU Fri Jan 18 22:49:38 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Fri, 18 Jan 2002 17:49:38 -0500 (EST) Subject: [Python-Dev] Re: several messages In-Reply-To: <200201182057.g0IKvms01998@mira.informatik.hu-berlin.de> Message-ID: On Fri, 18 Jan 2002, Guido van Rossum wrote: > http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=498915 On Fri, 18 Jan 2002, Martin v. Loewis wrote: > This is bug #498915, fixed in abstract.h 2.43 and 2.42.6.1, > abstract.c 2.94 and 2.93.6.1 Thanks. I changed it back to PyDict_... With that patch, pyobjc seems to build and work with Python-2.2 as well as 2.1.2. -- Steve. From jason@jorendorff.com Fri Jan 18 23:18:27 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Fri, 18 Jan 2002 17:18:27 -0600 Subject: [Python-Dev] Utopian String Interpolation In-Reply-To: <200201182212.RAA26732@cj20424-a.reston1.va.home.com> Message-ID: Paul Prescod: > > [...] But I'm most confident of the theory that you > > snapped at one person in particular because of something he said. Guido: > He harped at the same issue in three consecutive message without > explaining his position. Actually I was quite happy with the thread. At runtime, Python tends to complain about iffy situations, even situations that other languages might silently accept. For example: print 50 + " percent" # TypeError x = [1, 2, 3]; x.remove(4) # ValueError x = {}; print x[3] # KeyError a, b = "x,y,z,z,y".split() # ValueError x.append(1, 2) # TypeError, recently print u"\N{EURO SIGN}" # UnicodeError I'm not complaining. I like the pickiness. But the Python compiler (that is, Python's syntax) tends to be more forgiving. Examples: - Inconsistent use of tabs and spaces. (Originally handled by tabnanny.py; now an optional warning in Python itself.) - Useless or probably-useless expressions, like these: def g(f): os.environ['EDITOR'] # does nothing with value f.write(xx), f.write(yy) # should be ; not , f.close # obvious mistake (PyChecker catches the last one.) - Non-escaping backslashes in strings (there is a well-known reason for this one; but the reason no longer exists, in new code anyway, since 1.5.) So we catch things like this with static analysis tools like tabnanny.py, or lately PyChecker. If Guido finds any of these syntax-checks compelling enough, he can always incorporate them into Python whenever (but don't hold your breath). Again, you'll get no complaints from me on this. But I am curious. Is this apparent difference in pickiness a design choice? Or is it just harder to write picky compilers than picky libraries? Or am I seeing something that's not really there? ## Jason Orendorff http://www.jorendorff.com/ From Jack.Jansen@oratrix.nl Sat Jan 19 00:07:56 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Sat, 19 Jan 2002 01:07:56 +0100 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: <200201181936.g0IJaKu01691@mira.informatik.hu-berlin.de> Message-ID: <9709BBF2-0C70-11D6-BED2-003065517236@oratrix.nl> On Friday, January 18, 2002, at 08:36 PM, Martin v. Loewis wrote: >> Also, the method seems rather complicated for doing a simple >> thing. The >> only thing I really want is a way to refer to an _New or >> _Convert method >> from Python code. > > I believe the attached code implements your requirements. > Martin, hats off! This does exactly what I want, and it does so in a pretty generalized way. Actually in _such_ a generalized way that I think this should be documented loud and clear. Looking at it a bit more, how about storing each function pointer in a separate PyCObject, and adding general APIs somewhere in the core void PyType_SetAnnotation(PyTypeObject *tp, char *name, char *descr, void *); void *PyType_GetAnnotation(PyTypeObject *tp, char *name, char *descr); (I've picked the name annotation here, because it sort-of feels like that, another name may bring the idea across better). > -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From tim.one@home.com Sat Jan 19 00:10:26 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 18 Jan 2002 19:10:26 -0500 Subject: [Python-Dev] deprecate input()? In-Reply-To: <15431.3203.415990.602525@beluga.mojam.com> Message-ID: [Skip Montanaro] > Yes, but what if the program containing calls to input() get shipped to > someone else's computer? It just seems to me that a) input is almost > never what you want to call and that b) it would seem to a naive > programmer to be the correct way to ask the user for a line of input. One of my favorite papers for the upcoming Python Conference describes the use of Python in a CAD system for chip design. The authors had indeed used input(), and didn't know that it eval'ed expressions. The program's users discovered it first, succumbing to a natural urge to type expressions in the input fields. One of the things that made this paper a favorite is that the authors didn't whine about this: to the contrary, they were delighted to get the kudos for Guido's good intuition about what a kick-ass input() function should do. guido-never-drives-before-a-few-stiff-drinks-either-ly y'rs - tim From martin@v.loewis.de Sat Jan 19 00:28:23 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 19 Jan 2002 01:28:23 +0100 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: <9709BBF2-0C70-11D6-BED2-003065517236@oratrix.nl> (message from Jack Jansen on Sat, 19 Jan 2002 01:07:56 +0100) References: <9709BBF2-0C70-11D6-BED2-003065517236@oratrix.nl> Message-ID: <200201190028.g0J0SN102701@mira.informatik.hu-berlin.de> > Martin, hats off! > > This does exactly what I want, and it does so in a pretty > generalized way. Actually in _such_ a generalized way that I > think this should be documented loud and clear. Thanks! > Looking at it a bit more, how about storing each function > pointer in a separate PyCObject, and adding general APIs > somewhere in the core > void PyType_SetAnnotation(PyTypeObject *tp, char *name, char > *descr, void *); > void *PyType_GetAnnotation(PyTypeObject *tp, char *name, char *descr); I'll happily add that to some recipe collection. However, before generalizing it, I'd like to see more use cases. There should, atleast, be a *second* application beyond calldll (or, perhaps even beyond MacPython). Generalizing from a single use case is not good. Regards, Martin From guido@python.org Sat Jan 19 03:38:56 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 18 Jan 2002 22:38:56 -0500 Subject: [Python-Dev] When to signal an error In-Reply-To: Your message of "Fri, 18 Jan 2002 17:18:27 CST." References: Message-ID: <200201190338.WAA28421@cj20424-a.reston1.va.home.com> (I'm changing the topic :-) > At runtime, Python tends to complain about iffy situations, > even situations that other languages might silently accept. "Other languages" being Perl or JavaScript? The situations you show here would all be errors in most languages that are compiled to machine code. > For example: > > print 50 + " percent" # TypeError > x = [1, 2, 3]; x.remove(4) # ValueError > x = {}; print x[3] # KeyError > a, b = "x,y,z,z,y".split() # ValueError > x.append(1, 2) # TypeError, recently > print u"\N{EURO SIGN}" # UnicodeError > > I'm not complaining. I like the pickiness. That's why you're using Python. :-) > But the Python compiler (that is, Python's syntax) tends to be > more forgiving. Examples: > > - Inconsistent use of tabs and spaces. (Originally handled > by tabnanny.py; now an optional warning in Python itself.) > - Useless or probably-useless expressions, like these: > def g(f): > os.environ['EDITOR'] # does nothing with value > f.write(xx), f.write(yy) # should be ; not , > f.close # obvious mistake > (PyChecker catches the last one.) > - Non-escaping backslashes in strings (there is a well-known > reason for this one; but the reason no longer exists, in new > code anyway, since 1.5.) > > So we catch things like this with static analysis tools like > tabnanny.py, or lately PyChecker. If Guido finds any of these > syntax-checks compelling enough, he can always incorporate them > into Python whenever (but don't hold your breath). > > Again, you'll get no complaints from me on this. But I am > curious. Is this apparent difference in pickiness a design > choice? Or is it just harder to write picky compilers than > picky libraries? Or am I seeing something that's not really > there? There's no unifying reason why thes examples are not errors. The first and last can be considered historical raisins -- the tabs/spaces mix was considered a good thing in the days when Python only ran on Unixoid systems where nobody would think about changing the display size for tabs; we know the reason for the last. But it's hard to change these without inconveniencing users, and there are other ways to deal with them (like picky tools). The three examples in the second item have in common that they are syntactically expressions but are used in a statement context. The problem here that any language designer is faced with: you would want to allow expressions with an obvious side-effect, but you would want to disallow expressions that obviously have no side-effects. But where to draw the line? Traditional parsing technology such as used in Python makes it hard to be very differentiating here; a good analysis of which expressions "make sense" and which ones don't can only be done during a later pass of the compiler. I believe that evertually some PyChecker-like technology will be incorporated in the Python compiler. The same happened to C compilers: the lint program became useless once GCC incorporated the same technology. But these warnings will always have a different status than purely syntactical error: there are often cases where the user knows better (for example, sometimes an attribute reference can have a desirable side effect). --Guido van Rossum (home page: http://www.python.org/~guido/) From neal@metaslash.com Sat Jan 19 19:25:18 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sat, 19 Jan 2002 14:25:18 -0500 Subject: [Python-Dev] When to signal an error References: <200201190338.WAA28421@cj20424-a.reston1.va.home.com> Message-ID: <3C49C81E.C2D2F1BD@metaslash.com> Guido van Rossum wrote: > I believe that evertually some PyChecker-like technology will be > incorporated in the Python compiler. The same happened to C > compilers: the lint program became useless once GCC incorporated the > same technology. pychecker was (and still is) an experiment to me. But I think it would be great if the lessons from pychecker could be integrated into the compiler. Currently, I think there are 2 or 3 warnings which definitely fit this class: No global found, using ++/--, and expressions with no effect as Jason described. I have posted a patch on SF to demonstrate the feasibility of expressions with no effect: https://sourceforge.net/tracker/index.php?func=detail&aid=505826&group_id=5470&atid=305470 It should be pretty easy to warn about ++ and --. No global found would probably require another pass of the code after compilation. I'd be happy to help the process of integrating warnings into the compiler, however, I'm not sure how to proceed. Should pychecker be put into the standard library (users can now do: import pychecker.checker and all modules imported are checked by installing an __import__)? Should pychecker be added as a tool? Should a PEP be written? etc. > But these warnings will always have a different status than purely > syntactical error: there are often cases where the user knows better > (for example, sometimes an attribute reference can have a desirable > side effect). I agree. Neal From jason@jorendorff.com Sat Jan 19 23:16:42 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Sat, 19 Jan 2002 17:16:42 -0600 Subject: [Python-Dev] When to signal an error In-Reply-To: <3C49C81E.C2D2F1BD@metaslash.com> Message-ID: Neal Norwitz: > Guido van Rossum: > > But these warnings will always have a different status than purely > > syntactical error: there are often cases where the user knows better > > (for example, sometimes an attribute reference can have a desirable > > side effect). > > I agree. Here's what Pychecker finds in the standard library (as of 2.2). In each case, the expression is intended to raise an exception if the named variable or attribute doesn't exist. Each one could be rewritten (I'm curious as to the prevailing stylistic opinions on this): === code.py (lines 217 and 221) try: sys.ps1 except AttributeError: sys.ps1 = ">>> " try: sys.ps2 except AttributeError: sys.ps2 = "... " Could be rewritten: if not hasattr(sys, 'ps1'): sys.ps1 = ">>> " if not hasattr(sys, 'ps2'): sys.ps2 = "... " === locale.py (line 721) try: LC_MESSAGES except: pass else: __all__.append("LC_MESSAGES") Could be rewritten: if globals().has_key("LC_MESSAGES"): __all__.append("LC_MESSAGES") === pickle.py (line 58) try: UnicodeType except NameError: UnicodeType = None Could be rewritten: globals().setdefault('UnicodeType', None) ## Jason Orendorff http://www.jorendorff.com/ From jason@jorendorff.com Sat Jan 19 23:34:12 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Sat, 19 Jan 2002 17:34:12 -0600 Subject: [Python-Dev] When to signal an error In-Reply-To: <200201190338.WAA28421@cj20424-a.reston1.va.home.com> Message-ID: Guido van Rossum wrote: > Jason Orendorff wrote: > > At runtime, Python tends to complain about iffy situations, > > even situations that other languages might silently accept. > > "Other languages" being Perl or JavaScript? The situations you show > here would all be errors in most languages that are compiled to > machine code. > > > For example: > > print 50 + " percent" # TypeError > > x = [1, 2, 3]; x.remove(4) # ValueError > > x = {}; print x[3] # KeyError > > a, b = "x,y,z,z,y".split() # ValueError > > x.append(1, 2) # TypeError, recently > > print u"\N{EURO SIGN}" # UnicodeError Not to bicker, but Java only manages to reject 2 of the 6, both at compile time. The other 4 silently pass through the standard library without complaint. None cause exceptions during execution. ML makes no distinction between append(1, 2) and append((1, 2)), but that's a syntax thing... C++ STL remove() doesn't complain if it doesn't find anything to remove; nor does the C++ map<>::operator[]() complain if no entry exists. > > I'm not complaining. I like the pickiness. > > That's why you're using Python. :-) (laugh) You sell yourself short, Guido. :) I would still use Python even if (50 + " percent") started evaluating to "50 percent" tomorrow. ## Jason Orendorff http://www.jorendorff.com/ From martin@v.loewis.de Sun Jan 20 00:02:10 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 20 Jan 2002 01:02:10 +0100 Subject: [Python-Dev] When to signal an error In-Reply-To: References: Message-ID: <200201200002.g0K02Ao03439@mira.informatik.hu-berlin.de> > Each one could be rewritten (I'm curious as to the prevailing > stylistic opinions on this): I think those rewrites do not improve the code, see detailed comments below. > Could be rewritten: > if not hasattr(sys, 'ps1'): > sys.ps1 = ">>> " > if not hasattr(sys, 'ps2'): > sys.ps2 = "... " Using string literals when you mean attribute names is bad style. It just helps to trick the checker. Sometimes, you cannot avoid this style, but if you can, you should. > if globals().has_key("LC_MESSAGES"): > __all__.append("LC_MESSAGES") This combines the previous issue with the usage of globals(). I find it confusing to perform function calls to check for the presence of names. > try: > UnicodeType > except NameError: > UnicodeType = None > > Could be rewritten: > globals().setdefault('UnicodeType', None) Same issue here. If this needs to be rewritten, I'd prefer try: from types import UnicodeType except ImportError: UnicodeType = None Somebody might also change the "from types import *" to explicitly list the set of names that are requested, when changing this fragment. Regards, Martin From guido@python.org Sun Jan 20 00:53:41 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 19 Jan 2002 19:53:41 -0500 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: Your message of "Fri, 18 Jan 2002 20:01:16 +0100." <060801c1a052$93d5a860$e000a8c0@thomasnotebook> References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> Message-ID: <200201200053.TAA30250@cj20424-a.reston1.va.home.com> > > Yes, alas. The type you would have to declare is 'etype', a private > > type in typeobject.c. > > Does this mean this is the wrong route, or is it absolute impossible > to create a subtype of PyType_Type in C with additional slots? I wish I had time to explain this, but I don't. For now, you'll have to read how types are initialized in typeobject.c -- maybe there's a way, maybe there isn't. > Any tips about the route to take? It can be done easily dynamically. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Sun Jan 20 12:11:57 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 20 Jan 2002 13:11:57 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> <200201181936.g0IJaKu01691@mira.informatik.hu-berlin.de> Message-ID: <3C4AB40D.47E3E35@lemburg.com> [Martin's PyCObject based Handle object] This seems to be very close to the __reduce__ idea I posted on this thread a couple of days ago. Why not extend it to fully support this standard Python protocol ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From Boris_Lipner Sun Jan 20 12:49:52 2002 From: Boris_Lipner (Boris_Lipner) Date: Sun, 20 Jan 2002 15:49:52 +0300 Subject: [Python-Dev] cooperation Message-ID: <8515280648.20020120154952@nm.ru> Dear Sirs, For some technical reasons we have partially lost our Data Bank of art galleries. Please, write the address of your website, so that we could continue our cooperation. Our site is http://www.gallery-a.ru/ Best regards, Boris_Lipner v2004@nm.ru From martin@v.loewis.de Sun Jan 20 19:13:22 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 20 Jan 2002 20:13:22 +0100 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: <3C4AB40D.47E3E35@lemburg.com> (mal@lemburg.com) References: <7D823B3B-0BFD-11D6-B669-0030655234CE@oratrix.com> <200201181936.g0IJaKu01691@mira.informatik.hu-berlin.de> <3C4AB40D.47E3E35@lemburg.com> Message-ID: <200201201913.g0KJDMO01304@mira.informatik.hu-berlin.de> > This seems to be very close to the __reduce__ idea I posted > on this thread a couple of days ago. Why not extend it to > fully support this standard Python protocol ? Because it is not clear, to me, what specifically the semantics of this protocol is. I wrote it to support MacOS calldll. I cannot see applicability beyond this API. One of the strength of OO and polymorphism is precisely that users can freely extend the protocols that their objects support, without requiring *all* objects to support the protocol. A standard protocol should be clearly useful cross-platform, for many different types, in different applications. Regards, Martin From res0peyy@verizon.net Sun Jan 20 21:48:24 2002 From: res0peyy@verizon.net (Joshua 'The List' S.) Date: Sun, 20 Jan 2002 13:48:24 -0800 Subject: [Python-Dev] (no subject) Message-ID: <20020120214832.LDFQ4908.out020.verizon.net@there> From ping@lfw.org Sun Jan 20 22:23:15 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Sun, 20 Jan 2002 16:23:15 -0600 (CST) Subject: [Python-Dev] Python and Security In-Reply-To: <3C472FB3.81EAA5D5@prescod.net> Message-ID: "M.-A. Lemburg" wrote: > ... Note that Python hasn't really had a need > for Perl's "taint" because of this. I wouldn't want to see that > change in any way. On Thu, 17 Jan 2002, Paul Prescod wrote: > I am certainly not a Perl programmer but Python is also attackable > through the sorts of holes that "taint" is intended to avoid. Paul is right on the money. Tainting is a completely separate issue. That said, however, i wonder why security rarely comes up as an issue for Python. Is it because nobody expects security properties from the language? Does anyone know how much the restricted execution feature gets used? Is there anyone here that would use a tainting feature if it existed? It would be interesting to explore the possibilities for safe distributed programming in Python. Restricted execution mode and the ability to hook __import__ seem like a pretty strong starting point, and given a suitable cryptographic comm library, it might be feasible to get from there to capability-style distributed programming. IMHO, simplicity and readability are extremely important for a secure programming language, so that gives Python a great head start. (By the way, i'm planning to be at Python 10, and hope to see many of you there. As i'm looking for ways to keep costs down, would anyone be interested in splitting the cost of a hotel room in exchange for a roommate with a strange hairstyle? I'll be there Feb 4 to 7, three nights.) -- ?!ng From martin@v.loewis.de Sun Jan 20 22:37:11 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sun, 20 Jan 2002 23:37:11 +0100 Subject: [Python-Dev] Python and Security In-Reply-To: (message from Ka-Ping Yee on Sun, 20 Jan 2002 16:23:15 -0600 (CST)) References: Message-ID: <200201202237.g0KMbBY02366@mira.informatik.hu-berlin.de> > That said, however, i wonder why security rarely comes up as an > issue for Python. Is it because nobody expects security properties > from the language? Does anyone know how much the restricted > execution feature gets used? Is there anyone here that would use > a tainting feature if it existed? In my understanding, tainting is needed if you allow data received from remote to invoke arbitrary operations. In Python, there is only a short list where this might cause a problem: - invoking exec or eval on a string of unknown origin - unpickling an arbitrary string - performing getattr with a parameter of unknown origin. Because there are so few places where tainted data may cause problems, it never is an issue: people just intuitively know to avoid them. > It would be interesting to explore the possibilities for safe > distributed programming in Python. Not sure what this has to do with tainting, though: if you want to execute code you receive from untrusted sources, a sandbox is closer to what you need. Regards, Martin From barry@zope.com Sun Jan 20 23:01:44 2002 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 20 Jan 2002 18:01:44 -0500 Subject: [Python-Dev] Python and Security References: <200201202237.g0KMbBY02366@mira.informatik.hu-berlin.de> Message-ID: <15435.19544.899789.631148@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: | - invoking exec or eval on a string of unknown origin | - unpickling an arbitrary string | - performing getattr with a parameter of unknown origin. Don't forget os.system(), popen(), and friends, i.e. passing unsanitized strings to the shell. In my my long rusty Perl experience, this was the most common reason to use taint strings. Python OTOH really has very little need to call out to the shell; almost everything you'd want to do that way can be done in pure Python. There are some opportunties for improving string sanitization for the few instances where os.system() is necessary. Most of the security issues I've had to deal with in Mailman have been in library modules -- or the use thereof, not in the language itself. Things like vulnerabilies in Cookie.py or pickle/marshal, or cross-site scripting exploits, that kind of thing. There are also more subtle issues that would be interesting to explore, like DoS attacks with thru-the-web regular expression searching, deliberate form confuddling, and some of the ttw code execution stuff that e.g. Zope gets into. Rexec is an incomplete solution to the latter. -Barry From paul@prescod.net Sun Jan 20 23:49:58 2002 From: paul@prescod.net (Paul Prescod) Date: Sun, 20 Jan 2002 15:49:58 -0800 Subject: [Python-Dev] Re: Python and Security References: Message-ID: <3C4B57A6.702BFC36@prescod.net> Ka-Ping Yee wrote: > >... > > That said, however, i wonder why security rarely comes up as an > issue for Python. I guess you didn't read comp.lang.python this week. ;) http://www.securityfocus.com/archive/1/250580 > ... Is it because nobody expects security properties > from the language? Remember that people for a long time thought of Perl as a "CGI language". And early uses of CGI would probably have depended heavily on the Perl equivalents of "popen" and "system". Plus, those features are so easy to get at in the language. Compare: print `ls` versus: import os print os.popen("ls").read() If you were a newbie in each of these languages what are the percentage chance of you using either of these features versus the list-dir equivalent. List-dir is available in each language. > ... Does anyone know how much the restricted > execution feature gets used? I personally would not trust it because I don't know if anyone is following its progress from one version of Python to another. I also know that even languages that are designed from scratch to be safe (Java and JavaScript) have had leaky implemetations so I don't really hold out much hope for Python until I hear that someone is actively researching this. > ... Is there anyone here that would use > a tainting feature if it existed? I'd like to think I've internalized taints rules by osmosis... > (By the way, i'm planning to be at Python 10, and hope to see many > of you there. As i'm looking for ways to keep costs down, would > anyone be interested in splitting the cost of a hotel room in > exchange for a roommate with a strange hairstyle? I'll be there > Feb 4 to 7, three nights.) Maybe there should be a bulletin board or something for people to find each other. I think one of the Python conferences had something like that...for hotels and also to share cabs from the airport. Paul Prescod From simon@netthink.co.uk Mon Jan 21 00:11:27 2002 From: simon@netthink.co.uk (Simon Cozens) Date: Mon, 21 Jan 2002 00:11:27 +0000 Subject: [Python-Dev] Python and Security In-Reply-To: <200201202237.g0KMbBY02366@mira.informatik.hu-berlin.de> References: <200201202237.g0KMbBY02366@mira.informatik.hu-berlin.de> Message-ID: <20020121001127.GA5014@netthink.co.uk> On Sun, Jan 20, 2002 at 11:37:11PM +0100, Martin v. Loewis wrote: > In my understanding, tainting is needed if you allow data received > from remote to invoke arbitrary operations. In Python, there is only a > short list where this might cause a problem: > > - invoking exec or eval on a string of unknown origin > - unpickling an arbitrary string > - performing getattr with a parameter of unknown origin. >From a Perl point of view, tainting is there to stop data received from outside to do *anything* related to the system. This includes what you say, but goes further: - open - os.popen (in fact, most of os.*) - socket (no, really) and everything that depends on it (urllib, etc.) Since Python has rexec for this sort of thing, tainting may not be so important, but I think rexec goes too far. The idea of tainting is not to *disallow* using, say, arbitrary user input from CGI scripts as filenames - it's help the programmer segregate which pieces of data need special treatment before being passed to these kinds of functions. -- Rule the Empire through force. -- Shogun Tokugawa From aahz@rahul.net Mon Jan 21 01:38:59 2002 From: aahz@rahul.net (Aahz Maruch) Date: Sun, 20 Jan 2002 17:38:59 -0800 (PST) Subject: [Python-Dev] Python and Security In-Reply-To: <15435.19544.899789.631148@anthem.wooz.org> from "Barry A. Warsaw" at Jan 20, 2002 06:01:44 PM Message-ID: <20020121013900.62CFCE8CD@waltz.rahul.net> Barry A. Warsaw wrote: > >>>>> "MvL" == Martin v Loewis writes: > > | - invoking exec or eval on a string of unknown origin > | - unpickling an arbitrary string > | - performing getattr with a parameter of unknown origin. > > Don't forget os.system(), popen(), and friends, i.e. passing > unsanitized strings to the shell. In my my long rusty Perl > experience, this was the most common reason to use taint strings. More precisely, because Perl culture developed as a superset of shell scripts, it used to be all-too-common for Perl scripts to get their data by parsing the output of a Unix utility (instead of calling a library function directly). This necessarily spawned a subshell where malicious input could be a security problem. (When I was learning Perl, the available books often taught this programming style.) I've heard that Perl culture has changed, but the taint capability is still there because too many Perlers stick to their trusty poor habits. Pythonistas, of course, never learned bad habits. ;-) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From simon@netthink.co.uk Mon Jan 21 02:06:53 2002 From: simon@netthink.co.uk (Simon Cozens) Date: Mon, 21 Jan 2002 02:06:53 +0000 Subject: [Python-Dev] Python and Security In-Reply-To: <20020121013900.62CFCE8CD@waltz.rahul.net> References: <15435.19544.899789.631148@anthem.wooz.org> <20020121013900.62CFCE8CD@waltz.rahul.net> Message-ID: <20020121020653.GA5885@netthink.co.uk> On Sun, Jan 20, 2002 at 05:38:59PM -0800, Aahz Maruch wrote: > More precisely, because Perl culture developed as a superset of shell > scripts, it used to be all-too-common for Perl scripts to get their data > by parsing the output of a Unix utility (instead of calling a library > function directly). This necessarily spawned a subshell where malicious > input could be a security problem. Not so. This is what taint is: Taint tells you where there's some shit you want to clean up. If you ask the user for a filename to write to, taint tells you that you'd better check for leading slashes, double dots and the like before writing to it. If you're about to run an external program, taint tells you that you might not want to believe the user's idea of what $PATH ought to be. If you're getting a URL from somewhere, taint tells you that you should probably think twice before happily passing back file:///etc/shadow. And so on and so forth. None of these examples are about input to a subshell. I'm not in a position to say whether or not Python needs taint; if it had it, I probably wouldn't use the feature. But let's not misunderstand what it's for. -- Thermodynamics in a nutshell: 1st Law: You can't win. (Energy is conserved) 2nd Law: You can't break even. (Entropy) 0th Law: You can't even quit the game. (Closed systems) -- Taki Kogoma From paul@prescod.net Mon Jan 21 02:27:59 2002 From: paul@prescod.net (Paul Prescod) Date: Sun, 20 Jan 2002 18:27:59 -0800 Subject: [Python-Dev] When to signal an error References: <200201200002.g0K02Ao03439@mira.informatik.hu-berlin.de> Message-ID: <3C4B7CAF.F6D4909B@prescod.net> "Martin v. Loewis" wrote: > >... > > > Could be rewritten: > > if not hasattr(sys, 'ps1'): > > sys.ps1 = ">>> " > > if not hasattr(sys, 'ps2'): > > sys.ps2 = "... " > > Using string literals when you mean attribute names is bad style. It > just helps to trick the checker. Just for the record, I think that Jason's rewrites were clearer in every case because they said exactly what he was trying to do. "If the sys module has the attribute ps1 then ..." This is much clearer than "Get the ps1 attribute from the sys module and throw it away.". Python has a functions specifically for checking for the existance of attributes and keys. Why not use them? Plus, I think that exceptions should be (as far as possible) reserved for exceptional situations. Using them to as tests is not as compact, not as readable and not as runtime efficient. But more to the point, any of these could have been rewritten as: _junk = sys.ps1 That would shut up compiler messages without forcing you to use the haskey/hasattr style. Paul Prescod From mwh@python.net Mon Jan 21 10:24:54 2002 From: mwh@python.net (Michael Hudson) Date: 21 Jan 2002 10:24:54 +0000 Subject: [Python-Dev] When to signal an error In-Reply-To: Neal Norwitz's message of "Sat, 19 Jan 2002 14:25:18 -0500" References: <200201190338.WAA28421@cj20424-a.reston1.va.home.com> <3C49C81E.C2D2F1BD@metaslash.com> Message-ID: <2mn0z8xaq1.fsf@starship.python.net> Neal Norwitz writes: > Currently, I think there are 2 or 3 warnings which definitely fit this class: > No global found, using ++/--, and expressions with no effect as Jason > described. It would sure be nice if using a variable before assignment produced a warning at compile time. However I think this needs flow analysis and you won't catch me trying to add that to compile.c. Cheers, M. -- MGM will not get your whites whiter or your colors brighter. It will, however, sit there and look spiffy while sucking down a major honking wad of RAM. -- http://www.xiph.org/mgm/ From Samuele Pedroni" Hi. Thanks to http://www.pythonware.com/daily/ I landed in http://norvig.com/python/python.html Peter Norvig is about to supply Python versions of the algorithms with the 2nd edition of his AI: A Modern Approach. So far, so good. In the section about coding convetions he says: =A6In general, follow Guido's style conventions, =A6but I have some quirks that I prefer (although I could be talked out o= f them): ... =A6* _ instead of self as first argument to methods: def f(_, x): ... I'm perfectly aware that the 'self' thing it is just a convetion, OTOH much of the cross-programmer readability of code relies on such convention. It is good, bad or irrelevant to have such an authoritative book (although about AI not Python directly) adopting such a line-noisy convention? Maybe nobody cares, but I preferred not to let this go unnoticed. Someone who cares could try to discuss the issue or make it apparent to Mr. Norvig. Opinions? regards, Samuele Pedroni. From jeremy@alum.mit.edu Sun Jan 20 22:43:59 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Sun, 20 Jan 2002 17:43:59 -0500 Subject: [Python-Dev] When to signal an error In-Reply-To: <3C49C81E.C2D2F1BD@metaslash.com> References: <200201190338.WAA28421@cj20424-a.reston1.va.home.com> <3C49C81E.C2D2F1BD@metaslash.com> Message-ID: <15435.18479.961038.136717@gondolin.digicool.com> >>>>> "NN" == Neal Norwitz writes: NN> Guido van Rossum wrote: >> I believe that evertually some PyChecker-like technology will be >> incorporated in the Python compiler. The same happened to C >> compilers: the lint program became useless once GCC incorporated >> the same technology. NN> pychecker was (and still is) an experiment to me. But I think NN> it would be great if the lessons from pychecker could be NN> integrated into the compiler. Me, too. NN> I'd be happy to help the process of integrating warnings into NN> the compiler, however, I'm not sure how to proceed. Should NN> pychecker be put into the standard library (users can now do: NN> import pychecker.checker and all modules imported are checked by NN> installing an __import__)? Should pychecker be added as a tool? NN> Should a PEP be written? etc. How much of pychecker's work could be done by the compiler itself? I'd like to see more of the warnings generated during compilation, but agree with Michael Hudson that extending it is a lot of work. Perhaps it's time to redesign the compiler. A PEP is probably good for more than one reason. One reason is to document the warnings that are generated and the rationale for them. If you integrate it into the compiler, the PEP is a good place to capture some design info. Jeremy From jeremy@alum.mit.edu Sun Jan 20 22:44:39 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Sun, 20 Jan 2002 17:44:39 -0500 Subject: [Python-Dev] When to signal an error In-Reply-To: <3C49C81E.C2D2F1BD@metaslash.com> References: <200201190338.WAA28421@cj20424-a.reston1.va.home.com> <3C49C81E.C2D2F1BD@metaslash.com> Message-ID: <15435.18519.191874.661917@gondolin.digicool.com> We could talk about this at the conference. Jeremy From guido@python.org Mon Jan 21 16:07:59 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 21 Jan 2002 11:07:59 -0500 Subject: [Python-Dev] OT: style convention: self vs. _ in new Norvig's book In-Reply-To: Your message of "Mon, 21 Jan 2002 14:20:15 +0100." <001201c1a27e$5d8ec060$6d94fea9@newmexico> References: <001201c1a27e$5d8ec060$6d94fea9@newmexico> Message-ID: <200201211607.LAA17031@cj20424-a.reston1.va.home.com> > http://norvig.com/python/python.html > > Peter Norvig is about to supply > Python versions of the algorithms with > the 2nd edition of his AI: A Modern Approach. > > So far, so good. In the section about > coding convetions he says: > > ¦In general, follow Guido's style conventions, > ¦but I have some quirks that I prefer (although I could be talked out of them): > ... > ¦* _ instead of self as first argument to methods: def f(_, x): > ... > > I'm perfectly aware that the 'self' thing it is just a convetion, > OTOH much of the cross-programmer readability > of code relies on such convention. > > It is good, bad or irrelevant to have such > an authoritative book (although about AI not > Python directly) adopting such a line-noisy > convention? > > Maybe nobody cares, but I preferred not to > let this go unnoticed. Someone who cares > could try to discuss the issue or make it > apparent to Mr. Norvig. > > Opinions? > > regards, Samuele Pedroni. Peter: My apologies for butting in here without doing full research. I don't know how you reached this set of conventions, so maybe you've got a very good reason; but I don't see it on your webpage. Two of those coding conventions look really ugly to me: 2-space indents and _ for self. I think the code will look horrible! I think everyone should be able to make their own style choices, but I ask you to reconsider. If you have to reconsider one, I would beg you to use 'self' like everybody else. The _ name is already overloaded with multiple meanings in the Python community: it's a shorthand for the last evaluated expression in interactive mode, and some people use it as a dummy variable to assign uninteresting results to. Almost the entire Python community is happy with 4-space indents; if you're worried about your lines getting too long, that's usually a hint that your code can be restructured in a way that's easier on the reader's eye/mind anyway. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Mon Jan 21 16:10:04 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 21 Jan 2002 11:10:04 -0500 Subject: [Python-Dev] OT: style convention: self vs. _ in new Norvig's book References: <001201c1a27e$5d8ec060$6d94fea9@newmexico> <200201211607.LAA17031@cj20424-a.reston1.va.home.com> Message-ID: <15436.15708.321312.724003@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> The _ name is already overloaded with multiple meanings in GvR> the Python community: it's a shorthand for the last evaluated GvR> expression in interactive mode, and some people use it as a GvR> dummy variable to assign uninteresting results to. It's also the common name of a function in internationalized Python applications (mostly inherited from established conventions in the C world). -Barry From pnorvig@google.com Mon Jan 21 18:16:51 2002 From: pnorvig@google.com (Peter Norvig) Date: Mon, 21 Jan 2002 10:16:51 -0800 Subject: [Python-Dev] OT: style convention: self vs. _ in new Norvig's book References: <001201c1a27e$5d8ec060$6d94fea9@newmexico> <200201211607.LAA17031@cj20424-a.reston1.va.home.com> Message-ID: <3C4C5B13.FA2909A6@google.com> Wow; I didn't expect this to generate such a response. But I did post the code far before it was ready and put the "I could be talked out of it" there for a reason. So, thank you for your feedback! My reactions: 4 spaces: OK.=20 I have no strong feelings on that, and I think its just an accident of the way my emacs was configured that I started using 2 spaces. I agree that I should make it easier for other people to edit my code, so I'll switch to the default. self: OK, I'll try it.=20 My rationale was: I'm used to Java, where self is usually spelled '', and I figured '_' was the next best thing. I find it much nicer to read because 'self' is too intrusive; I want something that disappears.=20 Compare: _.x, _.y, _.z =3D x, y, z self.x, self.y, self.z =3D x, y, z Besides saving 9 characters, I find that the first line I can read at a glance, ignoring the _, while the second I have to look at more carefully. I also like the symmetry of _._ in _._private_slot. However, I recognize I'm doing this as an outsider to the language without much experience reading/writing it. If it is really true that using '_' would be seen as a change to the language and not a personal quirk, then I agree that I shouldn't do it. The first hint I had of this was when I saw something on comp.lang.python (I forget the details) suggesting that an automated tool look for methods with first argument 'self'. So I'll try 'self' for a while, and hope I learn to like it (and learn to read the second sample line above in one glance). If I don't, I'll write here and give you all another chance to innundate me with reasons why I should. -Peter PS - Getting a personal request from Guido reminds me of the time I was at a conference and John McCarthy walked up to the booth of one of the Lisp vendors and said in his usual direct fashion "I hear you have a new version. You should send me one". The booth bimbo had no idea who McCarthy was and politely suggested he pay for a copy. Then someone in the booth with a little more experience came over and said "That's ok -- it's his language, he can have whatever he wants." Guido van Rossum wrote: >=20 > > http://norvig.com/python/python.html > > > > Peter Norvig is about to supply > > Python versions of the algorithms with > > the 2nd edition of his AI: A Modern Approach. > > > > So far, so good. In the section about > > coding convetions he says: > > > > =A6In general, follow Guido's style conventions, > > =A6but I have some quirks that I prefer (although I could be talked o= ut of them): > > ... > > =A6* _ instead of self as first argument to methods: def f(_, x): > > ... > > > > I'm perfectly aware that the 'self' thing it is just a convetion, > > OTOH much of the cross-programmer readability > > of code relies on such convention. > > > > It is good, bad or irrelevant to have such > > an authoritative book (although about AI not > > Python directly) adopting such a line-noisy > > convention? > > > > Maybe nobody cares, but I preferred not to > > let this go unnoticed. Someone who cares > > could try to discuss the issue or make it > > apparent to Mr. Norvig. > > > > Opinions? > > > > regards, Samuele Pedroni. >=20 > Peter: >=20 > My apologies for butting in here without doing full research. I don't > know how you reached this set of conventions, so maybe you've got a > very good reason; but I don't see it on your webpage. >=20 > Two of those coding conventions look really ugly to me: 2-space > indents and _ for self. I think the code will look horrible! >=20 > I think everyone should be able to make their own style choices, but I > ask you to reconsider. If you have to reconsider one, I would beg you > to use 'self' like everybody else. The _ name is already overloaded > with multiple meanings in the Python community: it's a shorthand for > the last evaluated expression in interactive mode, and some people use > it as a dummy variable to assign uninteresting results to. >=20 > Almost the entire Python community is happy with 4-space indents; if > you're worried about your lines getting too long, that's usually a > hint that your code can be restructured in a way that's easier on the > reader's eye/mind anyway. >=20 > --Guido van Rossum (home page: http://www.python.org/~guido/) --=20 _____________________________________________________________________ Peter Norvig, Director of Machine Learning, Google, http://google.com pnorvig@google.com, Voice:650-330-0100 x1248, Fax:650-618-1499 From guido@python.org Mon Jan 21 19:02:49 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 21 Jan 2002 14:02:49 -0500 Subject: [Python-Dev] OT: style convention: self vs. _ in new Norvig's book In-Reply-To: Your message of "Mon, 21 Jan 2002 10:16:51 PST." <3C4C5B13.FA2909A6@google.com> References: <001201c1a27e$5d8ec060$6d94fea9@newmexico> <200201211607.LAA17031@cj20424-a.reston1.va.home.com> <3C4C5B13.FA2909A6@google.com> Message-ID: <200201211902.OAA22868@cj20424-a.reston1.va.home.com> > Wow; I didn't expect this to generate such a response. But I did post > the code far before it was ready and put the "I could be talked out of > it" there for a reason. So, thank you for your feedback! My reactions: You're welcome. I'm always there to save a straying stranger. :-) [snip] > self: OK, I'll try it. > > My rationale was: I'm used to Java, where self is usually spelled '', > and I figured '_' was the next best thing. I find it much nicer to read > because 'self' is too intrusive; I want something that disappears. I hear that in the Lisp world, when someone complains about the parentheses, the standard response is "once you're used to it, the parentheses disappear". So it is for Python's 'self'. :-) > Compare: > > _.x, _.y, _.z = x, y, z > self.x, self.y, self.z = x, y, z > > Besides saving 9 characters, I find that the first line I can read at a > glance, ignoring the _, while the second I have to look at more > carefully. I also like the symmetry of _._ in _._private_slot. However, > I recognize I'm doing this as an outsider to the language without much > experience reading/writing it. If it is really true that using '_' would > be seen as a change to the language and not a personal quirk, then I > agree that I shouldn't do it. The first hint I had of this was when I > saw something on comp.lang.python (I forget the details) suggesting that > an automated tool look for methods with first argument 'self'. So I'll > try 'self' for a while, and hope I learn to like it (and learn to read > the second sample line above in one glance). If I don't, I'll write > here and give you all another chance to innundate me with reasons why I > should. Thanks! > -Peter > > PS - Getting a personal request from Guido reminds me of the time I was > at a conference and John McCarthy walked up to the booth of one of the > Lisp vendors and said in his usual direct fashion "I hear you have a new > version. You should send me one". The booth bimbo had no idea who > McCarthy was and politely suggested he pay for a copy. Then someone in > the booth with a little more experience came over and said "That's ok -- > it's his language, he can have whatever he wants." What's a booth bimbo? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From jason@jorendorff.com Mon Jan 21 20:47:59 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 21 Jan 2002 14:47:59 -0600 Subject: [Python-Dev] OT: style convention: self vs. _ in new Norvig's book In-Reply-To: <001201c1a27e$5d8ec060$6d94fea9@newmexico> Message-ID: > ] In general, follow Guido's style conventions, > ] but I have some quirks that I prefer (although I could be talked > ] out of them): > ... > ] * _ instead of self as first argument to methods: def f(_, x): > ... I dunno; I think sample code should (a) stick rather conservatively to typical usage, apart from the concept being illustrated of course; and (b) strive for maximum readability. For Python, both principles demand that one should write: def foo(bar): if is_list(bar): return sum(map(foo, bar)) else: return [bar] instead of: def foo(bar): if is_list(bar): return sum(map(foo, bar)) else: return [bar] This may be one of those things that only makes sense if you've not a Lisp programmer. (wink) To stray from the topic: I find I only disagree with three points in Peter Norvig's enlightening table of Lisp vs. Python features. 1. That "x.slot = y" is not user-extensible. The __setattr__() method does this. 2. That Python's relative lack of control structures is necessarily worse than Lisp's abundance of them. Especially for students, I think this: if is_list(n): return foo_l(n) elif is_str(n) or is_int(n): return foo_a(n) else: raise TypeError is at least as clear, though not as brief, as this: (etypecase n (list (foo-l n)) ((or string integer) (foo-a n))) with the obligatory note in the text to the effect that "'Etypecase' is a form similar to 'case' which selects a clause based on the type..." and so on. 3. That Python doesn't support generic programming. Generic algorithms are expressed as naturally in Python as in any language I know: from operator import add def sum(items): return reduce(add, items) >>> sum([3, 4, 5]) 12 >>> sum([3, 4j, 4-2j]) (7+2j) >>> sum(["py", "th", "o", "n"]) 'python' Likewise it's natural to write functions that can operate on "any sequence", not just lists or tuples, "any file-like object", not just a real file, "any function-like object", etc. Perhaps something more specific is meant by "generic programming". Cheers, ## Jason Orendorff http://www.jorendorff.com/ From martin@v.loewis.de Mon Jan 21 21:57:40 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Mon, 21 Jan 2002 22:57:40 +0100 Subject: [Python-Dev] OT: style convention: self vs. _ in new Norvig's book In-Reply-To: <001201c1a27e$5d8ec060$6d94fea9@newmexico> (pedronis@bluewin.ch) References: <001201c1a27e$5d8ec060$6d94fea9@newmexico> Message-ID: <200201212157.g0LLve701480@mira.informatik.hu-berlin.de> > Opinions? I dislike it, because _ is already taken for two things: for the last expression in interactive mode, and as a markup of translatable strings. Regards, Martin From pnorvig@google.com Mon Jan 21 23:22:32 2002 From: pnorvig@google.com (Peter Norvig) Date: Mon, 21 Jan 2002 15:22:32 -0800 Subject: [Python-Dev] OT: style convention: self vs. _ in new Norvig's book References: <001201c1a27e$5d8ec060$6d94fea9@newmexico> <200201211607.LAA17031@cj20424-a.reston1.va.home.com> <3C4C5B13.FA2909A6@google.com> <200201211902.OAA22868@cj20424-a.reston1.va.home.com> Message-ID: <3C4CA2B8.4951AB78@google.com> Guido van Rossum wrote: > I hear that in the Lisp world, when someone complains about the > parentheses, the standard response is "once you're used to it, the > parentheses disappear". So it is for Python's 'self'. :-) That may be a good analogy, and as I said, I'm willing to try. But I still think one character is easier to ignore than four, and that there is no compelling argument for 'self' over '_', while there is a positive reason for parens (ease of automated parsing tools). > What's a booth bimbo? :-) "It's not a sexist phenomenon as such, applying equally to the pretty young men and women who work as scenery at various booths. Universally, these people have no clue about the products they represent; instead they hand out buttons and propaganda, smile nicely, and act as props for the larger show that goes on around them." -- http://www.tidbits.com/tb-issues/TidBITS-159.html#lnk6 > > --Guido van Rossum (home page: http://www.python.org/~guido/) From jason@jorendorff.com Tue Jan 22 00:06:51 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 21 Jan 2002 18:06:51 -0600 Subject: [Python-Dev] OT: style convention: self vs. _ in new Norvig's book In-Reply-To: <3C4CA2B8.4951AB78@google.com> Message-ID: Peter Norvig wrote: > Guido van Rossum wrote: > > I hear that in the Lisp world, when someone complains about the > > parentheses, the standard response is "once you're used to it, the > > parentheses disappear". So it is for Python's 'self'. :-) > > That may be a good analogy, and as I said, I'm willing to try. It's an excellent analogy: both statements are about 1/3 true in my experience. :-) > But I still think one character is easier to ignore than four, > and that there is no compelling argument for 'self' over '_', > while there is a positive reason for parens (ease of automated > parsing tools). There is no especially compelling reason for Python to have 'self' over '_' or 'me' or '@' or ''. However, there is a compelling reason for you to choose 'self': "Prefer the standard to the offbeat." --Strunk and White ## Jason Orendorff http://www.jorendorff.com/ From Anthony Baxter Tue Jan 22 00:14:37 2002 From: Anthony Baxter (Anthony Baxter) Date: Tue, 22 Jan 2002 11:14:37 +1100 Subject: [Python-Dev] OT: style convention: self vs. _ in new Norvig's book In-Reply-To: Message from Peter Norvig of "Mon, 21 Jan 2002 15:22:32 -0800." <3C4CA2B8.4951AB78@google.com> Message-ID: <200201220014.g0M0EbW28603@mbuna.arbhome.com.au> >>> Peter Norvig wrote > That may be a good analogy, and as I said, I'm willing to try. But I > still think one character is easier to ignore than four, and that there > is no compelling argument for 'self' over '_', while there is a positive > reason for parens (ease of automated parsing tools). The primary arguments against '_' are that it already has meaning. I can think of three, off the top of my head. Interactive mode uses this as "result of last expression". The i18n code uses it as a function _('translate me'). Zope uses it in DTML (python) expressions as the default namespace. I'd also add the subjective argument that it's ugly, and looks far too magical and perl-like. I don't _want_ it to disappear into the background, as it's going to cause me pain if I miss it. Anthony From pnorvig@google.com Tue Jan 22 00:27:04 2002 From: pnorvig@google.com (Peter Norvig) Date: Mon, 21 Jan 2002 16:27:04 -0800 Subject: [Python-Dev] OT: style convention: self vs. _ in new Norvig's book References: Message-ID: <3C4CB1D8.B050B002@google.com> OK, OK; When both Guido and E. B. team up against me, I know I'm licked. -Peter Jason Orendorff wrote: > There is no especially compelling reason for Python to have > 'self' over '_' or 'me' or '@' or ''. > > However, there is a compelling reason for you to choose 'self': > "Prefer the standard to the offbeat." --Strunk and White From montanaro@tttech.com Tue Jan 22 14:35:15 2002 From: montanaro@tttech.com (montanaro@tttech.com) Date: Tue, 22 Jan 2002 08:35:15 -0600 Subject: [Python-Dev] Bug? is Tkinter+no threads+Windows supported? Message-ID: <15437.30883.138962.301012@dynamic2.tttech1.ttt> My client is trying to build a version of Python on Windows with Tkinter and pymalloc enabled, and threads disabled (in part because pymalloc is not thread-safe). There appears to be a bug in _tkinter.c:EventHook. It has this code: #if defined(WITH_THREAD) || defined(MS_WINDOWS) Py_BEGIN_ALLOW_THREADS PyThread_acquire_lock(tcl_lock, 1); tcl_tstate = event_tstate; result = Tcl_DoOneEvent(TCL_DONT_WAIT); tcl_tstate = NULL; PyThread_release_lock(tcl_lock); if (result == 0) Sleep(20); Py_END_ALLOW_THREADS #else result = Tcl_DoOneEvent(0); #endif It seems on the surface that the "|| defined(MS_WINDOWS)" bit should be deleted. This code dates from 1998 and comes with this log text: revision 1.72 date: 1998/06/13 13:56:28; author: guido; state: Exp; lines: +26 -6 Fixed the EventHook() code so that it also works on Windows, sort of. (The "sort of" is because it uses kbhit() to detect that the user starts typing, and then no events are processed until they hit return.) Also fixed a nasty locking bug: EventHook() is called without the Tcl lock set, so it can't use the ENTER_PYTHON and LEAVE_PYTHON macros, which manipulate both the Python and the Tcl lock. I now only acquire and release the Python lock. (Haven't tested this on Unix yet...) This suggests that Guido was (rightly) worried about the case of threading on Windows. What about a non-threaded interpreter on Windows? Skip From nas@python.ca Tue Jan 22 15:02:12 2002 From: nas@python.ca (Neil Schemenauer) Date: Tue, 22 Jan 2002 07:02:12 -0800 Subject: [Python-Dev] Bug? is Tkinter+no threads+Windows supported? In-Reply-To: <15437.30883.138962.301012@dynamic2.tttech1.ttt>; from montanaro@tttech.com on Tue, Jan 22, 2002 at 08:35:15AM -0600 References: <15437.30883.138962.301012@dynamic2.tttech1.ttt> Message-ID: <20020122070212.B10448@glacier.arctrix.com> montanaro@tttech.com wrote: > My client is trying to build a version of Python on Windows with Tkinter and > pymalloc enabled, and threads disabled (in part because pymalloc is not > thread-safe). Using pymalloc with threads should be safe as long as you don't have extensions that call pymalloc without the big lock held. Neil From montanaro@tttech.com Wed Jan 23 08:24:27 2002 From: montanaro@tttech.com (montanaro@tttech.com) Date: Wed, 23 Jan 2002 02:24:27 -0600 Subject: [Python-Dev] "This document is locked" message from Sourceforge? Message-ID: <15438.29499.711766.816065@dynamic2.tttech1.ttt> I just logged into Sourceforge. Now every time I visit a page, although that page displays, I also get username/password popup saying the document is locked and giving a server message of "foo". Any idea where this came from? Perhaps a test on SF they forgot to undo before putting some pages into production? Skip From mwh@python.net Wed Jan 23 10:57:23 2002 From: mwh@python.net (Michael Hudson) Date: 23 Jan 2002 10:57:23 +0000 Subject: [Python-Dev] "This document is locked" message from Sourceforge? In-Reply-To: montanaro@tttech.com's message of "Wed, 23 Jan 2002 02:24:27 -0600" References: <15438.29499.711766.816065@dynamic2.tttech1.ttt> Message-ID: <2mbsfl2v3g.fsf@starship.python.net> montanaro@tttech.com writes: > I just logged into Sourceforge. Now every time I visit a page, although > that page displays, I also get username/password popup saying the document > is locked and giving a server message of "foo". Any idea where this came > from? Perhaps a test on SF they forgot to undo before putting some pages > into production? Haven't noticed that, but sf is being nice and snappy this morning, isn't it? It seems to take five minutes for a bug report to finish displaying. Argh! Cheers, M. -- $ head -n 2 src/bash/bash-2.04/unwind_prot.c /* I can't stand it anymore! Please can't we just write the whole Unix system in lisp or something? */ -- spotted by Rich van der Hoff From sdm7g@Virginia.EDU Wed Jan 23 16:26:09 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Wed, 23 Jan 2002 11:26:09 -0500 (EST) Subject: [Python-Dev] VERBOSE and DEBUG conventions. Message-ID: Py_DebugFlag is used for debugging the Python parser. Py_VerboseFlag is used for debugging and tracing imports. (and in some places it wants Py_VerboseFlag > 1 (more than one "-v") for output) Are there any conventions on which to use for other debugging output? (Or did Guido have any particular conventions in mind when he added them? ) Right now, I'm using Py_VerboseFlag to also trigger logging of message sends in pyobjc. Stealing this flag for another use isn't a problem here because [1] the logging goes to a /tmp file, so I don't have to turn off import tracing -- the two logging streams don't get mixed together, and [2] it only functions when you import pyobjc, so it's not going to get in someone else's use. But I may need to add other debug and log output to my module and I'ld like to do it in the least suprising manner if possible. -- Steve Majewski From tim.one@home.com Wed Jan 23 17:51:58 2002 From: tim.one@home.com (Tim Peters) Date: Wed, 23 Jan 2002 12:51:58 -0500 Subject: [Python-Dev] VERBOSE and DEBUG conventions. In-Reply-To: Message-ID: [Steven Majewski] > Py_DebugFlag is used for debugging the Python parser. If Guido had it to do over again, I suspect he'd put that code in #ifdef Py_DEBUG blocks instead. > Py_VerboseFlag is used for debugging and tracing imports. > (and in some places it wants Py_VerboseFlag > 1 (more than one "-v") > for output) > > Are there any conventions on which to use for other debugging output? Py_VerboseFlag is for output about core activities every user of Python may want to see sometimes, and in release builds. It doesn't cover much beyond tracing imports, printing stats about memory cleanup, and some highly dubious fudging: PyThreadState_Clear(PyThreadState *tstate) { if (Py_VerboseFlag && tstate->frame != NULL) fprintf(stderr, "PyThreadState_Clear: warning: thread still has a frame\n"); (that should probably be an error instead -- or be officially blessed). > (Or did Guido have any particular conventions in mind when he added > them? ) > > Right now, I'm using Py_VerboseFlag to also trigger logging of message > sends in pyobjc. Stealing this flag for another use isn't a problem > here because [1] the logging goes to a /tmp file, so I don't have > to turn off import tracing -- the two logging streams don't get mixed > together, and [2] it only functions when you import pyobjc, so it's > not going to get in someone else's use. > > But I may need to add other debug and log output to my module and > I'ld like to do it in the least suprising manner if possible. Supply a "set debug and log options" interface for your module, and then call it . Good example: the gc module. From guido@python.org Wed Jan 23 18:01:14 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 23 Jan 2002 13:01:14 -0500 Subject: [Python-Dev] VERBOSE and DEBUG conventions. In-Reply-To: Your message of "Wed, 23 Jan 2002 12:51:58 EST." References: Message-ID: <200201231801.NAA17967@pcp742651pcs.reston01.va.comcast.net> > [Steven Majewski] > > Py_DebugFlag is used for debugging the Python parser. > > If Guido had it to do over again, I suspect he'd put that code in #ifdef > Py_DEBUG blocks instead. Yes and no. Some of it *is* already only inside #ifdef Py_DEBUG (see parser.c); but it still requires a command line flag because the output is too much to bear in a regular debugging run... --Guido van Rossum (home page: http://www.python.org/~guido/) From sdm7g@Virginia.EDU Wed Jan 23 18:06:30 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Wed, 23 Jan 2002 13:06:30 -0500 (EST) Subject: [Python-Dev] VERBOSE and DEBUG conventions. In-Reply-To: Message-ID: On Wed, 23 Jan 2002, Tim Peters wrote: > Supply a "set debug and log options" interface for your module, and then > call it . Good example: the gc module. Thanks. That mostly makes sense. Except that I needed it to be in trace/debug mode when the module initialization is being done, so I can't import the module and then set it. I suppose I could just use another environment variable: $PYOBJC_DEBUG -- then I could set debug levels. -- Steve FYI: In case you're wondering why I don't just use gdb: It's seems to be a meta level problem between the python runtime and the objective-c runtime, and I suspect the objc extensions in gdb must make use of the objc-runtime ( for 'po' - print object, for example.) because I seem to be causing another objc runtime exception the act of examining things in the debugger. This is not very documented in the gdb manual, so unless I'm going to wade thru the sources, I though it would be easier just to instrument the module. (and maybe Python.) From gmcm@hypernet.com Wed Jan 23 18:26:02 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 23 Jan 2002 13:26:02 -0500 Subject: [Python-Dev] VERBOSE and DEBUG conventions. In-Reply-To: References: Message-ID: <3C4EB9EA.25681.2F95B9E4@localhost> On 23 Jan 2002 at 13:06, Steven Majewski wrote: > FYI: In case you're wondering why I don't just use gdb: > It's seems to be a meta level problem between the python > runtime and the objective-c runtime, and I suspect the > objc extensions in gdb must make use of the objc-runtime ( > for 'po' - print object, for example.) because I seem to be > causing another objc runtime exception the act of examining > things in the debugger. Or perhaps chip geometries are getting small enough that simply the act of observing is enough. running-on-stale-Doritos-ly y'rs -- Gordon http://www.mcmillan-inc.com/ From sdm7g@Virginia.EDU Wed Jan 23 18:57:08 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Wed, 23 Jan 2002 13:57:08 -0500 (EST) Subject: [Python-Dev] VERBOSE and DEBUG conventions. In-Reply-To: <3C4EB9EA.25681.2F95B9E4@localhost> Message-ID: On Wed, 23 Jan 2002, Gordon McMillan wrote: > Or perhaps chip geometries are getting small enough > that simply the act of observing is enough. Well: the effect is being magnified by Class Object self reference, which I could probably avoid if objective-C had actual metaclasses. > running-on-stale-Doritos-ly y'rs Gosh. I'm impressed. I'm still running on coffee and donuts here. I usually don't start on stale Doritos until much later in the day! ( Unless I've wrapped around on an all nighter and I'm still on last night's Doritos. ) From DavidA@ActiveState.com Wed Jan 23 23:02:32 2002 From: DavidA@ActiveState.com (David Ascher) Date: Wed, 23 Jan 2002 15:02:32 -0800 Subject: [Python-Dev] largeint.h and ver.h gone from VS.NET Message-ID: <3C4F4108.48D9AEF3@activestate.com> As mentioned in: http://groups.google.com/groups?q=largeint.h&hl=en&selm=epfeXXOYBHA.2060%40tkmsftngp07&rnum=1 largeint.h is gone from the VisualStudio compiler as of the VisualStudio.NET release. Python's build currently fails without the workaround mentioned in that posting. Furthermore, the file "ver.h" used in python_nt.rc appears to be gone as well. Not sure why we needed it. Gettinr dir fo it seems to have no ill effect =). Anyone remember what it's for? I'm having sre problems in the test suite though, which have pretty wide-ranging effects. Is someone else looking at the patches needed for VS.NET, or should I keep digging? --david From DavidA@ActiveState.com Wed Jan 23 23:06:50 2002 From: DavidA@ActiveState.com (David Ascher) Date: Wed, 23 Jan 2002 15:06:50 -0800 Subject: [Python-Dev] largeint.h and ver.h gone from VS.NET References: <3C4F4108.48D9AEF3@activestate.com> Message-ID: <3C4F420A.7F26C7F5@activestate.com> Whoa. test_longexp seems to be causing the python_d process to bloat to almost 80Megs. This is with the VS.NET build. I guess I really have to get a VC6 build going now =). From Jack.Jansen@oratrix.nl Wed Jan 23 23:07:49 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Thu, 24 Jan 2002 00:07:49 +0100 Subject: [Python-Dev] PEP 278 - Universal newline support Message-ID: <0537F905-1056-11D6-B9C4-003065517236@oratrix.nl> Folks, there's a new PEP 278 plus an accompanying patch available on the subject of universal newline support (the ability to read and import files that use a different newline convention than what the current platform uses). Please read, apply, try, provide feedback and put me back to work:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From guido@python.org Wed Jan 23 23:14:09 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 23 Jan 2002 18:14:09 -0500 Subject: [Python-Dev] largeint.h and ver.h gone from VS.NET In-Reply-To: Your message of "Wed, 23 Jan 2002 15:06:50 PST." <3C4F420A.7F26C7F5@activestate.com> References: <3C4F4108.48D9AEF3@activestate.com> <3C4F420A.7F26C7F5@activestate.com> Message-ID: <200201232314.SAA19582@pcp742651pcs.reston01.va.comcast.net> > test_longexp seems to be causing the python_d process to bloat to almost > 80Megs. This is with the VS.NET build. I think that's expected -- test_longexp is very memory intensive, we've seen complaints about this on feeble platforms before. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Wed Jan 23 23:48:47 2002 From: tim.one@home.com (Tim Peters) Date: Wed, 23 Jan 2002 18:48:47 -0500 Subject: [Python-Dev] largeint.h and ver.h gone from VS.NET In-Reply-To: <3C4F4108.48D9AEF3@activestate.com> Message-ID: [David Ascher] > As mentioned in: > > http://groups.google.com/groups?q=largeint.h&hl=en&selm=epfeXXOYBH > A.2060%40tkmsftngp07&rnum=1 > > largeint.h is gone from the VisualStudio compiler as of the > VisualStudio.NET release. > > Python's build currently fails without the workaround mentioned in that > posting. Did they also, e.g., change the signature of QueryPerformanceCounter(), so that largeint.h isn't needed to get at the MS-specific LARGE_INTEGER typedef? Note that the workaround doesn't work unless these files are on MS's list of redistributable files (which always takes me an hour to find, and no time for that now). > Furthermore, the file "ver.h" used in python_nt.rc appears to be gone as > well. Not sure why we needed it. Gettinr dir fo it seems to have no > ill effect =). Anyone remember what it's for? Mark Hammond created all the code in question (here and above), so ActiveState should know who to hire to maintain it . Here's ver.h in its entirety (as of VC6): #ifndef RC_INVOKED #pragma message ("VER.H obsolete, including WINVER.H instead") #endif #include gettinr-dir-fo-it-indeed-ly y'rs - tim From DavidA@ActiveState.com Thu Jan 24 00:45:03 2002 From: DavidA@ActiveState.com (David Ascher) Date: Wed, 23 Jan 2002 16:45:03 -0800 Subject: [Python-Dev] largeint.h and ver.h gone from VS.NET References: Message-ID: <3C4F590E.B4EE9DC@activestate.com> > Did they also, e.g., change the signature of QueryPerformanceCounter(), so > that largeint.h isn't needed to get at the MS-specific LARGE_INTEGER > typedef? Note that the workaround doesn't work unless these files are on > MS's list of redistributable files (which always takes me an hour to find, > and no time for that now). I did not intend that the workaround would be the right way to do it long term. LARGE_INTEGER is now defined in winnt.h, which is included by windows.h. However, the current code does need more than just the typedef, such as LargeIntegerEqualToZero, LargeIntegerSubtract, etc. > > Furthermore, the file "ver.h" used in python_nt.rc appears to be gone as > > well. Not sure why we needed it. Gettinr dir fo it seems to have no > > ill effect =). Anyone remember what it's for? > > Mark Hammond created all the code in question (here and above), so > ActiveState should know who to hire to maintain it . Sigh. I'm not doing this on behalf of ActiveState -- there's no real need for us to move to VS.NET for most of our builds right now. I'm just playing with my new toy. --david-should-have-posted-from-my-hotmail-account?-ascher From tim.one@home.com Thu Jan 24 01:02:53 2002 From: tim.one@home.com (Tim Peters) Date: Wed, 23 Jan 2002 20:02:53 -0500 Subject: [Python-Dev] largeint.h and ver.h gone from VS.NET In-Reply-To: <3C4F590E.B4EE9DC@activestate.com> Message-ID: [David Ascher] > I did not intend that the workaround would be the right way to do it > long term. > > LARGE_INTEGER is now defined in winnt.h, which is included by > windows.h. However, the current code does need more than just the > typedef, such as LargeIntegerEqualToZero, LargeIntegerSubtract, etc. Those can be replaced with "== 0" and "-" etc -- the obvious things, at least under VC6. Don't know about .NET. >> Mark Hammond created all the code in question (here and above), so >> ActiveState should know who to hire to maintain it . > Sigh. I'm not doing this on behalf of ActiveState Neither am I . > -- there's no real need for us to move to VS.NET for most of our builds > right now. I'm just playing with my new toy. Well, then *you* know who to hire -- same thing. BTW, the #include of ver.h is gone in current CVS now. Mucking with LARGE_INTEGER awaits a volunteer. From fredrik@pythonware.com Thu Jan 24 09:10:05 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 24 Jan 2002 10:10:05 +0100 Subject: [Python-Dev] largeint.h and ver.h gone from VS.NET References: <3C4F4108.48D9AEF3@activestate.com> Message-ID: <00db01c1a4b6$eb4008d0$0900a8c0@spiff> david wrote: > I'm having sre problems in the test suite though, which have pretty > wide-ranging effects. SRE uses agressive inlining under MSVC. maybe their new optimizer is slightly broken? (not the first time, in a X.0 release) as a temporary workaround, try changing #if defined(_MSC_VER) to #if 0 && defined(_MSC_VER) if SRE works after this change, try switching on USE_INLINE. if you find a combination that works, change the MSC_VER clause to: #if defined(_MSC_VER) && _MSC_VER >= SOMETHING ... vs.net configuration #elif defined(_MSC_VER) ... msvc 5/6 configuration #elif defined(USE_INLINE) ... and mail me the patch. cheers /F From guido@python.org Thu Jan 24 15:07:34 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 24 Jan 2002 10:07:34 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Mac/scripts gensuitemodule.py,1.18,1.19 In-Reply-To: Your message of "Thu, 24 Jan 2002 09:34:40 EST." <15440.7040.836670.604742@grendel.zope.com> References: <15440.7040.836670.604742@grendel.zope.com> Message-ID: <200201241507.KAA11788@pcp742651pcs.reston01.va.comcast.net> > The keyword module has an undocumented data object kwlist which is a > list of keywords. Perhaps this should be documented and made part of > the public API? I'd want to change the list to a tuple, but that > seems harmless since it isn't already part of the API. Why make it a tuple? Out of fear someone changes it? Let them change it, and learn about sharing of object references! Agree it should be documented of course. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Jan 24 16:07:41 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 24 Jan 2002 11:07:41 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Mac/scripts gensuitemodule.py,1.18,1.19 In-Reply-To: <200201241507.KAA11788@pcp742651pcs.reston01.va.comcast.net> References: <15440.7040.836670.604742@grendel.zope.com> <200201241507.KAA11788@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15440.12621.645554.228182@grendel.zope.com> Guido van Rossum writes: > Why make it a tuple? Out of fear someone changes it? Let them change > it, and learn about sharing of object references! Partly, and partly because it's something that should be changed anyway. Do you seriously object to changing it to a tuple??? > Agree it should be documented of course. OK. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido@python.org Thu Jan 24 16:11:23 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 24 Jan 2002 11:11:23 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Mac/scripts gensuitemodule.py,1.18,1.19 In-Reply-To: Your message of "Thu, 24 Jan 2002 11:07:41 EST." <15440.12621.645554.228182@grendel.zope.com> References: <15440.7040.836670.604742@grendel.zope.com> <200201241507.KAA11788@pcp742651pcs.reston01.va.comcast.net> <15440.12621.645554.228182@grendel.zope.com> Message-ID: <200201241611.LAA15046@pcp742651pcs.reston01.va.comcast.net> > Do you seriously object to changing it to a tuple??? Yes, I don't want to create any more show code examples that use tuples for (conceptually) arbitrary-length arrays of homogeneous data. The data type to use for those is lists. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@rahul.net Thu Jan 24 16:56:38 2002 From: aahz@rahul.net (Aahz Maruch) Date: Thu, 24 Jan 2002 08:56:38 -0800 (PST) Subject: [Python-Dev] Tuples vs. lists In-Reply-To: <200201241611.LAA15046@pcp742651pcs.reston01.va.comcast.net> from "Guido van Rossum" at Jan 24, 2002 11:11:23 AM Message-ID: <20020124165638.DE31AE8C4@waltz.rahul.net> Guido van Rossum wrote: >Fred: >> >> Do you seriously object to changing it to a tuple??? > > Yes, I don't want to create any more show code examples that use > tuples for (conceptually) arbitrary-length arrays of homogeneous > data. The data type to use for those is lists. Hrm. Even when it's something that's supposed to be immutable? I'm asking because I'm currently using a tuple for the digit list in my BCD module, and I'd like a clearer explanation of why you think that it should be a list (assuming you do). >From my viewpoint, the BCD digit string should be handled like a string; I'm only using a tuple for efficiency of storing numbers instead of characters. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From guido@python.org Thu Jan 24 17:01:51 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 24 Jan 2002 12:01:51 -0500 Subject: [Python-Dev] Tuples vs. lists In-Reply-To: Your message of "Thu, 24 Jan 2002 08:56:38 PST." <20020124165638.DE31AE8C4@waltz.rahul.net> References: <20020124165638.DE31AE8C4@waltz.rahul.net> Message-ID: <200201241701.MAA15403@pcp742651pcs.reston01.va.comcast.net> > Hrm. Even when it's something that's supposed to be immutable? I'm > asking because I'm currently using a tuple for the digit list in my BCD > module, and I'd like a clearer explanation of why you think that it > should be a list (assuming you do). > > From my viewpoint, the BCD digit string should be handled like a string; > I'm only using a tuple for efficiency of storing numbers instead of > characters. Can't you trust your users not to change it? --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@rahul.net Thu Jan 24 18:03:35 2002 From: aahz@rahul.net (Aahz Maruch) Date: Thu, 24 Jan 2002 10:03:35 -0800 (PST) Subject: [Python-Dev] Tuples vs. lists In-Reply-To: <200201241701.MAA15403@pcp742651pcs.reston01.va.comcast.net> from "Guido van Rossum" at Jan 24, 2002 12:01:51 PM Message-ID: <20020124180336.4FBE1E8C1@waltz.rahul.net> Guido van Rossum wrote: > Aahz: >> >> Hrm. Even when it's something that's supposed to be immutable? I'm >> asking because I'm currently using a tuple for the digit list in my BCD >> module, and I'd like a clearer explanation of why you think that it >> should be a list (assuming you do). >> >> From my viewpoint, the BCD digit string should be handled like a string; >> I'm only using a tuple for efficiency of storing numbers instead of >> characters. > > Can't you trust your users not to change it? Sure, but then I can't just copy references to the tuple when creating a copy of an instance, I'd have to copy the entire list. That's what I meant by efficiency. There are important semantic differences coming from the fact that tuples are immutable and lists are mutable, and I think that a strict heterogeneous/homogenous distinction loses that. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From guido@python.org Thu Jan 24 18:07:24 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 24 Jan 2002 13:07:24 -0500 Subject: [Python-Dev] Tuples vs. lists In-Reply-To: Your message of "Thu, 24 Jan 2002 10:03:35 PST." <20020124180336.4FBE1E8C1@waltz.rahul.net> References: <20020124180336.4FBE1E8C1@waltz.rahul.net> Message-ID: <200201241807.NAA17320@pcp742651pcs.reston01.va.comcast.net> > Sure, but then I can't just copy references to the tuple when creating a > copy of an instance, I'd have to copy the entire list. That's what I > meant by efficiency. There are important semantic differences coming > from the fact that tuples are immutable and lists are mutable, and I > think that a strict heterogeneous/homogenous distinction loses that. Well, as long as you promise not to change it, you *can* copy a reference, right? I guess I don't understand your application enough -- do you intend this to be a starting point that is modified during the program's execution, or is this a constant array? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Fri Jan 25 01:22:47 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 24 Jan 2002 20:22:47 -0500 Subject: [Python-Dev] VERBOSE and DEBUG conventions. In-Reply-To: Message-ID: > Supply a "set debug and log options" interface for your module, and then > call it . Good example: the gc module. [Steven Majewski] > Thanks. That mostly makes sense. > Except that I needed it to be in trace/debug mode when the module > initialization is being done, so I can't import the module and then > set it. I suppose I could just use another environment variable: > $PYOBJC_DEBUG -- then I could set debug levels. Sure. Or split out option/logging knobs into a distinct module. > FYI: In case you're wondering why I don't just use gdb: Nope . > It's seems to be a meta level problem between the python runtime > and the objective-c runtime, and I suspect the objc extensions > in gdb must make use of the objc-runtime ( for 'po' - print object, > for example.) because I seem to be causing another objc runtime > exception the act of examining things in the debugger. > This is not very documented in the gdb manual, so unless I'm > going to wade thru the sources, I though it would be easier just > to instrument the module. (and maybe Python.) Upgrade your OS to Windows and all these time-consuming *choices* go away. Got a bug? Great! There's nowhere to report it that isn't a black hole, and you can't even think about patching the sources, so you just live with it and buy another OS next year. Except for all the bugs you have to learn to endure, it makes life much simpler . From nhodgson@bigpond.net.au Fri Jan 25 05:02:12 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Fri, 25 Jan 2002 16:02:12 +1100 Subject: [Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ... References: <15412.29829.302192.219332@12-248-41-177.client.attbi.com> <200201041046.g04AkKR05898@mira.informatik.hu-berlin.de> <016e01c19639$94c909b0$0acc8490@neil> <200201060033.g060X8c14491@mira.informatik.hu-berlin.de> <021901c19654$21f2e3f0$0acc8490@neil> <200201061214.g06CEtc01656@mira.informatik.hu-berlin.de> <036e01c1972d$dbc88a80$0acc8490@neil> <200201070728.g077SmZ01967@mira.informatik.hu-berlin.de> <003b01c1975d$e7dd3070$0acc8490@neil> <200201072317.g07NHEh01830@mira.informatik.hu-berlin.de> <3C3AC26A.D40842FB@lemburg.com> <02ff01c19cc3$92514540$0acc8490@neil> <200201140711.g0E7BsV01370@mira.informatik.hu-berlin.de> <3C44059C.CFC09899@lemburg.com> <200201152124.g0FLOV702247@mira.informatik.hu-berlin.de> <3C45CB11.ACB2CEE6@lemburg.com> <200201161909.g0GJ9OK01822@mira.informatik.hu-berlin.de> <3C46A535.3C579501@lemburg.com> <08e201c19f46$cad5f070$0acc8490@neil> <3C46B735.9C433F60@lemburg.com> <200201171206.g0HC6sa01572@mira.informatik.hu-berlin.de> <3C46C3C2.984F6227@lemburg.com> Message-ID: <06c901c1a55d$73987f40$0acc8490@neil> M.-A. Lemburg: > "Martin v. Loewis" wrote: > > ... > > if sys.platform == "win32": > > use_unicode_for_filenames = windowsversion in ['nt','w2k','xp'] > > elif sys.platform.startswith("darwin"): > > use_unicode_for_filenames = 1 > > else: > > use_unicode_for_filenames = 0 > > Sounds like this would be a good candidate for platform.py which I'll > check into CVS soon. With its many platform querying APIs it should > easily be possible to add a function which returns the above > information based on the platform Python is running on. OK. I'll remove unicodefilenames() from the PEP and my patch. Neil From tismer@tismer.com Thu Jan 17 17:09:38 2002 From: tismer@tismer.com (Christian Tismer) Date: Thu, 17 Jan 2002 18:09:38 +0100 Subject: [Python-Dev] Ann: Stackless Python is DEAD! Long live Stackless Python Message-ID: <3C470552.3040802@tismer.com> ####################################### Announcement: ####################################### The end of an era has come: --------------------------- Stackless Python, in the form provided upto Python 2.0, is DEAD. I am abandoning the whole implementation. A new era has begun: -------------------- A completely new implementation is in development for Python 2.2 and up which gives you the following features: - There are no restrictions any longer for uthread/coroutine switching. Switching is possible at *any* time, in *any* context. - There are no significant changes to the Python core any longer. The new patches are of minimum size, and they will probably survive unchanged until Python 3.0 . - Maintenance work for Stackless Python is reduced to the bare minimum. There is no longer a need to incorporate Stackless into the standard, since there is no work to be shared. - Stackless breaks its major axiom now. It is no longer platform independent, since it *does* modify the C stack. I will support all Intel platforms by myself. For other platforms, I'm asking for volunteers. * The basic elements of Stackless are now switchable chains of frames. We have to define an interface that turns these chains into microthreads and coroutines. Everybody is invited to come to the Stackless mailing list and discuss the layout of this new design. Especially we need to decide about (*). http://starship.python.net/mailman/listinfo/stackless see you there - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From mal@lemburg.com Fri Jan 25 20:46:29 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 25 Jan 2002 21:46:29 +0100 Subject: [Python-Dev] Using LXR for Python CVS Source Code ? Message-ID: <3C51C425.358637DB@lemburg.com> Browing the Mozilla web-site I came across I nice utility which enables cross-referenced source code browsing: LXR http://lxr.mozilla.org/mozilla/source/webtools/lxr/ For example, see e.g. http://lxr.mozilla.org/mozilla/source/expat/xmlparse/hashtable.c I suppose setting this up on python.org would ease referencing Python C sources a lot and also provide a nice tool for learning to understand the internal structures of the interpreter. What do you think ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Fri Jan 25 20:49:27 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 25 Jan 2002 15:49:27 -0500 Subject: [Python-Dev] Using LXR for Python CVS Source Code ? In-Reply-To: Your message of "Fri, 25 Jan 2002 21:46:29 +0100." <3C51C425.358637DB@lemburg.com> References: <3C51C425.358637DB@lemburg.com> Message-ID: <200201252049.PAA06163@pcp742651pcs.reston01.va.comcast.net> > Browing the Mozilla web-site I came across I nice utility which > enables cross-referenced source code browsing: LXR > > http://lxr.mozilla.org/mozilla/source/webtools/lxr/ > > For example, see e.g. > > http://lxr.mozilla.org/mozilla/source/expat/xmlparse/hashtable.c > > I suppose setting this up on python.org would ease referencing > Python C sources a lot and also provide a nice tool for learning > to understand the internal structures of the interpreter. > > What do you think ? +1 Do you want access to the python.org website and CVS so you can install this yourself? --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Fri Jan 25 21:16:19 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 25 Jan 2002 22:16:19 +0100 Subject: [Python-Dev] Using LXR for Python CVS Source Code ? References: <3C51C425.358637DB@lemburg.com> <200201252049.PAA06163@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C51CB23.D7E91DD7@lemburg.com> Guido van Rossum wrote: > > > Browing the Mozilla web-site I came across I nice utility which > > enables cross-referenced source code browsing: LXR > > > > http://lxr.mozilla.org/mozilla/source/webtools/lxr/ > > > > For example, see e.g. > > > > http://lxr.mozilla.org/mozilla/source/expat/xmlparse/hashtable.c > > > > I suppose setting this up on python.org would ease referencing > > Python C sources a lot and also provide a nice tool for learning > > to understand the internal structures of the interpreter. > > > > What do you think ? > > +1 > > Do you want access to the python.org website and CVS so you can > install this yourself? I could do that, but would need some help from the admins since LXR requires Perl 5+ and Glimpse to be installed. I'll also need to modify the Apache config files and will probably have to setup a cron job which updates the indexes once a day. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From DavidA@ActiveState.com Fri Jan 25 21:26:04 2002 From: DavidA@ActiveState.com (David Ascher) Date: Fri, 25 Jan 2002 13:26:04 -0800 Subject: [Python-Dev] Using LXR for Python CVS Source Code ? References: <3C51C425.358637DB@lemburg.com> <200201252049.PAA06163@pcp742651pcs.reston01.va.comcast.net> <3C51CB23.D7E91DD7@lemburg.com> Message-ID: <3C51CD6C.3EA845ED@activestate.com> Not wishing to make a science project out of it, but you might consider the newer lxr, which uses a real database (mysql, IIRC). We've used lxr in-house for a while, it's an absolutely wonderful tool. It is quite hard to setup multiple lxr's on a single machine (at least with the 'old' lxr), be forewarned. Also, lxr doesn't really deal especially well with Python code - but for C/C++ code, it rocks. --david "M.-A. Lemburg" wrote: > > Guido van Rossum wrote: > > > > > Browing the Mozilla web-site I came across I nice utility which > > > enables cross-referenced source code browsing: LXR > > > > > > http://lxr.mozilla.org/mozilla/source/webtools/lxr/ > > > > > > For example, see e.g. > > > > > > http://lxr.mozilla.org/mozilla/source/expat/xmlparse/hashtable.c > > > > > > I suppose setting this up on python.org would ease referencing > > > Python C sources a lot and also provide a nice tool for learning > > > to understand the internal structures of the interpreter. > > > > > > What do you think ? > > > > +1 > > > > Do you want access to the python.org website and CVS so you can > > install this yourself? > > I could do that, but would need some help from the admins > since LXR requires Perl 5+ and Glimpse to be installed. I'll > also need to modify the Apache config files and will probably > have to setup a cron job which updates the indexes once a > day. > > -- > Marc-Andre Lemburg > CEO eGenix.com Software GmbH > ______________________________________________________________________ > Company & Consulting: http://www.egenix.com/ > Python Software: http://www.egenix.com/files/python/ > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From mal@lemburg.com Fri Jan 25 22:57:31 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 25 Jan 2002 23:57:31 +0100 Subject: [Python-Dev] Using LXR for Python CVS Source Code ? References: <3C51C425.358637DB@lemburg.com> <200201252049.PAA06163@pcp742651pcs.reston01.va.comcast.net> <3C51CB23.D7E91DD7@lemburg.com> <3C51CD6C.3EA845ED@activestate.com> Message-ID: <3C51E2DB.85961923@lemburg.com> David Ascher wrote: > > Not wishing to make a science project out of it, but you might consider > the newer lxr, which uses a real database (mysql, IIRC). > > We've used lxr in-house for a while, it's an absolutely wonderful tool. > It is quite hard to setup multiple lxr's on a single machine (at least > with the 'old' lxr), be forewarned. > > Also, lxr doesn't really deal especially well with Python code - but for > C/C++ code, it rocks. Hmm, I was planning to install the Mozilla version of LXR. I'll also look at the latest LXR version 0.9. If it does indeed use MySQL, I'd rather not go down that road -- setting up and maintaining MySQL is not exactly fun... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From aazmace@aazbooks.com Sat Jan 26 02:24:31 2002 From: aazmace@aazbooks.com (AaZbooks.com) Date: Sat, 26 Jan 2002 03:24:31 +0100 Subject: [Python-Dev] Nouvelles acquisitions / New online Message-ID: <200201260224.DAA05258@www.olm.fr> Chers bibliophiles, Cette semaine nous vous proposons nos nouvelles acquisitions (plus de 500). Vous y trouverez par exemple: SAINTE BEUVE C.A. MADAME DESBORDES VALMORE SA VIE ET SA CORRESPONDANCE Edition originale (rare). Ouvrage contenant un catalogue de Michl Lévy, libraire éditeur, de 36 p.Broché.IN8. LEVY Paris 1870 Réf.: 18337 (107,00 €) ou bien encore: FALLET RENE BANLIEUE SUD EST Roman. Edition originale (rare). DOMAT Paris 1947 Réf.: 18313 (45,00 €) En vous souhaitant bonne lecture Toute l'équipe de AaZbooks.com ------------------------------------------------------- Dear bibliophiles, This week new online 500 books, for example: SAINTE BEUVE C.A. MADAME DESBORDES VALMORE SA VIE ET SA CORRESPONDANCE Edition originale (rare). Ouvrage contenant un catalogue de Michl Lévy, libraire éditeur, de 36 p.Broché.IN8. LEVY Paris 1870 Réf.: 18337 (107,00 €) FALLET RENE BANLIEUE SUD EST Roman. Edition originale (rare). DOMAT Paris 1947 Réf.: 18313 (45,00 €) Regards AaZbooks' Team ---------------------------------------------------------- AaZbooks.com - BP N°1 - La grande Bruyčre - F72320 St-Maixent Tel.: +33 (0)2 43 71 00 70 - Fax: +33 (0)2 43 71 29 16 http://www.aazbooks.com ---------------------------------------------------------- Pour vous désinscrire cliquez ci-dessous http://www.aazbooks.com\lnews\desinscription.php From skip@pobox.com Sat Jan 26 22:17:26 2002 From: skip@pobox.com (Skip Montanaro) Date: Sat, 26 Jan 2002 16:17:26 -0600 Subject: [Python-Dev] Using LXR for Python CVS Source Code ? In-Reply-To: <3C51E2DB.85961923@lemburg.com> References: <3C51C425.358637DB@lemburg.com> <200201252049.PAA06163@pcp742651pcs.reston01.va.comcast.net> <3C51CB23.D7E91DD7@lemburg.com> <3C51CD6C.3EA845ED@activestate.com> <3C51E2DB.85961923@lemburg.com> Message-ID: <15443.10998.673581.778224@localhost.localdomain> mal> Hmm, I was planning to install the Mozilla version of LXR. I'll mal> also look at the latest LXR version 0.9. If it does indeed use mal> MySQL, I'd rather not go down that road -- setting up and mal> maintaining MySQL is not exactly fun... I find MySQL fairly straightforward to work with. (I use it on the Mojam & Musi-Cal sites.) If there's a functional difference between the new version and the old, I'd be willing to help out administering the database. Skip From andymac@bullseye.apana.org.au Sun Jan 27 10:48:59 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Sun, 27 Jan 2002 21:48:59 +1100 (EDT) Subject: [Python-Dev] updated patches for OS/2 EMX port Message-ID: Its taken longer than I'd hoped, however they're finally up for review. The updated bits have been attached to the previous patch entries in the patch manager: 435381: distutils changes http://sf.net/tracker/?func=detail&atid=305470&aid=435381&group_id=5470 450265: build files - self contained subdirectory in PC/ http://sf.net/tracker/?func=detail&atid=305470&aid=450265&group_id=5470 450266: library changes - 3 patch files covering:- - Lib/ (included os2emxpath.py as previously discussed here) - Lib/plat-os2emx/ (new subdirectory) - Lib/test/ (cope with 2 EMX limitations) http://sf.net/tracker/?func=detail&atid=305470&aid=450266&group_id=5470 450267: core changes - 4 patch files covering:- - Include/ - Modules/ (lots of changes; see below for more info) - Objects/ (see below for more info) - Python/ http://sf.net/tracker/?func=detail&atid=305470&aid=450267&group_id=5470 I hope that I got the patch links right... Particular notes wrt #450267: - the patch to Modules/import.c supports VACPP in addition to EMX. Michael Muller has trialled this patch with a VACPP build successfully. It is messy, but OS/2 isn't going to lose the 8.3 naming limit on DLLs anytime soon :-( Although truncating the DLL (PYD) name to 8 characters increases the chances of a name clash, the case-sensitive import support in the same patch alleviates it somewhat, and the fact that the "init" entrypoint is maintained will result in an import failure when there is an actual name clash. - Modules/unicodedata.c is affected by a name clash between the internally defined _getname() and an EMX routine of the same name defined in . The patch renames the internal routine to _getucname() to avoid this, but this change may not be acceptable - advice please. - Objects/stringobject.c and Objects/unicodeobject.c contain changes to handle the EMX runtime library returning "0x" as the prefix for output formatted with a "%X" format. I have tried to minimise the changes in these patches to the minimum needed for the port to function, ie I've tried to eradicate the cosmetic changes in the earlier patches, and avoid picking up unwanted files (such as Modules/Setup). Please let me know if you find any such changes I missed. The patches uploaded apply cleanly to a copy of an anonoymously checked out CVS tree as of 0527 AEST this morning (Jan 27), and have been built and regression tested on both OS/2 EMX and FreeBSD 4.4R with no unexpected test failures. If there are no unresolvable objections, and approval to apply these patches is granted, I propose that the patches be applied as follows:- Stage 1: the build patch (creates+populates PC/os2emx/) Stage 2: the Lib/plat-os2emx/ patch Stage 3: the Lib/ and Lib/test/ patches Stage 4: the distutils patch Stage 5: the Include/, Objects/ and Python/ patches Stage 6: the Modules/ patch I would expect to allow at least 48 hours between stages. Comments/advice on this proposal also appreciated. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From martin@v.loewis.de Sun Jan 27 20:32:44 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 27 Jan 2002 21:32:44 +0100 Subject: [Python-Dev] updated patches for OS/2 EMX port In-Reply-To: References: Message-ID: Andrew MacIntyre writes: > - Modules/unicodedata.c is affected by a name clash between the internally > defined _getname() and an EMX routine of the same name defined in > . The patch renames the internal routine to _getucname() to > avoid this, but this change may not be acceptable - advice please. My advice for renaming things because of name clashes: Always rename in a way that solves this particular problem for good, by using the Py prefix (or _Py to further indicate that this is not public API; it's a static function, anyway). Somebody may have a function _getucname somewhere, whereas it is really unlikely that people add a Py prefix to their functions (if they have been following the last 30 years of C programming). > - Objects/stringobject.c and Objects/unicodeobject.c contain changes to > handle the EMX runtime library returning "0x" as the prefix for output > formatted with a "%X" format. I'd suggest a different approach here, which does not use #ifdefs: Instead of testing for the system, test for the bug. Then, if the bug goes away, or appears on other systems as well, the code will be good. Once formatting is complete, see whether it put in the right letter, and fix that in the result buffer if the native sprintf got it wrong. If you follow this strategy, you should still add a comment indicating that this was added for OS/2, to give people an idea where that came from. Another approach would be to autoconfiscate this particular issue. I'm in general in favour of autoconf'ed bug tests instead of runtime bug tests, but people on systems without /bin/sh might feel differently. > If there are no unresolvable objections, and approval to apply these > patches is granted, I propose that the patches be applied as follows:- > > Stage 1: the build patch (creates+populates PC/os2emx/) > Stage 2: the Lib/plat-os2emx/ patch > Stage 3: the Lib/ and Lib/test/ patches > Stage 4: the distutils patch > Stage 5: the Include/, Objects/ and Python/ patches > Stage 6: the Modules/ patch > > I would expect to allow at least 48 hours between stages. > > Comments/advice on this proposal also appreciated. Sounds good to me (although I'd probably process the "uncritical", i.e. truly platform-specific parts much more quickly). Who's going to work with Andrew to integrate this stuff? Regards, Martin From aahz@rahul.net Mon Jan 28 21:21:58 2002 From: aahz@rahul.net (Aahz Maruch) Date: Mon, 28 Jan 2002 13:21:58 -0800 (PST) Subject: [Python-Dev] Tuples vs. lists In-Reply-To: <200201241807.NAA17320@pcp742651pcs.reston01.va.comcast.net> from "Guido van Rossum" at Jan 24, 2002 01:07:24 PM Message-ID: <20020128212158.D7315E8C3@waltz.rahul.net> Guido van Rossum wrote: > Aahz: >> >> Sure, but then I can't just copy references to the tuple when creating a >> copy of an instance, I'd have to copy the entire list. That's what I >> meant by efficiency. There are important semantic differences coming >> from the fact that tuples are immutable and lists are mutable, and I >> think that a strict heterogeneous/homogenous distinction loses that. > > Well, as long as you promise not to change it, you *can* copy a > reference, right? I guess I don't understand your application > enough -- do you intend this to be a starting point that is modified > during the program's execution, or is this a constant array? It's a constant. The BCD module is Binary Coded Decimal; instances are intended to be as immutable as strings and numbers (well, it *is* a number type). Modifying an instance is guaranteed to produce a new instance. To a large extent, I guess I feel that if a class is intended to be immutable, each of its underlying data attributes should also be immutable. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From guido@python.org Mon Jan 28 21:26:23 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 28 Jan 2002 16:26:23 -0500 Subject: [Python-Dev] Tuples vs. lists In-Reply-To: Your message of "Mon, 28 Jan 2002 13:21:58 PST." <20020128212158.D7315E8C3@waltz.rahul.net> References: <20020128212158.D7315E8C3@waltz.rahul.net> Message-ID: <200201282126.QAA30702@pcp742651pcs.reston01.va.comcast.net> > It's a constant. The BCD module is Binary Coded Decimal; instances are > intended to be as immutable as strings and numbers (well, it *is* a > number type). Modifying an instance is guaranteed to produce a new > instance. To a large extent, I guess I feel that if a class is intended > to be immutable, each of its underlying data attributes should also be > immutable. Or you could assign it to a private variable. --Guido van Rossum (home page: http://www.python.org/~guido/) From tismer@tismer.com Tue Jan 29 00:58:09 2002 From: tismer@tismer.com (Christian Tismer) Date: Tue, 29 Jan 2002 01:58:09 +0100 Subject: [Python-Dev] Ann: Stackless 2.2 pre-alpha is ready! Message-ID: <3C55F3A1.607@tismer.com> Dear Python community, Stackless Python 2.2 is alive! This is the first alpha version. It does not have any relevant changes to the interpreter. It does not have any limitation on switching. Support code for uthreads and coroutines is already implemented. And as announced, it is completely platform dependant. This version works on MS Win32 only. I'm going to support other platforms if I can find some sponsors. Let me say, it works great! There is no single problem. This technique can be applied to any software, any interpreter, provided I can support the platform. *** This is a critical phase for Stackless! *** *** I Am Asking For Corporate Sponsorships. *** I don't know how things should go on. I could turn it into a commercial product, Stackless is enabled enough for this. Or I could continue to keep it open-sourced, provided there is enough sponsorship. This decision has to be discussed in the next two weeks, after that I will decide. Anyway: Please check it out of CVS and have a look, it is sooo small code now. cvs -d :pserver:anonymous@tismer.com:/home/cvs co stackless/src You might want to add -z9 since this is a full Python 2.2 checkout. In this state, I don't prepare a distribution. You can build Stackless from CVS. I also put a copy of my python22.dll here for testing: http://www.stackless.com/slpython22.zip It is just almost 2 percent slower on my W2k machine. The trick is to avoid stack switching as much as possible. I do it only on every 8th recursion level, which is more than what's usual. >>> def f(n): ... if n:f(n-1) ... >>> import sys >>> sys.setrecursionlimit(100000+10) >>> f(100000) >>> ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ _______________________________________________ Stackless mailing list Stackless@www.tismer.com http://www.tismer.com/mailman/listinfo/stackless From greg@cosc.canterbury.ac.nz Tue Jan 29 05:09:10 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 29 Jan 2002 18:09:10 +1300 (NZDT) Subject: [Python-Dev] Ann: Stackless 2.2 pre-alpha is ready! In-Reply-To: <3C55F3A1.607@tismer.com> Message-ID: <200201290509.SAA18841@s454.cosc.canterbury.ac.nz> Christian Tismer : > I could turn it into a commercial product, Stackless is > enabled enough for this. Or I could continue to keep it > open-sourced, provided there is enough sponsorship. It would be disappointing if it ceased being open-source! I hope enough volunteers can be found to work on ports to other platforms. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tismer@tismer.com Tue Jan 29 12:12:16 2002 From: tismer@tismer.com (Christian Tismer) Date: Tue, 29 Jan 2002 13:12:16 +0100 Subject: [Python-Dev] Ann: Stackless 2.2 pre-alpha is ready! References: <200201290509.SAA18841@s454.cosc.canterbury.ac.nz> Message-ID: <3C5691A0.5000101@tismer.com> Greg Ewing wrote: > Christian Tismer : > > >>I could turn it into a commercial product, Stackless is >>enabled enough for this. Or I could continue to keep it >>open-sourced, provided there is enough sponsorship. >> > > It would be disappointing if it ceased being open-source! > I hope enough volunteers can be found to work on ports to > other platforms. No problem, I was just trying to get more sponsors, which in fact already exist (but not enough for a living). Stackless will stay open source, especially after it has become so few source :-) -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From fdrake@acm.org Tue Jan 29 16:36:16 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 29 Jan 2002 11:36:16 -0500 Subject: [Python-Dev] release22-maint branch strangeness Message-ID: <15446.53120.164519.118410@grendel.zope.com> I've noticed some strangeness with the release22-maint branch. I made a documentation change there this morning, and CVS gave the change a really weird version number when I checked it in. Looking further, it looks like the previous checkin for that file (Doc/tut/tut.tex) has some strangeness as well. The branching tags are also pretty whacked. This is an excerpt of the "cvs log" for the file: ------------------------------------------------------------------------ RCS file: /cvsroot/python/python/dist/src/Doc/tut/tut.tex,v Working file: tut.tex head: 1.158 branch: locks: strict access list: symbolic names: r212: 1.133.2.5 r212c1: 1.133.2.5 release22-mac: 1.156.4.1 release22-maint: 1.156.4.1.0.2 release22: 1.156.4.1 release22-branch: 1.156.0.4 release22-fork: 1.156 ------------------------------------------------------------------------ Note the revision number for release22-maint; it looks like it's a branch created from a branch created from a tag on a branch(!). All the while, I've been thinking that branches, once created, are independent (identified by the third component of the revision number for any given file). I still think they're supposed to be. Using a checkout created with the "-r release22-maint" options, I made two checkins, and the revision numbers & other metadata seem seriously strange: ------------------------------------------------------------------------ revision 1.156.4.1 date: 2001/12/21 03:48:33; author: fdrake; state: Exp; lines: +2 -2 branches: 1.156.4.1.2; Fix up some examples in the tutorial so we don't contradict our own advice on docstrings. This fixes SF bug #495601. ---------------------------- revision 1.156.4.1.2.1 date: 2002/01/29 14:54:18; author: fdrake; state: Exp; lines: +8 -1 Revise cheeseshop example so that the order of the keyword output is completely determined by the example; dict insertion order and the string hash algorithm no longer affect the output. This fixes SF bug #509281. ------------------------------------------------------------------------ For revision 1.156.4.1, note the strange branch number (1.156.4.1.2 -- too many components), and for revision 1.156.4.1.2.1 (too many components again!). The strange branch number on the first indicates that a branch was created from that revision (itself part of the release22-branch branch). Does anyone remember who created these branches? Or what commands were used to create them (using which branch/tag as the source of the working copy being used?)? This pretty much has Barry & I stumped at the moment, and we'd like to get this straightened out. The suspect branches are release22-maint, release22-mac. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mwh@python.net Tue Jan 29 17:25:05 2002 From: mwh@python.net (Michael Hudson) Date: 29 Jan 2002 17:25:05 +0000 Subject: [Python-Dev] release22-maint branch strangeness In-Reply-To: "Fred L. Drake, Jr."'s message of "Tue, 29 Jan 2002 11:36:16 -0500" References: <15446.53120.164519.118410@grendel.zope.com> Message-ID: <2melk9nk7i.fsf@starship.python.net> "Fred L. Drake, Jr." writes: > I've noticed some strangeness with the release22-maint branch. I made > a documentation change there this morning, and CVS gave the change a > really weird version number when I checked it in. Looking further, it > looks like the previous checkin for that file (Doc/tut/tut.tex) has > some strangeness as well. The branching tags are also pretty > whacked. This is an excerpt of the "cvs log" for the file: Looks to me like the release22 tag for Doc/tut/tut.tex was set on the release22-branch, not the trunk. This is not what happened for, e.g. configure.in. Quite how this happened, or what (if anything) we should do about it, is another question entirely. cvs status -v is quite handy here. $ cvs status -v Doc/tut/tut.tex | head -n 20 =================================================================== File: tut.tex Status: Needs Patch Working revision: 1.157 Repository revision: 1.158 /cvsroot/python/python/dist/src/Doc/tut/tut.tex,v Sticky Tag: (none) Sticky Date: (none) Sticky Options: (none) Existing Tags: r212 (revision: 1.133.2.5) r212c1 (revision: 1.133.2.5) release22-mac (revision: 1.156.4.1) release22-maint (branch: 1.156.4.1.2) release22 (revision: 1.156.4.1) release22-branch (branch: 1.156.4) release22-fork (revision: 1.156) r22c1-mac (revision: 1.156) r22c1 (revision: 1.156) r22rc1-branch (branch: 1.156.2) $ cvs status -v configure.in | head -n 20 =================================================================== File: configure.in Status: Up-to-date Working revision: 1.289 Repository revision: 1.289 /cvsroot/python/python/dist/src/configure.in,v Sticky Tag: (none) Sticky Date: (none) Sticky Options: (none) Existing Tags: r212 (revision: 1.215.2.7) r212c1 (revision: 1.215.2.7) release22-mac (revision: 1.288) release22-maint (branch: 1.288.6) release22 (revision: 1.288) release22-branch (branch: 1.288.4) release22-fork (revision: 1.288) r22c1-mac (revision: 1.288) r22c1 (revision: 1.288) r22rc1-branch (branch: 1.288.2) Did different people create the release22 tags in different bits of the tree? Cheers, M. -- The "of course, while I have no problem with this at all, it's surely too much for a lesser being" flavor of argument always rings hollow to me. -- Tim Peters, 29 Apr 1998 From fdrake@acm.org Tue Jan 29 19:15:48 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 29 Jan 2002 14:15:48 -0500 Subject: [Python-Dev] release22-maint branch strangeness In-Reply-To: <2melk9nk7i.fsf@starship.python.net> References: <15446.53120.164519.118410@grendel.zope.com> <2melk9nk7i.fsf@starship.python.net> Message-ID: <15446.62692.281105.738748@grendel.zope.com> Michael Hudson writes: > Looks to me like the release22 tag for Doc/tut/tut.tex was set on the > release22-branch, not the trunk. Should not the branches be independent, once created? > This is not what happened for, e.g. configure.in. configure.in was not changed on the release22-branch. Take a look at Include/patchlevel.h. It doesn't look as messed up as Doc/tut/tut.tex, but something is definately wrong here as well. > Did different people create the release22 tags in different bits of > the tree? I'm not quite sure how things were handled with the -maint and -mac branches; I wonder if a branch tag was used somewhere a normal tag could have been used. I don't see it, though. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From tismer@tismer.com Tue Jan 29 19:41:45 2002 From: tismer@tismer.com (Christian Tismer) Date: Tue, 29 Jan 2002 20:41:45 +0100 Subject: [Python-Dev] Thread questionlet Message-ID: <3C56FAF9.20109@tismer.com> Dear developers, I'm still a little ignorant to real threads. In order to do the implementation of hard-wired microthreads right, I tried to understand how real threads work. My question, which I could not easily answer by reading the source is: What happens when the main thread ends? Do all threads run until they are eady too, or are they just killed away? And if they are killed, are they just removed, or do they all get an exception for cleanup? I would guess the latter, but I'm not sure. When a thread ends, it may contain several levels of other C calls which might need to finalize, so I thought of a special exception for this, but didn't find such. Many thanks and sorry about my ignorance - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From tim.one@home.com Tue Jan 29 20:29:41 2002 From: tim.one@home.com (Tim Peters) Date: Tue, 29 Jan 2002 15:29:41 -0500 Subject: [Python-Dev] Thread questionlet In-Reply-To: <3C56FAF9.20109@tismer.com> Message-ID: [Christian Tismer] > ... > My question, which I could not easily answer by reading > the source is: > What happens when the main thread ends? Do all threads run > until they are ready too, or are they just killed away? You're walking near the edge of a very steep cliff. There are jagged rocks a kilometer below, so don't slip . It varies by OS, and even by exactly how the main thread exits. Reading OS docs doesn't really help either, because the version of threads exposed by the C libraries may differ from native OS facilities in subtle but crucial ways. > And if they are killed, are they just removed, or do > they all get an exception for cleanup? Can only be answered one platform at a time. They're not going to get a *Python*-level exception, no. Here's a simple test program: import thread import time def f(i): while 1: print "thread %d about to sleep" % i time.sleep(0.5) for i in range(3): thread.start_new_thread(f, (i,)) time.sleep(3) print "main is done" and a typical run on Windows: C:\Code\python\PCbuild>\python22\python.exe tdie.py thread 0 about to sleep thread 1 about to sleep thread 2 about to sleep thread 0 about to sleep thread 1 about to sleep thread 2 about to sleep thread 0 about to sleep thread 1 about to sleep thread 2 about to sleep thread 0 about to sleep thread 1 about to sleep thread 2 about to sleep thread 1 about to sleep thread 0 about to sleep thread 2 about to sleep thread 1 about to sleep thread 0 about to sleep thread 2 about to sleep thread 1 about to sleep main is done C:\Code\python\PCbuild> I expect much the same on Linux (all threads die, no exceptions raised). But, IIRC, the threads would keep going on SGI despite that the main thread is history. > ... > When a thread ends, it may contain several levels of other > C calls which might need to finalize, so I thought of > a special exception for this, but didn't find such. Closing threads cleanly is the programmer's responsiblity across all OSes. It can be very difficult. Python doesn't really help (or hinder). Microsoft helps in that DLLs can define a "call on thread detach" function that's automatically called when a thread detaches from the DLL, but Python doesn't exploit that. The DLL hook may not get called even if it did, depending on exactly how a thread detaches (the Big Hammer last-chance Win32 TerminateProcess/TerminateThread functions generally leave things a mess -- "TerminateThread is a dangerous function that should only be used in the most extreme cases", etc). From guido@python.org Tue Jan 29 21:46:41 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 29 Jan 2002 16:46:41 -0500 Subject: [Python-Dev] Thread questionlet In-Reply-To: Your message of "Tue, 29 Jan 2002 20:41:45 +0100." <3C56FAF9.20109@tismer.com> References: <3C56FAF9.20109@tismer.com> Message-ID: <200201292146.QAA24412@pcp742651pcs.reston01.va.comcast.net> > My question, which I could not easily answer by reading > the source is: > What happens when the main thread ends? Do all threads run > until they are eady too, or are they just killed away? > And if they are killed, are they just removed, or do > they all get an exception for cleanup? If you're talking about the thread module, they are killed without being given notice. The threading module however waits for all non-daemon threads, using the atexit mechanism build on top of sys.exit. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jan 29 21:51:35 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 29 Jan 2002 16:51:35 -0500 Subject: [Python-Dev] release22-maint branch strangeness In-Reply-To: Your message of "Tue, 29 Jan 2002 11:36:16 EST." <15446.53120.164519.118410@grendel.zope.com> References: <15446.53120.164519.118410@grendel.zope.com> Message-ID: <200201292151.QAA24484@pcp742651pcs.reston01.va.comcast.net> > I've noticed some strangeness with the release22-maint branch. I made > a documentation change there this morning, and CVS gave the change a > really weird version number when I checked it in. Looking further, it > looks like the previous checkin for that file (Doc/tut/tut.tex) has > some strangeness as well. The branching tags are also pretty > whacked. This is an excerpt of the "cvs log" for the file: > > ------------------------------------------------------------------------ > RCS file: /cvsroot/python/python/dist/src/Doc/tut/tut.tex,v > Working file: tut.tex > head: 1.158 > branch: > locks: strict > access list: > symbolic names: > r212: 1.133.2.5 > r212c1: 1.133.2.5 > release22-mac: 1.156.4.1 > release22-maint: 1.156.4.1.0.2 > release22: 1.156.4.1 > release22-branch: 1.156.0.4 > release22-fork: 1.156 > ------------------------------------------------------------------------ > > Note the revision number for release22-maint; it looks like it's a > branch created from a branch created from a tag on a branch(!). All > the while, I've been thinking that branches, once created, are > independent (identified by the third component of the revision number > for any given file). I still think they're supposed to be. I think you must've used a tag to bvase your branch, and that tag was already on a branch. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Tue Jan 29 21:51:29 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 29 Jan 2002 15:51:29 -0600 Subject: [Python-Dev] Stevens - still best for Unix system call programming? Message-ID: <15447.6497.563586.973235@beluga.mojam.com> Sorry for the off-topic post. I'm starting in on a little project to create an analog to fopen(3) and friends that provides the illusion of large file support even on systems that don't support large files, so I'm doing more fiddling with Unix system calls than I've done in awhile, and am looking for a little hardcover help. Is Richard Stevens' "Advanced Programming in the UNIX Environment" still the _sine qua non_ in this area? Thx, Skip P.S. OPN: it will have a Python binding... From barry@zope.com Tue Jan 29 22:08:00 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 29 Jan 2002 17:08:00 -0500 Subject: [Python-Dev] Stevens - still best for Unix system call programming? References: <15447.6497.563586.973235@beluga.mojam.com> Message-ID: <15447.7488.861384.586661@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> Is Richard Stevens' "Advanced Programming in the UNIX SM> Environment" still the _sine qua non_ in this area? Indeed! It always seems to answer my questions accurately and in depth. -Barry From aahz@rahul.net Tue Jan 29 23:47:37 2002 From: aahz@rahul.net (Aahz Maruch) Date: Tue, 29 Jan 2002 15:47:37 -0800 (PST) Subject: [Python-Dev] Thread questionlet In-Reply-To: <3C56FAF9.20109@tismer.com> from "Christian Tismer" at Jan 29, 2002 08:41:45 PM Message-ID: <20020129234738.66F9AE8C4@waltz.rahul.net> Christian Tismer wrote: > > I'm still a little ignorant to real threads. > In order to do the implementation of hard-wired microthreads > right, I tried to understand how real threads work. Can't answer your specific question, but you might want to look at my Starship pages if you want to increase your general understanding of Python threads (there probably won't be much new to you; OTOH, it shouldn't take you long to read). http://starship.python.net/crew/aahz/ -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From tismer@tismer.com Wed Jan 30 00:21:40 2002 From: tismer@tismer.com (Christian Tismer) Date: Wed, 30 Jan 2002 01:21:40 +0100 Subject: [Python-Dev] Thread questionlet References: <20020129234738.66F9AE8C4@waltz.rahul.net> Message-ID: <3C573C94.3090305@tismer.com> Aahz Maruch wrote: > Christian Tismer wrote: > >>I'm still a little ignorant to real threads. >>In order to do the implementation of hard-wired microthreads >>right, I tried to understand how real threads work. >> > > Can't answer your specific question, but you might want to look at my > Starship pages if you want to increase your general understanding of > Python threads (there probably won't be much new to you; OTOH, it > shouldn't take you long to read). > > http://starship.python.net/crew/aahz/ Oh well, 1024 thanks, very helpful. I'm again the clueless implementor. It still feels warm and fuzzy here, although I think there is no rule that I missed to break since -dev knows about me, and now my final sacrileg... ...after all, this is kinda piecemaker, since Stackless is now orthogonal, in a way. I gave up some academic POV, in favor of something pragmatic, and finally we all get rid of a problem. Hey, I want to become a productive contributor (again?) :) thanks - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From tim.one@home.com Wed Jan 30 06:29:39 2002 From: tim.one@home.com (Tim Peters) Date: Wed, 30 Jan 2002 01:29:39 -0500 Subject: [Python-Dev] Stale CVS lock Message-ID: There appears to be a stale anoncvs lock in the Misc directory preventing checkins there (like NEWS); have opened an SF support request: http://sf.net/tracker/?func=detail&aid=510555&group_id=1&atid=200001 From tismer@tismer.com Wed Jan 30 10:54:14 2002 From: tismer@tismer.com (Christian Tismer) Date: Wed, 30 Jan 2002 11:54:14 +0100 Subject: [Python-Dev] Thread questionlet References: Message-ID: <3C57D0D6.8010602@tismer.com> Tim Peters wrote: > [Christian Tismer] > >>... >>My question, which I could not easily answer by reading >>the source is: >>What happens when the main thread ends? Do all threads run >>until they are ready too, or are they just killed away? >> > > You're walking near the edge of a very steep cliff. There are jagged rocks > a kilometer below, so don't slip . Uhmm -- I really didn't want to poke into something problematic, but obviously I have no more simple questions left. ;-) > It varies by OS, and even by exactly how the main thread exits. Reading OS > docs doesn't really help either, because the version of threads exposed by > the C libraries may differ from native OS facilities in subtle but crucial > ways. It does not sound like being designed so, more like just some way through these subtleties, without trying to solve every platform's problems. I don't try to solve this, either. But since I'm writing some kind of platform independant threads (isn't it funny? by using non-portable tricks, I get some portable threads), I'd like to think about how this world *could* look like. Maybe I have a chance to provide an (u)thread implementation which is really what people would want for real threads? >>And if they are killed, are they just removed, or do >>they all get an exception for cleanup? >> > > Can only be answered one platform at a time. They're not going to get a > *Python*-level exception, no. Here's a simple test program: [thanks for the test code] > I expect much the same on Linux (all threads die, no exceptions raised). > But, IIRC, the threads would keep going on SGI despite that the main thread > is history. So threads do force the programmer to write platform-dependant Python code. For sure nothing that Python wants, it just happens. >>... >>When a thread ends, it may contain several levels of other >>C calls which might need to finalize, so I thought of >>a special exception for this, but didn't find such. >> > > Closing threads cleanly is the programmer's responsiblity across all OSes. > It can be very difficult. Python doesn't really help (or hinder). Ok with me, this is really not trivial. (I guessed that from reading the source, but it really was not obvious. So I asked a naive question, but you know me better...) Maybe Python could try to help though an API? > Microsoft helps in that DLLs can define a "call on thread detach" function > that's automatically called when a thread detaches from the DLL, but Python > doesn't exploit that. The DLL hook may not get called even if it did, > depending on exactly how a thread detaches (the Big Hammer last-chance Win32 > TerminateProcess/TerminateThread functions generally leave things a mess -- > "TerminateThread is a dangerous function that should only be used in the > most extreme cases", etc). Now the real question: If you have the oportunity which I have: Define some threads which (mis)behave equally (un)well on every supported platform, once and forever. Would you try to mimick the median real threads behavior as they work today? Or would you try to build something consistent, cross-platform, that makes sense, that would even make sense for new revisions of the real thread modules? I think here is a chance to do a reference implementation of (u)threads since there are absolutely no OS dictated restrictions or MS added doubtful features, we can just do it right. Given that there is a suitable definition of "right", of course. The problem is that I'm not a specialist on threading, therefore I'm asking for suggestions. Please, what do you all think would be "right", given that you have full control of ver your "virtual OS"? contructively-but-trying-not-to-overdo - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From mwh@python.net Wed Jan 30 14:54:30 2002 From: mwh@python.net (Michael Hudson) Date: 30 Jan 2002 14:54:30 +0000 Subject: [Python-Dev] can someone with purify run test_curses Message-ID: <2m7kpzq47t.fsf@starship.python.net> Please? It segfaults for me, in a confusing way. It only segfaults if I run it under regrtest, not directly. Cheers, M. -- CLiki pages can be edited by anybody at any time. Imagine the most fearsomely comprehensive legal disclaimer you have ever seen, and double it -- http://ww.telent.net/cliki/index From aahz@rahul.net Wed Jan 30 14:56:04 2002 From: aahz@rahul.net (Aahz Maruch) Date: Wed, 30 Jan 2002 06:56:04 -0800 (PST) Subject: [Python-Dev] Thread questionlet In-Reply-To: <3C57D0D6.8010602@tismer.com> from "Christian Tismer" at Jan 30, 2002 11:54:14 AM Message-ID: <20020130145604.AF2F0E8C4@waltz.rahul.net> Christian Tismer wrote: > > I don't try to solve this, either. But since I'm writing some kind of > platform independant threads (isn't it funny? by using non-portable > tricks, I get some portable threads), I'd like to think about how > this world *could* look like. Maybe I have a chance to provide an > (u)thread implementation which is really what people would want for > real threads? No, you don't. Real threads have one killer advantage you just can't emulate: they can parallelize I/O operations (and theoretically parallelize computations on multiple CPUs). The advantage of microthreads has been that they're lightweight, so they're good for applications that require *lots* of threads, such as simulations. I think keeping this advantage would be a Good Idea. You might want to look at Ruby, though, because it does what you're wanting to do. (I think -- I haven't touched Ruby myself.) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From neal@metaslash.com Wed Jan 30 15:36:23 2002 From: neal@metaslash.com (Neal Norwitz) Date: Wed, 30 Jan 2002 10:36:23 -0500 Subject: [Python-Dev] can someone with purify run test_curses References: <2m7kpzq47t.fsf@starship.python.net> Message-ID: <3C5812F7.86B78FBC@metaslash.com> Michael Hudson wrote: > > Please? It segfaults for me, in a confusing way. > > It only segfaults if I run it under regrtest, not directly. What version? Current CVS? Any other special instructions? When do you need it done? Neal From mwh@python.net Wed Jan 30 15:43:19 2002 From: mwh@python.net (Michael Hudson) Date: 30 Jan 2002 15:43:19 +0000 Subject: [Python-Dev] can someone with purify run test_curses In-Reply-To: Neal Norwitz's message of "Wed, 30 Jan 2002 10:36:23 -0500" References: <2m7kpzq47t.fsf@starship.python.net> <3C5812F7.86B78FBC@metaslash.com> Message-ID: <2mhep37sko.fsf@starship.python.net> Neal Norwitz writes: > Michael Hudson wrote: > > > > Please? It segfaults for me, in a confusing way. > > > > It only segfaults if I run it under regrtest, not directly. > > What version? Current CVS? Any other special instructions? > When do you need it done? I don't think I do, now (see checkins). But it might be worth running test_curses --with-pymalloc, if it's not too much hassle. I'll have a look to see if there are any other Object/Mem mismatches. Cheers, M. -- Our lecture theatre has just crashed. It will currently only silently display an unexplained line-drawing of a large dog accompanied by spookily flickering lights. -- Dan Sheppard, ucam.chat (from Owen Dunn's summary of the year) From nas@python.ca Wed Jan 30 16:28:07 2002 From: nas@python.ca (Neil Schemenauer) Date: Wed, 30 Jan 2002 08:28:07 -0800 Subject: [Python-Dev] Mixing memory management APIs In-Reply-To: ; from mwh@users.sourceforge.net on Wed, Jan 30, 2002 at 07:47:36AM -0800 References: Message-ID: <20020130082807.B9393@glacier.arctrix.com> Michael Hudson wrote: > Modified Files: > _curses_panel.c > Log Message: > Oh look, another one. > > 2.2.1 candiate (he says, largely talking to himself :) > *** 192,196 **** > Py_DECREF(po->wo); > remove_lop(po); > ! PyMem_DEL(po); > } > > --- 192,196 ---- > Py_DECREF(po->wo); > remove_lop(po); > ! PyObject_DEL(po); > } I think we have to break down and do what Tim suggests. Ie make: free == PyMem_DEL == PyObject_DEL == PyObject_FREE == ... pymalloc needs to use a completely new set of APIs. The only problem I see is coming up with names. NEW, MALLOC, REALLOC, RESIZE, and DEL are all taken. Any suggestions? Neil From mwh@python.net Wed Jan 30 17:01:17 2002 From: mwh@python.net (Michael Hudson) Date: 30 Jan 2002 17:01:17 +0000 Subject: [Python-Dev] Mixing memory management APIs In-Reply-To: Neil Schemenauer's message of "Wed, 30 Jan 2002 08:28:07 -0800" References: <20020130082807.B9393@glacier.arctrix.com> Message-ID: <2mvgdj6aea.fsf@starship.python.net> Neil Schemenauer writes: > Michael Hudson wrote: > > Modified Files: > > _curses_panel.c > > Log Message: > > Oh look, another one. > > > > 2.2.1 candiate (he says, largely talking to himself :) > > > *** 192,196 **** > > Py_DECREF(po->wo); > > remove_lop(po); > > ! PyMem_DEL(po); > > } > > > > --- 192,196 ---- > > Py_DECREF(po->wo); > > remove_lop(po); > > ! PyObject_DEL(po); > > } > > I think we have to break down and do what Tim suggests. Ie make: > > free == PyMem_DEL == PyObject_DEL == PyObject_FREE == ... > > pymalloc needs to use a completely new set of APIs. The only problem I > see is coming up with names. NEW, MALLOC, REALLOC, RESIZE, and DEL are > all taken. Any suggestions? And then change all the current uses of PyObject_Del to the new API? What would that buy us? Unless I misunderstand we *have* to do something different to remove an object as opposed to freeing raw storage (GC, for example). I agree we have too many preprocessor macros, but I don't think we can have free == PyObject_DEL. I don't what we have is so bad; a helpful tip is that if you're using the _Free/_FREE/_Malloc/_REALLOC/etc interfaces, stop. That gets rid of half the problem. Cheers, M. -- We've had a lot of problems going from glibc 2.0 to glibc 2.1. People claim binary compatibility. Except for functions they don't like. -- Peter Van Eynde, comp.lang.lisp From mwh@python.net Wed Jan 30 17:03:56 2002 From: mwh@python.net (Michael Hudson) Date: 30 Jan 2002 17:03:56 +0000 Subject: [Python-Dev] Mixing memory management APIs In-Reply-To: Michael Hudson's message of "30 Jan 2002 17:01:17 +0000" References: <20020130082807.B9393@glacier.arctrix.com> <2mvgdj6aea.fsf@starship.python.net> Message-ID: <2m8zafkbyb.fsf@starship.python.net> Michael Hudson writes: > Neil Schemenauer writes: > > > I think we have to break down and do what Tim suggests. Ie make: > > > > free == PyMem_DEL == PyObject_DEL == PyObject_FREE == ... > > > > pymalloc needs to use a completely new set of APIs. The only problem I > > see is coming up with names. NEW, MALLOC, REALLOC, RESIZE, and DEL are > > all taken. Any suggestions? > > And then change all the current uses of PyObject_Del to the new API? > What would that buy us? Unless I misunderstand we *have* to do > something different to remove an object as opposed to freeing raw > storage (GC, for example). > > I agree we have too many preprocessor macros, but I don't think we can > have free == PyObject_DEL. No, I take that back... -- Just point your web browser at http://www.python.org/search/ and look for "program", "doesn't", "work", or "my". Whenever you find someone else whose program didn't work, don't do what they did. Repeat as needed. -- Tim Peters, on python-help, 16 Jun 1998 From aahz@rahul.net Wed Jan 30 17:13:48 2002 From: aahz@rahul.net (Aahz Maruch) Date: Wed, 30 Jan 2002 09:13:48 -0800 (PST) Subject: [Python-Dev] Mixing memory management APIs In-Reply-To: <20020130082807.B9393@glacier.arctrix.com> from "Neil Schemenauer" at Jan 30, 2002 08:28:07 AM Message-ID: <20020130171348.E8D78E8C4@waltz.rahul.net> Neil Schemenauer wrote: > > pymalloc needs to use a completely new set of APIs. The only problem I > see is coming up with names. NEW, MALLOC, REALLOC, RESIZE, and DEL are > all taken. Any suggestions? >From the Department of Redundancy Department, how about: PyMalloc_New, PyMalloc_Malloc, .... -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From nas@python.ca Wed Jan 30 17:40:50 2002 From: nas@python.ca (Neil Schemenauer) Date: Wed, 30 Jan 2002 09:40:50 -0800 Subject: [Python-Dev] Mixing memory management APIs In-Reply-To: <2mvgdj6aea.fsf@starship.python.net>; from mwh@python.net on Wed, Jan 30, 2002 at 05:01:17PM +0000 References: <20020130082807.B9393@glacier.arctrix.com> <2mvgdj6aea.fsf@starship.python.net> Message-ID: <20020130094050.B9709@glacier.arctrix.com> Michael Hudson wrote: > I agree we have too many preprocessor macros, but I don't think we can > have free == PyObject_DEL. If we don't then many extension modules will break. The example module Modules/xxmodule.c used to allocate using PyObject_New and deallocate using free(). I believe there are many modules out there that do the same (or use PyMem_Del, etc). Neil From paul@pfdubois.com Wed Jan 30 17:39:45 2002 From: paul@pfdubois.com (Paul Dubois) Date: Wed, 30 Jan 2002 09:39:45 -0800 Subject: [Python-Dev] Odd errors when catching ImportError Message-ID: <000701c1a9b5$1bccb5e0$09860cc0@CLENHAM> Please excuse me if this is in the bug list; I looked through it but the list is too long for old people. I have been running into a number of odd errors caused by code like the following. The behavior seems to be machine dependent. fooflag = 0 try: import foo except ImportError: fooflag = 1 I have had this result in a seg fault upon exit, and also when something like this was in file xxx.py inside a package, and the __init__.py did from xxx import fooflag I've had it tell me xxx had no attribute fooflag. I added "print fooflag" at the bottom of the file and it fixed it. That was on a DEC. On Linux it worked. I suppose I should be testing for the ability to import foo some other way but I don't know what it is. From sdm7g@Virginia.EDU Wed Jan 30 18:00:48 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Wed, 30 Jan 2002 13:00:48 -0500 (EST) Subject: [Python-Dev] next vs darwin Message-ID: I recall having the discussion but I don't quite recall the resolution: Is Next support now officially dropped from the distribution ? I have a revised dynamic loading module that strips out all of the dead branches ( as well as better error reporting ): I was going to call it dynload_darwin.c and add support to configure, but grepping thru configure I only saw darwin as triggering dynload_next.c -- it *looks* like the Next has bee dropped. Should we rename the file anyway ? ( to make it easier for folks to know where to look. ) There has also been some discussion on the pythonmac-sig list about dynamic loading. There are some other problems that this module doesn't fix yet. If someone wants to subit a better one, that's fine by me, but we REALLY need to get the better error reporting in there so we can at least find the problem. The other thing that's been discussed is adding configure support to build with the dlopen compatability libs if that is available. ( doing config with --without-dyld doesn't seem to change anything. ) -- Steve From martin@v.loewis.de Wed Jan 30 19:04:59 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 30 Jan 2002 20:04:59 +0100 Subject: [Python-Dev] next vs darwin In-Reply-To: References: Message-ID: Steven Majewski writes: > I recall having the discussion but I don't quite recall the > resolution: Is Next support now officially dropped from the > distribution ? AFAIR, yes. > Should we rename the file anyway ? ( to make it easier for > folks to know where to look. ) Yes. > The other thing that's been discussed is adding configure > support to build with the dlopen compatability libs if > that is available. Can you please explain what that would provide to module users or end users? Would there be additional modules available that otherwise wouldn't be available? If not, I don't think that this option should be provided. Regards, Martin From mal@lemburg.com Wed Jan 30 19:15:33 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 30 Jan 2002 20:15:33 +0100 Subject: [Python-Dev] Mixing memory management APIs References: <20020130082807.B9393@glacier.arctrix.com> <2mvgdj6aea.fsf@starship.python.net> <20020130094050.B9709@glacier.arctrix.com> Message-ID: <3C584655.33BE8C9D@lemburg.com> Neil Schemenauer wrote: > > Michael Hudson wrote: > > I agree we have too many preprocessor macros, but I don't think we can > > have free == PyObject_DEL. > > If we don't then many extension modules will break. The example module > Modules/xxmodule.c used to allocate using PyObject_New and deallocate > using free(). I believe there are many modules out there that do the > same (or use PyMem_Del, etc). Breaking extensions is not a good idea. After all, these make Python so much fun to work with (since most of the work is usually already done ;-). I do think that we should keep the differentiation between allocating raw memory buffers and space for Python objects. Even though this is not currently used, it clarifies the code somewhat, e.g. to free memory allocated for a Python object you write PyObject_FREE(), for an raw buffer you write PyMem_FREE(). Who knows... perhaps we might want to handle Python object memory blocks differently in the future (e.g. build pymalloc support right into PyObject_NEW() and PyObject_DEL()) while leaving user space memory in the realm of malloc() et al. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From rschaefe@teksystems.com Wed Jan 30 20:41:25 2002 From: rschaefe@teksystems.com (Rebecca Schaefer) Date: Wed, 30 Jan 2002 15:41:25 -0500 Subject: [Python-Dev] Python/Web developer Message-ID: <3C585A75.5831727A@teksystems.com> This is a multi-part message in MIME format. --------------5BF9DEF828C31C37EAF3FA97 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit TEKsystems in Appleton, WI has an opening for a web developer with Python, Zope, html, SQL, UNIX, and Perl experience. It is a long term contract opportunity. Any interested candidates should email Rebecca Schaefer at rschaefe@teksystems.com Thank you, Rebecca --------------5BF9DEF828C31C37EAF3FA97 Content-Type: text/x-vcard; charset=us-ascii; name="rschaefe.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Rebecca Schaefer Content-Disposition: attachment; filename="rschaefe.vcf" begin:vcard n:Schaefer;Rebecca tel;fax:920-225-7692 tel;work:888-598-5874 x-mozilla-html:FALSE org: adr:;;;;;; version:2.1 email;internet:rschaefe@teksystems.com title:Senior Recruiter fn:Rebecca Schaefer end:vcard --------------5BF9DEF828C31C37EAF3FA97-- From sdm7g@Virginia.EDU Wed Jan 30 20:42:53 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Wed, 30 Jan 2002 15:42:53 -0500 (EST) Subject: [Python-Dev] next vs darwin In-Reply-To: Message-ID: On 30 Jan 2002, Martin v. Loewis wrote: > > The other thing that's been discussed is adding configure > > support to build with the dlopen compatability libs if > > that is available. > > Can you please explain what that would provide to module users or end > users? Would there be additional modules available that otherwise > wouldn't be available? If not, I don't think that this option should > be provided. dlcompat libs are used by Apple to build Apache and some other programs. The libs are not included in Mac OSX, although the sources are available in the Darwin CVS, and an improved version is distributed on Fink and maybe other places. Since additional libs are required, I would not make that the default. ( unless, since there's already a check for libdl in config, we make it dependent on that. ) The problem is that the current dynload_next is broken, and we've had some problems replicating tests and solutions because, among other problems, of the very poor error reporting in dynload_next, everyone is starting from a differently hacked version of the 2.2 distribution. (The other variable is which modules and packages people are loading.) Reportedly, using the dlcompat libs fixes some problems for some people. Obviously, the best solution would be an even better dynload_darwin that fixes all of the problems. But it the interim, I'ld like to at least get everone debugging from the same baseline. If there's a string objection to adding optional libdl support, I can live without that. Adding it would just make it easier for folks to test that configuration and build. Getting a less broken dynload module is probably more important. -- Steve. From sdm7g@Virginia.EDU Wed Jan 30 23:23:57 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Wed, 30 Jan 2002 18:23:57 -0500 (EST) Subject: [Python-Dev] Apple PublicSource license [was: next vs darwin] In-Reply-To: Message-ID: I did another version of dynload_darwin that took a >10 line function from the dlcompat/dlopen.c code which uses an undocumented (at least in the man pages -- there's probably comments in the Darwin source code) non-public way around the public/private namespace problem we were having with the previous version. I'm waiting for some folks on pythonmac-sig to test it and report back. I'm guessing that this solves the problem without requiring libdl. However it gets into the possible problem of including another license. Could someone who undestands these issues a bit more than I, look at this: Apple Public Source License: http://www.publicsource.apple.com/apsl/ -- Steve BTW: Here's the magic code I added from dlcompat/dlopen.c: ( On the one hand, it's fairly short and trivial. On the other, I wouldn't have had a clue about this without reading the code! ) /* * NSMakePrivateModulePublic() is not part of the public dyld API so we define * it here. The internal dyld function pointer for * __dyld_NSMakePrivateModulePublic is returned so thats all that maters to get * the functionality need to implement the dlopen() interfaces. */ static enum bool NSMakePrivateModulePublic( NSModule module) { static enum bool (*p)(NSModule module) = NULL; if(p == NULL) _dyld_func_lookup("__dyld_NSMakePrivateModulePublic", (unsigned long *)&p); if(p == NULL){ #ifdef DEBUG printf("_dyld_func_lookup of __dyld_NSMakePrivateModulePublic " "failed\n"); #endif return(FALSE); } return(p(module)); } From tim.one@home.com Wed Jan 30 23:49:19 2002 From: tim.one@home.com (Tim Peters) Date: Wed, 30 Jan 2002 18:49:19 -0500 Subject: [Python-Dev] Mixing memory management APIs In-Reply-To: <2mvgdj6aea.fsf@starship.python.net> Message-ID: [NeilS, growing older but wiser, embraces the wisdom of giving up ] [Michael Hudson] > And then change all the current uses of PyObject_Del to the new API? It would mean changing all uses of all memory macros in the core to use new macros. > What would that buy us? The possibility to move to Vladimir's malloc implementation without breaking any extension modules (none: no breakage at all). I want the Python core to use Vladimir's malloc/free, and until the fabled free-threading gets implemented, to use a version that *exploits* the GIL to eliminate malloc/free lock overhead too. We know for a fact that some major extension modules misused the existing memory API (via mismatching "get memory"->"free memory" pairs), and it's so Byzantine and ill-documented that this shouldn't be a surprise. Beyond that, I don't believe we've ever said anything about thread safety wrt the existing memory API, simply because we relied on the platform malloc/free to provide thread safety even in the worst of cases. But if the Python core switches to a gimmick that relies on the GIL, then even extensions that use the current API properly (wrt correct matching pairs) may get into huge trouble if the underlying allocator stops doing its own layer of locking. The intent with new macros would be to spell out the rules. Extensions that wanted to play along could switch, while extensions that ignored the issues would continue to work with the existing seven ways to spell "malloc" (and seven to spell "free"). > ... > I don't what we have is so bad; a helpful tip is that if you're using > the _Free/_FREE/_Malloc/_REALLOC/etc interfaces, stop. That gets rid > of half the problem. But only for extensions that are actively maintained by people who are keen to dig into how they've abused the current API. It's likely easier for the core to move to new macros than to fully debug even one large extension module that's been working so far by luck. a-big-hammer-wouldn't-be-called-that-if-it-weren't-big-ly y'rs - tim From tim.one@home.com Wed Jan 30 23:51:56 2002 From: tim.one@home.com (Tim Peters) Date: Wed, 30 Jan 2002 18:51:56 -0500 Subject: [Python-Dev] Mixing memory management APIs In-Reply-To: <20020130082807.B9393@glacier.arctrix.com> Message-ID: [NeilS] > pymalloc needs to use a completely new set of APIs. The only problem I > see is coming up with names. NEW, MALLOC, REALLOC, RESIZE, and DEL are > all taken. Any suggestions? I liked Aahz's suggestion to start them with "PyMalloc_" well enough. Most of us use editors with word-completion anyway . From neal@metaslash.com Thu Jan 31 00:13:58 2002 From: neal@metaslash.com (Neal Norwitz) Date: Wed, 30 Jan 2002 19:13:58 -0500 Subject: [Python-Dev] Mixing memory management APIs References: Message-ID: <3C588C46.2BF27BBE@metaslash.com> Because of Michael Hudson's request, I tried running Purify --with-pymalloc enabled. The results were a bit surprising: 13664 errors! All the errors were in unicodeobject.c. There were 3 types of errors: Free Memory Reads, Array Bounds Reads, and Unitialized Memory Reads. The line #s were in strange places (e.g., in a function declaration and accessing self->length in an if clause, after it was accessed w/o error). The line #s are primarily: unicodeobject.c:2875, and unicodeobject.c:2214. Has anyone run else used Purify and/or Insure --with-pymalloc? BTW, I test_curses fails: test test_curses crashed -- _curses.error: curs_set() returned ERR Solaris 2.8, Purify 2002. Neal -- Problems (error lines begin with =>) PyUnicode_TranslateCharmap [unicodeobject.c:2214] PyObject *PyUnicode_EncodeASCII(const Py_UNICODE *p, int size, => const char *errors) { PyObject *repr; char *s, *start; split_char [unicodeobject.c:2875] if (end > self->length) end = self->length; if (end < 0) => end += self->length; if (end < 0) end = 0; From tim.one@home.com Thu Jan 31 00:33:48 2002 From: tim.one@home.com (Tim Peters) Date: Wed, 30 Jan 2002 19:33:48 -0500 Subject: [Python-Dev] Odd errors when catching ImportError In-Reply-To: <000701c1a9b5$1bccb5e0$09860cc0@CLENHAM> Message-ID: [Paul Dubois] > ... > I have been running into a number of odd errors caused by code like the > following. The behavior seems to be machine dependent. Which version(s) of Python? (Released, current CVS, all, ...?) > fooflag = 0 > try: > import foo > except ImportError: > fooflag = 1 > > I have had this result in a seg fault upon exit, Does or does not "foo" exist? Or does it segfault both ways? Either way, run Python -vv to get a trace of what it's trying during the import attempt. The last line displayed before the segfault may be a useful clue. You may even discover you're really importing a compiled foo extension module with a hardcoded segfault in module init . > and also when something like this was in file xxx.py inside a package, > and the __init__.py did > > from xxx import fooflag > > I've had it tell me xxx had no attribute fooflag. I added "print > fooflag" at the bottom of the file and it fixed it. That was on a DEC. > On Linux it worked. > > I suppose I should be testing for the ability to import foo some other > way but I don't know what it is. That's "the usual" way to check imports; if it were a widespread problem under any version of Python, I expect we would have heard about it before. If you have useful followups, you should record them in a bug report on SourceForge (Python-Dev is a black hole for bug reports). From andymac@bullseye.apana.org.au Wed Jan 30 21:15:11 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Thu, 31 Jan 2002 08:15:11 +1100 (EDT) Subject: [Python-Dev] updated patches for OS/2 EMX port In-Reply-To: Message-ID: I've let this lie for a few days to see whether any other comments were forthcoming, but nothing's turned up... On 27 Jan 2002, Martin v. Loewis wrote: > Andrew MacIntyre writes: > > > - Modules/unicodedata.c is affected by a name clash between the internally > > defined _getname() and an EMX routine of the same name defined in > > . The patch renames the internal routine to _getucname() to > > avoid this, but this change may not be acceptable - advice please. > > My advice for renaming things because of name clashes: Always rename > in a way that solves this particular problem for good, by using the Py > prefix (or _Py to further indicate that this is not public API; it's a > static function, anyway). Somebody may have a function _getucname > somewhere, whereas it is really unlikely that people add a Py prefix > to their functions (if they have been following the last 30 years of C > programming). Fair enough. I was trying to minimise stylistic differences in the fix, but if using _Py_getname is the canonical solution, that's easy fixed. > > - Objects/stringobject.c and Objects/unicodeobject.c contain changes to > > handle the EMX runtime library returning "0x" as the prefix for output > > formatted with a "%X" format. > > I'd suggest a different approach here, which does not use #ifdefs: > Instead of testing for the system, test for the bug. Then, if the bug > goes away, or appears on other systems as well, the code will be good. I did it the way I did because there's already code dealing with other brokeness in this area which doesn't solve the EMX issue, and the #ifdef solution minimises the risk of EMX fixes breaking something else which I can't test. At this stage I can't see this bug being fixed in EMX :-( > Once formatting is complete, see whether it put in the right letter, > and fix that in the result buffer if the native sprintf got it wrong. > > If you follow this strategy, you should still add a comment indicating > that this was added for OS/2, to give people an idea where that came > from. Definitely a more general approach, which I'll look at in detail. > Another approach would be to autoconfiscate this particular issue. I'm > in general in favour of autoconf'ed bug tests instead of runtime bug > tests, but people on systems without /bin/sh might feel differently. While there are sh/bash shells for OS/2, they're not standard equipment. Autoconf also has a very spotty record on OS/2, although there are people trying to improve that. > > If there are no unresolvable objections, and approval to apply these > > patches is granted, I propose that the patches be applied as follows:- > > > > Stage 1: the build patch (creates+populates PC/os2emx/) > > Stage 2: the Lib/plat-os2emx/ patch > > Stage 3: the Lib/ and Lib/test/ patches > > Stage 4: the distutils patch > > Stage 5: the Include/, Objects/ and Python/ patches > > Stage 6: the Modules/ patch > > > > I would expect to allow at least 48 hours between stages. > > > > Comments/advice on this proposal also appreciated. > > Sounds good to me (although I'd probably process the "uncritical", > i.e. truly platform-specific parts much more quickly). Who's going to > work with Andrew to integrate this stuff? The last I heard, Guido expected I was going to commit my own patches (after review), so I was allowing time for my initial attempts to commit to be checked by regular builders/testers of the tree before getting to changes that affect non-EMX specific parts of Python. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From tim.one@home.com Thu Jan 31 04:41:13 2002 From: tim.one@home.com (Tim Peters) Date: Wed, 30 Jan 2002 23:41:13 -0500 Subject: [Python-Dev] Thread questionlet In-Reply-To: <3C57D0D6.8010602@tismer.com> Message-ID: If you ask Guido, the only reason to use threads is to do overlapped I/O. And if you come up with a good counter-example, he'll find a way to *call* it overlapped I/O, so there's no opposing him on this . That's clearly a huge reason in practice to use threads, and that reason requires using platform threads (to get true overlap). Another huge reason is to play nice with threaded libraries written in other languages, and again that requires playing along with platform threads. So what most thread users want is what Python gives them: a thin wrapper around native threads, complete with platform quirks. The threading.py module adds *some* sanity to that, providing portable APIs for some important synch primitives, and uniform thread shutdown semantics (as Guido pointed out, when you use threading.py's thread wrappers, when the main thread exits it waits for all (non-daemon) threads to quit). What people seem to ask for most often now is a way for one thread to tell another thread to stop. In practice I've always done this by having each thread poll a "time to stop?" variable from time to time. That's not what people want, though. They want a way to *force* another thread to stop, even if (e.g.) the target thread is stuck in a blocking read, or in the middle of doing an extraordinarily expensive regexp search. There simply isn't a portable way to do that. Java initially spec'ed a way to do it in its thread model, but declared that deprecated after obtaining experience with it: not only did it prove impossible to implement in all cases, but even when it worked, the thread that got killed had no way to leave global invariants in a sane state (e.g., the thread may have had any number of synch gimmicks-- like locks --in various states, and global invariants for synch gimmicks can't tolerate a participant vanishing without both extreme care and a way for a thread to declare itself unstoppable at times). So that's a mess, but that's still what people want. OTOH, they won't want it for long if they get it (just as Java ran screaming from it). I'm not sure the audience for cororoutine-style threads even overlaps. You could try to marry both models, by running coroutine-style threads in a pool of OS threads. Then, e.g., provided you knew a potentially blocking I/O call when you saw one, you could farm it out to one of the real threads. If you can't do that, then I doubt the "real thread" crowd will have any interest in uthreads (or will, but treat them as an entirely distinct facility -- which, for their purposes, they would be). For purposes of computational parallelism (more my background than Guido's -- the idea that you might want to use a thread to avoid blocking on I/O was novel to me ), the global interpreter lock renders Python useless except for prototyping, so there's not much point digging into the hundreds of higher-level parallelism models that have been developed. IOW, uthreads are their own universe. I haven't used them, so don't know what would be useful. What do the current uthread users ask for? That's where I'd start. From mal@lemburg.com Thu Jan 31 09:45:24 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 31 Jan 2002 10:45:24 +0100 Subject: [Python-Dev] updated patches for OS/2 EMX port References: Message-ID: <3C591234.4F3B2FCE@lemburg.com> Andrew MacIntyre wrote: > > On 27 Jan 2002, Martin v. Loewis wrote: > > Andrew MacIntyre writes: > > > > > - Modules/unicodedata.c is affected by a name clash between the internally > > > defined _getname() and an EMX routine of the same name defined in > > > . The patch renames the internal routine to _getucname() to > > > avoid this, but this change may not be acceptable - advice please. > > > > My advice for renaming things because of name clashes: Always rename > > in a way that solves this particular problem for good, by using the Py > > prefix (or _Py to further indicate that this is not public API; it's a > > static function, anyway). Somebody may have a function _getucname > > somewhere, whereas it is really unlikely that people add a Py prefix > > to their functions (if they have been following the last 30 years of C > > programming). > > Fair enough. I was trying to minimise stylistic differences in the fix, > but if using _Py_getname is the canonical solution, that's easy fixed. +1 > > > - Objects/stringobject.c and Objects/unicodeobject.c contain changes to > > > handle the EMX runtime library returning "0x" as the prefix for output > > > formatted with a "%X" format. > > > > I'd suggest a different approach here, which does not use #ifdefs: > > Instead of testing for the system, test for the bug. Then, if the bug > > goes away, or appears on other systems as well, the code will be good. > > I did it the way I did because there's already code dealing with other > brokeness in this area which doesn't solve the EMX issue, and the #ifdef > solution minimises the risk of EMX fixes breaking something else which I > can't test. At this stage I can't see this bug being fixed in EMX :-( I'd go with Martin's suggestion here: there already is code in formatint() which tests for '%#X' adding '0x' or not. This code should be made to handle the special case by testing for it -- who knows: there may be other platforms where this doesn't work as expected either. BTW, could you point me to your patch for this ? Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mwh@python.net Thu Jan 31 10:32:29 2002 From: mwh@python.net (Michael Hudson) Date: 31 Jan 2002 10:32:29 +0000 Subject: [Python-Dev] test_curses In-Reply-To: Neal Norwitz's message of "Wed, 30 Jan 2002 19:13:58 -0500" References: <3C588C46.2BF27BBE@metaslash.com> Message-ID: <2mlmeex136.fsf_-_@starship.python.net> Neal Norwitz writes: > BTW, I test_curses fails: > test test_curses crashed -- _curses.error: curs_set() returned ERR Hmm, yes I get that too (I'd commented it out, because I thought it had something to do with the crashes, but it didn't AFAICT). It's very strange. I tried wrestling with gdb to find out how it was failing, but didn't get very far. It's hard to see how curs_set can fail. Maybe I need to build ncurses from source and link to that. Cheers, M. -- ARTHUR: Yes. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying "Beware of the Leopard". -- The Hitch-Hikers Guide to the Galaxy, Episode 1 From aahz@rahul.net Thu Jan 31 10:44:39 2002 From: aahz@rahul.net (Aahz Maruch) Date: Thu, 31 Jan 2002 02:44:39 -0800 (PST) Subject: [Python-Dev] Thread questionlet In-Reply-To: from "Tim Peters" at Jan 30, 2002 11:41:13 PM Message-ID: <20020131104439.13FA4E8D1@waltz.rahul.net> Tim Peters wrote: > > For purposes of computational parallelism (more my background than > Guido's -- the idea that you might want to use a thread to avoid > blocking on I/O was novel to me ), the global interpreter lock > renders Python useless except for prototyping, so there's not much > point digging into the hundreds of higher-level parallelism models > that have been developed. Well, maybe. I'm still hoping to prove you at least partly wrong one of these years. ;-) (The long-term plan for my BCD module is to turn it into a C extension that releases the GIL. If that's successful, I'll start working on ways to have Numeric release the GIL.) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From mwh@python.net Thu Jan 31 11:18:30 2002 From: mwh@python.net (Michael Hudson) Date: 31 Jan 2002 11:18:30 +0000 Subject: [Python-Dev] test_curses In-Reply-To: Michael Hudson's message of "31 Jan 2002 10:32:29 +0000" References: <3C588C46.2BF27BBE@metaslash.com> <2mlmeex136.fsf_-_@starship.python.net> Message-ID: <2mpu3qlqex.fsf@starship.python.net> Michael Hudson writes: > Neal Norwitz writes: > > > BTW, I test_curses fails: > > test test_curses crashed -- _curses.error: curs_set() returned ERR > > Hmm, yes I get that too (I'd commented it out, because I thought it > had something to do with the crashes, but it didn't AFAICT). > > It's very strange. I tried wrestling with gdb to find out how it was > failing, but didn't get very far. > > It's hard to see how curs_set can fail. Maybe I need to build ncurses > from source and link to that. Heh, well I worked that one out. The terminfo database didn't contain entries for cursor visibility for the $TERM I had (xterm-color, for some forgotten reason). Setting it to xterm made test_curses.py pass for me. much-python-dev-noise-not-much-content-from-me-ly y'rs M. -- That's why the smartest companies use Common Lisp, but lie about it so all their competitors think Lisp is slow and C++ is fast. (This rumor has, however, gotten a little out of hand. :) -- Erik Naggum, comp.lang.lisp From jeremy@alum.mit.edu Thu Jan 31 08:25:54 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 31 Jan 2002 03:25:54 -0500 Subject: [Python-Dev] opcode performance measurements Message-ID: <15448.65426.470757.274910@gondolin.digicool.com> I've made some simple measurements of how long opcodes take to execute and how long it takes to go around the mainloop, using the Pentim timestamp counter, which measures processor cycles. The results aren't particularly surprising, but they provide some empirical validation of what we've believed all along. I don't have time to go into all the gory details here, though I plan to at Spam 10 developers day next week. I put together a few Web pages that summarize the data I've collected on some simple benchmarks: http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/PerformanceMeasurements Comments and questions are welcome. I've got a little time to do more measurement and analysis before devday. Jeremy From skip@pobox.com Thu Jan 31 18:37:16 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 31 Jan 2002 12:37:16 -0600 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <15448.65426.470757.274910@gondolin.digicool.com> References: <15448.65426.470757.274910@gondolin.digicool.com> Message-ID: <15449.36572.854311.657565@beluga.mojam.com> Jeremy> I've made some simple measurements of how long opcodes take to Jeremy> execute and how long it takes to go around the mainloop ... Jeremy> Comments and questions are welcome. I've got a little time to Jeremy> do more measurement and analysis before devday. Interesting results. I've been working on my {TRACK,UNTRACK}_GLOBAL opcode implementations. I have an optimizer filter that sets up tracking for all LOAD_GLOBAL,{LOAD_ATTR}* combinations. It's still not quite working and will only be a proof of concept by devday if I do get it working, but I expect most of these expensive opcode combinations to collapse into a LOAD_FAST, with the addition of a TRACK_GLOBAL/UNTRACK_GLOBAL pair executed at function start and end, respectively. Skip From jepler@unpythonic.dhs.org Thu Jan 31 19:48:03 2002 From: jepler@unpythonic.dhs.org (Jeff Epler) Date: Thu, 31 Jan 2002 13:48:03 -0600 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <15449.36572.854311.657565@beluga.mojam.com> References: <15448.65426.470757.274910@gondolin.digicool.com> <15449.36572.854311.657565@beluga.mojam.com> Message-ID: <20020131134802.A5269@unpythonic.dhs.org> On Thu, Jan 31, 2002 at 12:37:16PM -0600, Skip Montanaro wrote: > Interesting results. I've been working on my {TRACK,UNTRACK}_GLOBAL opcode > implementations. I have an optimizer filter that sets up tracking for all > LOAD_GLOBAL,{LOAD_ATTR}* combinations. It's still not quite working and > will only be a proof of concept by devday if I do get it working, but I > expect most of these expensive opcode combinations to collapse into a > LOAD_FAST, with the addition of a TRACK_GLOBAL/UNTRACK_GLOBAL pair executed > at function start and end, respectively. Won't there be code that this slows down? For instance, the code generated by print "f = lambda: 0" print "def g():" print "\tif f():" # prevent optimization of 'if 0:' print "\t\tx = []" for i in range(10000): print "\t\tx.append(global_%d)" % i print "\t\treturn x" print "\treturn []" (10001 TRACK_GLOBALs, one LOAD_GLOBAL) not to mention, will it even work? TRACK_GLOBAL will have to make special note of globals that didn't exist yet when the function prologue is executed, and either not subsequently execute the load as a LOAD_FAST or else have a special value that causes the same NameError "global name 'global_666' is not defined" message, not an UnboundLocalError... The latter sounds easy enough to solve, but how do you make sure that this optimization is never a pessimization (aside from sending programmers such as myself to the retraining camps of the PSU)? Jeff PS Hey, that's remarkable .. usually people get unexpectedly cut off when they try to mentio From jeremy@alum.mit.edu Thu Jan 31 10:14:28 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 31 Jan 2002 05:14:28 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <15449.36572.854311.657565@beluga.mojam.com> References: <15448.65426.470757.274910@gondolin.digicool.com> <15449.36572.854311.657565@beluga.mojam.com> Message-ID: <15449.6404.210447.679063@gondolin.digicool.com> >>>>> "SM" == Skip Montanaro writes: SM> Interesting results. I've been working on my SM> {TRACK,UNTRACK}_GLOBAL opcode implementations. I have an SM> optimizer filter that sets up tracking for all SM> LOAD_GLOBAL,{LOAD_ATTR}* combinations. It's still not quite SM> working and will only be a proof of concept by devday if I do SM> get it working, but I expect most of these expensive opcode SM> combinations to collapse into a LOAD_FAST, with the addition of SM> a TRACK_GLOBAL/UNTRACK_GLOBAL pair executed at function start SM> and end, respectively. I won't have any implementation done at all, but should have finished the design for LOAD_FAST-style access to globals and module attributes. I also have some ideas about Python bytecode specializer that would work essentially like a JIT but generated specialized bytecode instead of machine code. Jeremy PS Skip-- Sorry the PEP isn't clear, but the only dictionary lookups that need to occur are at function creation time. MAKE_FUNCTION would need to lookup the offsets of the globals used by the functions, so that a LOAD_FAST_GLOBAL opcode would take an int argument. From jepler@unpythonic.dhs.org Thu Jan 31 20:17:25 2002 From: jepler@unpythonic.dhs.org (Jeff Epler) Date: Thu, 31 Jan 2002 14:17:25 -0600 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <15449.6404.210447.679063@gondolin.digicool.com> References: <15448.65426.470757.274910@gondolin.digicool.com> <15449.36572.854311.657565@beluga.mojam.com> <15449.6404.210447.679063@gondolin.digicool.com> Message-ID: <20020131141725.B5269@unpythonic.dhs.org> On Thu, Jan 31, 2002 at 05:14:28AM -0500, Jeremy Hylton wrote: > PS Skip-- Sorry the PEP isn't clear, but the only dictionary lookups > that need to occur are at function creation time. MAKE_FUNCTION would > need to lookup the offsets of the globals used by the functions, so > that a LOAD_FAST_GLOBAL opcode would take an int argument. So how does this work for mutually-recursive functions? def f(x): if x==1: return 1 return g(x) def g(x): return x * f(x-1) can f not optimize the load of the global g into a LOAD_FAST_GLOBAL? Jeff PS Which PEP? I only see 266 From skip@pobox.com Thu Jan 31 20:27:59 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 31 Jan 2002 14:27:59 -0600 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <20020131134802.A5269@unpythonic.dhs.org> References: <15448.65426.470757.274910@gondolin.digicool.com> <15449.36572.854311.657565@beluga.mojam.com> <20020131134802.A5269@unpythonic.dhs.org> Message-ID: <15449.43215.961559.895079@beluga.mojam.com> Jeff> Won't there be code that this slows down? For instance, the code Jeff> generated by ... Sure, there will be code that slows down. That's why I said what I am working on is a proof of concept. Right now, each function the optimizer operates on converts it to the equivalent of TRACK_GLOBAL x.foo TRACK_GLOBAL y.bar TRACK_GLOBAL z try: original function body using x.foo, y.bar and z finally: UNTRACK_GLOBAL z UNTRACK_GLOBAL y.bar UNTRACK_GLOBAL x.foo There are no checks for obvious potential problems at the moment. Such problems include (but are not limited to): * Only track globals that are accessed in loops. This would eliminate your corner case and should be easily handled (only work between SETUP_LOOP and its jump target). * Only track globals when there are <= 256 globals (half an oparg - the other half being an index into the fastlocals array). This would also cure your problem. * Only track globals that are valid at the start of function execution, or defer tracking setup until they are. This can generally be avoided by not tracking globals that are written during the function's execution, but other safeguards will probably be necessary to insure that it works properly. Jeff> ... how do you make sure that this optimization is never a Jeff> pessimization ... I expect in the majority of cases either my idea or Jeremy's will be a net win, especially after seeing his timing data. I'm willing to accept that in some situations the code will run slower. I'm confident they will be a small minority. Tim Peters can construct cases where dicts perform badly. Does that mean Python shouldn't have dicts? ;-) Skip From skip@pobox.com Thu Jan 31 20:29:47 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 31 Jan 2002 14:29:47 -0600 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <20020131141725.B5269@unpythonic.dhs.org> References: <15448.65426.470757.274910@gondolin.digicool.com> <15449.36572.854311.657565@beluga.mojam.com> <15449.6404.210447.679063@gondolin.digicool.com> <20020131141725.B5269@unpythonic.dhs.org> Message-ID: <15449.43323.97437.978245@beluga.mojam.com> Jeff> PS Which PEP? I only see 266 Mine is 266. Jeremy's is 267. Skip From skip@pobox.com Thu Jan 31 20:39:10 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 31 Jan 2002 14:39:10 -0600 Subject: [Python-Dev] What's up w/ CVS? Message-ID: <15449.43886.970376.594507@beluga.mojam.com> Anyone have any idea what's going on w/ SF CVS? I tried to "cvs up" earlier and it just hung. I just tried again now and got the dreaded "WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!" from ssh. Skip From Samuele Pedroni" <15449.36572.854311.657565@beluga.mojam.com> <15449.6404.210447.679063@gondolin.digicool.com> Message-ID: <00c801c1aa96$2b521320$6d94fea9@newmexico> Hi. Q about PEP 267 Does the PEP mechanims adress only import a use a.x cases. How does it handle things like import a.b use a.b.x Thanks, Samuele Pedroni. From jeremy@alum.mit.edu Thu Jan 31 11:02:17 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 31 Jan 2002 06:02:17 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <20020131141725.B5269@unpythonic.dhs.org> References: <15448.65426.470757.274910@gondolin.digicool.com> <15449.36572.854311.657565@beluga.mojam.com> <15449.6404.210447.679063@gondolin.digicool.com> <20020131141725.B5269@unpythonic.dhs.org> Message-ID: <15449.9273.809378.72361@gondolin.digicool.com> >>>>> "JE" == Jeff Epler writes: JE> On Thu, Jan 31, 2002 at 05:14:28AM -0500, Jeremy Hylton wrote: >> PS Skip-- Sorry the PEP isn't clear, but the only dictionary >> lookups that need to occur are at function creation time. >> MAKE_FUNCTION would need to lookup the offsets of the globals >> used by the functions, so that a LOAD_FAST_GLOBAL opcode would >> take an int argument. JE> So how does this work for mutually-recursive functions? JE> def f(x): JE> if x==1: return 1 return g(x) JE> def g(x): return x * f(x-1) JE> can f not optimize the load of the global g into a JE> LOAD_FAST_GLOBAL? JE> PS Which PEP? I only see 266 PEP 267. (They gesture at each other.) So you've got a module with two globals f() and g(). They're stored in slots 0 and 1 of the module globals array. When f() and g() are compiled, the symbol table for the module can note the location of f() and g() and that f() and g() contain references to globals. Instead of emitting LOAD_GLOBAL "f" in g(), you can emit LOAD_GLOBAL 0 ("f"). The complication here is that a code object isn't tied to a single module. It would be possible to to exec f.func_code in some other environment where "g" was not stored in the module global array. The dictionary lookups may occur in MAKE_FUNCTION in order to verify that the code object and the module object agree on the layout of the globals array. Jeremy From tim.one@home.com Thu Jan 31 20:52:47 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 31 Jan 2002 15:52:47 -0500 Subject: [Python-Dev] What's up w/ CVS? In-Reply-To: <15449.43886.970376.594507@beluga.mojam.com> Message-ID: [Skip] > Anyone have any idea what's going on w/ SF CVS? I tried to "cvs > up" earlier and it just hung. I just tried again now and got the > dreaded "WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! IT IS > POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!" from ssh. I get this too. SF is also showing other signs of flakiness today. Take a nap . From barry@zope.com Thu Jan 31 20:59:02 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 31 Jan 2002 15:59:02 -0500 Subject: [Python-Dev] What's up w/ CVS? References: <15449.43886.970376.594507@beluga.mojam.com> Message-ID: <15449.45078.573517.218139@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> Anyone have any idea what's going on w/ SF CVS? I tried to SM> "cvs up" earlier and it just hung. I just tried again now and SM> got the dreaded "WARNING: REMOTE HOST IDENTIFICATION HAS SM> CHANGED! IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING SM> NASTY!" from ssh. Same here. No word from the SF web pages. napping-ly y'rs, -Barry From jeremy@alum.mit.edu Thu Jan 31 11:12:26 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 31 Jan 2002 06:12:26 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <00c801c1aa96$2b521320$6d94fea9@newmexico> References: <15448.65426.470757.274910@gondolin.digicool.com> <15449.36572.854311.657565@beluga.mojam.com> <15449.6404.210447.679063@gondolin.digicool.com> <00c801c1aa96$2b521320$6d94fea9@newmexico> Message-ID: <15449.9882.393698.701265@gondolin.digicool.com> >>>>> "SP" == Samuele Pedroni writes: SP> Hi. Q about PEP 267 Does the PEP mechanims adress only SP> import a SP> use a.x SP> cases. How does it handle things like SP> import a.b SP> use a.b.x You're a smart guy, can you tell me? :-). Seriously, I haven't gotten that far. import mod.sub creates a binding for "mod" in the global namespace The compiler can detect that the import statement is a package import -- and mark "mod.sub" as a candidate for optimization. A use of "mod.sub.attr" in function should be treated just as "mod.attr". The globals array (dict-list hybrid, technically) has the publicly visible binding for "mod" but also has an internal binding for "mod.sub" and "mod.sub.attr". Every module or submodule attribute in a function gets an internal slot in the globals. The internal slot gets initialized the first time it is used and then shared by all the functions in the module. So I think this case isn't special enough to need a special case. Jeremy From skip@pobox.com Thu Jan 31 21:03:51 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 31 Jan 2002 15:03:51 -0600 Subject: [Python-Dev] distutils & stderr Message-ID: <15449.45367.46625.175691@beluga.mojam.com> If I could "cvs up" I would submit a patch, but in the meantime, is there any good reason that distutils shouldn't write its output to stderr? I'm using PyInline to execute a little bit of C code that returns some information about the system to the calling Python code. This code then sends some output to stdout. I've patched my local directory tree so that distutils sends its output to sys.stderr. Is there some overriding reason distutils messages should go to sys.stdout? BTW, Python + PyInline makes a hell of a lot easier to understand configure script... ;-) Skip From barry@zope.com Thu Jan 31 21:07:48 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 31 Jan 2002 16:07:48 -0500 Subject: [Python-Dev] distutils & stderr References: <15449.45367.46625.175691@beluga.mojam.com> Message-ID: <15449.45604.994144.291947@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> If I could "cvs up" I would submit a patch SF seems happy again. From tim.one@home.com Thu Jan 31 21:17:25 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 31 Jan 2002 16:17:25 -0500 Subject: [Python-Dev] distutils & stderr In-Reply-To: <15449.45367.46625.175691@beluga.mojam.com> Message-ID: [Skip Montanaro] > If I could "cvs up" I would submit a patch, but in the meantime, is there > any good reason that distutils shouldn't write its output to stderr? Win9X (command.com) users can't redirect stderr, and the DOS box there has a 50-line maximum output history. So stuff going to stderr is often lost forever. stdout can be redirected. I don't know whether disutils had that in mind, but it is "a reason" to leave it alone. > I'm using PyInline to execute a little bit of C code that returns some > information about the system to the calling Python code. This code then > sends some output to stdout. If there's a connection between this and disutils, it's not apparent from what you wrote. From skip@pobox.com Thu Jan 31 21:16:27 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 31 Jan 2002 15:16:27 -0600 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <15449.9882.393698.701265@gondolin.digicool.com> References: <15448.65426.470757.274910@gondolin.digicool.com> <15449.36572.854311.657565@beluga.mojam.com> <15449.6404.210447.679063@gondolin.digicool.com> <00c801c1aa96$2b521320$6d94fea9@newmexico> <15449.9882.393698.701265@gondolin.digicool.com> Message-ID: <15449.46123.751909.428436@beluga.mojam.com> SP> cases. How does it handle things like SP> import a.b SP> use a.b.x Jeremy> You're a smart guy, can you tell me? :-). Seriously, I haven't Jeremy> gotten that far. My stuff does handle this, as long as the first name is global. It just gobbles up all LOAD_GLOBALS and any immediately following LOAD_ATTRs. For instance, this trivial function: def f(): return distutils.core.setup compiles to: 0 LOAD_GLOBAL 0 (distutils) 3 LOAD_ATTR 1 (core) 6 LOAD_ATTR 2 (setup) 9 RETURN_VALUE 10 LOAD_CONST 0 (None) 13 RETURN_VALUE My TrackGlobalOptimizer class currently transforms this to 0 SETUP_FINALLY 11 (to 14) 3 TRACK_GLOBAL 3 (distutils.core.setup, distutils.core.setup) 6 POP_BLOCK 7 LOAD_CONST 0 (None) 10 LOAD_FAST 0 (distutils.core.setup) 13 RETURN_VALUE >> 14 UNTRACK_GLOBAL 3 (distutils.core.setup, distutils.core.setup) 17 END_FINALLY 18 LOAD_CONST 0 (None) 21 RETURN_VALUE which is obviously not an improvement because distutils.core.setup is only accessed once. As people make more use of packages, such multiple attribute loads might become more common. Skip From skip@pobox.com Thu Jan 31 21:33:37 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 31 Jan 2002 15:33:37 -0600 Subject: [Python-Dev] distutils & stderr In-Reply-To: References: <15449.45367.46625.175691@beluga.mojam.com> Message-ID: <15449.47153.267248.439654@beluga.mojam.com> Tim> Win9X (command.com) users can't redirect stderr, and the DOS box Tim> there has a 50-line maximum output history. So stuff going to Tim> stderr is often lost forever. stdout can be redirected. I don't Tim> know whether disutils had that in mind, but it is "a reason" to Tim> leave it alone. Perhaps it would be friendlier if all distutils messages were hidden in "if verbose:" statements (many already are). PyInline could then dial down the verbosity before calling distutils. >> I'm using PyInline to execute a little bit of C code that returns >> some information about the system to the calling Python code. This >> code then sends some output to stdout. Tim> If there's a connection between this and disutils, it's not Tim> apparent from what you wrote. Sorry about the missing link. PyInline uses distutils to compile the C code. How PyInline does its think doesn't really matter to me, so I'm not going to be interested in distutils' messages. Skip From tim.one@home.com Thu Jan 31 21:31:37 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 31 Jan 2002 16:31:37 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <15449.43215.961559.895079@beluga.mojam.com> Message-ID: [Skip Montanaro] > ... > Tim Peters can construct cases where dicts perform badly. Does that mean > Python shouldn't have dicts? ;-) I've thought about that hard over the years. The answer is no . From tim.one@home.com Thu Jan 31 21:54:00 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 31 Jan 2002 16:54:00 -0500 Subject: [Python-Dev] Thread questionlet In-Reply-To: <20020131104439.13FA4E8D1@waltz.rahul.net> Message-ID: [Tim] > For purposes of computational parallelism ... the global interpreter > lock> renders Python useless except for prototyping, so there's not much > point digging into the hundreds of higher-level parallelism models > that have been developed. [Aahz] > Well, maybe. I'm still hoping to prove you at least partly wrong one of > these years. ;-) WRT higher-level parallelism models, you already have in a small way, by your good championing of the Queue module. Queue-based approaches are a step above the morass of low-level home-grown locking protocols people routinely screw up; it's almost *hard* to screw up a Queue-based approach. The GIL issue is distinct, and it plainly stops computational parallelism from doing any good so long as we're talking about Python code. > (The long-term plan for my BCD module is to turn it into a C extension > that releases the GIL. Well, that's not Python code. It's unclear whether it will actually help: Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS aren't free, and a typical BCD calculation may be so cheap that it's a net loss to release and reacquire the GIL across one. Effective use of fine-grained parallelism usually requires something cheaper to build on, like very lightweight critical sections mediating otherwise free-running threads. > If that's successful, I'll start working on ways to have Numeric release > the GIL.) I expect that's more promising because matrix ops are much coarser-grained, but also much harder to do safely: BCD objects are immutable (IIRC), so a routine crunching one doesn't have to worry about another thread mutating it midstream if the GIL is released. A Numeric array probably does have to worry about that. From Jack.Jansen@oratrix.nl Thu Jan 31 21:55:20 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Thu, 31 Jan 2002 22:55:20 +0100 Subject: [Python-Dev] Fwd: [Pythonmac-SIG] sys.exit() functionality Message-ID: <38A3EBA8-1695-11D6-9DB9-003065517236@oratrix.nl> This discussion started on pythonmac-SIG, but someone suggested that it isn't really a MacPython-specific issue (even though the implementation will be different for MacPython from unix-Python). Any opinions? Begin forwarded message: > From: Martin Miller > Date: Wed Jan 30, 2002 08:14:13 PM Europe/Amsterdam > To: pythonmac-sig@python.org > Subject: Re: [Pythonmac-SIG] sys.exit() functionality > > On Wed, 30 Jan 2002 15:29:21 +0100, Jack Jansen wrote: >> >> On Tuesday, January 29, 2002, at 08:54 , Jon Bradley wrote: >> >>> hey all, >>> >>> In embedded Python - why does sys.exit() quit out of the application >>> that's >>> embedding the interpreter? Is there any way to trap or >>> disregard this? >>> >>> If a user creates an application with Python and runs it through the >>> embedded interpreter, calling quit or exit on the Python application >>> itself >>> is more than ok, but allowing it to force out of the parent >>> application >>> isn't. >> >> Sounds reasonable. How about a routine PyMac_SetExitFunc() that you >> could call to set your own exit function, (similar to >> PyMac_SetConsoleHandler())? MacPython would then do all it's normal >> cleanup, but at the very end call your routine in stead of exit(). > > With an approach like the above, wouldn't it be better to have a > platform-independent way of defining a custom exit function, > rather than > calling a Mac-only system function -- or is this whole thing only an > issue with MacPython embedding? > > Martin > > _______________________________________________ > Pythonmac-SIG maillist - Pythonmac-SIG@python.org > http://mail.python.org/mailman/listinfo/pythonmac-sig > -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From aahz@rahul.net Thu Jan 31 22:07:30 2002 From: aahz@rahul.net (Aahz Maruch) Date: Thu, 31 Jan 2002 14:07:30 -0800 (PST) Subject: [Python-Dev] opcode performance measurements In-Reply-To: <15448.65426.470757.274910@gondolin.digicool.com> from "Jeremy Hylton" at Jan 31, 2002 03:25:54 AM Message-ID: <20020131220731.5EFD1E8C6@waltz.rahul.net> Jeremy Hylton wrote: > > I've made some simple measurements of how long opcodes take to execute > and how long it takes to go around the mainloop, using the Pentim > timestamp counter, which measures processor cycles. > > The results aren't particularly surprising, but they provide some > empirical validation of what we've believed all along. I don't have > time to go into all the gory details here, though I plan to at > Spam 10 developers day next week. My suggestion WRT SET_LINENO is to encourage the use of python -O and PYTHONOPTIMIZE. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From jeremy@alum.mit.edu Thu Jan 31 12:34:28 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 31 Jan 2002 07:34:28 -0500 Subject: [Python-Dev] opcode performance measurements In-Reply-To: <20020131220731.5EFD1E8C6@waltz.rahul.net> References: <15448.65426.470757.274910@gondolin.digicool.com> <20020131220731.5EFD1E8C6@waltz.rahul.net> Message-ID: <15449.14804.850939.223697@gondolin.digicool.com> >>>>> "AM" == Aahz Maruch writes: AM> My suggestion WRT SET_LINENO is to encourage the use of python AM> -O and PYTHONOPTIMIZE. Vladimir submitted a patch long ago to dynamically recompile bytecode to add/remove SET_LINENO as needed. I find that approach much more appealing, because you don't have to pay the SET_LINENO penalty just because there's some chance you'd want to connect with a debugger. A long running server process is the prime use case; it benefits from -O but may need to be debugged. Jeremy From tim.one@home.com Thu Jan 31 22:30:22 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 31 Jan 2002 17:30:22 -0500 Subject: [Python-Dev] opcode performance measurements In-Reply-To: <20020131220731.5EFD1E8C6@waltz.rahul.net> Message-ID: [Aahz] > My suggestion WRT SET_LINENO is to encourage the use of python -O and > PYTHONOPTIMIZE. What SET_LINENO does isn't even used in normal Python operation anymore (line numbers in tracebacks are obtained via a different means, the co_lnotab member of PyCodeObjects). They're needed now only to call back to user-supplied tracing routines, and that's rarely needed. The Python debugger is the most visible example of a tool that uses the line tracing hook. There are others way to get that to work, but they require real thought and effort to implement. There's a patch on SourceForge (IIRC, from Vladimir) that may have worked at one time, but nobody has picked it up (I tried to for 2.2, but couldn't make time for it then; I don't expect to have time for it for 2.3 either, alas). From Jack.Jansen@oratrix.nl Thu Jan 31 22:35:54 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Thu, 31 Jan 2002 23:35:54 +0100 Subject: [Python-Dev] next vs darwin In-Reply-To: Message-ID: On Wednesday, January 30, 2002, at 09:42 PM, Steven Majewski wrote: > dlcompat libs are used by Apple to build Apache and some other > programs. > The libs are not included in Mac OSX, although the sources are > available > in the Darwin CVS, and an improved version is distributed on Fink and > maybe other places. Since additional libs are required, I would not > make that the default. ( unless, since there's already a check for > libdl in config, we make it dependent on that. ) > > The problem is that the current dynload_next is broken, and we've > had some problems replicating tests and solutions because, among other > problems, of the very poor error reporting in dynload_next, everyone > is starting from a differently hacked version of the 2.2 distribution. > (The other variable is which modules and packages people are loading.) > > Reportedly, using the dlcompat libs fixes some problems for > some people. I'm not too thrilled with dlcompat. First and foremost, it fixes some problems for some people but may introduce problems for others (if I understand correctly). And then there's the issue of it not being part of the base MacOSX distribution. I now have a dynload_next.c (that I'll check in tomorrow) that can behave in two ways based on a #define. With the define off it loads every extension module in a separate namespace, i.e. two independent modules can never break each other by supplying external symbols the other module expected to load from a completely different place. With the define on it loads all extension modules into the application namespace. Some people want this (despite the problems sketched above) because they have modules that refer to external symbols defined in modules that have been loaded earlier (and I assume there's magic that ensures their modules are loaded in the right order). While I think this is an accident waiting to happen [*] the latter behaviour is more-or-less the standard unix behaviour, so it should probably be supportable in some way. I prefer the new (OSX 10.1) preferred Apple way of linking plugins (which is also the common way to do so on all other non-unix platforms) where the plugin has to be linked against the application and dynamic libraries it is going to be plugged into, so none of this dynamic behaviour goes on. [*] I know of two cases where this already happened: both the curses library and the SGI gl library defined a function clear(), so you were hosed when you used both in the same Python script. And the SGI compression library contains a private version of libjpeg with no symbol renaming, so if you used the cl module and a module which linked against the normal libjpeg you were also hosed. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jepler@unpythonic.dhs.org Thu Jan 31 22:54:52 2002 From: jepler@unpythonic.dhs.org (jepler@unpythonic.dhs.org) Date: Thu, 31 Jan 2002 16:54:52 -0600 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <15449.9273.809378.72361@gondolin.digicool.com> References: <15448.65426.470757.274910@gondolin.digicool.com> <15449.36572.854311.657565@beluga.mojam.com> <15449.6404.210447.679063@gondolin.digicool.com> <20020131141725.B5269@unpythonic.dhs.org> <15449.9273.809378.72361@gondolin.digicool.com> Message-ID: <20020131165451.A987@unpythonic.dhs.org> On Thu, Jan 31, 2002 at 06:02:17AM -0500, Jeremy Hylton wrote: > JE> can f not optimize the load of the global g into a > JE> LOAD_FAST_GLOBAL? > > So you've got a module with two globals f() and g(). They're stored > in slots 0 and 1 of the module globals array. When f() and g() are > compiled, the symbol table for the module can note the location of f() > and g() and that f() and g() contain references to globals. Instead > of emitting LOAD_GLOBAL "f" in g(), you can emit LOAD_GLOBAL 0 ("f"). But isn't what happens in this module something like LOAD_CONST MAKE_FUNCTION STORE_GLOBAL 0 (f) LOAD_CONST MAKE_FUNCTION STORE_GLOBAL 1 (g) so if you convert LOAD_GLOBAL into LOAD_FAST_GLOBAL when you MAKE_FUNCTION on code1, there is not yet a "g" in the dlict. Are you populating the "names" part of the dlict as an earlier "pass" of module compilation, then? So the optimization doesn't apply if I create the globals from within a function? (Of course, in that case it would work if I set the attributes to 'None' in the module scope, right?): def make_fg(): global f, g def f(x): pass def g(x): pass Jeff From barry@zope.com Thu Jan 31 22:52:43 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 31 Jan 2002 17:52:43 -0500 Subject: [Python-Dev] Attention Mailman list administrators Message-ID: <15449.51899.451313.87015@anthem.wooz.org> You will soon notice (if you haven't already) that your list admin passwords on mail.python.org are broken. This happened due to an upgrade of the version of Python running on that system. The old list passwords can't be recovered, so they have to be reset. List administrators can contact me to get this done. If you know the old password, send it to me and I'll reset the list to it. Otherwise, let me know and I'll generate a new password for you. Sorry for the inconvenience, -Barry From sdm7g@Virginia.EDU Thu Jan 31 22:55:22 2002 From: sdm7g@Virginia.EDU (Steven Majewski) Date: Thu, 31 Jan 2002 17:55:22 -0500 (EST) Subject: [Python-Dev] next vs darwin In-Reply-To: Message-ID: On Thu, 31 Jan 2002, Jack Jansen wrote: > I'm not too thrilled with dlcompat. First and foremost, it fixes > some problems for some people but may introduce problems for > others (if I understand correctly). And then there's the issue > of it not being part of the base MacOSX distribution. > > I now have a dynload_next.c (that I'll check in tomorrow) that > can behave in two ways based on a #define. > > With the define off it loads every extension module in a > separate namespace, i.e. two independent modules can never break > each other by supplying external symbols the other module > expected to load from a completely different place. > > With the define on it loads all extension modules into the > application namespace. [...] Did you see the version I posted a day or two ago: If I fixed up the #ifdef macros, you could compile that three ways (at least): Global public symbols, Private Symbols, or the dlcompat trick. ( But it uses that magic hook into the non-public API from dlcompat. ) My main requirement is the better error reporting. -- Steve From jeremy@alum.mit.edu Thu Jan 31 13:09:59 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 31 Jan 2002 08:09:59 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <20020131165451.A987@unpythonic.dhs.org> References: <15448.65426.470757.274910@gondolin.digicool.com> <15449.36572.854311.657565@beluga.mojam.com> <15449.6404.210447.679063@gondolin.digicool.com> <20020131141725.B5269@unpythonic.dhs.org> <15449.9273.809378.72361@gondolin.digicool.com> <20020131165451.A987@unpythonic.dhs.org> Message-ID: <15449.16935.134346.826927@gondolin.digicool.com> >>>>> "JE" == jepler writes: JE> But isn't what happens in this module something like JE> LOAD_CONST MAKE_FUNCTION STORE_GLOBAL 0 (f) JE> LOAD_CONST MAKE_FUNCTION STORE_GLOBAL 1 (g) JE> so if you convert LOAD_GLOBAL into LOAD_FAST_GLOBAL when you JE> MAKE_FUNCTION on code1, there is not yet a "g" in the dlict. JE> Are you populating the "names" part of the dlict as an earlier JE> "pass" of module compilation, then? Yes. The compiler can do a pretty good job of establishing all the globals in a module at compile time. When a module is loaded, the interpreter would allocate space for all the expected globals. JE> "pass" of module compilation, then? So the optimization doesn't JE> apply if I create the globals from within a function? It still applies. The function is compiled at the same time as the module, so the module symbol table can account for globals assigned to only in functions. The harder cases are much more dynamic -- exec using a module's globals, assignment to create new attributes on an imported module, etc. Example: import foo assert not hasattr(foo, 'bar') # just to illustrate the example foo.bar = 12 There's no way for the compiler to know that foo will have a bar attribute when it compiles foo. Jeremy From martin@v.loewis.de Thu Jan 31 23:00:34 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 01 Feb 2002 00:00:34 +0100 Subject: [Python-Dev] Fwd: [Pythonmac-SIG] sys.exit() functionality In-Reply-To: <38A3EBA8-1695-11D6-9DB9-003065517236@oratrix.nl> References: <38A3EBA8-1695-11D6-9DB9-003065517236@oratrix.nl> Message-ID: Jack Jansen writes: > This discussion started on pythonmac-SIG, but someone suggested that > it isn't really a MacPython-specific issue (even though the > implementation will be different for MacPython from unix-Python). > > Any opinions? I think allowing to replace Py_Exit is the right way to go. Make it a function pointer, initialized to _Py_Exit, and let the embedding context change its value (through a setter, or through direct assignment). Double-check that all callers of Py_Exit behave well when it actually does return (which currently is not the case), and don't forget to bump the API version. Regards, Martin From martin@v.loewis.de Thu Jan 31 23:09:55 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 01 Feb 2002 00:09:55 +0100 Subject: [Python-Dev] next vs darwin In-Reply-To: References: Message-ID: Jack Jansen writes: > With the define on it loads all extension modules into the application > namespace. Some people want this (despite the problems sketched above) > because they have modules that refer to external symbols defined in > modules that have been loaded earlier (and I assume there's magic that > ensures their modules are loaded in the right order). On Unix, this is a runtime option via sys.setdlopenflags (RTLD_GLOBAL turns on import into application namespace). Do you think you could emulate this API? > While I think this is an accident waiting to happen [*] the latter > behaviour is more-or-less the standard unix behaviour, so it should > probably be supportable in some way. It is not at all standard unix behaviour. Since Python 1.5.2, Python loads extensions with RTLD_LOCAL on systems, so that each module has its own namespace. People often requested that this is changed, but we successfully managed to turn down all these requests. Eventually, somebody came up with sys.setdlopenflags; this was good enough for me. > I prefer the new (OSX 10.1) preferred Apple way of linking plugins > (which is also the common way to do so on all other non-unix > platforms) where the plugin has to be linked against the application > and dynamic libraries it is going to be plugged into, so none of > this dynamic behaviour goes on. I'm not sure linking with a libpython.so is desirable, I'm quite fond of the approach to let the executable export symbols to the extensions. If that is possible on OS X, I'd encourage you to follow such a strategy (in unix gcc/ld, this is enabled through -Wl,--export-dynamic). > [*] I know of two cases where this already happened: both the curses > library and the SGI gl library defined a function clear(), so you > were hosed when you used both in the same Python script. On Unix, the originally trigger might have been the problem with initsocket, which was also exported in an Oracle library, thus breaking Oracle (the Python symbol is now init_socket, but that does not change the principle). Regards, Martin From jeremy@alum.mit.edu Thu Jan 31 14:43:22 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 31 Jan 2002 09:43:22 -0500 Subject: [Python-Dev] distutils & stderr In-Reply-To: References: <15449.47153.267248.439654@beluga.mojam.com> Message-ID: <15449.22538.192214.765110@gondolin.digicool.com> >>>>> "TP" == Tim Peters writes: TP> [Skip] >> Sorry about the missing link. PyInline uses distutils to compile >> the C code. How PyInline does its think doesn't really matter to >> me, so I'm not going to be interested in distutils' messages. TP> If distutils output isn't interesting to PyInline users, TP> shouldn't PyInline be changed to run setup.py with its TP> -q/--quiet option? I started a thread on similar issues on the distutils-sig mailing list a week or two ago. There's agreement that output is a problem. The code has no consistent way of generating messages or of interpreting the notion of verbose or quiet. I think the right solution is to have several levels of verbosity and have a single function or method to use for output. (Perhaps a print statement with appropriate >>.) This makes it easier to control the amount of information you get and where it gets printed to. Michael Hudson has signed up to implement it and whatever else we can pile on when he's not looking. Further discussion should probably go to the sig. Jeremy From andymac@bullseye.apana.org.au Thu Jan 31 21:13:52 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Fri, 1 Feb 2002 08:13:52 +1100 (EDT) Subject: [Python-Dev] updated patches for OS/2 EMX port In-Reply-To: <3C591234.4F3B2FCE@lemburg.com> Message-ID: On Thu, 31 Jan 2002, M.-A. Lemburg wrote: > > > > - Objects/stringobject.c and Objects/unicodeobject.c contain changes to > > > > handle the EMX runtime library returning "0x" as the prefix for output > > > > formatted with a "%X" format. > > > > > > I'd suggest a different approach here, which does not use #ifdefs: > > > Instead of testing for the system, test for the bug. Then, if the bug > > > goes away, or appears on other systems as well, the code will be good. > > > > I did it the way I did because there's already code dealing with other > > brokeness in this area which doesn't solve the EMX issue, and the #ifdef > > solution minimises the risk of EMX fixes breaking something else which I > > can't test. At this stage I can't see this bug being fixed in EMX :-( > > I'd go with Martin's suggestion here: there already is code in > formatint() which tests for '%#X' adding '0x' or not. This code > should be made to handle the special case by testing for it -- > who knows: there may be other platforms where this doesn't work > as expected either. There are sure to be other platforms that have this bogosity. I'll look into this some more. > BTW, could you point me to your patch for this ? The Objects patch in patch #450267, at http://sf.net/tracker/?func=detail&atid=305470&aid=450267&group_id=5470 -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From jeremy@alum.mit.edu Thu Jan 31 22:36:37 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 31 Jan 2002 17:36:37 -0500 Subject: [Python-Dev] distutils & stderr In-Reply-To: <20020201031143.GC9864@gerg.ca> References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> Message-ID: <15449.50933.89877.252954@gondolin.digicool.com> >>>>> "GW" == Greg Ward writes: GW> Oh wait: most of the low-level worker code in the Distutils GW> falls outside the main class hierarchy, so the verbose flag GW> isn't *quite* so readily available; it gets passed in to a heck GW> of a lot of functions. Crap. I wish it were so clean and simple, Greg . In a lot of places, the binary verbose flag that is stored in the main class hierarchy is compared to some a var named "level". The result of that comparison is passed to functions, which ignore it and just use print. At least sometimes. Jeremy From aahz@rahul.net Wed Jan 30 23:51:30 2002 From: aahz@rahul.net (Aahz Maruch) Date: Wed, 30 Jan 2002 15:51:30 -0800 (PST) Subject: [Python-Dev] Python/Web developer In-Reply-To: <3C585A75.5831727A@teksystems.com> from "Rebecca Schaefer" at Jan 30, 2002 03:41:25 PM Message-ID: <20020130235130.765BCE8C4@waltz.rahul.net> Rebecca Schaefer wrote: > > TEKsystems in Appleton, WI has an opening for a web developer with > Python, Zope, html, SQL, UNIX, and Perl experience. It is a long term > contract opportunity. Any interested candidates should email Rebecca > Schaefer at rschaefe@teksystems.com python-dev is the wrong place for job ads. Please use either python-list or send a message to jobs@python.org. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From Samuele Pedroni" <15449.36572.854311.657565@beluga.mojam.com><15449.6404.210447.679063@gondolin.digicool.com><00c801c1aa96$2b521320$6d94fea9@newmexico> <15449.9882.393698.701265@gondolin.digicool.com> Message-ID: <014001c1aa9a$94b8a960$6d94fea9@newmexico> From: Jeremy Hylton > >>>>> "SP" == Samuele Pedroni writes: > > SP> Hi. Q about PEP 267 Does the PEP mechanims adress only > SP> import a > SP> use a.x > > SP> cases. How does it handle things like > SP> import a.b > SP> use a.b.x > > You're a smart guy, can you tell me? :-). Seriously, I haven't > gotten that far. > > import mod.sub > creates a binding for "mod" in the global namespace > > The compiler can detect that the import statement is a package import > -- and mark "mod.sub" as a candidate for optimization. A use of > "mod.sub.attr" in function should be treated just as "mod.attr". > > The globals array (dict-list hybrid, technically) has the publicly > visible binding for "mod" but also has an internal binding for > "mod.sub" and "mod.sub.attr". Every module or submodule attribute in > a function gets an internal slot in the globals. The internal slot > gets initialized the first time it is used and then shared by all the > functions in the module. > > So I think this case isn't special enough to need a special case. > OK, I stated the wrong question. What happens if I do the following: import a.b def f(): print a.b.x a.g() print a.b.x f() Now a.g() change a.b from a submodule to an object with a x attribute. Maybe this case does not make sense, but the point is that the PEP is quite vague about imported stuff. Samuele (more puzzled than smart).