From victor.stinner at haypocalc.com Sat Nov 1 01:08:09 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 1 Nov 2008 01:08:09 +0100 Subject: [Python-3000] close() on open(fd, closefd=False) In-Reply-To: References: Message-ID: <200811010108.10006.victor.stinner@haypocalc.com> > Rightnow close() doesn't do anything and you can still write > or read after close(). This behavior is surprising to the user. > I like to change close() to set the internal fd attribute > to -1 (meaning close) but keep the fd open. Let take an example: ------------------- passwd = open('/etc/passwd', 'rb') readonly = open(passwd.fileno(), closefd=False) print("readonly: {0!r}".format(readonly.readline())) # close readonly stream, but no passwd readonly.close() try: readonly.readline() print("ERROR: read() on a closed file!") except Exception as err: # Expected behaviour pass # passwd is not closed print("passwd: {0!r}".format(passwd.readline())) passwd.close() ------------------- The current behaviour is to accept read/write on a closed file. Sorry benjamin, but it's not a feature: it's a bug :-) and passwd.readline() works. I wrote a patch to implement your suggestion crys and it works as expected: when readonly stream is closed, read is blocked but passwd.readline() still works. I will attach my patch to the issue 4233. Victor From martin at v.loewis.de Sat Nov 1 13:44:08 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 01 Nov 2008 13:44:08 +0100 Subject: [Python-3000] close() on open(fd, closefd=False) In-Reply-To: References: Message-ID: <490C4F18.5090307@v.loewis.de> > The additional warnings aren't critical. But in retrospection I think > that I made a small error during the design of the closefd feature. > With a file descriptor number as first argument and closefd set to > false, the file descriptor isn't closed when the file object is > deallocated. It's also impossible to close the fd with close(). Right > now close() doesn't do anything and you can still write or read after > close(). This behavior is surprising to the user. However, this is the documented behavior: "the underlying file descriptor will be kept open when the file is closed." and, for close "Flush and close this stream." It would help if the documentation of close would read "Flush and close this stream. This method has no effect if the file is already closed, or if closefd was False when the file object was created." It might be useful if closefd attribute could be reflected, so that applications that know about it and really want to close the file still can do that. Also, is it really the case that close has no effect if closefd was given? ISTM that it will still flush the file, for a _BufferedIOMixin. This should systematically be clarified: it should either always flush, or never (although "always flush" seems more useful). Also, why does the _BufferedIOMixin discard exceptions from flush? Errors should never pass silently. Also, why does the FileIO.close first invoke _FileIO.close, then RawIOBase.close? IIUC, FileIO.close will close the file handle, yet RawIOBase will attempt to flush afterwards. > Maybe the warning could be dropped all along, too. That sounds useful. Regards, Martin From tjreedy at udel.edu Sun Nov 2 02:52:43 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 01 Nov 2008 21:52:43 -0400 Subject: [Python-3000] [ANN] Python 3 Symbol Glossary Message-ID: Over the years, people have complained about the difficulty of finding the meaning of symbols used in Python syntax. So ... I wrote a Python 3 Symbol Glossary http://code.google.com/p/xploro/downloads/list There are .txt and .odt versions. The first is usable, the second is nicer if you have an OpenDocFormat viewer or editor such as OpenOffice. There is no .html because conversion mangles the format by deleting tabs . From the Introduction: The ascii character set includes non-printable control characters (designated below with a '^' or '\' prefix), letters and digits, and other printable symbols. A few of the control characters and most of the symbols are used in Python code as operators, delimiters, or other syntactic units. Some symbols are used alone, some in multi-symbol units, and some in both. There are separate entries for each syntactic unit and for each different use of a unit. In total, there are nearly 100 entries for over 50 symbols and combinations. Entries are in ascii collating (sorting) order except that ?= entries (where ? is a symbol) follow the one for ? (if there is one) and the general 'op=' entry follows the one for =. The two lines after the entry for '\r\' are entries for the invisible blank space ' '. Most entries start with P, I, or S to indicate the syntactic unit's use as a prefix, infix, or suffix. (These terms are here not limited to operators.) If so, a template follows, with italicized words indicating the type of code to be substituted in their place. Entries also have additional explanations. Some syntactic units are split into two subunits that enclose code. Entries for these are the same except that two initials are used, PS or IS, depending on whether the first subunit is a prefix or infix relative to the entire syntactic construct. If I missed anything or made any errors, let me know. PSF people are free to make any use of this they wish. Terry Jan Reedy From lists at cheimes.de Mon Nov 3 00:35:20 2008 From: lists at cheimes.de (Christian Heimes) Date: Mon, 03 Nov 2008 00:35:20 +0100 Subject: [Python-3000] close() on open(fd, closefd=False) In-Reply-To: <490C4F18.5090307@v.loewis.de> References: <490C4F18.5090307@v.loewis.de> Message-ID: <490E3938.6070309@cheimes.de> Martin v. L?wis wrote: > However, this is the documented behavior: > > "the underlying file descriptor will be kept open when the file is closed." > > and, for close > > "Flush and close this stream." > > It would help if the documentation of close would read > > "Flush and close this stream. This method has no effect if the file is > already closed, or if closefd was False when the file object was created." I would use a slightly different wording "Flush and close this stream. This method has no effect if the file is already closed. Once the file is closed, any operation on the file (e.g. reading or writing) will fail. Close doesn't close the internal file descriptor if closefd was False when the file object was created." In my opinion close() should render the file *object* disabled in all cases. closefd should just leave the file *descriptor* open, not the file *object*. > It might be useful if closefd attribute could be reflected, so that > applications that know about it and really want to close the file > still can do that. Agreed! My latest patch adds a read only attribute closefd. > Also, is it really the case that close has no effect if closefd > was given? ISTM that it will still flush the file, for a > _BufferedIOMixin. This should systematically be clarified: it > should either always flush, or never (although "always flush" > seems more useful). I haven't looked at the code in the io yet. Our discussion was mostly about the _fileio C extension. To clarify the issue with an example >>> out = open(1, "w", closefd=False) >>> out.write("example\n") example 8 >>> out.close() /usr/local/lib/python3.0/io.py:1461: RuntimeWarning: Trying to close unclosable fd! self.buffer.close() >>> out.write("example\n") example 8 I think the second out.write() call should raise an exception instead of writing the string to the fd 1. > Also, why does the _BufferedIOMixin discard exceptions from flush? > Errors should never pass silently. Ack! > Also, why does the FileIO.close first invoke _FileIO.close, then > RawIOBase.close? IIUC, FileIO.close will close the file handle, > yet RawIOBase will attempt to flush afterwards. Apparently the io module hasn't been reviewed for its massive usage of close. :/ Christian From guido at python.org Tue Nov 4 03:08:35 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Nov 2008 18:08:35 -0800 Subject: [Python-3000] close() on open(fd, closefd=False) In-Reply-To: <490E3938.6070309@cheimes.de> References: <490C4F18.5090307@v.loewis.de> <490E3938.6070309@cheimes.de> Message-ID: I see no problem in fixing this in 3.0 and 2.6.1. The current behavior is a bug and I see no reason to expect that users would depend on it. Sure, it might occasionally mask a bug elsewhere in their code -- but given how esoteric this whole issue is I think its fine to consider this simply a bug and fix it. 2008/11/2 Christian Heimes : > Martin v. L?wis wrote: >> >> However, this is the documented behavior: >> >> "the underlying file descriptor will be kept open when the file is closed." >> >> and, for close >> >> "Flush and close this stream." >> >> It would help if the documentation of close would read >> >> "Flush and close this stream. This method has no effect if the file is >> already closed, or if closefd was False when the file object was created." > > I would use a slightly different wording > > "Flush and close this stream. This method has no effect if the file is > already closed. Once the file is closed, any operation on the file (e.g. reading or writing) will fail. Close doesn't close the internal file descriptor if closefd was False when the file object was created." > > In my opinion close() should render the file *object* disabled in all cases. closefd should just leave the file *descriptor* open, not the file *object*. > >> It might be useful if closefd attribute could be reflected, so that >> applications that know about it and really want to close the file >> still can do that. > > Agreed! My latest patch adds a read only attribute closefd. > >> Also, is it really the case that close has no effect if closefd >> was given? ISTM that it will still flush the file, for a >> _BufferedIOMixin. This should systematically be clarified: it >> should either always flush, or never (although "always flush" >> seems more useful). > > I haven't looked at the code in the io yet. Our discussion was mostly about the _fileio C extension. To clarify the issue with an example > >>>> out = open(1, "w", closefd=False) >>>> out.write("example\n") > example > 8 >>>> out.close() > /usr/local/lib/python3.0/io.py:1461: RuntimeWarning: Trying to close unclosable fd! > self.buffer.close() >>>> out.write("example\n") > example > 8 > > I think the second out.write() call should raise an exception instead of writing the string to the fd 1. > >> Also, why does the _BufferedIOMixin discard exceptions from flush? >> Errors should never pass silently. > > Ack! > >> Also, why does the FileIO.close first invoke _FileIO.close, then >> RawIOBase.close? IIUC, FileIO.close will close the file handle, >> yet RawIOBase will attempt to flush afterwards. > > Apparently the io module hasn't been reviewed for its massive usage of close. :/ > > Christian > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Tue Nov 4 03:25:36 2008 From: janssen at parc.com (Bill Janssen) Date: Mon, 3 Nov 2008 18:25:36 PST Subject: [Python-3000] close() on open(fd, closefd=False) In-Reply-To: References: <490C4F18.5090307@v.loewis.de> <490E3938.6070309@cheimes.de> Message-ID: <71578.1225765536@parc.com> Note that the whole httplib uses this behavior -- issue 1348. Bill Guido van Rossum wrote: > I see no problem in fixing this in 3.0 and 2.6.1. The current behavior > is a bug and I see no reason to expect that users would depend on it. > Sure, it might occasionally mask a bug elsewhere in their code -- but > given how esoteric this whole issue is I think its fine to consider > this simply a bug and fix it. > > 2008/11/2 Christian Heimes : > > Martin v. L?wis wrote: > >> > >> However, this is the documented behavior: > >> > >> "the underlying file descriptor will be kept open when the file is closed." > >> > >> and, for close > >> > >> "Flush and close this stream." > >> > >> It would help if the documentation of close would read > >> > >> "Flush and close this stream. This method has no effect if the file is > >> already closed, or if closefd was False when the file object was created." > > > > I would use a slightly different wording > > > > "Flush and close this stream. This method has no effect if the file is > > already closed. Once the file is closed, any operation on the file (e.g. reading or writing) will fail. Close doesn't close the internal file descriptor if closefd was False when the file object was created." > > > > In my opinion close() should render the file *object* disabled in all cases. closefd should just leave the file *descriptor* open, not the file *object*. > > > >> It might be useful if closefd attribute could be reflected, so that > >> applications that know about it and really want to close the file > >> still can do that. > > > > Agreed! My latest patch adds a read only attribute closefd. > > > >> Also, is it really the case that close has no effect if closefd > >> was given? ISTM that it will still flush the file, for a > >> _BufferedIOMixin. This should systematically be clarified: it > >> should either always flush, or never (although "always flush" > >> seems more useful). > > > > I haven't looked at the code in the io yet. Our discussion was mostly about the _fileio C extension. To clarify the issue with an example > > > >>>> out = open(1, "w", closefd=False) > >>>> out.write("example\n") > > example > > 8 > >>>> out.close() > > /usr/local/lib/python3.0/io.py:1461: RuntimeWarning: Trying to close unclosable fd! > > self.buffer.close() > >>>> out.write("example\n") > > example > > 8 > > > > I think the second out.write() call should raise an exception instead of writing the string to the fd 1. > > > >> Also, why does the _BufferedIOMixin discard exceptions from flush? > >> Errors should never pass silently. > > > > Ack! > > > >> Also, why does the FileIO.close first invoke _FileIO.close, then > >> RawIOBase.close? IIUC, FileIO.close will close the file handle, > >> yet RawIOBase will attempt to flush afterwards. > > > > Apparently the io module hasn't been reviewed for its massive usage of close. :/ > > > > Christian > > _______________________________________________ > > Python-3000 mailing list > > Python-3000 at python.org > > http://mail.python.org/mailman/listinfo/python-3000 > > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/janssen%40parc.com From guido at python.org Tue Nov 4 03:49:16 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Nov 2008 19:49:16 -0700 Subject: [Python-3000] close() on open(fd, closefd=False) In-Reply-To: <71578.1225765536@parc.com> References: <490C4F18.5090307@v.loewis.de> <490E3938.6070309@cheimes.de> <71578.1225765536@parc.com> Message-ID: Are you sure? I thought that was different -- httplib depends on the reference count semantics of socket objects. The closefd behavior that Christian is describing here is part of the (new in 2.6 and 3.0) io.py module. On Mon, Nov 3, 2008 at 7:25 PM, Bill Janssen wrote: > Note that the whole httplib uses this behavior -- issue 1348. > > Bill > > Guido van Rossum wrote: > >> I see no problem in fixing this in 3.0 and 2.6.1. The current behavior >> is a bug and I see no reason to expect that users would depend on it. >> Sure, it might occasionally mask a bug elsewhere in their code -- but >> given how esoteric this whole issue is I think its fine to consider >> this simply a bug and fix it. >> >> 2008/11/2 Christian Heimes : >> > Martin v. L?wis wrote: >> >> >> >> However, this is the documented behavior: >> >> >> >> "the underlying file descriptor will be kept open when the file is closed." >> >> >> >> and, for close >> >> >> >> "Flush and close this stream." >> >> >> >> It would help if the documentation of close would read >> >> >> >> "Flush and close this stream. This method has no effect if the file is >> >> already closed, or if closefd was False when the file object was created." >> > >> > I would use a slightly different wording >> > >> > "Flush and close this stream. This method has no effect if the file is >> > already closed. Once the file is closed, any operation on the file (e.g. reading or writing) will fail. Close doesn't close the internal file descriptor if closefd was False when the file object was created." >> > >> > In my opinion close() should render the file *object* disabled in all cases. closefd should just leave the file *descriptor* open, not the file *object*. >> > >> >> It might be useful if closefd attribute could be reflected, so that >> >> applications that know about it and really want to close the file >> >> still can do that. >> > >> > Agreed! My latest patch adds a read only attribute closefd. >> > >> >> Also, is it really the case that close has no effect if closefd >> >> was given? ISTM that it will still flush the file, for a >> >> _BufferedIOMixin. This should systematically be clarified: it >> >> should either always flush, or never (although "always flush" >> >> seems more useful). >> > >> > I haven't looked at the code in the io yet. Our discussion was mostly about the _fileio C extension. To clarify the issue with an example >> > >> >>>> out = open(1, "w", closefd=False) >> >>>> out.write("example\n") >> > example >> > 8 >> >>>> out.close() >> > /usr/local/lib/python3.0/io.py:1461: RuntimeWarning: Trying to close unclosable fd! >> > self.buffer.close() >> >>>> out.write("example\n") >> > example >> > 8 >> > >> > I think the second out.write() call should raise an exception instead of writing the string to the fd 1. >> > >> >> Also, why does the _BufferedIOMixin discard exceptions from flush? >> >> Errors should never pass silently. >> > >> > Ack! >> > >> >> Also, why does the FileIO.close first invoke _FileIO.close, then >> >> RawIOBase.close? IIUC, FileIO.close will close the file handle, >> >> yet RawIOBase will attempt to flush afterwards. >> > >> > Apparently the io module hasn't been reviewed for its massive usage of close. :/ >> > >> > Christian >> > _______________________________________________ >> > Python-3000 mailing list >> > Python-3000 at python.org >> > http://mail.python.org/mailman/listinfo/python-3000 >> > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org >> > >> >> >> >> -- >> --Guido van Rossum (home page: http://www.python.org/~guido/) >> _______________________________________________ >> Python-3000 mailing list >> Python-3000 at python.org >> http://mail.python.org/mailman/listinfo/python-3000 >> Unsubscribe: http://mail.python.org/mailman/options/python-3000/janssen%40parc.com > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From lists at cheimes.de Tue Nov 4 15:42:50 2008 From: lists at cheimes.de (Christian Heimes) Date: Tue, 04 Nov 2008 15:42:50 +0100 Subject: [Python-3000] close() on open(fd, closefd=False) In-Reply-To: <71578.1225765536@parc.com> References: <490C4F18.5090307@v.loewis.de> <490E3938.6070309@cheimes.de> <71578.1225765536@parc.com> Message-ID: <49105F6A.3070004@cheimes.de> Bill Janssen wrote: > Note that the whole httplib uses this behavior -- issue 1348. > Are you sure? httplib would raise lots of warnings saying something about an unclosable fd. Christian From janssen at parc.com Tue Nov 4 19:13:23 2008 From: janssen at parc.com (Bill Janssen) Date: Tue, 4 Nov 2008 10:13:23 PST Subject: [Python-3000] close() on open(fd, closefd=False) In-Reply-To: References: <490C4F18.5090307@v.loewis.de> <490E3938.6070309@cheimes.de> <71578.1225765536@parc.com> Message-ID: <79439.1225822403@parc.com> You are, of course, correct. Bill Guido van Rossum wrote: > Are you sure? I thought that was different -- httplib depends on the > reference count semantics of socket objects. The closefd behavior that > Christian is describing here is part of the (new in 2.6 and 3.0) io.py > module. > > On Mon, Nov 3, 2008 at 7:25 PM, Bill Janssen wrote: > > Note that the whole httplib uses this behavior -- issue 1348. > > > > Bill > > > > Guido van Rossum wrote: > > > >> I see no problem in fixing this in 3.0 and 2.6.1. The current behavior > >> is a bug and I see no reason to expect that users would depend on it. > >> Sure, it might occasionally mask a bug elsewhere in their code -- but > >> given how esoteric this whole issue is I think its fine to consider > >> this simply a bug and fix it. > >> > >> 2008/11/2 Christian Heimes : > >> > Martin v. L?wis wrote: > >> >> > >> >> However, this is the documented behavior: > >> >> > >> >> "the underlying file descriptor will be kept open when the file is closed." > >> >> > >> >> and, for close > >> >> > >> >> "Flush and close this stream." > >> >> > >> >> It would help if the documentation of close would read > >> >> > >> >> "Flush and close this stream. This method has no effect if the file is > >> >> already closed, or if closefd was False when the file object was created." > >> > > >> > I would use a slightly different wording > >> > > >> > "Flush and close this stream. This method has no effect if the file is > >> > already closed. Once the file is closed, any operation on the file (e.g. reading or writing) will fail. Close doesn't close the internal file descriptor if closefd was False when the file object was created." > >> > > >> > In my opinion close() should render the file *object* disabled in all cases. closefd should just leave the file *descriptor* open, not the file *object*. > >> > > >> >> It might be useful if closefd attribute could be reflected, so that > >> >> applications that know about it and really want to close the file > >> >> still can do that. > >> > > >> > Agreed! My latest patch adds a read only attribute closefd. > >> > > >> >> Also, is it really the case that close has no effect if closefd > >> >> was given? ISTM that it will still flush the file, for a > >> >> _BufferedIOMixin. This should systematically be clarified: it > >> >> should either always flush, or never (although "always flush" > >> >> seems more useful). > >> > > >> > I haven't looked at the code in the io yet. Our discussion was mostly about the _fileio C extension. To clarify the issue with an example > >> > > >> >>>> out = open(1, "w", closefd=False) > >> >>>> out.write("example\n") > >> > example > >> > 8 > >> >>>> out.close() > >> > /usr/local/lib/python3.0/io.py:1461: RuntimeWarning: Trying to close unclosable fd! > >> > self.buffer.close() > >> >>>> out.write("example\n") > >> > example > >> > 8 > >> > > >> > I think the second out.write() call should raise an exception instead of writing the string to the fd 1. > >> > > >> >> Also, why does the _BufferedIOMixin discard exceptions from flush? > >> >> Errors should never pass silently. > >> > > >> > Ack! > >> > > >> >> Also, why does the FileIO.close first invoke _FileIO.close, then > >> >> RawIOBase.close? IIUC, FileIO.close will close the file handle, > >> >> yet RawIOBase will attempt to flush afterwards. > >> > > >> > Apparently the io module hasn't been reviewed for its massive usage of close. :/ > >> > > >> > Christian > >> > _______________________________________________ > >> > Python-3000 mailing list > >> > Python-3000 at python.org > >> > http://mail.python.org/mailman/listinfo/python-3000 > >> > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > >> > > >> > >> > >> > >> -- > >> --Guido van Rossum (home page: http://www.python.org/~guido/) > >> _______________________________________________ > >> Python-3000 mailing list > >> Python-3000 at python.org > >> http://mail.python.org/mailman/listinfo/python-3000 > >> Unsubscribe: http://mail.python.org/mailman/options/python-3000/janssen%40parc.com > > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) From szport at gmail.com Wed Nov 5 12:22:57 2008 From: szport at gmail.com (Zaur Shibzoukhov) Date: Wed, 5 Nov 2008 14:22:57 +0300 Subject: [Python-3000] Python Object Notation (PyON) In-Reply-To: References: <9bfc700a0811050257p6e723c3fw86b1eb04c0436ec1@mail.gmail.com> Message-ID: 2008/11/5 Arnaud Delobelle wrote: > (or one could use given=dict(lst=lst, d=d)) > > This would have two advantages: > > * eliminate the rist of keyword argument name collision > * one could put all the 'given' objects in a dictionary and then > 'pickle' expressions as needed using this method. Later pyon.loads > could be passed this dictionary so that the objects can be unpickled > correctly. > > I think this idea is good as it would make it possible to pickle some > objects that contain unpicklable objects just by declaring them as > 'given'. > I think it's reasonable. I will change the interface. > Also, what happens with types? E.g. > >>>> pyon.dumps([int, float, str]) > > I think it would be good if typenames were considered literals (like > numbers and strings) so that the above returns '[int, float, str]' > (and the same for user-defined types maybe). Yes, pyon can dump types too. One note: default rule for name resolving uses sys._getframe(1).f_globals and sys._getframe(1).f_locals. But you can change name resolver writing you own. For example: >>> class C(object): pass ... >>> pyon.loads("[int,bool,float,C]") [, , , ] >>> pyon.dumps([int,bool,float,C]) '[int,bool,float,C]' Best regards, Zaur From info at orlans-amo.be Mon Nov 3 12:12:57 2008 From: info at orlans-amo.be (info at orlans-amo.be) Date: Mon, 3 Nov 2008 03:12:57 -0800 (PST) Subject: [Python-3000] bug in idle on rc1 Message-ID: in run.py in Python_30\Lib\idlelib the line: sockthread.set_daemon(True) has to be changed to: sockthread.setDaemon(True) the message was: D:\Python_30\Lib\idlelib>python idle.py Traceback (most recent call last): File "", line 1, in File "D:\Python_30\lib\idlelib\run.py", line 76, in main sockthread.set_daemon(True) AttributeError: 'Thread' object has no attribute 'set_daemon' From victor.stinner at haypocalc.com Wed Nov 5 13:20:55 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 5 Nov 2008 13:20:55 +0100 Subject: [Python-3000] bug in idle on rc1 In-Reply-To: References: Message-ID: <200811051320.55758.victor.stinner@haypocalc.com> Le Monday 03 November 2008 12:12:57 info at orlans-amo.be, vous avez ?crit?: > in run.py in Python_30\Lib\idlelib > the line: sockthread.set_daemon(True) > has to be changed to: sockthread.setDaemon(True) It's already fixed in python trunk: http://svn.python.org/view?rev=66518&view=rev That's why we are all waiting on barry for python 3.0rc2 :-) Thanks for the report, but next time, please use the tracker: http://bugs.python.org/ -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ From barry at python.org Wed Nov 5 15:50:37 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 5 Nov 2008 09:50:37 -0500 Subject: [Python-3000] bug in idle on rc1 In-Reply-To: <200811051320.55758.victor.stinner@haypocalc.com> References: <200811051320.55758.victor.stinner@haypocalc.com> Message-ID: <80127E23-7C6E-4600-AC6E-EE31A156BEAB@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 5, 2008, at 7:20 AM, Victor Stinner wrote: > That's why we are all waiting on barry for python 3.0rc2 :-) T minus 8h10m and counting... - -B -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRGyvXEjvBPtnXfVAQKHywP/TRRndSp9UE16Xf0Djj6f4R4VLsQ3ejps 7UbKRTdCcIT6Yo+Lrslgvwg0EW6d/HsxicJGF519tbFcDhnhgbVQ8Lub6Ec1f+0B mbGLJoWs9XqfZ6pCpEQ1zshyhqin+FYJz54QDU2TVpJMeMuaU7/tPc/3+yu9/b7V g9Ey+g3lQa8= =0RWf -----END PGP SIGNATURE----- From barry at python.org Wed Nov 5 21:02:17 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 5 Nov 2008 15:02:17 -0500 Subject: [Python-3000] BDFL pronouncement needed on issue 4211 Message-ID: <1F56DCFB-F488-4AB6-8AC0-F869A437DEE8@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Guido, Can you please make a BDFL pronouncement on issue 4211, specifically the backward compatibility and API break for __path__ this late in the game: http://bugs.python.org/issue4211 If you can decide in the next 3 hours we can get the patch into 3.0rc2. Christian's reviewed the patch and thinks it looks good, if the API change is approved. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRH7ynEjvBPtnXfVAQKS+AP+PBQ9h2rS+yDLkZxImVi/84NnUtsBY2aZ LK2ljYkqdI3O/ZExcKlKNKrCB8XD/DK2DM8Y+PgBRmuXlLBzIJkb8xta1QXsXJ/E uKmeKfbzSdTrGWaxnIS0D9lFZFhaCfiNqy6idfmiQEQmOAOLsh/TsGb9uPqVpbkM /G2hdboSdkw= =j4Cy -----END PGP SIGNATURE----- From barry at python.org Wed Nov 5 21:38:23 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 5 Nov 2008 15:38:23 -0500 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <20081030221726.0A0636007DF@longblack.object-craft.com.au> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> Message-ID: <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Oct 30, 2008, at 6:17 PM, Andrew McNamara wrote: > That's a tricker case, but I think it should use bytes internally. > One of > the early goals of email was that be able to cope with malformed > MIME - > this includes incorrectly encoded messages. So I think it must keep a > bytes representation internally. > > However - charset encoding is part of the MIME spec, so users have a > reasonable expectation that the mime lib will present them with > unicode. > So the API needs to be unicode. > >> The latter doesn't though, and it needs a lot of work (we tried and >> failed >> at pycon). > > Yes, it's hard. I think we're going to have to break the API. I did make a start on a new API for email to work better with bytes and unicode. I didn't get that far before other work intruded. My current thinking is that you need separate APIs where appropriate to access email content as unicodes (or decoded data in general). For example, normally headers and their values would be bytes, but there would be an API to retrieve the decoded values as unicodes. Similarly, where get_payload() now takes a 'decoded' option, there would be a separate API for retrieving the decoded payload. This is a bit trickier because depending on the content-type, you might want a unicode, or an image, or a sound file, etc. Another tricky issue is how to set these things. We have to get in the habit of writing message[b'Subject'] = b'Hello' but that's really gross, and of course email_from_string() would have to become email_from_bytes(). Maybe the API accepts unicode strings but only if they are ASCII? There are lots of other problems with the email package, and while it's made my life much better on the whole, it is definitely in need of improvement. Unfortunately, I don't see myself having much time to attack it in the near future. Maybe we can make it a Pycon sprint (instead of spending all that time on the bzr experiment ;), or, if someone else wants to lead the dirty work, I would definitely pitch in with my thoughts on API and implementation. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRIEQHEjvBPtnXfVAQI9TQQAjcCPSUH9RNazXR6vaCHRLauSF9x4RPzE 8odPKLamPpea3kPS9OvGzSs3JtRwSQ8ozbd42MkovlexT7nEcHSZRfvQJNC8scPS sjEuqyVIdKb9ls1SaZsuK7cZBaKM9OZP3qjvsnDOIICJu9wIpiyvYbhocVq2Yl9g CNO6rIUU+8k= =IT8J -----END PGP SIGNATURE----- From guido at python.org Wed Nov 5 22:30:43 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Nov 2008 13:30:43 -0800 Subject: [Python-3000] BDFL pronouncement needed on issue 4211 In-Reply-To: <1F56DCFB-F488-4AB6-8AC0-F869A437DEE8@python.org> References: <1F56DCFB-F488-4AB6-8AC0-F869A437DEE8@python.org> Message-ID: Done -- I'm fine with this particular API change. On Wed, Nov 5, 2008 at 12:02 PM, Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Guido, > > Can you please make a BDFL pronouncement on issue 4211, specifically the > backward compatibility and API break for __path__ this late in the game: > > http://bugs.python.org/issue4211 > > If you can decide in the next 3 hours we can get the patch into 3.0rc2. > Christian's reviewed the patch and thinks it looks good, if the API change > is approved. > > - -Barry > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (Darwin) > > iQCVAwUBSRH7ynEjvBPtnXfVAQKS+AP+PBQ9h2rS+yDLkZxImVi/84NnUtsBY2aZ > LK2ljYkqdI3O/ZExcKlKNKrCB8XD/DK2DM8Y+PgBRmuXlLBzIJkb8xta1QXsXJ/E > uKmeKfbzSdTrGWaxnIS0D9lFZFhaCfiNqy6idfmiQEQmOAOLsh/TsGb9uPqVpbkM > /G2hdboSdkw= > =j4Cy > -----END PGP SIGNATURE----- > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From v+python at g.nevcal.com Wed Nov 5 23:45:07 2008 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 05 Nov 2008 14:45:07 -0800 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> Message-ID: <491221F3.4040304@g.nevcal.com> On approximately 11/5/2008 12:38 PM, came the following characters from the keyboard of Barry Warsaw: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Oct 30, 2008, at 6:17 PM, Andrew McNamara wrote: > >> That's a tricker case, but I think it should use bytes internally. One of >> the early goals of email was that be able to cope with malformed MIME - >> this includes incorrectly encoded messages. So I think it must keep a >> bytes representation internally. >> >> However - charset encoding is part of the MIME spec, so users have a >> reasonable expectation that the mime lib will present them with unicode. >> So the API needs to be unicode. >> >>> The latter doesn't though, and it needs a lot of work (we tried and >>> failed >>> at pycon). >> >> Yes, it's hard. I think we're going to have to break the API. > > I did make a start on a new API for email to work better with bytes and > unicode. I didn't get that far before other work intruded. My current > thinking is that you need separate APIs where appropriate to access > email content as unicodes (or decoded data in general). For example, > normally headers and their values would be bytes, but there would be an > API to retrieve the decoded values as unicodes. > > Similarly, where get_payload() now takes a 'decoded' option, there would > be a separate API for retrieving the decoded payload. This is a bit > trickier because depending on the content-type, you might want a > unicode, or an image, or a sound file, etc. > > Another tricky issue is how to set these things. We have to get in the > habit of writing > > message[b'Subject'] = b'Hello' > > but that's really gross, and of course email_from_string() would have to > become email_from_bytes(). Maybe the API accepts unicode strings but > only if they are ASCII? > > There are lots of other problems with the email package, and while it's > made my life much better on the whole, it is definitely in need of > improvement. Unfortunately, I don't see myself having much time to > attack it in the near future. Maybe we can make it a Pycon sprint > (instead of spending all that time on the bzr experiment ;), or, if > someone else wants to lead the dirty work, I would definitely pitch in > with my thoughts on API and implementation. I would find message[b'Subject'] = b'Hello' to be totally gross. While RFC Email is all ASCII, except if 8bit transfer is legal, there are internal encoding provided that permit the expression of Unicode in nearly any component of the email, except for header identifiers. But there are never Unicode characters in the transfer, as they always get encoded (there can be UTF-8 byte sequences, of course, if 8bit transfer is legal; if it is not, then even UTF-8 byte sequences must be further encoded). Depending on the level of email interface, there should be no interface that cannot be expressed in terms of Unicode, plus an encoding to use for the associated data. Even 8-bit binary can be translated into a sequence of Unicode codepoints with the same numeric value, for example. That isn't particularly, efficient, though, so providing a few interfaces that accept binary blobs to encode in various ways would be handy. Of course binary data should allow specification of an associated encoding also. I haven't looked at the details of the Python libraries yet, but it is a subject I eventually want to get familiar with, as I've written Perl scripts to read and write email, and have tweaked a couple open source email clients a bit. The Python POP, IMAP, SMTP and NNTP sound like they raise the level of abstraction a bit, and should make it even easier to read and write email. So many projects, so many ideas, but limited time :( Helping with this would be something I would really enjoy, but I'm significantly backlogged at present. Maybe I should outline what would be nice to see, before delving into the interfaces. This could be helpful if you invent a new interface; maybe some of the ideas would help avoid designs that require the above grossness. Alternately, perhaps viewing these comments as an extremely high-level set of expectations could help view the existing interfaces in a way that can achieve these goals, without major rework, even if there are few warts. I'll speak in terms of creating and sending a message, but receiving should be similar, and simpler (because encoding choices were already made, and only need to be decoded). It would be nice to specify, at "message creation" time, the preferred types of encodings that should be used, and then not have to think about the encodings any more, and just provide Unicode at the interfaces (or binary for certain blobs). It is not clear that the message should be encoded on the fly, but rather after negotiation with the server, after determining if 8bit transfer is legal. Once the message is complete, it should be retrievable as a blob (perhaps pickle is appropriate, or just any of the possible email bytestreams that could be sent) that can be re-instantiated later. Once the message is sendable, the actual binary bytestream sent should be available for retrieval. This could be used as the "sent" log, if one is desired (usually is for email clients). The saved bytestream should be able to be used to re-instantiate an equivalent message object later. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking From andrewm at object-craft.com.au Wed Nov 5 23:59:47 2008 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Thu, 06 Nov 2008 09:59:47 +1100 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <491221F3.4040304@g.nevcal.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> Message-ID: <20081105225947.50E885AC03F@longblack.object-craft.com.au> >I would find > > message[b'Subject'] = b'Hello' > >to be totally gross. > >While RFC Email is all ASCII, except if 8bit transfer is legal, there >are internal encoding provided that permit the expression of Unicode in >nearly any component of the email, except for header identifiers. But >there are never Unicode characters in the transfer, as they always get >encoded (there can be UTF-8 byte sequences, of course, if 8bit transfer >is legal; if it is not, then even UTF-8 byte sequences must be further >encoded). > >Depending on the level of email interface, there should be no interface >that cannot be expressed in terms of Unicode, plus an encoding to use >for the associated data. Even 8-bit binary can be translated into a >sequence of Unicode codepoints with the same numeric value, for example. One significant problem is that the email module is intended to be able to work with malformed e-mail without mangling it too badly. The malformed e-mail should also make a round-trip through the email module without being further mangled. I think this requires the underlying processing to be all based on bytes, but doesn't preclude layers on top that parse the charset hints. The rules about encoding are strict, but not always followed. For instance, the headers *must* be ASCII (the header body can, however, be encoded - see rfc2047). Spammers often ignore this, and you might be inclined to say "stuff em'", but this would make the SpamBayes authors rather unhappy. One solution is to provide two sets of classes - the underlying bytes-based one, and another unicode-based one, built on top of the bytes classes, that implements the same API, but that may fail due to encoding errors. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From v+python at g.nevcal.com Thu Nov 6 00:39:55 2008 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 05 Nov 2008 15:39:55 -0800 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <20081105225947.50E885AC03F@longblack.object-craft.com.au> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> Message-ID: <49122ECB.5010205@g.nevcal.com> On approximately 11/5/2008 2:59 PM, came the following characters from the keyboard of Andrew McNamara: >> I would find >> >> message[b'Subject'] = b'Hello' >> >> to be totally gross. >> >> While RFC Email is all ASCII, except if 8bit transfer is legal, there >> are internal encoding provided that permit the expression of Unicode in >> nearly any component of the email, except for header identifiers. But >> there are never Unicode characters in the transfer, as they always get >> encoded (there can be UTF-8 byte sequences, of course, if 8bit transfer >> is legal; if it is not, then even UTF-8 byte sequences must be further >> encoded). >> >> Depending on the level of email interface, there should be no interface >> that cannot be expressed in terms of Unicode, plus an encoding to use >> for the associated data. Even 8-bit binary can be translated into a >> sequence of Unicode codepoints with the same numeric value, for example. > > One significant problem is that the email module is intended to be > able to work with malformed e-mail without mangling it too badly. The > malformed e-mail should also make a round-trip through the email module > without being further mangled. This is an interesting perspective... "stuff em" does come to mind :) But I'm not at all clear on what you mean by a round-trip through the email module. Let me see... if you are creating an email, you (1) should encode it properly (2) a round-trip is mostly meaningless, unless you send it to yourself. So you probably mean email that is received, and that you want to send on. In this case, there is already a composed/encoded form of the email in hand; it could simply be sent as is without decoding or re-encoding. That would be quite a clean round-trip! > I think this requires the underlying processing to be all based on bytes, Notice that I said _nothing_ about the underlying processing in my comments, only the API. I fully agree that some, perhaps most, of the underlying processing has to be aware of bytes, and use and manipulate bytes. > but doesn't preclude layers on top that parse the charset hints. The > rules about encoding are strict, but not always followed. For instance, > the headers *must* be ASCII (the header body can, however, be encoded - > see rfc2047). Indeed, the headers must be ASCII, and once encoded, the header body is also. > Spammers often ignore this, and you might be inclined to > say "stuff em'", but this would make the SpamBayes authors rather unhappy. And so it is quite possible to misinterpret the improperly encoded headers as 8-bit octets that correspond to Unicode codepoints (the so-called "Latin-1" conversion). For spam, that is certainly good enough. And roundtripping it says that if APIs are not used to change it, you use the original binary for that header. > One solution is to provide two sets of classes - the underlying > bytes-based one, and another unicode-based one, built on top of the > bytes classes, that implements the same API, but that may fail due to > encoding errors. I think you meant "decoding" errors, there? I guess I'm not terribly concerned about the readability of improperly encoded email messages, whether they are spam or ham. For the purposes of SpamBayes (which I assume is similar to spamassassin, only written in Python), it doesn't matter if the data is readable, only that it is recognizably similar. So a consistent mis-transliteration is as good a a correct decoding. For ham, the correspondent should be informed that there are problems with their software, so that they can upgrade or reconfigure it. And a mis-transliteration is likely the best that can be provided in that case anyway... unless the mail API provides for ignoring the incoming (incorrect or missing) encoding directives and using one provided by the API, and the client can select a few until they stumble on one that produces a readable result. But if the mis-transliteration is done using the Latin-1 conversion to Unicode, the client, if it chooses to want to do that sort of heuristic analysis, can reencode to Latin-1, and then decode using some other encoding(s), independently of the mail APIs providing such a facility. I do hope to learn and use the Python mail APIs, and I was hoping to do that in Python 3.0 (and am sorry, but not surprised, to hear that this is an area of problems at present), and I was hoping that the interfaces that would be presented by Python 3.0 mail APIs would be in terms of Unicode, for the convenience of being abstracted away from the plethora of encodings that are defined at the mail transport layer. (Not that I don't understand those encodings, but it is something that certainly can and should be mostly hidden under the covers.) -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking From andrewm at object-craft.com.au Thu Nov 6 01:24:04 2008 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Thu, 06 Nov 2008 11:24:04 +1100 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <49122ECB.5010205@g.nevcal.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> Message-ID: <20081106002404.3BB6B60004F@longblack.object-craft.com.au> >But I'm not at all clear on what you mean by a round-trip through the >email module. Let me see... if you are creating an email, you (1) >should encode it properly (2) a round-trip is mostly meaningless, unless >you send it to yourself. So you probably mean email that is received, >and that you want to send on. In this case, there is already a >composed/encoded form of the email in hand; it could simply be sent as >is without decoding or re-encoding. That would be quite a clean round-trip! Imagine a mail proxy of some sort (SMTP or a list manager like Mailman) - you want to be able to parse a message, maybe make some minor changes (such as adding a "Received:" header, or stripping out illegal MIME types) and then emit something that differs from the original in only the ways that you specified. Another example - image what an mail transport agent does with bounces: it wraps them in a MIME wrapper, but otherwise changes the structure as little as possible (because that would make later analysis of the bounce problematic). >Notice that I said _nothing_ about the underlying processing in my >comments, only the API. I fully agree that some, perhaps most, of the >underlying processing has to be aware of bytes, and use and manipulate >bytes. The bytes API has to be accessible - there are many contexts in which you need to work at this level. >Indeed, the headers must be ASCII, and once encoded, the header body is >also. Except when they're not. It's not uncommon in mail handling to get a valid message that doesn't conform to the specs (not just spam). You can either throw your hands up in the air and declare it irredeemably broken, or do your best to extract meaning from it. Invariably, it's the CEO's best mate who sent the malformed message, so you process it or find a new job. >And so it is quite possible to misinterpret the improperly encoded >headers as 8-bit octets that correspond to Unicode codepoints (the >so-called "Latin-1" conversion). For spam, that is certainly good >enough. And roundtripping it says that if APIs are not used to change >it, you use the original binary for that header. Certainly, this is one approach, and users of the email module in the py3k standard lib are essentially doing this now. >> One solution is to provide two sets of classes - the underlying >> bytes-based one, and another unicode-based one, built on top of the >> bytes classes, that implements the same API, but that may fail due to >> encoding errors. > >I think you meant "decoding" errors, there? Well, yes and no. I meant that the encoding was done incorrectly. >I guess I'm not terribly concerned about the readability of improperly >encoded email messages, whether they are spam or ham. You may not be, but other users of the module are. >For ham, the correspondent should be informed that there are problems >with their software, so that they can upgrade or reconfigure it. How do you determine the correspondent if you can't parse their e-mail? 8-) >(Not that I don't understand those encodings, but it is something that >certainly can and should be mostly hidden under the covers.) You're talking about a utopian state that Unicode strives but fails to achieve. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From stephen at xemacs.org Thu Nov 6 03:09:49 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 06 Nov 2008 11:09:49 +0900 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <49122ECB.5010205@g.nevcal.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> Message-ID: <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> Glenn Linderman writes: > On approximately 11/5/2008 2:59 PM, came the following characters from > the keyboard of Andrew McNamara: > >> I would find > >> > >> message[b'Subject'] = b'Hello' > >> > >> to be totally gross. Indeed. > >> Depending on the level of email interface, there should be no interface > >> that cannot be expressed in terms of Unicode, plus an encoding to use > >> for the associated data. Even 8-bit binary can be translated into a > >> sequence of Unicode codepoints with the same numeric value, for example. Also totally gross. RFC 2821 is bytes, RFC 2822 is Unicode (in spirit, even though headers are limited to ASCII), RFC 2045-and-the- cast-of-thousands interfaces the two. We can't really get around this, IMO. > > One significant problem is that the email module is intended to be > > able to work with malformed e-mail without mangling it too badly. The > > malformed e-mail should also make a round-trip through the email module > > without being further mangled. > > This is an interesting perspective... "stuff em" does come to mind :) Not acceptable in Japan, or anywhere that Microsoft beta products are used, for that matter. (At one point Outhouse Excess betas were sending HTML *with tags in unibyte ASCII and element content in little-endian UTF-16*.) > But I'm not at all clear on what you mean by a round-trip through the > email module. Bounce messages, for example. > I guess I'm not terribly concerned about the readability of improperly > encoded email messages, whether they are spam or ham. I'm fine with *your* lack of concern if you don't need it, but an email module that doesn't care really is not acceptable in any of the Asian cultures; they have more characters to worry about than the Bush administration has "suspicious foreign elements". Although the various standards are far better at keeping track of their charges than the Department of Homeland Security, you still get junk in messages, and codecs are of varying quality in error-handling. If you want to restrict yourself to the Unicode-feasible layer, then it would be very cool if you would watch for any leakage of bytes or encoding-related lossage into that layer, and scream bloody murder if they do. (Eg, the APIs that handle well-formed messages should never ever raise UnicodeError or codec errors themselves.) > is an area of problems at present), and I was hoping that the interfaces > that would be presented by Python 3.0 mail APIs would be in terms of > Unicode, For the applications I guess you have in mind, they can and should be. But there is no reason why Python can't be used for RFC 2821-level bit-flicking transport protocol. I don't see a way at present to separate that level from the email module because of the Postel Principle; you can get anything in email and you have to live with it. The various API layers are going to need to cooperate closely, and given how specialized and crufty the bytes-to-Unicode relationship is, I think the lexing/parsing layer probably should be allowed to have a pretty fluid API for quite a while. There need to be two (and I would say three is better) sets of APIs: byte-oriented for handling the wire protocol, Unicode-oriented for handling well-formed messages (both presentation and composition), and (probably) a "codec" layer which handles nastiness in the transition. > for the convenience of being abstracted away from the plethora of > encodings that are defined at the mail transport layer. But handling those is definitely in the domain of the email module. Any attachments of documents in legacy encodings will need to deal with them explicitly in composition of Content-Type headers, etc. From barry at python.org Thu Nov 6 05:18:19 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 5 Nov 2008 23:18:19 -0500 Subject: [Python-3000] No 3.0rc2 tonight Message-ID: <848C34C1-6FD6-4BE8-A134-48766BDCC073@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 There appears to be a bug in the documentation for 3.0. See issue 4266. http://bugs.python.org/issue4266 I'm sorry that I'm too tired to figure out what the basic problem is. I've made the issue a release blocker (the only one left for 3.0rc2), and we'll try again tomorrow. The branch is unfrozen. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRJwC3EjvBPtnXfVAQLBewP7BRzkxHdOdVyG2sqxcOAI5aTlGleDIQqd rCBZ0xv6roig5ZiVtFBJ2cK5+2NqfKpaE79VTketReMKTl0bQZhf1NuCunHr0h+B x+1eW/OqD2Ff0l8bfdtry9MUuc8PBLGVJ4w9xeKW9nEDpZe1eG9YN6+fNqUh81Pc VMifCwV8OXk= =AuEq -----END PGP SIGNATURE----- From v+python at g.nevcal.com Thu Nov 6 06:27:32 2008 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 05 Nov 2008 21:27:32 -0800 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <20081106002404.3BB6B60004F@longblack.object-craft.com.au> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> <20081106002404.3BB6B60004F@longblack.object-craft.com.au> Message-ID: <49128044.3000006@g.nevcal.com> On approximately 11/5/2008 4:24 PM, came the following characters from the keyboard of Andrew McNamara: >> But I'm not at all clear on what you mean by a round-trip through the >> email module. Let me see... if you are creating an email, you (1) >> should encode it properly (2) a round-trip is mostly meaningless, unless >> you send it to yourself. So you probably mean email that is received, >> and that you want to send on. In this case, there is already a >> composed/encoded form of the email in hand; it could simply be sent as >> is without decoding or re-encoding. That would be quite a clean round-trip! > > Imagine a mail proxy of some sort (SMTP or a list manager like Mailman) - > you want to be able to parse a message, maybe make some minor changes > (such as adding a "Received:" header, or stripping out illegal MIME types) > and then emit something that differs from the original in only the ways > that you specified. Sure. Add header, delete header APIs would suffice for this. The APIs could accept Unicode, but do bytes manipulations. > Another example - image what an mail transport agent does with bounces: > it wraps them in a MIME wrapper, but otherwise changes the structure > as little as possible (because that would make later analysis of the > bounce problematic). So they usually truncate the size too, to 10K or less. Enough to get all the headers. Some only send headers back. So it is no problem. A "retrieve headers in binary from message" API, followed by "add this chunk of binary as a MIME part" to the new bounce message under construction. The first could be replaced by "retrieve message as bytes" and "substr", as an alternative. So yes, some bytes APIs are necessary for binary MIME parts and the whole message (as I mentioned before), and there may be a few other special cases. But mostly, just Unicode. >> Notice that I said _nothing_ about the underlying processing in my >> comments, only the API. I fully agree that some, perhaps most, of the >> underlying processing has to be aware of bytes, and use and manipulate >> bytes. > > The bytes API has to be accessible - there are many contexts in which > you need to work at this level. Maybe. I named a couple, you've named another, maybe there are a few more. The only reason not to have a full bytes API is just the effort to support it... if that can reasonably be avoided, why not? But I doubt there are a lot of cases that _must_ be handled as bytes, and so if we can identify the ones that indeed, must be, and supply them, the rest can be Unicode. >> Indeed, the headers must be ASCII, and once encoded, the header body is >> also. > > Except when they're not. It's not uncommon in mail handling to get a > valid message that doesn't conform to the specs (not just spam). You can > either throw your hands up in the air and declare it irredeemably broken, > or do your best to extract meaning from it. Invariably, it's the CEO's > best mate who sent the malformed message, so you process it or find a > new job. This is where you use the Latin-1 conversion. Don't throw an error when in doesn't conform, but don't go to heroic efforts to provide bytes alternatives... just convert the bytes to Unicode, and the way the mail RFCs are written, and the types of encodings used, it is mostly readable. And if it isn't encoded, it is even more readable. >> And so it is quite possible to misinterpret the improperly encoded >> headers as 8-bit octets that correspond to Unicode codepoints (the >> so-called "Latin-1" conversion). For spam, that is certainly good >> enough. And roundtripping it says that if APIs are not used to change >> it, you use the original binary for that header. > > Certainly, this is one approach, and users of the email module in the py3k > standard lib are essentially doing this now. And so how much is it a problem? What are the effects of the problem? Does providing a bytes API solve the problem, or simply punt it to the user? If it simply punts it to the user, are there significant benefits to the coder-user of obtaining the data as bytes, vs. obtaining it as bytes transliterated by the Latin-1 conversion to Unicode? If there are significant benefits to the coder-user, what are they? >>> One solution is to provide two sets of classes - the underlying >>> bytes-based one, and another unicode-based one, built on top of the >>> bytes classes, that implements the same API, but that may fail due to >>> encoding errors. >> I think you meant "decoding" errors, there? > > Well, yes and no. I meant that the encoding was done incorrectly. Sure. The encoding wasn't done correctly, or wasn't done at all. But that causes problems for the decoder, on the receiving side. >> I guess I'm not terribly concerned about the readability of improperly >> encoded email messages, whether they are spam or ham. > > You may not be, but other users of the module are. Sure, but if it isn't properly encoded, then either it is an ASCII superset, in which case the ASCII parts will be readable (at least), and so with a little human cleverness, the non-ASCII parts can be intuited. I'm not suggesting making it worse than what it already is, in bytes form; just to translate the bytes to Unicode codepoints so that they can be returned on a Unicode interface. If you return them in bytes, what would you do besides that? If you would guess at an encoding, and do a different decode, that can be done on the Unicode transliteration just as easily as it can on the bytes form. >> For ham, the correspondent should be informed that there are problems >> with their software, so that they can upgrade or reconfigure it. > > How do you determine the correspondent if you can't parse their e-mail? 8-) Email addresses are pretty standardized in format. Especially the Errors header and the From header. So I think the correspondent's email address will be reasonably interpretable even if their name is not, and the body of their message is not. I'm not saying all is wonderful if they didn't properly encode their message, but I think you are exaggerating the problem... you can write back to the email address, even if you can't read the message. >> (Not that I don't understand those encodings, but it is something that >> certainly can and should be mostly hidden under the covers.) > > You're talking about a utopian state that Unicode strives but fails to achieve. Messages that are properly encoded can certainly achieve the Utopian state under the covers. Messages that are not properly encoded can be assumed to be Latin-1, and converted to Unicode. They may not be perfectly readable in that state, but face it, non-Unicode email clients did exactly that, but used Latin-1 bytes directly (or some other encoding). And if you think it would be helpful to have the default conversion to Unicode use some code page other than Latin-1, such as the currently configured code page, that is a fine alternative... and again, is much what happens today when people communicate without doing the proper encoding. Two people that use the same code page can communicate in that code page, but communicating with people that use other code pages is problematical. So no, Unicode doesn't solve the problems with buggy software, but it can be used without making the problem worse, so using it generally makes for a more convenient API. Think about the coder of the Python-based email client. Given the alternatives to use the Unicode API or the bytes API, how are they going to choose to use one or the other? Code the application twice, once with each API? No way! Too much work! So they'll use the Unicode API for text, and the bytes APIs for binary attachments, because that is what is natural. If improperly encoded messages are received, and appropriate transliterations are made so that the bytes get converted (default code page) or passed through (Latin-1 transformation), then the data may be somewhat garbled for characters in the non-ASCII subset. But that is not different than the handling done by any 8-bit email client, nor, I suspect (a little uncertainty here) different than the handling done by Python < 3.0 mail libraries. So that is not Utopian; Utopia can only be reached by following standards. But I don't see it as terrible; it is no worse that what happens today when the standards are not followed. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking From v+python at g.nevcal.com Thu Nov 6 07:04:44 2008 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 05 Nov 2008 22:04:44 -0800 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <491288FC.8090805@g.nevcal.com> On approximately 11/5/2008 6:09 PM, came the following characters from the keyboard of Stephen J. Turnbull: > Glenn Linderman writes: > > On approximately 11/5/2008 2:59 PM, came the following characters from > > the keyboard of Andrew McNamara: > > >> I would find > > >> > > >> message[b'Subject'] = b'Hello' > > >> > > >> to be totally gross. > > Indeed. > > > >> Depending on the level of email interface, there should be no interface > > >> that cannot be expressed in terms of Unicode, plus an encoding to use > > >> for the associated data. Even 8-bit binary can be translated into a > > >> sequence of Unicode codepoints with the same numeric value, for example. > > Also totally gross. RFC 2821 is bytes, RFC 2822 is Unicode (in > spirit, even though headers are limited to ASCII), RFC 2045-and-the- > cast-of-thousands interfaces the two. We can't really get around > this, IMO. > > > > One significant problem is that the email module is intended to be > > > able to work with malformed e-mail without mangling it too badly. The > > > malformed e-mail should also make a round-trip through the email module > > > without being further mangled. > > > > This is an interesting perspective... "stuff em" does come to mind :) > > Not acceptable in Japan, or anywhere that Microsoft beta products are > used, for that matter. (At one point Outhouse Excess betas were > sending HTML *with tags in unibyte ASCII and element content in > little-endian UTF-16*.) So I would hope that the users of such Betas would quickly discover that they were producing garbage, report it to M$, and go back to using a release version with only the usual expectation of bugs, inconsistencies, standards violations, and security exploits, but not expect that Beta software is, or should be, fully compatible with other applications that handle proper email. Did Python's 2.x mail library handle the data that you describe? Did anyone seriously expect it to? Did Mozilla clients handle it? Can you provide a list of email clients that handled it gracefully, other than the same Outhouse Excess client that produced it? And if not, why would you expect Python's 3.0 mail library to handle it? > > But I'm not at all clear on what you mean by a round-trip through the > > email module. > > Bounce messages, for example. OK, my other reply just now described a way to handle that. > > I guess I'm not terribly concerned about the readability of improperly > > encoded email messages, whether they are spam or ham. > > I'm fine with *your* lack of concern if you don't need it, but an > email module that doesn't care really is not acceptable in any of the > Asian cultures; they have more characters to worry about than the Bush > administration has "suspicious foreign elements". Although the > various standards are far better at keeping track of their charges > than the Department of Homeland Security, you still get junk in > messages, and codecs are of varying quality in error-handling. > > If you want to restrict yourself to the Unicode-feasible layer, then > it would be very cool if you would watch for any leakage of bytes or > encoding-related lossage into that layer, and scream bloody murder if > they do. (Eg, the APIs that handle well-formed messages should never > ever raise UnicodeError or codec errors themselves.) Sure, and I'm fine with your concern about being able to reasonably handle invalid messages. I'm concerned about that too. But I'm not sure that the mixed single byte and double byte bug you describe above is in the realm of reasonable... even so, it could be handled by transliterating it using the Latin-1 transform; there'd be lots of gibberish, but it wouldn't create exceptions. The reader would quickly gather that the message is a mess, and be able to report that to the sender, who should know that they are using Beta software, and should try resending with a production version, and report the bug to M$. > > is an area of problems at present), and I was hoping that the interfaces > > that would be presented by Python 3.0 mail APIs would be in terms of > > Unicode, > > For the applications I guess you have in mind, they can and should > be. But there is no reason why Python can't be used for RFC > 2821-level bit-flicking transport protocol. The term "bit-flicking" is foreign to me; it does not appear in the mail RFCs. Hence, I have little clue what you are talking about here. There is no reason that RFC 2821 couldn't be implemented with a Unicode interface, as far as I can see. > I don't see a way at > present to separate that level from the email module because of the > Postel Principle; you can get anything in email and you have to live > with it. The various API layers are going to need to cooperate > closely, and given how specialized and crufty the bytes-to-Unicode > relationship is, I think the lexing/parsing layer probably should be > allowed to have a pretty fluid API for quite a while. > > There need to be two (and I would say three is better) sets of APIs: > byte-oriented for handling the wire protocol, Unicode-oriented for > handling well-formed messages (both presentation and composition), and > (probably) a "codec" layer which handles nastiness in the transition. I see no reason the wire protocol cannot be implemented with Unicode APIs. Granted, the wire protocol is defined in terms of bytes, but the set of legal commands and responses are in the ASCII subset; with encoding violations, the illegal commands and responses may be in the Latin-1 subset, or some other code page (default system code page?). But the API could speak Unicode, and do the appropriate translations. Or in some cases, inappropriate translations. > > for the convenience of being abstracted away from the plethora of > > encodings that are defined at the mail transport layer. > > But handling those is definitely in the domain of the email module. > Any attachments of documents in legacy encodings will need to deal > with them explicitly in composition of Content-Type headers, etc. Definitely in the domain of the email module. Not clearly necessary to expose in the API. Binary attachments being delivered as bytes, yes; a way of obtaining the whole email message in the form of its wire protocol, yes; a way of obtaining the whole set of headers in the form of its wire protocol, for use in bounce messages, yes; what else could be usefully provided as bytes, that cannot be equally well handled by returning bytes translitered to Unicode? Please be specific; just mentioning bit-flicking, or error cases, or bad encoding sounds terrible, but provides little information as to how it can be handled via some theoretical bytes interface that cannot be handled equally as effectively (although perhaps not equally efficiently) via a transliterated Unicode data stream. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking From stephen at xemacs.org Thu Nov 6 08:47:09 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 06 Nov 2008 16:47:09 +0900 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <491288FC.8090805@g.nevcal.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> <491288FC.8090805@g.nevcal.com> Message-ID: <87ljvxs3cy.fsf@uwakimon.sk.tsukuba.ac.jp> Glenn Linderman writes: > But the API could speak Unicode, and do the appropriate translations. > Or in some cases, inappropriate translations. You've written that kind of thing three or four times by now. As far as I can see, you just don't care about any requirements beyond your own. > Please be specific; just mentioning bit-flicking, or error cases, or bad > encoding sounds terrible, but provides little information as to how it > can be handled via some theoretical bytes interface that cannot be > handled equally as effectively (although perhaps not equally > efficiently) via a transliterated Unicode data stream. I did. You missed it. Reread my example of mixing types. That is not theory, that is Emacs practice. And it's not good practice. From v+python at g.nevcal.com Thu Nov 6 08:55:38 2008 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 05 Nov 2008 23:55:38 -0800 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <87ljvxs3cy.fsf@uwakimon.sk.tsukuba.ac.jp> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> <491288FC.8090805@g.nevcal.com> <87ljvxs3cy.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4912A2FA.1090403@g.nevcal.com> On approximately 11/5/2008 11:47 PM, came the following characters from the keyboard of Stephen J. Turnbull: > Glenn Linderman writes: > > > But the API could speak Unicode, and do the appropriate translations. > > Or in some cases, inappropriate translations. > > You've written that kind of thing three or four times by now. As far > as I can see, you just don't care about any requirements beyond your > own. I suppose you could interpret it that way. I thought I was describing how to handle things for different cases. It would help if you could elucidate and enumerate the requirements you see, that you don't think that I see, or point to some place where they already are elucidated and enumerated, so I could learn what they are. > > Please be specific; just mentioning bit-flicking, or error cases, or bad > > encoding sounds terrible, but provides little information as to how it > > can be handled via some theoretical bytes interface that cannot be > > handled equally as effectively (although perhaps not equally > > efficiently) via a transliterated Unicode data stream. > > I did. You missed it. Reread my example of mixing types. That is > not theory, that is Emacs practice. And it's not good practice. There is no reference to the word emacs or types in any of the messages you've posted in this thread, maybe you are referring to another thread somewhere? Sorry, I'm new to this party, but I have read the whole thread... unless my mail reader has missed part of it. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking From stephen at xemacs.org Thu Nov 6 12:59:46 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 06 Nov 2008 20:59:46 +0900 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <4912A2FA.1090403@g.nevcal.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> <491288FC.8090805@g.nevcal.com> <87ljvxs3cy.fsf@uwakimon.sk.tsukuba.ac.jp> <4912A2FA.1090403@g.nevcal.com> Message-ID: <87k5bhrrnx.fsf@uwakimon.sk.tsukuba.ac.jp> Glenn Linderman writes: > There is no reference to the word emacs or types in any of the messages > you've posted in this thread, maybe you are referring to another thread > somewhere? Sorry, I'm new to this party, but I have read the whole > thread... unless my mail reader has missed part of it. I'm sorry, you are right; the relevant message was never sent. Here it is; I've looked it over briefly and it seems intelligible, but from your point of view it may seem out of context now. Glenn Linderman writes: > This is where you use the Latin-1 conversion. Don't throw an error > when in doesn't conform, but don't go to heroic efforts to provide > bytes alternatives... just convert the bytes to Unicode, and the > way the mail RFCs are written, and the types of encodings used, it > is mostly readable. And if it isn't encoded, it is even more > readable. This is what XEmacs/Mule does. It's a PITA for everybody (except the Mule implementers, whose life is dramatically simplified by punting this way). For one thing, what's readable to a human being may be death to a subprogram that expects valid MIME. GNU Emacs is even worse; it does provide both a bytes-like type and a unicode-like type, but then it turns around and provides a way to "cast" unicodes to bytes and vice-versa, thus exposing implementation in an unclean (and often buggy) way. > And so how much is it a problem? What are the effects of the problem? In Emacs, the problem is that strings that are punted get concatenated with strings that are properly decoded, and when reencoding is attempted, you get garbage or a coding error. Since Mule discarded the type (punt vs. decode) information, the app loses. There's no way to recover. The apps most at risk are things like MUAs (which Emacs does well) and web browsers (which it doesn't), and even AUCTeX (a mode for handling LaTeX documents---TeX is not Unicode-aware so its error messages are frequently truncated in the middle of a UTF-8 character) and they go to great lengths to keep track of what is valid and what is not in the app. They don't always succeed. I think Emacs should be doing this for them, somehow (and I'm an XEmacs implementer, not an MUA implementer!) The situation in Python will be strongly analogous, I believe. > I'm not suggesting making it worse than what it already is, in > bytes form; just to translate the bytes to Unicode codepoints so > that they can be returned on a Unicode interface. Which *does* make it worse, unless you enforce a type difference so that punted strings can't be mixed with decoded strings without effort. That type difference may as well be bytes vs. Unicode as some subclass of Unicode vs. Unicode. "Why would you mix strings?" Well, for one example there are multiple address headers which get collected into an addressee list for purpose of constructing a reply. If one of the headers is broken and another is not, you get mixed mode. The same thing can happen for multilingual message bodies: they get split into a multipart with different charsets for different parts, and if one is broken but another is not, you get mixed mode. > So they'll use the Unicode API for text, and the bytes APIs for binary > attachments, because that is what is natural. Well, as I see it there won't be bytes APIs for text. The APIs will return Unicode text if they succeed, and raise an error if not. If the error is caught, the offending object will be available as bytes. > If improperly encoded messages are received, and appropriate > transliterations are made so that the bytes get converted (default code > page) or passed through (Latin-1 transformation), then the data may be > somewhat garbled for characters in the non-ASCII subset. But that is > not different than the handling done by any 8-bit email client, nor, I > suspect (a little uncertainty here) different than the handling done by > Python < 3.0 mail libraries. Which is exactly how we got to this point. Experience with GNU Mailman and other such applications indicate that the implementation in the existing Python email module needs work, and Barry Warsaw and others who have tried to work on it say that it's not that easy, and that the API may need to change to accomodate needed changes in the implementation. From ncoghlan at gmail.com Thu Nov 6 13:09:27 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 06 Nov 2008 22:09:27 +1000 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> Message-ID: <4912DE77.5040209@gmail.com> Barry Warsaw wrote: > There are lots of other problems with the email package, and while it's > made my life much better on the whole, it is definitely in need of > improvement. Unfortunately, I don't see myself having much time to > attack it in the near future. Maybe we can make it a Pycon sprint > (instead of spending all that time on the bzr experiment ;), or, if > someone else wants to lead the dirty work, I would definitely pitch in > with my thoughts on API and implementation. So here's a question (speaking as someone that has never had to go near the email module, and is unlikely to do so anytime soon): is this something that should hold up the release of Python 3.0? As I see it, there are 3 options: 1. Hold up 3.0 until you get an API for the email package that handles Unicode vs bytes issues gracefully 2. Drop the email package entirely from 3.0, iterate on a 3.0 version of it on PyPI for a while, then add the cleaned up version in 3.1 3. Keep the current version (issues and all) in 3.0, with fairly strong warnings that the API may change in 3.1 I don't know enough about the package to have an opinion on the answer, but the nature of this thread makes me feel that this is a question that needs to be asked. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Thu Nov 6 13:22:42 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 06 Nov 2008 22:22:42 +1000 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <491221F3.4040304@g.nevcal.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> Message-ID: <4912E192.80105@gmail.com> Glenn Linderman wrote: > Even 8-bit binary can be translated into a > sequence of Unicode codepoints with the same numeric value, for example. No, no, no, no. Using latin-1 to tunnel binary data through Unicode just gets us straight back into the "is it text or bytes?" hell that is the 8-bit string in 2.x. It defeats the entire point of making the break between str and bytes in 3.0 in the first place. If something is potentially arbitrary binary data, we need to treat it that way and use bytes. People are just going to have to get over their aesthetic objections to the leading b on their bytes literals. Heck, be happy you don't have to write bytes(map(ord, 'literal')) as was the case in the early stages of 3.0 :) Providing a Unicode based text API over the top for the cases where handling malformed data isn't necessary may be convenient and a good idea, but it shouldn't be the only API (3.0 is already guilty of that in a few places - we shouldn't be adding more). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From andrewm at object-craft.com.au Thu Nov 6 13:39:47 2008 From: andrewm at object-craft.com.au (Andrew McNamara) Date: Thu, 06 Nov 2008 23:39:47 +1100 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <4912DE77.5040209@gmail.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <4912DE77.5040209@gmail.com> Message-ID: <20081106123947.AD73F60004F@longblack.object-craft.com.au> >So here's a question (speaking as someone that has never had to go near >the email module, and is unlikely to do so anytime soon): is this >something that should hold up the release of Python 3.0? I'm not sure. I noticed the email problems because I was trying to port a web framework to py3k, and eventually ran into too many problems with the cgi module (which were due partially to email's shaky handling of Unicode). The email problems are hard, and none of us really has the time to resolve them quickly, so if the release was delayed due to email, we couldn't say when it would be "done". That doesn't seem attractive. On the other hand, ripping email out completely will break a number of other modules that rely on it for MIME and RFC822 handling. At the moment, they're limping along by casting bytes to latin1 (if I remember correctly), which works mostly. I think the only sensible answer is to leave it as is with warnings. Maybe the hypothetical new module should be called email2 or something. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ From barry at python.org Thu Nov 6 18:02:10 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Nov 2008 12:02:10 -0500 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <49122ECB.5010205@g.nevcal.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> Message-ID: <638112CE-0DA3-48C3-9EDF-05ADC11A1579@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 5, 2008, at 6:39 PM, Glenn Linderman wrote: > This is an interesting perspective... "stuff em" does come to mind :) > > But I'm not at all clear on what you mean by a round-trip through > the email module. Let me see... if you are creating an email, you > (1) should encode it properly (2) a round-trip is mostly > meaningless, unless you send it to yourself. So you probably mean > email that is received, and that you want to send on. In this case, > there is already a composed/encoded form of the email in hand; it > could simply be sent as is without decoding or re-encoding. That > would be quite a clean round-trip! There are two ways to create an email DOM. One is out of whole cloth (i.e. creating Message objects and their subclasses, then attaching them into a tree). Note that it is a "generator" whose job it is to take the DOM and produce an RFC-compliant flat textural representation. The other way to get a DOM is to parse some flat textual representation. In this case, it is a core design requirement that the parser never throws an exception, and that there is a way to record and retrieve the defects in a message. The core model objects of Message (and their MIME subclasses) and Header should treat everything internally as bytes. The edges are where you want to be able to accept varying types, but always convert to bytes internally. Edges of this system include the parser, the generator, and various setter and getter methods of Message and Header. The current code has a strong desire to be idempotent, so that parser- >DOM->generator output is exactly the same as input. Small changes to the DOM or content in between should have minimal effect. For example, if you delete a header and then add it back, the header will show up at the end of the RFC 2822 header list, but everything else about the message will be unchanged. Currently idempotency is broken for defective messages. The generator is guaranteed to produce RFC-compliant output, repairing defects like missing boundaries and such. > I guess I'm not terribly concerned about the readability of > improperly encoded email messages, whether they are spam or ham. > For the purposes of SpamBayes (which I assume is similar to > spamassassin, only written in Python), it doesn't matter if the data > is readable, only that it is recognizably similar. So a consistent > mis-transliteration is as good a a correct decoding. The key thing is that parse should never ever raise an exception. We've learned the hard way that this is the most practical thing because at the level most parsing happens, you really cannot handle any errors. > For ham, the correspondent should be informed that there are > problems with their software, so that they can upgrade or > reconfigure it. That's a practical impossibility in real-world applications, as is simply discarding malformed messages. Email sucks. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRMjE3EjvBPtnXfVAQKMYAP/VbzETAnCegJavJ4zIB37hbWBWmp4yClY RRzdTXQQY8VxFioxlVwHaxa7AHW/xADsFEkOsm0saWnld4pbu9m00T6KccAOp3eY BbqXUixFRR6DmyiuLk+0F/cBlgnPH8y3XnlTXsEdXS2za5tW6YoyCsfTu9xGl0Qp aC7ta6xcvNk= =NgCu -----END PGP SIGNATURE----- From barry at python.org Thu Nov 6 18:06:50 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Nov 2008 12:06:50 -0500 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 5, 2008, at 9:09 PM, Stephen J. Turnbull wrote: > There need to be two (and I would say three is better) sets of APIs: > byte-oriented for handling the wire protocol, Unicode-oriented for > handling well-formed messages (both presentation and composition), and > (probably) a "codec" layer which handles nastiness in the transition. > >> for the convenience of being abstracted away from the plethora of >> encodings that are defined at the mail transport layer. > > But handling those is definitely in the domain of the email module. > Any attachments of documents in legacy encodings will need to deal > with them explicitly in composition of Content-Type headers, etc. I think we can simplify this. Almost all of the email-like wire protocol modules handle pure bytes. nntplib, poplib, imaplib, even the http-based libraries iiuc. That's as it should be. Largely the email package should not be concerned with these, because the email package is all about the email-DOM, parsing raw "stuff" into it, manipulating it, and generating raw "stuff" out of it. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRMkKnEjvBPtnXfVAQJQ+wQAtm2FnphKbFSZFkpMrV9ALCwQZ78x8UpC mFzU3lHZ786Wl6fM72kjVoNl+EdDWxR5ZPcDJ4j7EtMDers7431+MD3vTazaGiJP M+uVxN6XRSSe2bhLeXbjcffHuDuefV2WZJjg50YCrpGY3s6LWcPOkUtf6AENVUFL Wt5hG6nmFxQ= =+LnA -----END PGP SIGNATURE----- From barry at python.org Thu Nov 6 18:14:16 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Nov 2008 12:14:16 -0500 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <4912DE77.5040209@gmail.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <4912DE77.5040209@gmail.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 6, 2008, at 7:09 AM, Nick Coghlan wrote: > So here's a question (speaking as someone that has never had to go > near > the email module, and is unlikely to do so anytime soon): is this > something that should hold up the release of Python 3.0? Not if you're like Guido and want to get 3.0 out this year. ;) > As I see it, there are 3 options: > 1. Hold up 3.0 until you get an API for the email package that handles > Unicode vs bytes issues gracefully > 2. Drop the email package entirely from 3.0, iterate on a 3.0 > version of > it on PyPI for a while, then add the cleaned up version in 3.1 > 3. Keep the current version (issues and all) in 3.0, with fairly > strong > warnings that the API may change in 3.1 At this point I think our only option is essentially 3, keep what we have warts and all. When the precursor to the email package was being developed (at that time, called mimelib), it was initially done as a separate package and only folded into core when it was stable and fairly widely used. For email-ng (or whatever we call it) we should follow the same guidelines. Eventually email-ng will be folded back into the core and will replace the current email package. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRMl6HEjvBPtnXfVAQIw6AP8D1ie5tOyL+2nvemxE8pEHd4HrfudqTDu xMHqi7QyT/EUfEsrK1lH4wqZhE76dbDlie6yGQWL6vrAsUPvo3xEDWCOie6+18D+ TO/G2s7jXtZeMXSXJFpCmVUE+kS2B4b5OJQgdHqQlJL5CyA3PhdeRrGMSyv38WDn bjqASX5hCxI= =bDTT -----END PGP SIGNATURE----- From barry at python.org Thu Nov 6 18:17:31 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Nov 2008 12:17:31 -0500 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <4912E192.80105@gmail.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <4912E192.80105@gmail.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 6, 2008, at 7:22 AM, Nick Coghlan wrote: > Glenn Linderman wrote: >> Even 8-bit binary can be translated into a >> sequence of Unicode codepoints with the same numeric value, for >> example. > > No, no, no, no. Using latin-1 to tunnel binary data through Unicode > just > gets us straight back into the "is it text or bytes?" hell that is the > 8-bit string in 2.x. It defeats the entire point of making the break > between str and bytes in 3.0 in the first place. And I'll note that this is essentially how the email package in 3.0 cheats its way into some modicum of usability. It is teh suck, but it works (defined as "passes the tests" ;). > If something is potentially arbitrary binary data, we need to treat it > that way and use bytes. People are just going to have to get over > their > aesthetic objections to the leading b on their bytes literals. Heck, > be > happy you don't have to write bytes(map(ord, 'literal')) as was the > case > in the early stages of 3.0 :) > > Providing a Unicode based text API over the top for the cases where > handling malformed data isn't necessary may be convenient and a good > idea, but it shouldn't be the only API (3.0 is already guilty of > that in > a few places - we shouldn't be adding more). Right, and really it's a deeper issue. We're really only concerned with bytes vs. unicodes in headers. When talking about payloads, we get into a much more rich type hierarchy, with images, audio, byte streams, etc, etc. Message.get_payload(decode=True) doesn't know anything about that stuff, but it could. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRMmrHEjvBPtnXfVAQLjuQQAmhi6Fz/K4MN+QBDzRgxZmX5WnSpYs2IR ZYei/S/0xxbtZbfvC0IzIeeg4BfR1SVGRYypZGWSwSOxHX08VWNKpR0QBa6oNZsm xjiW02856wK8AHAM2Lt59GHpj4qXbEFvUDjnv7/72WmUJO+yJbRPTCwUGLY5IToZ xFCftr/WWfQ= =/faa -----END PGP SIGNATURE----- From barry at python.org Thu Nov 6 18:23:03 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Nov 2008 12:23:03 -0500 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <491288FC.8090805@g.nevcal.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> <491288FC.8090805@g.nevcal.com> Message-ID: <38AB5885-C61D-4D8E-A3B8-DEBAA0063BAF@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 6, 2008, at 1:04 AM, Glenn Linderman wrote: > So I would hope that the users of such Betas would quickly discover > that they were producing garbage, report it to M$, and go back to > using a release version with only the usual expectation of bugs, > inconsistencies, standards violations, and security exploits, but > not expect that Beta software is, or should be, fully compatible > with other applications that handle proper email. It's a nice thought, but it's completely impossible for real-world applications to ignore broken messages. "Be lenient in what you accept and strict in what you produce" is the only way you can operate, and the email package has a very strong design goal toward that tenant. > Did Python's 2.x mail library handle the data that you describe? > Did anyone seriously expect it to? Did Mozilla clients handle it? > Can you provide a list of email clients that handled it gracefully, > other than the same Outhouse Excess client that produced it? And if > not, why would you expect Python's 3.0 mail library to handle it? Yes, Python 2.x's email package handles broken messages, and email-ng must too. "Handling it" means: 1) never throw an exception 2) record defects in a usable way for upstream consumers of the message to handle it currently also means 3) ignore idempotency for defective messages. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRMn93EjvBPtnXfVAQINZQP/QeaDuDI9gRK7VQwpgkSCQ/i07v8Be6EP q8Xijd5NHt34wCxZVCWp+ttAH6FrrbKSUktLvI9CBVUzYPE+T5GhPC7vvVlnp3rF JsO5tJv8qFHjJi1jlwvgxQo1KXJB/kSxNyZiKXGZ9i16RGEoqXTbj+1XVgu8MONI 0EkEpD9bIq8= =a1sq -----END PGP SIGNATURE----- From foom at fuhm.net Thu Nov 6 18:41:09 2008 From: foom at fuhm.net (James Y Knight) Date: Thu, 6 Nov 2008 12:41:09 -0500 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <4912DE77.5040209@gmail.com> Message-ID: <7009144E-1A85-462D-8BDB-D29A96238652@fuhm.net> On Nov 6, 2008, at 12:14 PM, Barry Warsaw wrote: >> As I see it, there are 3 options: >> 1. Hold up 3.0 until you get an API for the email package that >> handles >> Unicode vs bytes issues gracefully >> 2. Drop the email package entirely from 3.0, iterate on a 3.0 >> version of >> it on PyPI for a while, then add the cleaned up version in 3.1 >> 3. Keep the current version (issues and all) in 3.0, with fairly >> strong >> warnings that the API may change in 3.1 > > At this point I think our only option is essentially 3, keep what we > have warts and all. When the precursor to the email package was > being developed (at that time, called mimelib), it was initially > done as a separate package and only folded into core when it was > stable and fairly widely used. > > For email-ng (or whatever we call it) we should follow the same > guidelines. Eventually email-ng will be folded back into the core > and will replace the current email package. Is 3.1 in general going to allow API-breaking changes from 3.0? That's fine with me if it is: it does make some sense to allow a "second chance" to get things really right. But if that's not the case, wouldn't it make more sense to keep email out of the initial 3.0 release, rather than put a half-broken version in with special "we can totally change the API for the next release" dispensation? James From guido at python.org Thu Nov 6 19:15:43 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 6 Nov 2008 10:15:43 -0800 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <7009144E-1A85-462D-8BDB-D29A96238652@fuhm.net> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <4912DE77.5040209@gmail.com> <7009144E-1A85-462D-8BDB-D29A96238652@fuhm.net> Message-ID: On Thu, Nov 6, 2008 at 9:41 AM, James Y Knight wrote: > Is 3.1 in general going to allow API-breaking changes from 3.0? That's fine > with me if it is: it does make some sense to allow a "second chance" to get > things really right. I don't want to answer this with a blanket yes or no, but it's close to no. In general I've promised people that post 3.0 we'd be striving for backwards compatibility at (at least) the same level as we did in the 2.x range. > But if that's not the case, wouldn't it make more sense to keep email out of > the initial 3.0 release, rather than put a half-broken version in with > special "we can totally change the API for the next release" dispensation? Tough call. I'm inclined to give people *something* in 3.0 with the promise we'll fix it in 3.1, rather than withholding it altogether. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Thu Nov 6 19:47:41 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Nov 2008 13:47:41 -0500 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <4912DE77.5040209@gmail.com> <7009144E-1A85-462D-8BDB-D29A96238652@fuhm.net> Message-ID: <298E4E1A-6D03-4CF0-B79A-13608404831B@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 6, 2008, at 1:15 PM, Guido van Rossum wrote: >> But if that's not the case, wouldn't it make more sense to keep >> email out of >> the initial 3.0 release, rather than put a half-broken version in >> with >> special "we can totally change the API for the next release" >> dispensation? > > Tough call. I'm inclined to give people *something* in 3.0 with the > promise we'll fix it in 3.1, rather than withholding it altogether. I think that's the right thing to do, because large parts of the API will be the same, and where ever it's possible, we should provide a migration path for the new API (e.g. DeprecationWarnings, etc.). - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRM7zXEjvBPtnXfVAQKlMQP/YxW+AWdFb83NC9mpL3uBZrNkEygKlcp6 IoyehmucOfCmPGp8dwCkw/BP9qCoKXkFyCnMbIuLOhbyzYfPsPD822voGjeLNb2O bYPMoMSOdlUPJaV4trdGd3RR7KIYAwhXymWW1MxnkyfDZ1mNyRRJyR3SMPJiLZoL /MDfrcchcGQ= =z39Z -----END PGP SIGNATURE----- From v+python at g.nevcal.com Thu Nov 6 20:25:55 2008 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 06 Nov 2008 11:25:55 -0800 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <87k5bhrrnx.fsf@uwakimon.sk.tsukuba.ac.jp> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> <491288FC.8090805@g.nevcal.com> <87ljvxs3cy.fsf@uwakimon.sk.tsukuba.ac.jp> <4912A2FA.1090403@g.nevcal.com> <87k5bhrrnx.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <491344C3.3080701@g.nevcal.com> On approximately 11/6/2008 3:59 AM, came the following characters from the keyboard of Stephen J. Turnbull: > Glenn Linderman writes: > > > There is no reference to the word emacs or types in any of the messages > > you've posted in this thread, maybe you are referring to another thread > > somewhere? Sorry, I'm new to this party, but I have read the whole > > thread... unless my mail reader has missed part of it. > > I'm sorry, you are right; the relevant message was never sent. Here > it is; I've looked it over briefly and it seems intelligible, but from > your point of view it may seem out of context now. Stuff happens. Apology accepted. The goal here isn't to make points or play one-up, the goal is to figure out if making a more complex interface (having both bytes and Unicode interfaces) is beneficial to life. I'm certain that I don't see all the issues yet; but if the issues can be stated clearly, and the alternative solutions outlined, then I would get educated, which is good for me, but perhaps annoying for you. Progress gets made faster if we stay out of the flame-fanning. I've read the other responses received to date, but choose to compose my response to this message, as it is the most meaty. The others discuss only particular (interesting) details. Summary of issues is at the end. Skip directly to the summary before reading the interspersed comments, if you wish. Search for "summarize". Comment on general data handling. It is good to follow the rules, of course, but not everyone does. When they don't, it is not clear a program can cure the problem by itself. 1) If the data is already corrupted by using the wrong encoding, potentially it could be reversed if the proper encoding could be intuited. 1a) If it is returned as bytes, then once the proper encoding is intuited, the data can be decoded properly into Unicode. 1b) If it is returned as Latin-1 decoded Unicode, then once the proper encoding is intuited, the Unicode data can be reencoded as bytes using Latin-1 (this is a fully reversible, no data loss reencoding), and then decoded properly into Unicode. The hard part here is intuiting the proper encoding; 1b is less efficient than 1a, but no less possible. Intuiting the proper encoding is most likely done by human choice (iterating over: try this encoding, does it look better?) 2) If the data is already corrupted by using multiple encodings when only one is claimed, then again it could be reversed if the proper encodings, as well as the boundaries between them, could be intuited. The same parts a) and b) apply as in #1, but extremely complexified by the boundary selections. Again it seems that human choice is required. Select a range of text, and try displaying it in a different encoding to see if it makes more sense. For both 1 & 2, the user interaction is much more time consuming than the 3-stage decoding, encoding, and redecoding process, I would expect. More below. > Glenn Linderman writes: > > > This is where you use the Latin-1 conversion. Don't throw an error > > when in doesn't conform, but don't go to heroic efforts to provide > > bytes alternatives... just convert the bytes to Unicode, and the > > way the mail RFCs are written, and the types of encodings used, it > > is mostly readable. And if it isn't encoded, it is even more > > readable. > > This is what XEmacs/Mule does. It's a PITA for everybody (except the > Mule implementers, whose life is dramatically simplified by punting > this way). For one thing, what's readable to a human being may be > death to a subprogram that expects valid MIME. GNU Emacs is even > worse; it does provide both a bytes-like type and a unicode-like type, > but then it turns around and provides a way to "cast" unicodes to > bytes and vice-versa, thus exposing implementation in an unclean (and > often buggy) way. > > > And so how much is it a problem? What are the effects of the problem? > > In Emacs, the problem is that strings that are punted get concatenated > with strings that are properly decoded, and when reencoding is > attempted, you get garbage or a coding error. Uh-huh. Garbage (wrongly decoded, then re-encoded), I would expect. Coding errors, I would not, since Latin-1 codepoints are certainly reencodable to Unicode (creating legal looking garbage OUt of originally illegal garbage). Can you give me an example of a coding error, or is this just FUD? > Since Mule discarded > the type (punt vs. decode) information, the app loses. This is precisely the problem that was faced for "fake unicode file handling" that was the topic of a thread a few weeks ago. While the Latin-1 transform (or UTF-8b, or others mentioned there), can provide a round-trip decode/encode, it is only useful and usable if the knowledge that the transform was performed is retained. The choice there was to have a binary interface, and build a Unicode interface on top of it that can't see the binaries that do not conform to UTF-8. The problem there is that existing programs expect to be able to manipulate file names as text, but existing operating systems provide bytes interfaces. > There's no way to recover. Not automatically. Point 2) above addresses this. It would require human intelligence to attempt to recover, and even the human would find it extremely painstaking to assist in the recovery process. > The apps most at risk are things like MUAs (which Emacs > does well) and web browsers (which it doesn't), and even AUCTeX (a > mode for handling LaTeX documents---TeX is not Unicode-aware so its > error messages are frequently truncated in the middle of a UTF-8 > character) and they go to great lengths to keep track of what is valid > and what is not in the app. They don't always succeed. I think Emacs > should be doing this for them, somehow (and I'm an XEmacs implementer, > not an MUA implementer!) So your belief that Emacs should be doing this for them somehow is nice, perhaps it should. However, it doesn't sound like you have a solution for emacs... How should it keep track? How is it helpful? If TeX is not Unicode aware, what is it doing dealing with UTF-8 data? Or it is dealing with Latin-1 transformed UTF-8 garbage? > The situation in Python will be strongly analogous, I believe. And so are you proposing that a binary interface to the data, rather than a Unicode interface to the Latin-1 transformed data, will be more usable by the Python solution that might be able to be similar to the Emacs solution, that hasn't been figured out yet? Once the boundaries and encoding has been lost by the original buggy MUA that has injected the data into the email message, only human intelligence has a chance of recreating the original message in all cases, and even then it may take more than one human to achieve it. There may be cases where heuristics can be applied, when human intelligence figures out the type of bugs in the original MUA, and can recognize patterns that allow it to rediscover the boundaries. This is unlikely to work in all cases, but could perhaps work in some cases. Even in the cases where it can work with some measurable success, I claim that the heuristics could be coded based on the Latin-1 transformed Unicode equally effectively as based on the bytes. > > I'm not suggesting making it worse than what it already is, in > > bytes form; just to translate the bytes to Unicode codepoints so > > that they can be returned on a Unicode interface. > > Which *does* make it worse, unless you enforce a type difference so > that punted strings can't be mixed with decoded strings without > effort. That type difference may as well be bytes vs. Unicode as some > subclass of Unicode vs. Unicode. 138 is still 138 whether it is a byte or a Unicode codepoint. Yes, concatenating stuff that is transformed with stuff that is properly decoded would be stupid. Enforcing a type difference is purely an application thing, though. Each piece of data retrieved would have a consistent decoding provided... either the proper decoding as specified in the message, or the Latin-1 or current code page decode if no encoding is specified. Either is reversible if the application doesn't like the results, and wants to try a different encoding. The APIs could have optional parameters and results that specify the encoding to use, or the encoding that was used, to decode the results. If the app wishes to keep that separate, and convert it to a different type to help it stay separate that is the app's privilege. If the app wishes to concatenate with other data, that is the app's choice (and having the interface define a bunch of different types for different decodings wouldn't really help the ignorant app, which would simply convert the different types back to strings and then concatenate, or the smart app, which could do its own type encapsulations if it thinks that would help). > "Why would you mix strings?" Well, for one example there are multiple > address headers which get collected into an addressee list for purpose > of constructing a reply. If one of the headers is broken and another > is not, you get mixed mode. Sure. Now you have mixed mode. Try to send the reply message... if the email address part is OK, then it gets sent, with a gibberish name. If the email address part is not OK, that destination bounces. Now what? Seriously, what else could be done? You could try a bunch of different encodings to attempt to resolve the broken email address or name... requires human intelligence to decide which is correct... when the bounce message comes, the human will get involved. If the bounce message doesn't come, then all is well (problem only affected the name part, not the email address part). > The same thing can happen for > multilingual message bodies: they get split into a multipart with > different charsets for different parts, and if one is broken but > another is not, you get mixed mode. First, if the multilingual message bodies are know to be multilingual when they are encoded, and are in different multiparts, what are the chances that an application that knows to correctly keep the multilingual parts separate is dumb enough to encode one correctly and one incorrectly? Is this a real scenario? What software/version does this? If it is real scenario, it still requires human intelligence to resolve... to choose different encodings, and decide which one "looks right". Since it is in separate parts, the boundaries are not lost, so this is case 1 above. If the boundaries are lost, the human can direct the program to go back to the original message, which still has its boundaries, and start over from there, with different encodings. If the app wants to be smart enough to provide such features. You might write such an app just for fun; I might or might not, depending on if someone pays me, or I have other incentive. Given boundaries, it is case 1) above. If the boundaries are lost, it is case 2). How is it easier if the bytes are preserved, vs translated via Latin-1 to a Unicode string? > > So they'll use the Unicode API for text, and the bytes APIs for binary > > attachments, because that is what is natural. > > Well, as I see it there won't be bytes APIs for text. The APIs will > return Unicode text if they succeed, and raise an error if not. If > the error is caught, the offending object will be available as bytes. Sure; I'd proposed a way to get a whole messages as bytes for archiving, logging, message store, etc. I'd proposed a way to get a particular MIME part as bytes for binary parts. You seem to be proposing a way to get text MIME parts as binary if they fail to decode. I have no particular problem with the API providing that ability. I have a specific question here: what encodings, when the attempt is made to decode to Unicode, will ever fail? For 8-bit encodings, the answer is none. You may get gibberish, but not a failure, because every 8-bit encoding has every byte value used, and Unicode contains all those characters. So you've mentioned Asian encodings, and certainly these could fail to convert to Unicode if the decoder finds inappropriate sequences. I don't know enough about all the multi-byte encodings to know if all of them can fail, or if applying a particular decoding might produce gibberish, but not fail. The ones I know about use a particular range of characters to represent "first byte" of a pair, but what I don't know is whether any byte can follow the first byte, or if only certain bytes can follow the first byte. I do that for some multi-byte encodings, the first byte can be followed by second bytes in the ASCII range; I don't know if it is illegal to be followed by another byte in the "first byte" range. Certainly there could be 2-byte pairs that don't have an associated character, although I don't know that that exists for any particular encoding. Can you cite a particular multi-byte encoding that has byte sequences that are illegal, and can be used to detect failure? Or can failure only be detected by the human determining that it is gibberish? > > If improperly encoded messages are received, and appropriate > > transliterations are made so that the bytes get converted (default code > > page) or passed through (Latin-1 transformation), then the data may be > > somewhat garbled for characters in the non-ASCII subset. But that is > > not different than the handling done by any 8-bit email client, nor, I > > suspect (a little uncertainty here) different than the handling done by > > Python < 3.0 mail libraries. > > Which is exactly how we got to this point. Experience with GNU > Mailman and other such applications indicate that the implementation > in the existing Python email module needs work, and Barry Warsaw and > others who have tried to work on it say that it's not that easy, and > that the API may need to change to accomodate needed changes in the > implementation. So let me try to summarize. I could have reached some inappropriate issues or conclusions. I'm willing to be corrected. But I'd much prefer to be corrected by specific cases that can be detected and corrected via a bytes interface that cannot be detected and corrected by using a bytes-translitered-to-Unicode interface, complete with specific encodings that are used, properly or improperly, to arrive at the case, and specific APIs that must be changed to achieve the goal. A) An attempt to decode text to Unicode may fail. A1) doesn't apply to 8-bit encodings. A2) doesn't apply to some multi-byte encodings A3) applies to UTF-8 A4) may apply to some other multi-byte encodings B) User sees gibberish because of decoding problems. What can be done? Can the app provide features to help? Do any of the features depend on API features? Let's assume that the app wants to help, and provides features. User must also get involved, because the app/API can't tell the difference between gibberish and valid text. B1) User can see a map of the components of the email, and their encodings, and whether they were provided by the email message, or were the default for the app. User chooses a different decoding for a component, and the app reprocesses that component. API requirement: a way for the user/app to specify an override to the decoding for a component. B2) User chooses binary for a particular component. App reprocesses the component, and asks what file to store the binary in. API requirement: a way for the user/app to specify an override to the decoding for a component. I've now looked briefly at the email module APIs. They seem quite flexible to me. I don't know what happens under the covers. It seems that the API is already set up flexibly enough to handle both bytes and Unicode!!! Perhaps it is just the implementation that should be adjusted. (I'm not saying that might not be too big a job for 3.0, I haven't read the code.) It seems that get_/set_payload might want to be able to return/accept either string or bytes, depending on the other parameters involved. Let's talk again about creation of messages first. If a string is supplied, it is Unicode. The encoding parameter describes what encoding should be applied to convert the message to wire-protocol bytes. The data should be saved as Unicode until the request is made to convert it to wire protocol, so that set_charset can be called a few dozen times if desired (not clear why that would be done, though) to change the encoding. Perhaps it is appropriate to verify that the encoding can happen without using the substitution character, or perhaps that should be the user's responsibility. This choice should be documented. If bytes are supplied, an encoding must also be supplied. The data should be saved in this encoding until the request is made to convert it to wire-protocol. This encoding should be used if possible, otherwise converted to an encoding that is acceptable to the wire protocol. Perhaps it is appropriate to verify that the translation, if necessary, can happen without using the substitution character, or perhaps that should be the user's responsibility. This choice should be documented. It seems that charset None implies ASCII, for historical reasons; perhaps that can be overloaded to alternately mean binary, as the handling would be roughly the same, but perhaps a new 'binary' charset should be created to make it clear that charset changes don't make sense, and to reject attempts to convert binary data to character data. For an incoming message, the wire-protocol format should be used as the primary data store. Cached pointers and lengths to various MIME parts and subparts (individual headers, body, preamble, epilogue) would be appropriate. get_ operations would find the data, and interpret it according to the current (defaults to message content, overridden by set_ operations) charset and encoding. Requesting a Unicode charset would imply decoding the part to Unicode from the current charset and would return a string; requesting other character sets would imply converting from the message charset to the specified charset and returning bytes; requesting binary (or possibly 'None', see above) would return the wire-protocol bytes unchanged. Then the application could do what it wants to attempt to decode that data to text using other encodings (i.e. not starting the conversion from the encoding declared explicitly or implicitly in the message part). The as_string() method becomes a misnomer in Python 3.0; since it is Python 3.0, that can be changed, no? It should become as_wire_protocol, and would default to returning bytes of binary data, which is what the wire-protocol APIs need. A variety that returns the bytes as Unicode codepoints could be implemented, for the purpose of "View source" type operations on the wire-protocol form... but that would and should only be a direct Latin-1 transliteration to Unicode. Now that I've looked at the API, I don't see why it should be changed significantly for Python 3.0. I have no clue how much of the guts would have to be changed to achieve the equivalent of what I described above. I do believe that what I outlined above would use the present API to achieve both the "I want Unicode only" philosophy that you ascribe to me, and the "I want to do bit-flipping" (whatever that means) philosophy that you claimed for yourself. Setting headers via the msg['Subject'] syntax to Unicode values is no problem. Just make sure that they get converted to ASCII encoded properly at the end. msg['Subject'] and msg[b'Subject'] could be made equivalent, but I'd never use the latter, it has an annoying b character to distract from the meaning. The syntax should permit the use of Unicode, in other words, but: * to encode non-ASCII data with full control over what parts get encoded and how, the Header API is still appropriate * as an alternative, the API could be extended to include a default header encoding * Strings supplied via the msg['Subject'] = 'some string' interface, are handled as follows: if 'some string' is in the ASCII subset, no problem. If not, and if the default header encoding has not been set, then an exception is raised. Otherwise, the default header encoding is used to encode the Unicode string as necessary. So I see the API as quite robust, although its current implementation may not be as described above, and I can't scope the effort to achieve the above. I'd like to see a "headers_as_wire_protocol" API added for generating bounce messages. It is easy enough to extract from as_wire_protocol, but common enough to be useful, methinks, and avoids allocating space for a huge message just to get its headers. What specific problems are perceived, that the present API can't handle? Are there areas in which it behaves differently than I outline above? If so, is my outline an improvement, or confusing, and why? Are there other issues? Barry said: > Yes, Python 2.x's email package handles broken messages, and email-ng > must too. "Handling it" means: > > 1) never throw an exception > 2) record defects in a usable way for upstream consumers of the message > to handle > > it currently also means > > 3) ignore idempotency for defective messages. I'm not sure what "ignore idempotency" means in this context... If the above outline is perceived as a useful set of semantics for the 3.0 email library, I might be able to find a little time (don't tell my wife) to help work on them, assuming that they are mostly implemented in Python and/or C. But I'd need a bit of hand-holding to get started, since I haven't yet figured out how to compile my own Python. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking From v+python at g.nevcal.com Thu Nov 6 20:41:23 2008 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 06 Nov 2008 11:41:23 -0800 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <638112CE-0DA3-48C3-9EDF-05ADC11A1579@python.org> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> <638112CE-0DA3-48C3-9EDF-05ADC11A1579@python.org> Message-ID: <49134863.6010800@g.nevcal.com> sorry, this one scrolled off the top, and I didn't read it before sending my other reply. On approximately 11/6/2008 9:02 AM, came the following characters from the keyboard of Barry Warsaw: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Nov 5, 2008, at 6:39 PM, Glenn Linderman wrote: > >> This is an interesting perspective... "stuff em" does come to mind :) >> >> But I'm not at all clear on what you mean by a round-trip through the >> email module. Let me see... if you are creating an email, you (1) >> should encode it properly (2) a round-trip is mostly meaningless, >> unless you send it to yourself. So you probably mean email that is >> received, and that you want to send on. In this case, there is >> already a composed/encoded form of the email in hand; it could simply >> be sent as is without decoding or re-encoding. That would be quite a >> clean round-trip! > > There are two ways to create an email DOM. One is out of whole cloth > (i.e. creating Message objects and their subclasses, then attaching them > into a tree). Note that it is a "generator" whose job it is to take the > DOM and produce an RFC-compliant flat textural representation. I grok this one; but think that for the generator, keeping things in Unicode until the last minute could be useful. Maybe not as useful as converting immediately to bytes, though, to reduce the amount of duplicated code. > The other way to get a DOM is to parse some flat textual > representation. In this case, it is a core design requirement that the > parser never throws an exception, and that there is a way to record and > retrieve the defects in a message. Sure, this makes sense. My other message suggested keeping the message flat, and using cached pointers and lengths. Of course, editing with such a technique could be a problem, because the pointers would have to be updated. A MIME-mimicking tree of flat subchunks comes to mind... > The core model objects of Message (and their MIME subclasses) and Header > should treat everything internally as bytes. The edges are where you > want to be able to accept varying types, but always convert to bytes > internally. Edges of this system include the parser, the generator, and > various setter and getter methods of Message and Header. > > The current code has a strong desire to be idempotent, so that > parser->DOM->generator output is exactly the same as input. Small > changes to the DOM or content in between should have minimal effect. > For example, if you delete a header and then add it back, the header > will show up at the end of the RFC 2822 header list, but everything else > about the message will be unchanged. Ah, this is your definition of idempotent! Which is what I expected, but wasn't sure. This is reasonable. One _could_ even convince the header to show up in the original spot, if you keep a NULL header placeholder around for deleted headers.... that would vanish only when regenerating. > Currently idempotency is broken for defective messages. The generator > is guaranteed to produce RFC-compliant output, repairing defects like > missing boundaries and such. So it seems you are happy with this level of "fixing" things? >> I guess I'm not terribly concerned about the readability of improperly >> encoded email messages, whether they are spam or ham. For the >> purposes of SpamBayes (which I assume is similar to spamassassin, only >> written in Python), it doesn't matter if the data is readable, only >> that it is recognizably similar. So a consistent mis-transliteration >> is as good a a correct decoding. > > The key thing is that parse should never ever raise an exception. We've > learned the hard way that this is the most practical thing because at > the level most parsing happens, you really cannot handle any errors. So you don't have a goal to make mangled, multi-character encodings suddenly be readable via the email lib? Only to provide the data in raw form, so that Mr. Turnbull can implement that on top, in emacs? >> For ham, the correspondent should be informed that there are problems >> with their software, so that they can upgrade or reconfigure it. > > That's a practical impossibility in real-world applications, as is > simply discarding malformed messages. Email sucks. I agree it is impossible to do that automatically. But if a correspondent suddenly gets broken software, I attempt to inform them of that... and as long as their email address comes through, I can... And I don't think I've ever proposed discarding malformed messages; just transliterating them in some way that (drum roll) doesn't cause exceptions... Sorry I wrote a bit before looking at the API, which is more robust than I expected, from Mr. Turnbull's writings. I am curious what the list of API deficiencies that have been determined are... is there a list somewhere? My summary tried to be a start on that, or an augmentation. Seems I tried to get to bug# last night, but the 'net wasn't responsive. Can't find the number now, in a quick look through the messages in this thread. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking From barry at python.org Fri Nov 7 04:53:35 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Nov 2008 22:53:35 -0500 Subject: [Python-3000] RELEASED Python 3.0rc2 Message-ID: <60B04061-65DA-4F44-8396-02F4FF0B4B47@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On behalf of the Python development team and the Python community, I am happy to announce the second release candidate for Python 3.0. This is a release candidate, so while it is not suitable for production environments, we strongly encourage you to download and test this release on your software. We expect only critical bugs to be fixed between now and the final release, currently planned for 03- Dec-2008. If you find things broken or incorrect, please submit bug reports at http://bugs.python.org For more information and downloadable distributions, see the Python 3.0 website: http://www.python.org/download/releases/3.0/ See PEP 361 for release schedule details: http://www.python.org/dev/peps/pep-0361/ Enjoy, - -Barry Barry Warsaw barry at python.org Python 2.6/3.0 Release Manager (on behalf of the entire python-dev team) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRO7wHEjvBPtnXfVAQIYrQP+Lynpa/p7VMY/YxJyjxiBI0bvOATPIKbE jqu9ZFwXlO19+G4bFiAoYviY5UFYPm3TpbMoso2qNoJsJFLSt4d+AycWWcaXKd08 vpifsxVoWvdLCLZtT7ioMBJh/juu+Pchf2o2l+PHm5aWlLvq/24uu8YKbpSKKbr9 K4gB4ecYd3A= =3UPl -----END PGP SIGNATURE----- From stephen at xemacs.org Fri Nov 7 05:58:04 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 07 Nov 2008 13:58:04 +0900 Subject: [Python-3000] email libraries: use byte or unicode strings? In-Reply-To: <491344C3.3080701@g.nevcal.com> References: <200810281612.54570.victor.stinner@haypocalc.com> <20081029091259.7153ec82@resist.wooz.org> <20081030221726.0A0636007DF@longblack.object-craft.com.au> <2B7A0223-2FFE-416F-8AE1-7082CA2453AB@python.org> <491221F3.4040304@g.nevcal.com> <20081105225947.50E885AC03F@longblack.object-craft.com.au> <49122ECB.5010205@g.nevcal.com> <87tzalsiz6.fsf@uwakimon.sk.tsukuba.ac.jp> <491288FC.8090805@g.nevcal.com> <87ljvxs3cy.fsf@uwakimon.sk.tsukuba.ac.jp> <4912A2FA.1090403@g.nevcal.com> <87k5bhrrnx.fsf@uwakimon.sk.tsukuba.ac.jp> <491344C3.3080701@g.nevcal.com> Message-ID: <87bpwsrv37.fsf@uwakimon.sk.tsukuba.ac.jp> I think we should move the discussion of the pragmatics of the email module to the email-sig list (as Barry is already doing). But this is probably my last post in this discussion until Nov 14 or so, I'm not sure I'll be connected while I'm in Shanghai. Due to travel prep, I don't have time to go into detail but two comments: > 1b) If it is returned as Latin-1 decoded Unicode, then once the proper > encoding is intuited, the Unicode data can be reencoded as bytes > using Latin-1 (this is a fully reversible, no data loss > reencoding), and then decoded properly into Unicode. This is true, as written. But it's an answer to the wrong question. What to *do with* broken data is the app's decision, it's the app's responsibility. IMO the *email* module's responsibility is to inform the app that it couldn't decode in a conforming way, and provide the raw data in case the app thinks it can do better. Barry says that it's desirable that the parser *not* raise exceptions. In that case, returning bytes where unicodes are expected is a way to accomplish all the desiderata. > I've now looked briefly at the email module APIs. They seem quite > flexible to me. I don't know what happens under the covers. It seems > that the API is already set up flexibly enough to handle both bytes and > Unicode!!! Perhaps it is just the implementation that should be > adjusted. Well, yes, that's what everybody is hoping for. I agree with Barry's assessment that at most minor, backward compatible changes should do the trick, so that including email in Python 3.0 is OK IMO. However, Barry has already said that he has looked at trying to fix some of the known issues, and he's not sure it can be done without an API break. From victor.stinner at haypocalc.com Fri Nov 7 10:53:38 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 7 Nov 2008 10:53:38 +0100 Subject: [Python-3000] [Python-Dev] RELEASED Python 3.0rc2 In-Reply-To: <60B04061-65DA-4F44-8396-02F4FF0B4B47@python.org> References: <60B04061-65DA-4F44-8396-02F4FF0B4B47@python.org> Message-ID: <200811071053.38908.victor.stinner@haypocalc.com> Hi, Great job Barry and all contributors who fixed the "last" bugs ;-) Le Friday 07 November 2008 04:53:35 Barry Warsaw, vous avez ?crit?: > We expect only critical bugs to be fixed between now and the > final release, currently planned for 03-Dec-2008. The document "What's new in Python 3.0" in should be updated: http://docs.python.org/dev/3.0/whatsnew/3.0.html "PEP 352: Exceptions must derive from BaseException. This is the root of the exception hierarchy." I prefer to derive from Exception to be able to use "exept Exception as: ..." which doesn't catch SystemExit nor KeyboardInterrupt. "PEP 3134: Exception chaining. (The __context__ feature from the PEP hasn?t been implemented yet in 3.0a2.)" The feature is now implemented! "PEP 237: long renamed to int. (...) sys.maxint was also removed since the int type has no maximum value anymore." What about the new sys.maxsize constant? Oh, it's written at the bottom, "Removed sys.maxint. Use sys.maxsize." Both paragraphs should be merged. "Optimizations (...) 33% slower (...) we expect to be optimizing string and integer operations significantly before the final 3.0 release!" I don't expect "significant" changes before the final release. I worked on some patches about the int type (use base 2^30 instead of 2^15, GMP, etc.), but all patches optimize large integer (more than 1 digit, or more than 20 digits) whereas most integers in Python are very small. About str=>unicode, I also don't expect changes. On character in now 4 bytes (or 2 bytes), whereas Python2 used 1 byte. This introduce an overhead. Most functions at C level use an conversion from byte string (ASCII) to Unicode (eg. PyErr_SetString). We should directly use wide string (eg. PyErr_SetWideString?). "Porting to Python 3.0" This section is very small and gives few informations. There is nothing about 2to3 (just two references in the document). I also read somewhere that someone wrote a document explaining how to port a C extension to Python3. What about a link to the "What's new in Python 2.6" document? Most people are still using Python 2.4 or 2.5. And Python3 is Python 2.5 + + . -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ From barry at python.org Fri Nov 7 14:31:59 2008 From: barry at python.org (Barry Warsaw) Date: Fri, 7 Nov 2008 08:31:59 -0500 Subject: [Python-3000] [Python-Dev] RELEASED Python 3.0rc2 In-Reply-To: <200811071053.38908.victor.stinner@haypocalc.com> References: <60B04061-65DA-4F44-8396-02F4FF0B4B47@python.org> <200811071053.38908.victor.stinner@haypocalc.com> Message-ID: <0A787651-013A-4635-BC8A-EC8C52B53654@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 7, 2008, at 4:53 AM, Victor Stinner wrote: > Hi, > > Great job Barry and all contributors who fixed the "last" bugs ;-) Thanks! > The document "What's new in Python 3.0" in should be updated: > http://docs.python.org/dev/3.0/whatsnew/3.0.html Issue 2306, assigned to Guido. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRRDT3EjvBPtnXfVAQJUsAP/S9rFA0V4HDp80oxycy0coWDaW2HAnHA4 ombn/+HjWtS2zTIbCkqdFfFsZ05DRDQN7LNKZVkV1sRsmzJ9fASITQP6mpUiFy/f Aq6I0Z73jHlZtINuDRI5ZaCQCrxHPM/lTdjEP3h0fGxtW0wEibr1/rep3gAJMomL 2V6y2mCOc8I= =H0+/ -----END PGP SIGNATURE----- From musiccomposition at gmail.com Fri Nov 7 18:39:48 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Fri, 7 Nov 2008 11:39:48 -0600 Subject: [Python-3000] [Python-Dev] RELEASED Python 3.0rc2 In-Reply-To: <200811071053.38908.victor.stinner@haypocalc.com> References: <60B04061-65DA-4F44-8396-02F4FF0B4B47@python.org> <200811071053.38908.victor.stinner@haypocalc.com> Message-ID: <1afaf6160811070939m3170bcd5j8613a26d71ed190a@mail.gmail.com> On Fri, Nov 7, 2008 at 3:53 AM, Victor Stinner wrote: > Hi, > > Great job Barry and all contributors who fixed the "last" bugs ;-) Which reminds me that this release's star developer award goes to Victor for his hard work on fixing up the networking libraries for Py3k! -- Cheers, Benjamin Peterson "There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner." From steve at holdenweb.com Fri Nov 7 18:42:11 2008 From: steve at holdenweb.com (Steve Holden) Date: Fri, 07 Nov 2008 12:42:11 -0500 Subject: [Python-3000] RELEASED Python 3.0rc2 In-Reply-To: <1afaf6160811070939m3170bcd5j8613a26d71ed190a@mail.gmail.com> References: <60B04061-65DA-4F44-8396-02F4FF0B4B47@python.org> <200811071053.38908.victor.stinner@haypocalc.com> <1afaf6160811070939m3170bcd5j8613a26d71ed190a@mail.gmail.com> Message-ID: <49147DF3.8010709@holdenweb.com> Benjamin Peterson wrote: > On Fri, Nov 7, 2008 at 3:53 AM, Victor Stinner > wrote: >> Hi, >> >> Great job Barry and all contributors who fixed the "last" bugs ;-) > > Which reminds me that this release's star developer award goes to > Victor for his hard work on fixing up the networking libraries for > Py3k! > Yay, Victor!!!! regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ From barry at python.org Fri Nov 7 21:24:42 2008 From: barry at python.org (Barry Warsaw) Date: Fri, 7 Nov 2008 15:24:42 -0500 Subject: [Python-3000] [Python-Dev] RELEASED Python 3.0rc2 In-Reply-To: <1afaf6160811070939m3170bcd5j8613a26d71ed190a@mail.gmail.com> References: <60B04061-65DA-4F44-8396-02F4FF0B4B47@python.org> <200811071053.38908.victor.stinner@haypocalc.com> <1afaf6160811070939m3170bcd5j8613a26d71ed190a@mail.gmail.com> Message-ID: <2655FBFB-B7BD-4567-8FF2-573626E7BD55@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 7, 2008, at 12:39 PM, Benjamin Peterson wrote: > On Fri, Nov 7, 2008 at 3:53 AM, Victor Stinner > wrote: >> Hi, >> >> Great job Barry and all contributors who fixed the "last" bugs ;-) > > Which reminds me that this release's star developer award goes to > Victor for his hard work on fixing up the networking libraries for > Py3k! Indeed, great work Victor! - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRSkCnEjvBPtnXfVAQK7MAP/ayV9U241Xi1+s02LNphTwSNhJFDW9mbP UhN1PVINDkS6kqs8GYQ6iO5KsTi20eQxjuTsITLZzFsNuGUKXRcIzkzibWZBMR7m 3WjHC5heGwaxYmaPYmcHUFipdW8T0vYGwiNmk/TWinHx11KSCEhHHeP/Mcr/xd+9 j+VCcl+45pI= =S+vx -----END PGP SIGNATURE----- From steve at holdenweb.com Fri Nov 7 18:42:11 2008 From: steve at holdenweb.com (Steve Holden) Date: Fri, 07 Nov 2008 12:42:11 -0500 Subject: [Python-3000] RELEASED Python 3.0rc2 In-Reply-To: <1afaf6160811070939m3170bcd5j8613a26d71ed190a@mail.gmail.com> References: <60B04061-65DA-4F44-8396-02F4FF0B4B47@python.org> <200811071053.38908.victor.stinner@haypocalc.com> <1afaf6160811070939m3170bcd5j8613a26d71ed190a@mail.gmail.com> Message-ID: <49147DF3.8010709@holdenweb.com> Benjamin Peterson wrote: > On Fri, Nov 7, 2008 at 3:53 AM, Victor Stinner > wrote: >> Hi, >> >> Great job Barry and all contributors who fixed the "last" bugs ;-) > > Which reminds me that this release's star developer award goes to > Victor for his hard work on fixing up the networking libraries for > Py3k! > Yay, Victor!!!! regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ From ondrej at certik.cz Mon Nov 10 23:32:55 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Mon, 10 Nov 2008 23:32:55 +0100 Subject: [Python-3000] cyclic imports worked in py2.6, fail in 3.0 Message-ID: <85b5c3130811101432q3326635am3d7b546d5f375f27@mail.gmail.com> Hi, I am trying to get SymPy working on python 3.0, to see if there are possible bugs. I must say that I am impressed by the 2to3 tool, that is now able to translate the whole sympy! Many thanks to Benjamin for fixing several bugs very quickly in it. However, I am having some troubles with cyclic imports. Currently the way we handle them in sympy is that we have this code at the end of each module (only in sympy core): # /cyclic/ import basic as _ _.Add = Add del _ import mul as _ _.Add = Add del _ import power as _ _.Add = Add del _ What it does is that it injects the classes defined in this particular module to the other modules. First problem: 2to3 tool doesn't convert this correctly, I had to manually conver this to: from . import basic as _ _.Add = Add del _ from . import mul as _ _.Add = Add del _ from . import power as _ _.Add = Add del _ and second problem is that it still fails to import in python3.0: $ python3.0 q.py Traceback (most recent call last): File "q.py", line 1, in import sympy File "/home/ondra/repos/sympy/sympy/__init__.py", line 16, in from sympy.core import * File "/home/ondra/repos/sympy/sympy/core/__init__.py", line 6, in from .numbers import Number, Real, Rational, Integer, igcd, ilcm, RealNumber, \ File "/home/ondra/repos/sympy/sympy/core/numbers.py", line 13, in from .power import integer_nthroot File "/home/ondra/repos/sympy/sympy/core/power.py", line 631, in from . import mul as _ File "/home/ondra/repos/sympy/sympy/core/mul.py", line 657, in from . import add as _ File "/home/ondra/repos/sympy/sympy/core/add.py", line 384, in from . import mul as _ ImportError: cannot import name mul However, this works in python2.4, 2.5 and 2.6. Notice, that "from . import mul as _" worked in power.py, but failed in add.py 3 lines below. This is weird, isn't it? So my questions are: * is our "hack" supported at all? If not, how would you suggest us to handle cyclic imports? Basically, we want Add and Mul classes to be defined in separate modules, however the methods of both classes need access to the other --- so the only other option that I can see is to locally import the other module in each method, which is slow and not so clean. Another option is to import the other class to the module at runtime using some dynamic features of Python. * if it is supposed to work, is this a bug in python3.0? Thanks, Ondrej From musiccomposition at gmail.com Tue Nov 11 04:10:57 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Mon, 10 Nov 2008 21:10:57 -0600 Subject: [Python-3000] cyclic imports worked in py2.6, fail in 3.0 In-Reply-To: <85b5c3130811101432q3326635am3d7b546d5f375f27@mail.gmail.com> References: <85b5c3130811101432q3326635am3d7b546d5f375f27@mail.gmail.com> Message-ID: <1afaf6160811101910p7c1f43e3h7c206f438782c98e@mail.gmail.com> On Mon, Nov 10, 2008 at 4:32 PM, Ondrej Certik wrote: > Hi, > > I am trying to get SymPy working on python 3.0, to see if there are > possible bugs. I must say that I am impressed by the 2to3 tool, that > is now able to translate the whole sympy! > Many thanks to Benjamin for fixing several bugs very quickly in it. Thanks for reporting them! > > However, I am having some troubles with cyclic imports. Currently the > way we handle them in sympy is that we have this code at the end of > each module (only in sympy core): > > # /cyclic/ > import basic as _ > _.Add = Add > del _ > > import mul as _ > _.Add = Add > del _ > > import power as _ > _.Add = Add > del _ > > > What it does is that it injects the classes defined in this particular > module to the other modules. First problem: 2to3 tool doesn't convert > this correctly, I had to manually conver this > to: Would you file a bug report for the 2to3 problem, please? That should be fixed. > > from . import basic as _ > _.Add = Add > del _ > > from . import mul as _ > _.Add = Add > del _ > > from . import power as _ > _.Add = Add > del _ > > and second problem is that it still fails to import in python3.0: > > $ python3.0 q.py > Traceback (most recent call last): > File "q.py", line 1, in > import sympy > File "/home/ondra/repos/sympy/sympy/__init__.py", line 16, in > from sympy.core import * > File "/home/ondra/repos/sympy/sympy/core/__init__.py", line 6, in > from .numbers import Number, Real, Rational, Integer, igcd, ilcm, > RealNumber, \ > File "/home/ondra/repos/sympy/sympy/core/numbers.py", line 13, in > from .power import integer_nthroot > File "/home/ondra/repos/sympy/sympy/core/power.py", line 631, in > from . import mul as _ > File "/home/ondra/repos/sympy/sympy/core/mul.py", line 657, in > from . import add as _ > File "/home/ondra/repos/sympy/sympy/core/add.py", line 384, in > from . import mul as _ > ImportError: cannot import name mul > > > However, this works in python2.4, 2.5 and 2.6. Notice, that "from . > import mul as _" worked in power.py, but failed in add.py 3 lines > below. This is weird, isn't it? Actually, if you use the relative imports with 2.6, it fails like 3.0. 3.0 is just being stricter. > > So my questions are: > > * is our "hack" supported at all? If not, how would you suggest us to > handle cyclic imports? Basically, we want Add and Mul classes to be > defined in separate modules, however the methods of both classes need > access to the other --- so the only other option that I can see is to > locally import the other module in each method, which is slow and not > so clean. Another option is to import the other class to the module at > runtime using some dynamic features of Python. First, I suggest instead of using sibling imports in your packages, you should convert to all relative imports or all absolute imports. (ie. from sympy.core import something) Instead of inserting Mul into the namespace of different modules, you do something like: from .mul import Mul at the bottom of files that use the cyclic import. > > * if it is supposed to work, is this a bug in python3.0? No, Python 3.0 is just being stricter. :) You may want to test this out by using "from __future__ import absolute_import". -- Cheers, Benjamin Peterson "There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner." From ondrej at certik.cz Tue Nov 11 13:15:00 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Tue, 11 Nov 2008 13:15:00 +0100 Subject: [Python-3000] cyclic imports worked in py2.6, fail in 3.0 In-Reply-To: <1afaf6160811101910p7c1f43e3h7c206f438782c98e@mail.gmail.com> References: <85b5c3130811101432q3326635am3d7b546d5f375f27@mail.gmail.com> <1afaf6160811101910p7c1f43e3h7c206f438782c98e@mail.gmail.com> Message-ID: <85b5c3130811110415i459fcffbsd338e810f19e27a8@mail.gmail.com> >> What it does is that it injects the classes defined in this particular >> module to the other modules. First problem: 2to3 tool doesn't convert >> this correctly, I had to manually conver this >> to: > > Would you file a bug report for the 2to3 problem, please? That should be fixed. I will. There are a lot more import conversions problem like this one. >> However, this works in python2.4, 2.5 and 2.6. Notice, that "from . >> import mul as _" worked in power.py, but failed in add.py 3 lines >> below. This is weird, isn't it? > > Actually, if you use the relative imports with 2.6, it fails like 3.0. > 3.0 is just being stricter. > >> >> So my questions are: >> >> * is our "hack" supported at all? If not, how would you suggest us to >> handle cyclic imports? Basically, we want Add and Mul classes to be >> defined in separate modules, however the methods of both classes need >> access to the other --- so the only other option that I can see is to >> locally import the other module in each method, which is slow and not >> so clean. Another option is to import the other class to the module at >> runtime using some dynamic features of Python. > > First, I suggest instead of using sibling imports in your packages, > you should convert to all relative imports or all absolute imports. > (ie. from sympy.core import something) The problem is that we still need to support python2.4, so the only option seems to be absolute imports. I think from the major distributions, only Debian and Gentoo still use 2.4 in their stable versions, but everyone now uses 2.5 in their unstable versions, so I guess we need to support 2.4 for at least one more year or two. > > Instead of inserting Mul into the namespace of different modules, you > do something like: > > from .mul import Mul > > at the bottom of files that use the cyclic import. Ok, I'll try to fix it this way. > >> >> * if it is supposed to work, is this a bug in python3.0? > > No, Python 3.0 is just being stricter. :) You may want to test this > out by using "from __future__ import absolute_import". Thanks for the info. So I'll first convert the imports, then run it through 2to3 again and report all problems again. I noticed, that no imports from sympy/__init__.py and all the other __init__py files were converted and they fail with python3.0. I must admit, that because I was still using python2.4, I must first learn how it works (e.g. naively adding the dot like "from .functions import *" sometimes fail) and then report back when I understand it more. Ondrej From mal at egenix.com Tue Nov 11 14:06:37 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 11 Nov 2008 14:06:37 +0100 Subject: [Python-3000] None in Comparisons Message-ID: <4919835D.5000605@egenix.com> Why was the special case for None being "smaller" than all other objects in Python removed from Python 3.0 ? (see object.c in Py2.x) There's currently a discussion on c.l.p regarding this issue (see below). It looks like a bug in Python 3.0 to me, since None is widely used as "n/a" object in Python. Should I file a bug report for this ? -------- Original Message -------- Subject: Re: Python 3.0 - is this true? Date: Tue, 11 Nov 2008 14:02:59 +0100 From: M.-A. Lemburg Organization: eGenix.com Software GmbH To: Steven D'Aprano CC: python-list at python.org References: <64fee417-96d0-458a-8f5c-c71147a2c3bb at w1g2000prk.googlegroups.com> <7edee5cc-a98e-4a72-880a-7e20339f9697 at i20g2000prf.googlegroups.com> On 2008-11-11 02:10, Steven D'Aprano wrote: > On Mon, 10 Nov 2008 12:51:51 +0000, Duncan Grisby wrote: > >> I have an object database written in Python. It, like Python, is >> dynamically typed. It heavily relies on being able to sort lists where >> some of the members are None. To some extent, it also sorts lists of >> other mixed types. It will be very hard to migrate this aspect of it to >> Python 3. > > No, it is "very hard" to sort *arbitrary* objects consistently. If it > appears to work in Python 2.x that's because you've been lucky to never > need to sort objects that cause it to break. If you read Duncan's email, he isn't talking about arbitrary objects at all. He's just referring to being able to sort lists that contain None elements. That's far from arbitrary and does work consistently in Python 2.x - simply because None is a singleton which is special cased in Python: None compares smaller to any other object in Python. I'm not sure why this special case was dropped in Python 3.0. None is generally used to be a place holder for a n/a-value and as such will pop up in lists on a regular basis. I think the special case for None should be readded to Python 3.0. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 11 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 11 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From solipsis at pitrou.net Tue Nov 11 14:28:54 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Nov 2008 13:28:54 +0000 (UTC) Subject: [Python-3000] None in Comparisons References: <4919835D.5000605@egenix.com> Message-ID: M.-A. Lemburg egenix.com> writes: > > Why was the special case for None being "smaller" than all other > objects in Python removed from Python 3.0 ? (see object.c in Py2.x) Because ordered comparisons (<, <=, >, >=) are much stricter in 3.0 than in 2.x. In practice, ordered comparisons which don't have an obvious, intuitive meaning now raise a TypeError (such as comparing a number and a string). > It looks like a bug in Python 3.0 to me, since None is widely used as > "n/a" object in Python. But why should "n/a" (or "missing", or "undefined") imply "smaller than everything else"? I understand it might be a case of "practicality beats purity", but this is not semantically obvious and can also let bugs slip through (the very bugs that the stricter ordered comparison semantics in 3.0 are meant to make easier to detect). Also there are cases where you'll want something which is *bigger* than everything else, not smaller. (SQL seems to do such a thing with NULL, but SQL isn't exactly a good example for programming language design, is it?) If it is really useful, I think i would be cleaner and more explicit to add the Smallest and Largest constants suggested elsewhere, than reuse a very widely used constant (None) for half of the purpose. cheers Antoine. From mal at egenix.com Tue Nov 11 14:54:48 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 11 Nov 2008 14:54:48 +0100 Subject: [Python-3000] None in Comparisons In-Reply-To: References: <4919835D.5000605@egenix.com> Message-ID: <49198EA8.6040400@egenix.com> On 2008-11-11 14:28, Antoine Pitrou wrote: > M.-A. Lemburg egenix.com> writes: >> Why was the special case for None being "smaller" than all other >> objects in Python removed from Python 3.0 ? (see object.c in Py2.x) > > Because ordered comparisons (<, <=, >, >=) are much stricter in 3.0 than in 2.x. > In practice, ordered comparisons which don't have an obvious, intuitive meaning > now raise a TypeError (such as comparing a number and a string). That's fine. I'm just talking about the special case for None that has existed in Python for years - and for a good reason. >> It looks like a bug in Python 3.0 to me, since None is widely used as >> "n/a" object in Python. > > But why should "n/a" (or "missing", or "undefined") imply "smaller than > everything else"? It's just a convention based on viewing None as "nothing" or the empty set. > I understand it might be a case of "practicality beats purity", but this is not > semantically obvious and can also let bugs slip through (the very bugs that the > stricter ordered comparison semantics in 3.0 are meant to make easier to > detect). Please note that I'm just talking about that one object, not all the other cases where comparisons between apples and oranges don't make sense :-) > Also there are cases where you'll want something which is *bigger* than > everything else, not smaller. Having None compare smaller than all other objects is just a convention, nothing more. If there's a need for an object that compares larger than any other object in Python, we could introduce another singleton for this, but I don't really see the need. > (SQL seems to do such a thing with NULL, but SQL isn't exactly a good example > for programming language design, is it?) NULLs are a fact in life, not only in SQL, but also in numerics and statistics. You often don't want a complex calculation or query to fail just because a few input values are not available. None has been used in Python for the same purpose in these application areas. > If it is really useful, I think i would be cleaner and more explicit to add the > Smallest and Largest constants suggested elsewhere, than reuse a very widely > used constant (None) for half of the purpose. Fair enough. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 11 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From barry at python.org Tue Nov 11 15:07:55 2008 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Nov 2008 09:07:55 -0500 Subject: [Python-3000] None in Comparisons In-Reply-To: <49198EA8.6040400@egenix.com> References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> Message-ID: <2A0602DC-2857-4873-8382-CCE158C44FA5@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 11, 2008, at 8:54 AM, M.-A. Lemburg wrote: > On 2008-11-11 14:28, Antoine Pitrou wrote: >> M.-A. Lemburg egenix.com> writes: >>> Why was the special case for None being "smaller" than all other >>> objects in Python removed from Python 3.0 ? (see object.c in Py2.x) >> >> Because ordered comparisons (<, <=, >, >=) are much stricter in 3.0 >> than in 2.x. >> In practice, ordered comparisons which don't have an obvious, >> intuitive meaning >> now raise a TypeError (such as comparing a number and a string). > > That's fine. I'm just talking about the special case for None that > has existed in Python for years - and for a good reason. How hard is it to implement your own "missing" object which has the desired semantics? Why should something as fundamental as None have it? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRmRu3EjvBPtnXfVAQKfMgQAkV5Gm6xqRlJsPZs2SEy6kisLp//V1vlT vGeDbF2jL+N717o2cIU21PQVWONarCStR+u98SO5EUkAAqUcKZ8gJ5x+RN376djv fg0YsqFrgtTyFaUpfOdxc648xkbL29TK+ClX0twSG5vf+5vFs4uc4SCj7bUd6+hc xyLqUXrybwM= =YnKH -----END PGP SIGNATURE----- From mal at egenix.com Tue Nov 11 15:21:03 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 11 Nov 2008 15:21:03 +0100 Subject: [Python-3000] None in Comparisons In-Reply-To: <2A0602DC-2857-4873-8382-CCE158C44FA5@python.org> References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> <2A0602DC-2857-4873-8382-CCE158C44FA5@python.org> Message-ID: <491994CF.9070509@egenix.com> On 2008-11-11 15:07, Barry Warsaw wrote: > On Nov 11, 2008, at 8:54 AM, M.-A. Lemburg wrote: > >> On 2008-11-11 14:28, Antoine Pitrou wrote: >>> M.-A. Lemburg egenix.com> writes: >>>> Why was the special case for None being "smaller" than all other >>>> objects in Python removed from Python 3.0 ? (see object.c in Py2.x) >>> >>> Because ordered comparisons (<, <=, >, >=) are much stricter in 3.0 >>> than in 2.x. >>> In practice, ordered comparisons which don't have an obvious, >>> intuitive meaning >>> now raise a TypeError (such as comparing a number and a string). > >> That's fine. I'm just talking about the special case for None that >> has existed in Python for years - and for a good reason. > > How hard is it to implement your own "missing" object which has the > desired semantics? Why should something as fundamental as None have it? Because None is already special, has had this feature for a very long time and there's no apparent reason to move the feature to some other special object. Also, having each module or project invent its own NULL-like object will not make things better for anyone, it would only introduce new problems. I don't see any benefit from the removal of the special property of None in Python 3.0. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 11 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From solipsis at pitrou.net Tue Nov 11 15:27:57 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Nov 2008 14:27:57 +0000 (UTC) Subject: [Python-3000] None in Comparisons References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> Message-ID: M.-A. Lemburg egenix.com> writes: > > NULLs are a fact in life, not only in SQL, but also in numerics and > statistics. You often don't want a complex calculation or query to > fail just because a few input values are not available. But it only works in the case where you only do comparisons, and where defaulting to "smaller than everything" is the right behaviour. Would you want "None + 1" to return either 1 or None, rather than raising TypeError? Also, it is trivial to write a list comprehension or generator expression that filters all None values before doing the complex calculation or query. Antoine. From victor.stinner at haypocalc.com Tue Nov 11 15:38:43 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 11 Nov 2008 15:38:43 +0100 Subject: [Python-3000] None in Comparisons In-Reply-To: <491994CF.9070509@egenix.com> References: <4919835D.5000605@egenix.com> <2A0602DC-2857-4873-8382-CCE158C44FA5@python.org> <491994CF.9070509@egenix.com> Message-ID: <200811111538.43256.victor.stinner@haypocalc.com> Le Tuesday 11 November 2008 15:21:03 M.-A. Lemburg, vous avez ?crit?: > Because None is already special, has had this feature for a very > long time (...) Yeah, Python3 breaks compatibility by removing old dummy behaviour like comparaison between bytes and characters, or between an integer an None ;-) I like the new behaviour, it helps to detect bugs earlier ! I hope that the -bb option will be enabled by default in Python 2.7 :-) You can use an explicit comparaison to None as workaround for your problem: (x is None) or (x < y) -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ From mal at egenix.com Tue Nov 11 15:44:55 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 11 Nov 2008 15:44:55 +0100 Subject: [Python-3000] None in Comparisons In-Reply-To: References: <4919835D.5000605@egenix.com> Message-ID: <49199A67.1050400@egenix.com> On 2008-11-11 15:22, Daniel Stutzbach wrote: > On Tue, Nov 11, 2008 at 7:06 AM, M.-A. Lemburg wrote: > >> Why was the special case for None being "smaller" than all other >> objects in Python removed from Python 3.0 ? (see object.c in Py2.x) >> > > It wasn't true in Python 2.5, either. Observe: > > Cashew:~/pokersleuth/tracker$ python2.5 > Python 2.5 (r25:51908, Feb 26 2007, 08:19:26) > [GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin > Type "help", "copyright", "credits" or "license" for more information. >>>> import datetime >>>> now = datetime.datetime.now() >>>> now < None > Traceback (most recent call last): > File "", line 1, in > TypeError: can't compare datetime.datetime to NoneType You're right, it's a bit unfortunate, that the special casing is only applied as fall back mechanism in Python 2.x and that's not always triggered when dealing with objects implementing rich comparisons. However, it does work for all the basic built-in types in Python, such as strings, numbers, lists, tuples, dicts, etc. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 11 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From guido at python.org Tue Nov 11 18:09:11 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Nov 2008 09:09:11 -0800 Subject: [Python-3000] None in Comparisons In-Reply-To: <49199A67.1050400@egenix.com> References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> Message-ID: We're not going to add the "feature" back that None compares smaller than everything. It's a slippery slope that ends with all operations involving None returning None -- I've seen a proposal made in all earnestness requesting that None+42 == None, None() == None, and so on. This Nonesense was wisely rejected; a whole slew of early-error-catching would have gone out of the window. It's the same with making None smaller than everything else. For numbers, you can already use -inf; for other types, you'll have to invent your own Smallest if you need it. In short, I'll have None of it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Tue Nov 11 18:58:37 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 11 Nov 2008 12:58:37 -0500 Subject: [Python-3000] None in Comparisons In-Reply-To: <4919835D.5000605@egenix.com> References: <4919835D.5000605@egenix.com> Message-ID: M.-A. Lemburg wrote: > Why was the special case for None being "smaller" than all other > objects in Python removed from Python 3.0 ? (see object.c in Py2.x) For one thing, it is only smallest when it controls the comparison. >>> class c(object): ... def __lt__(s,o): return True ... >>> cc = c() >>> cc < None True Of course, >>> None < cc True Which means that both [cc, None] and [None, cc] are sorted (assuming it only uses < comparison). Terry Jan Reedy From tim.peters at gmail.com Tue Nov 11 19:20:26 2008 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 11 Nov 2008 13:20:26 -0500 Subject: [Python-3000] None in Comparisons In-Reply-To: <49198EA8.6040400@egenix.com> References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> Message-ID: <1f7befae0811111020k74161888ie51ddc8967e5cd81@mail.gmail.com> [M.-A. Lemburg] > ... > That's fine. I'm just talking about the special case for None that > has existed in Python for years - and for a good reason. That's overstating it a bit ;-) In Python 1.5.1, comparisons were changed so that objects of numeric types compared smaller than objects of non-numeric types, and then 0 < None was true, not None < 0 (which became true substantially later). The reason for that change is explained in Misc/HISTORY (it was an attempt to preserve transitivity across chains of mixed-type comparisons). Later, during the move to rich comparisons, I was hacking the code in the same room with Guido, and realized something special had to be done with None. "Hey, Guido, what should we do about mixed-type comparisons against None?" "Hmm ... what do you think?" "Hmm ... OK, let's make None smaller than other types." "Why?" "Oh, why not?" "Good enough -- but let's not document it -- it's an arbitrary implementation detail." "Of course!" In any case, we thought this was so arbitrary that we didn't hesitate to break that, up until that time, "0 < None" /had/ been true "for years - and for a good reason" ;-) not-all-good-reasons-are-particularly-good-ly y'rs - tim From mal at egenix.com Tue Nov 11 21:28:59 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 11 Nov 2008 21:28:59 +0100 Subject: [Python-3000] None in Comparisons In-Reply-To: References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> Message-ID: <4919EB0B.2090004@egenix.com> On 2008-11-11 18:09, Guido van Rossum wrote: > We're not going to add the "feature" back that None compares smaller > than everything. It's a slippery slope that ends with all operations > involving None returning None -- I've seen a proposal made in all > earnestness requesting that None+42 == None, None() == None, and so > on. This Nonesense was wisely rejected; a whole slew of > early-error-catching would have gone out of the window. I was suggesting None of that. > It's the same > with making None smaller than everything else. For numbers, you can > already use -inf; for other types, you'll have to invent your own > Smallest if you need it. No, that doesn't work: -inf is a perfectly valid number, None isn't. Same for strings: '' would be a valid string that compares smaller than all others, None isn't. Furthermore, if you get a None value from some database or data set, you don't want to replace that special n/a value with a valid number or string - since you'd lose round-trip safety. In some cases you don't even know whether the item was supposed to be a number or e.g. a string, so there is no obvious choice for a replacement. Looks like data processing algorithms written for Python3 will have to start using key functions throughout or end up requiring lots of if-elif-elses to deal gracefully with the different combinations of comparison failures. Oh well, another surprise to add to the Python 3k list of surprises. And here's another one: >>> None < None Traceback (most recent call last): File "", line 1, in TypeError: unorderable types: NoneType() < NoneType() >>> None is None True >>> None == None True >>> None > None Traceback (most recent call last): File "", line 1, in TypeError: unorderable types: NoneType() > NoneType() >>> None != None False Two values that compare equal to each other (and are in fact identical), yet cannot be compared less-than or greater-than. This would make sense if you think of None as meaning "anything and/or nothing", since the left side None could stand for a different None than the right one, but then you could apply the same logic to inf: >>> inf = float('inf') >>> inf < inf False >>> inf is inf True >>> inf == inf True >>> inf > inf False >>> inf != inf False In this case you don't get any errors. Note all of this has to be seen from a user perspective, not from a CPython implementors perspective. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 11 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From greg.ewing at canterbury.ac.nz Tue Nov 11 23:27:51 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Nov 2008 11:27:51 +1300 Subject: [Python-3000] None in Comparisons In-Reply-To: <4919EB0B.2090004@egenix.com> References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> Message-ID: <491A06E7.5090608@canterbury.ac.nz> M.-A. Lemburg wrote: > And here's another one: > ... > Two values that compare equal to each other (and are in fact identical), > yet cannot be compared less-than or greater-than. That's not particularly surprising -- complex numbers have been like that for a long time. The only surprise, if any, is that more types are becoming that way. -- Greg From solipsis at pitrou.net Wed Nov 12 00:09:14 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Nov 2008 23:09:14 +0000 (UTC) Subject: [Python-3000] None in Comparisons References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> Message-ID: M.-A. Lemburg egenix.com> writes: > > >>> None > None > Traceback (most recent call last): > File "", line 1, in > TypeError: unorderable types: NoneType() > NoneType() > >>> None != None > False > > Two values that compare equal to each other (and are in fact identical), > yet cannot be compared less-than or greater-than. The error message is clear: "unorderable types". Having some types support an equivalence relation (e.g. "equality") but no intuitive total order relation is hardly a surprise. As someone said, complex numbers are an example of that (not only in Python, but in real life). > This would make sense if you think of None as meaning "anything > and/or nothing", since the left side None could stand for > a different None than the right one, but then you could apply the > same logic to inf: inf is a float instance, and as such supports ordering. I don't see how it invalidates None *not* supporting an order relation, since None isn't a float instance and doesn't pretend to be usable as a number (or as anything supporting ordering, for that matter). From greg.ewing at canterbury.ac.nz Tue Nov 11 23:00:45 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Nov 2008 11:00:45 +1300 Subject: [Python-3000] None in Comparisons In-Reply-To: <49198EA8.6040400@egenix.com> References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> Message-ID: <491A008D.3000609@canterbury.ac.nz> M.-A. Lemburg wrote: > On 2008-11-11 14:28, Antoine Pitrou wrote: > >>But why should "n/a" (or "missing", or "undefined") imply "smaller than >>everything else"? > > It's just a convention based on viewing None as "nothing" or the > empty set. It would be possible to implement this convention in the sort method, without making it a feature of comparisons in general. SQL does something similar -- while nulls sort before everything else, the result of null < something is null, not true. -- Greg From mal at egenix.com Wed Nov 12 11:01:52 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 12 Nov 2008 11:01:52 +0100 Subject: [Python-3000] None in Comparisons In-Reply-To: <1f7befae0811111020k74161888ie51ddc8967e5cd81@mail.gmail.com> References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> <1f7befae0811111020k74161888ie51ddc8967e5cd81@mail.gmail.com> Message-ID: <491AA990.1090601@egenix.com> On 2008-11-11 19:20, Tim Peters wrote: > [M.-A. Lemburg] >> ... >> That's fine. I'm just talking about the special case for None that >> has existed in Python for years - and for a good reason. > > That's overstating it a bit ;-) In Python 1.5.1, comparisons were > changed so that objects of numeric types compared smaller than objects > of non-numeric types, and then 0 < None was true, not None < 0 (which > became true substantially later). The reason for that change is > explained in Misc/HISTORY (it was an attempt to preserve transitivity > across chains of mixed-type comparisons). > > Later, during the move to rich comparisons, I was hacking the code in > the same room with Guido, and realized something special had to be > done with None. > > "Hey, Guido, what should we do about mixed-type comparisons against None?" > > "Hmm ... what do you think?" > > "Hmm ... OK, let's make None smaller than other types." > > "Why?" > > "Oh, why not?" > > "Good enough -- but let's not document it -- it's an arbitrary > implementation detail." > > "Of course!" > > In any case, we thought this was so arbitrary that we didn't hesitate > to break that, up until that time, "0 < None" /had/ been true "for > years - and for a good reason" ;-) > > not-all-good-reasons-are-particularly-good-ly y'rs - tim Thanks for that bit of history :-) With "good reason" I meant special casing None w/r to putting it in a fixed place somewhere into the ordering scheme. The important aspect is getting it in there, not the exact position it takes. None could also compare larger than any other object, or smaller than all objects with type names starting with a 'P' and larger than all objects with type names starting with an 'O'. You just need to get it in there somewhere in order to have comparisons with None not fail with an exception. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 12 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mal at egenix.com Wed Nov 12 11:44:11 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 12 Nov 2008 11:44:11 +0100 Subject: [Python-3000] None in Comparisons In-Reply-To: References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> Message-ID: <491AB37B.8010408@egenix.com> On 2008-11-12 00:09, Antoine Pitrou wrote: > M.-A. Lemburg egenix.com> writes: >>>>> None > None >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: unorderable types: NoneType() > NoneType() >>>>> None != None >> False >> >> Two values that compare equal to each other (and are in fact identical), >> yet cannot be compared less-than or greater-than. > > The error message is clear: "unorderable types". Having some types support an > equivalence relation (e.g. "equality") but no intuitive total order relation is > hardly a surprise. As someone said, complex numbers are an example of that (not > only in Python, but in real life). The difference is that None is a singleton, so the set of all None type instances is {None}. You always have an intuitive total order relation on one element sets: the identity relation. >> This would make sense if you think of None as meaning "anything >> and/or nothing", since the left side None could stand for >> a different None than the right one, but then you could apply the >> same logic to inf: > > inf is a float instance, and as such supports ordering. I don't see how it > invalidates None *not* supporting an order relation, since None isn't a float > instance and doesn't pretend to be usable as a number (or as anything supporting > ordering, for that matter). Right, but you're taking the view of a CPython developer. You need to view this as Python user. In real (math) life, inf is a different type of number than regular floats, ints or complex numbers and has a special meaning depending on the context in which you use it. The relationship is much like that of None to all other Python objects. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 12 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From greg.ewing at canterbury.ac.nz Wed Nov 12 12:07:09 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Nov 2008 00:07:09 +1300 Subject: [Python-3000] None in Comparisons In-Reply-To: <491AB37B.8010408@egenix.com> References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> <491AB37B.8010408@egenix.com> Message-ID: <491AB8DD.3010908@canterbury.ac.nz> M.-A. Lemburg wrote: > The difference is that None is a singleton, so the set of all > None type instances is {None}. You always have an intuitive total order > relation on one element sets: the identity relation. I don't see this having much practical consequence, though, since sorting members of a 1-element set isn't a very useful thing to do. -- Greg From victor.stinner at haypocalc.com Wed Nov 12 14:36:45 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 12 Nov 2008 14:36:45 +0100 Subject: [Python-3000] Status of the email package ? (or: email package and unicode) Message-ID: <200811121436.45734.victor.stinner@haypocalc.com> Hi, poplib, imaplib and nntplib are fixed in Python 3.0rc2, cool. I tested the smtplib module. It looks like .sendmail() requires an ASCII message (7 bits). I tried to use the email package to encode my message. But the problem is that I'm unable to use characters different not in the ASCII charset! See the reported bugs at: http://bugs.python.org/issue4306 Before the Python 3.0 final, we have to test the email package with unicode characters! I wrote two small patches, one includes at little test :-) -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ From barry at python.org Wed Nov 12 15:04:34 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 12 Nov 2008 09:04:34 -0500 Subject: [Python-3000] Status of the email package ? (or: email package and unicode) In-Reply-To: <200811121436.45734.victor.stinner@haypocalc.com> References: <200811121436.45734.victor.stinner@haypocalc.com> Message-ID: <18DA96C8-2E89-4EEF-A125-C14808036550@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 12, 2008, at 8:36 AM, Victor Stinner wrote: > poplib, imaplib and nntplib are fixed in Python 3.0rc2, cool. > > I tested the smtplib module. It looks like .sendmail() > requires > an ASCII message (7 bits). > > I tried to use the email package to encode my message. But the > problem is that > I'm unable to use characters different not in the ASCII charset! See > the > reported bugs at: > http://bugs.python.org/issue4306 > > Before the Python 3.0 final, we have to test the email package with > unicode > characters! I wrote two small patches, one includes at little test :-) Please make this a release blocker and I will look at it this weekend. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSRricnEjvBPtnXfVAQJ4EwQAlkCiDyHrF2Epo4aYqsAxJhyR2J0/rfP3 i9ptOEXk75AGJ4LP4wGNSkAce8ljxe7myfUpvky9dFbAybHr4ZoRwooAIVBJ0OS8 VOk++4dbDO/PSA+Yz8hYgQAfFU2UbOwD47XAaMs6WU7IMY3r+g6QyPTTnulA+EKo lz3xCpSivj8= =4Emh -----END PGP SIGNATURE----- From solipsis at pitrou.net Wed Nov 12 16:10:29 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 12 Nov 2008 15:10:29 +0000 (UTC) Subject: [Python-3000] None in Comparisons: None vs. float("inf") References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> <491AB37B.8010408@egenix.com> Message-ID: M.-A. Lemburg egenix.com> writes: > > The difference is that None is a singleton, so the set of all > None type instances is {None}. You always have an intuitive total order > relation on one element sets: the identity relation. But it's not what you are asking for. You are asking for None to support ordered comparison with objects of other types, which is completely different. Having None be comparable with itself for ordered comparisons is certainly possible, but it's also completely useless if None doesn't compare with other types, which is what we are talking about. > Right, but you're taking the view of a CPython developer. You > need to view this as Python user. No, I'm taking the view of an user. inf is usable as a float object and as such supports many of the same operations as other float objects. None obviously doesn't and doesn't claim to. I don't see how this is a CPython-centric point of view. > In real (math) life, inf is a different type of number than regular floats, > ints or complex numbers and has a special meaning depending on the context > in which you use it. Well, in real life, inf isn't a number at all. It is a notation to indicate the behaviour of certain sequences or functions when some of their arguments tends to a certain "value" (either infinite, or a singular point for which the function or sequence is not defined). Making inf usable as a number is a way to make more accessible some useful properties of limits, at the price of notational abuse. (*) But there's still no analogy with None. None has nothing to do with algebra, limits, neighbourings or closures. It cannot be defined as the limit of a particular function when of its arguments approaches either infinity or a singular point. None is a independent discrete value, not something at the outer boundary of a continuous set of values. (*) e.g.: >>> f = float("inf") >>> f * f inf >>> math.exp(f) inf >>> math.log(f) inf >>> math.tanh(f) 1.0 >>> math.exp(-f) 0.0 >>> 1 ** f 1.0 >>> 0 ** f 0.0 It is not complete though : >>> 2 ** f Traceback (most recent call last): File "", line 1, in OverflowError: (34, 'Numerical result out of range') >>> f ** 2 Traceback (most recent call last): File "", line 1, in OverflowError: (34, 'Numerical result out of range') Regards Antoine. From guido at python.org Wed Nov 12 20:19:47 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Nov 2008 11:19:47 -0800 Subject: [Python-3000] None in Comparisons In-Reply-To: <1f7befae0811111020k74161888ie51ddc8967e5cd81@mail.gmail.com> References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> <1f7befae0811111020k74161888ie51ddc8967e5cd81@mail.gmail.com> Message-ID: On Tue, Nov 11, 2008 at 10:20 AM, Tim Peters wrote: > [M.-A. Lemburg] >> ... >> That's fine. I'm just talking about the special case for None that >> has existed in Python for years - and for a good reason. > > That's overstating it a bit ;-) In Python 1.5.1, comparisons were > changed so that objects of numeric types compared smaller than objects > of non-numeric types, and then 0 < None was true, not None < 0 (which > became true substantially later). The reason for that change is > explained in Misc/HISTORY (it was an attempt to preserve transitivity > across chains of mixed-type comparisons). > > Later, during the move to rich comparisons, I was hacking the code in > the same room with Guido, and realized something special had to be > done with None. > > "Hey, Guido, what should we do about mixed-type comparisons against None?" > > "Hmm ... what do you think?" > > "Hmm ... OK, let's make None smaller than other types." > > "Why?" > > "Oh, why not?" > > "Good enough -- but let's not document it -- it's an arbitrary > implementation detail." > > "Of course!" > > In any case, we thought this was so arbitrary that we didn't hesitate > to break that, up until that time, "0 < None" /had/ been true "for > years - and for a good reason" ;-) > > not-all-good-reasons-are-particularly-good-ly y'rs - tim Hah! I can vouch that this is pretty much how it went. It's a good thing our process has changed a bit; in today's world this would have required a PEP, and for good reason. I would have fought the proposal if it had been proposed as a new feature for Python 3000. welcome-back-tim-ly y'rs, -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bruce at leapyear.org Wed Nov 12 20:52:41 2008 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 12 Nov 2008 11:52:41 -0800 Subject: [Python-3000] None in Comparisons In-Reply-To: <491A008D.3000609@canterbury.ac.nz> References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> <491A008D.3000609@canterbury.ac.nz> Message-ID: On Tue, Nov 11, 2008 at 2:00 PM, Greg Ewing wrote: > M.-A. Lemburg wrote: > >> On 2008-11-11 14:28, Antoine Pitrou wrote: >> >> But why should "n/a" (or "missing", or "undefined") imply "smaller than >>> everything else"? >>> >> >> It's just a convention based on viewing None as "nothing" or the >> empty set. >> > > It would be possible to implement this convention in the > sort method, without making it a feature of comparisons > in general. > +1 None / Missing / undefined should be able to be sorted with other data. If this requires adding an optional parameter to sort, I'm fine with that. Note that this works with strings today: x = "abc" y = "abcd" x < y note that x[3] is undefined and the comparison operator (and sorting) automatically places x before y when all other elements of x and y are equal. Likewise if I created a comparison method for a class I would probably order C(a=1) < C(a=2) < C(a=2, b=3) I understand why you don't want to make None comparison work generally for the < operator. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From qrczak at knm.org.pl Wed Nov 12 22:13:50 2008 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Wed, 12 Nov 2008 22:13:50 +0100 Subject: [Python-3000] None in Comparisons In-Reply-To: <491A008D.3000609@canterbury.ac.nz> References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> <491A008D.3000609@canterbury.ac.nz> Message-ID: <3f4107910811121313p3fbd7071ue4e287be775f1b8@mail.gmail.com> 2008/11/11 Greg Ewing : > It would be possible to implement this convention in the > sort method, without making it a feature of comparisons > in general. Until someone wishes to sort a list of some objects by key, where the keys can be (1, None) compared with (1, 3). This will be confusing because None compared with 3 would work. -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From ncoghlan at gmail.com Wed Nov 12 22:28:56 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Nov 2008 07:28:56 +1000 Subject: [Python-3000] None in Comparisons In-Reply-To: References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> <491A008D.3000609@canterbury.ac.nz> Message-ID: <491B4A98.9020804@gmail.com> Bruce Leban wrote: > > > On Tue, Nov 11, 2008 at 2:00 PM, Greg Ewing > wrote: > > M.-A. Lemburg wrote: > > On 2008-11-11 14:28, Antoine Pitrou wrote: > > But why should "n/a" (or "missing", or "undefined") imply > "smaller than > everything else"? > > > It's just a convention based on viewing None as "nothing" or the > empty set. > > > It would be possible to implement this convention in the > sort method, without making it a feature of comparisons > in general. > > > +1 Adding a shorthand way of filtering out or otherwise permitting "None" entries in sorted()/list.sort() certainly has a greater chance of acceptance than bringing back ordering comparisons to None in general. As Marcin points out though, there would be potential issues with such an idea that may even need a PEP to thrash out (mainly the "what happens for containers" question that he brings up, but there may be other pitfalls as well). Application specific key or comparison functions with their own ideas on how to handle None really don't sound like a bad idea to me. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From greg.ewing at canterbury.ac.nz Thu Nov 13 01:54:47 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Nov 2008 13:54:47 +1300 Subject: [Python-3000] None in Comparisons In-Reply-To: <3f4107910811121313p3fbd7071ue4e287be775f1b8@mail.gmail.com> References: <4919835D.5000605@egenix.com> <49198EA8.6040400@egenix.com> <491A008D.3000609@canterbury.ac.nz> <3f4107910811121313p3fbd7071ue4e287be775f1b8@mail.gmail.com> Message-ID: <491B7AD7.1040706@canterbury.ac.nz> Marcin 'Qrczak' Kowalczyk wrote: > 2008/11/11 Greg Ewing : > >> It would be possible to implement this convention in the >> sort method > > Until someone wishes to sort a list of some objects by key, where the > keys can be (1, None) compared with (1, 3). Yes, I thought of that shortly after posting. Lists and tuples could be special-cased as well, although someone is inevitably going to be surprised when their favourite sequence type doesn't get included in the special treatment. However, covering lists and tuples would cover the vast majority of use cases, I think. So perhaps the best thing would be to provide this as an option, defaulting to false, and with the limitations clearly documented. If someone has a use case not covered by it, they just have to provide their own key function. Another approach would be to define a protocol for comparison-for-sorting, which custom types could implement if they wanted. -- Greg From mal at egenix.com Thu Nov 13 12:35:56 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 13 Nov 2008 12:35:56 +0100 Subject: [Python-3000] None in Comparisons: None vs. float("inf") In-Reply-To: References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> <491AB37B.8010408@egenix.com> Message-ID: <491C111C.9050208@egenix.com> On 2008-11-12 16:10, Antoine Pitrou wrote: > M.-A. Lemburg egenix.com> writes: >> The difference is that None is a singleton, so the set of all >> None type instances is {None}. You always have an intuitive total order >> relation on one element sets: the identity relation. > > But it's not what you are asking for. You are asking for None to support ordered > comparison with objects of other types, which is completely different. > > Having None be comparable with itself for ordered comparisons is certainly > possible, but it's also completely useless if None doesn't compare with other > types, which is what we are talking about. I should have probably made it clearer in my posting: Having None < None fail is another different (and a lot more insignificant) problem. It would be solved by having None added to a consistent Python object ordering scheme, but is not a consequence of not having a consistent general object ordering scheme. So far, I haven't heard a single argument for why not having None participate in an ordering scheme is a good strategy to use, except that it's pure. IMHO, practicality beats purity in this special case. Anyway, like I said: it's one more thing to add to the list of surprises in Python 3.0. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 13 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From guido at python.org Thu Nov 13 16:15:14 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 13 Nov 2008 07:15:14 -0800 Subject: [Python-3000] None in Comparisons: None vs. float("inf") In-Reply-To: <491C111C.9050208@egenix.com> References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> <491AB37B.8010408@egenix.com> <491C111C.9050208@egenix.com> Message-ID: On Thu, Nov 13, 2008 at 3:35 AM, M.-A. Lemburg wrote: > Anyway, like I said: it's one more thing to add to the list of > surprises in Python 3.0. I'm happy to do so. I expect that over time it won't be an issue. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Thu Nov 13 19:55:52 2008 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 13 Nov 2008 13:55:52 -0500 Subject: [Python-3000] None in Comparisons: None vs. float("inf") In-Reply-To: <491C111C.9050208@egenix.com> References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> <491AB37B.8010408@egenix.com> <491C111C.9050208@egenix.com> Message-ID: <1f7befae0811131055s1e5fec8eta5ce057008c5cbfe@mail.gmail.com> [M.-A. Lemburg] > ... > So far, I haven't heard a single argument for why not having None > participate in an ordering scheme is a good strategy to use, except > that it's pure. I've tracked down plenty of program logic errors that would have been discovered more easily if comparing None to (mostly, but among others) integers and strings had raised an exception instead of returning a meaningless true/false result. Perhaps you haven't. For those who have, the attraction to making comparisons with None refuse to return nonsense silently is both obvious and visceral. > IMHO, practicality beats purity in this special case. If hiding program logic errors is practical, sure ;-) there-is-no-behavior-no-matter-how-bizarre-someone-won't come-to-rely-on-ly y'rs - tim From bruce at leapyear.org Thu Nov 13 20:15:23 2008 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 13 Nov 2008 11:15:23 -0800 Subject: [Python-3000] None in Comparisons: None vs. float("inf") In-Reply-To: <1f7befae0811131055s1e5fec8eta5ce057008c5cbfe@mail.gmail.com> References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> <491AB37B.8010408@egenix.com> <491C111C.9050208@egenix.com> <1f7befae0811131055s1e5fec8eta5ce057008c5cbfe@mail.gmail.com> Message-ID: I think the behavior of NaN in comparisons is more confusing: >>> sorted([1,nan,2]) [1, nan, 2] >>> sorted([2,nan,1]) [2, nan, 1] >>> sorted([2,None,1]) Traceback (most recent call last): File "", line 1, in sorted([2,None,1]) TypeError: unorderable types: NoneType() < int() At least the third case is clear that I shouldn't have done that. The way nan works, the results of sorting where one of the values is nan is unpredictable and useless. Yes, I know the rules about how NaN values behave in comparisons. Notwithstanding that, sorting could use a different comparison rule imposing a total ordering: -inf, ..., inf, nan as some other systems do. --- Bruce On Thu, Nov 13, 2008 at 10:55 AM, Tim Peters wrote: > [M.-A. Lemburg] > > ... > > So far, I haven't heard a single argument for why not having None > > participate in an ordering scheme is a good strategy to use, except > > that it's pure. > > I've tracked down plenty of program logic errors that would have been > discovered more easily if comparing None to (mostly, but among others) > integers and strings had raised an exception instead of returning a > meaningless true/false result. Perhaps you haven't. For those who > have, the attraction to making comparisons with None refuse to return > nonsense silently is both obvious and visceral. > > > > IMHO, practicality beats purity in this special case. > > If hiding program logic errors is practical, sure ;-) > > there-is-no-behavior-no-matter-how-bizarre-someone-won't > come-to-rely-on-ly y'rs - tim > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/bruce%40leapyear.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jyasskin at gmail.com Thu Nov 13 22:01:44 2008 From: jyasskin at gmail.com (Jeffrey Yasskin) Date: Thu, 13 Nov 2008 13:01:44 -0800 Subject: [Python-3000] None in Comparisons: None vs. float("inf") In-Reply-To: References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> <491AB37B.8010408@egenix.com> <491C111C.9050208@egenix.com> <1f7befae0811131055s1e5fec8eta5ce057008c5cbfe@mail.gmail.com> Message-ID: <5d44f72f0811131301p50f5dd0em3dd21e925223e799@mail.gmail.com> Be glad you're not programming in C++ then, where trying to sort NaN can cause segfaults! More seriously, I think using the following function as the sort key will make sort do what you want: def SortNoneFirstAndNanLast(x): if x is None: return (1, x) if isnan(x): return (3, x) return (2, x) No need to modify either sort() or <. On Thu, Nov 13, 2008 at 11:15 AM, Bruce Leban wrote: > I think the behavior of NaN in comparisons is more confusing: >>>> sorted([1,nan,2]) > [1, nan, 2] >>>> sorted([2,nan,1]) > [2, nan, 1] >>>> sorted([2,None,1]) > Traceback (most recent call last): > File "", line 1, in > sorted([2,None,1]) > TypeError: unorderable types: NoneType() < int() > At least the third case is clear that I shouldn't have done that. The way > nan works, the results of sorting where one of the values is nan is > unpredictable and useless. > Yes, I know the rules about how NaN values behave in comparisons. > Notwithstanding that, sorting could use a different comparison rule imposing > a total ordering: -inf, ..., inf, nan as some other systems do. > --- Bruce > > On Thu, Nov 13, 2008 at 10:55 AM, Tim Peters wrote: >> >> [M.-A. Lemburg] >> > ... >> > So far, I haven't heard a single argument for why not having None >> > participate in an ordering scheme is a good strategy to use, except >> > that it's pure. >> >> I've tracked down plenty of program logic errors that would have been >> discovered more easily if comparing None to (mostly, but among others) >> integers and strings had raised an exception instead of returning a >> meaningless true/false result. Perhaps you haven't. For those who >> have, the attraction to making comparisons with None refuse to return >> nonsense silently is both obvious and visceral. >> >> >> > IMHO, practicality beats purity in this special case. >> >> If hiding program logic errors is practical, sure ;-) >> >> there-is-no-behavior-no-matter-how-bizarre-someone-won't >> come-to-rely-on-ly y'rs - tim >> _______________________________________________ >> Python-3000 mailing list >> Python-3000 at python.org >> http://mail.python.org/mailman/listinfo/python-3000 >> Unsubscribe: >> http://mail.python.org/mailman/options/python-3000/bruce%40leapyear.org > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/jyasskin%40gmail.com > > -- Namast?, Jeffrey Yasskin http://jeffrey.yasskin.info/ From bruce at leapyear.org Fri Nov 14 00:09:04 2008 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 13 Nov 2008 15:09:04 -0800 Subject: [Python-3000] NaN in sorting Message-ID: I agree that that function will fix the problem. You can also use it to fix sorting any mixed types but that's not my point either. My point is that when you have None mixed in with numbers, sorting doesn't work in a way that stops you and makes you fix it. When you have nan mixed in, it just fails randomly. The sorted function should satisfy the property that sorted[i] < sorted[j] for all i < j and it doesn't. I realize that it *can't* satisfy it for the nans because < is always false regardless. But it should at least satisfy it for the comparable values and it doesn't. --- Bruce On Thu, Nov 13, 2008 at 1:01 PM, Jeffrey Yasskin wrote: > Be glad you're not programming in C++ then, where trying to sort NaN > can cause segfaults! > > More seriously, I think using the following function as the sort key > will make sort do what you want: > > def SortNoneFirstAndNanLast(x): > if x is None: > return (1, x) > if isnan(x): > return (3, x) > return (2, x) > > No need to modify either sort() or <. > > On Thu, Nov 13, 2008 at 11:15 AM, Bruce Leban wrote: > > I think the behavior of NaN in comparisons is more confusing: > >>>> sorted([1,nan,2]) > > [1, nan, 2] > >>>> sorted([2,nan,1]) > > [2, nan, 1] > >>>> sorted([2,None,1]) > > Traceback (most recent call last): > > File "", line 1, in > > sorted([2,None,1]) > > TypeError: unorderable types: NoneType() < int() > > At least the third case is clear that I shouldn't have done that. The way > > nan works, the results of sorting where one of the values is nan is > > unpredictable and useless. > > Yes, I know the rules about how NaN values behave in comparisons. > > Notwithstanding that, sorting could use a different comparison rule > imposing > > a total ordering: -inf, ..., inf, nan as some other systems do. > > --- Bruce > > > > On Thu, Nov 13, 2008 at 10:55 AM, Tim Peters > wrote: > >> > >> [M.-A. Lemburg] > >> > ... > >> > So far, I haven't heard a single argument for why not having None > >> > participate in an ordering scheme is a good strategy to use, except > >> > that it's pure. > >> > >> I've tracked down plenty of program logic errors that would have been > >> discovered more easily if comparing None to (mostly, but among others) > >> integers and strings had raised an exception instead of returning a > >> meaningless true/false result. Perhaps you haven't. For those who > >> have, the attraction to making comparisons with None refuse to return > >> nonsense silently is both obvious and visceral. > >> > >> > >> > IMHO, practicality beats purity in this special case. > >> > >> If hiding program logic errors is practical, sure ;-) > >> > >> there-is-no-behavior-no-matter-how-bizarre-someone-won't > >> come-to-rely-on-ly y'rs - tim > >> _______________________________________________ > >> Python-3000 mailing list > >> Python-3000 at python.org > >> http://mail.python.org/mailman/listinfo/python-3000 > >> Unsubscribe: > >> http://mail.python.org/mailman/options/python-3000/bruce%40leapyear.org > > > > > > _______________________________________________ > > Python-3000 mailing list > > Python-3000 at python.org > > http://mail.python.org/mailman/listinfo/python-3000 > > Unsubscribe: > > http://mail.python.org/mailman/options/python-3000/jyasskin%40gmail.com > > > > > > > > -- > Namast?, > Jeffrey Yasskin > http://jeffrey.yasskin.info/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rogerb at rogerbinns.com Fri Nov 14 06:23:19 2008 From: rogerb at rogerbinns.com (Roger Binns) Date: Thu, 13 Nov 2008 21:23:19 -0800 Subject: [Python-3000] PyObject_HEAD_INIT Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Something has been baffling me and is still present in py3.0 -rc2. When initializing a (non-variable) PyTypeObject in Python 2, PyObject_HEAD_INIT is used. The docs for Python 3 still show that: http://docs.python.org/dev/3.0/extending/newtypes.html However if you try to use it you get all sorts of severe warnings (it looks like structure members don't line up). Looking through the Python 3.0 source, *all* initialization is done using PyVarObject_HEAD_INIT(NULL, 0) instead - PyObject_HEAD_INIT is not used at all. The Python 2.3 source shows the latter form being used almost exclusively so at some point someone changed a lot of code. It would seem that common practise, the examples and the documentation don't all match each other! Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkkdC0MACgkQmOOfHg372QRErgCdGPyIr9eLNaoivePS5AhUFzJx KEEAnR8oYK27C5ZueWnmtk+qecOh0bpP =LDdB -----END PGP SIGNATURE----- From martin at v.loewis.de Fri Nov 14 11:07:43 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Nov 2008 11:07:43 +0100 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: References: Message-ID: <491D4DEF.3050100@v.loewis.de> Roger Binns wrote: > The Python 2.3 source shows the latter form being used almost > exclusively so at some point someone changed a lot of code. That's correct. I changed it for Python 3, for PEP 3123. > It would seem that common practise, the examples and the documentation > don't all match each other! I cannot parse this sentence. Regards, Martin From rogerb at rogerbinns.com Fri Nov 14 11:22:01 2008 From: rogerb at rogerbinns.com (Roger Binns) Date: Fri, 14 Nov 2008 02:22:01 -0800 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: <491D4DEF.3050100@v.loewis.de> References: <491D4DEF.3050100@v.loewis.de> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin v. L?wis wrote: >> It would seem that common practise, the examples and the documentation >> don't all match each other! > > I cannot parse this sentence. Py2 source: Uses PyObject_HEAD_INIT Py2 code compiled under Py3: Gives serious warnings Py3 examples: Say to use PyObject_HEAD_INIT Py3 reference: Says to use PyObject_HEAD_INIT Py3 source: Uses PyVarObject_HEAD_INIT PEP 3123: Says PyVarObject_HEAD_INIT only Obviously the Python 3 documentation and examples need to be updated. Also why not remove PyObject_HEAD_INIT from Python 3 headers so that if it is used then the compile fails? The Python 3 examples show using PyObject_HEAD_INIT: http://docs.python.org/dev/3.0/extending/newtypes.html The Python 3 documentation says to use PyObject_HEAD_INIT: http://docs.python.org/dev/3.0/search.html?q=PyObject_HEAD_INIT&check_keywords=yes&area=default There are no matches for PyVarObject_HEAD_INIT: http://docs.python.org/dev/3.0/search.html?q=PyVarObject_HEAD_INIT Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkkdUUUACgkQmOOfHg372QSAGwCgvx4N8GAe1ciUgedgu/3QB920 PqkAoMGQ0veEJ/FrkUiWxBJ/ZPw5ugjb =+hZI -----END PGP SIGNATURE----- From daniel.stutzbach at gmail.com Tue Nov 11 15:22:29 2008 From: daniel.stutzbach at gmail.com (Daniel Stutzbach) Date: Tue, 11 Nov 2008 08:22:29 -0600 Subject: [Python-3000] None in Comparisons In-Reply-To: <4919835D.5000605@egenix.com> References: <4919835D.5000605@egenix.com> Message-ID: On Tue, Nov 11, 2008 at 7:06 AM, M.-A. Lemburg wrote: > Why was the special case for None being "smaller" than all other > objects in Python removed from Python 3.0 ? (see object.c in Py2.x) > It wasn't true in Python 2.5, either. Observe: Cashew:~/pokersleuth/tracker$ python2.5 Python 2.5 (r25:51908, Feb 26 2007, 08:19:26) [GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin Type "help", "copyright", "credits" or "license" for more information. >>> import datetime >>> now = datetime.datetime.now() >>> now < None Traceback (most recent call last): File "", line 1, in TypeError: can't compare datetime.datetime to NoneType Right now to get the desired semantics, I implement a custom AlwaysLeast and/or AlwaysGreatest singletons for whatever type I'm dealing with. It's a bit of of a pain. My use cases are all along the following lines: class TimeSpan: def __init__(self): self.earliest = AlwaysGreatest self.latest = AlwaysLeast def update(self, v): self.earliest = min(self.earliest, v) self.latest = max(self.latest, v) -- Daniel Stutzbach, Ph.D. http://www.barsoom.org/~agthorr -------------- next part -------------- An HTML attachment was scrubbed... URL: From ESmith-rowland at alionscience.com Wed Nov 12 21:46:46 2008 From: ESmith-rowland at alionscience.com (Smith-Rowland, Edward M) Date: Wed, 12 Nov 2008 15:46:46 -0500 Subject: [Python-3000] Install python-3000 as python3 Message-ID: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> When I try to install python-3.0rcN I get the following warning: running install_egg_info Writing /usr/local/lib/python3.0/lib-dynload/Python-3.0rc2-py3.0.egg-info * Note: not installed as 'python'. * Use 'make fullinstall' to install as 'python'. * However, 'make fullinstall' is discouraged, * as it will clobber your Python 2.x installation. Not clobbering my python-2.* install is all well and good. python-3.0 is installed as expected. It would be nice to have a generic way to call python-3 though. Suggestion: Use python3 as the generic install target for python-3.* Then all subsequent python-3.* installs would get sent there (in addition to python-3.m.n) If no python2 exists you could either go ahead and use that for python3. Or you could leave the python just for 2. Ed From guido at python.org Fri Nov 14 16:24:33 2008 From: guido at python.org (Guido van Rossum) Date: Fri, 14 Nov 2008 07:24:33 -0800 Subject: [Python-3000] Install python-3000 as python3 In-Reply-To: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> References: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> Message-ID: No, no, no! *Eventually* Python 3.0 will just be "python". Until then, it needs to be "python3.0". Long agon, Red Hat installed Python 2 as "python2" instead of "python" and it led to endless problems. On Wed, Nov 12, 2008 at 12:46 PM, Smith-Rowland, Edward M wrote: > > When I try to install python-3.0rcN I get the following warning: > > running install_egg_info > Writing /usr/local/lib/python3.0/lib-dynload/Python-3.0rc2-py3.0.egg-info > * Note: not installed as 'python'. > * Use 'make fullinstall' to install as 'python'. > * However, 'make fullinstall' is discouraged, > * as it will clobber your Python 2.x installation. > > Not clobbering my python-2.* install is all well and good. python-3.0 is installed as expected. > > It would be nice to have a generic way to call python-3 though. > > Suggestion: Use python3 as the generic install target for python-3.* > Then all subsequent python-3.* installs would get sent there (in addition to python-3.m.n) > > If no python2 exists you could either go ahead and use that for python3. > Or you could leave the python just for 2. > > Ed > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Fri Nov 14 17:51:24 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Nov 2008 17:51:24 +0100 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: References: <491D4DEF.3050100@v.loewis.de> Message-ID: <491DAC8C.5060701@v.loewis.de> > Obviously the Python 3 documentation and examples need to be updated. I see - please submit a bug report. > Also why not remove PyObject_HEAD_INIT from Python 3 headers so that if > it is used then the compile fails? It's still needed for non-var objects. Regards, Martin From guido at python.org Fri Nov 14 18:20:57 2008 From: guido at python.org (Guido van Rossum) Date: Fri, 14 Nov 2008 09:20:57 -0800 Subject: [Python-3000] Install python-3000 as python3 In-Reply-To: <893D428105AB9F49AC7A5C96A454A8B808F6C6F9@email4a.alionscience.com> References: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> <893D428105AB9F49AC7A5C96A454A8B808F6C6F9@email4a.alionscience.com> Message-ID: On Fri, Nov 14, 2008 at 7:37 AM, Smith-Rowland, Edward M wrote: > Thank you for your reply. > I didn't mean to instigate or revive a problem ;-) > I guess when python-3.* is python we'll have python-2.* installed. > > I guess I forsee a significant amount of time when we'll need both versions overlapping. > I just would like a way to have a fixed target for python-2.* as versions get bumped there. > That's more the issue I guess. You can solve that locally however you want to. But in practice most code is tied to a particular minor version (e.g. 2.5, 2.6), not just to a major version, so your #! line should probably be #!/usr/bin/env python3.0 or whatever version you know works. > Ed > > -----Original Message----- > From: gvanrossum at gmail.com on behalf of Guido van Rossum > Sent: Fri 11/14/2008 10:24 AM > To: Smith-Rowland, Edward M > Cc: python-3000 at python.org > Subject: Re: [Python-3000] Install python-3000 as python3 > > No, no, no! *Eventually* Python 3.0 will just be "python". Until > then, it needs to be "python3.0". > > Long agon, Red Hat installed Python 2 as "python2" instead of "python" > and it led to endless problems. > > On Wed, Nov 12, 2008 at 12:46 PM, Smith-Rowland, Edward M > wrote: >> >> When I try to install python-3.0rcN I get the following warning: >> >> running install_egg_info >> Writing /usr/local/lib/python3.0/lib-dynload/Python-3.0rc2-py3.0.egg-info >> * Note: not installed as 'python'. >> * Use 'make fullinstall' to install as 'python'. >> * However, 'make fullinstall' is discouraged, >> * as it will clobber your Python 2.x installation. >> >> Not clobbering my python-2.* install is all well and good. python-3.0 is installed as expected. >> >> It would be nice to have a generic way to call python-3 though. >> >> Suggestion: Use python3 as the generic install target for python-3.* >> Then all subsequent python-3.* installs would get sent there (in addition to python-3.m.n) >> >> If no python2 exists you could either go ahead and use that for python3. >> Or you could leave the python just for 2. >> >> Ed >> _______________________________________________ >> Python-3000 mailing list >> Python-3000 at python.org >> http://mail.python.org/mailman/listinfo/python-3000 >> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org >> > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Fri Nov 14 18:48:39 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 14 Nov 2008 12:48:39 -0500 Subject: [Python-3000] Install python-3000 as python3 In-Reply-To: References: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> Message-ID: Guido van Rossum wrote: > No, no, no! *Eventually* Python 3.0 will just be "python". Until > then, it needs to be "python3.0". I think you should consider changing the installation policy to installing each file with its own pythonx.y name and alias exactly one, the system default, as 'python'. Multiple installations are going to be common for several years, I suspect. Currently on Windows, python3.0 is installed as just plain 'python'. This already causes problems when I want to choose 2.5 or 3.0. In my quick start list, I have two entries which both say "Python (command line)" with the *same* icon. Switching to admin, I just renamed one to Python2.5..., but that changes it also in the Python25 start menu directory, so I do not know if the change will survive the upgrade to 2.5.3. Duplicate names are also a problem for the right-click-on-.py-file context menu. It would be really helpful if there was a new icon for 3.0. > Long agon, Red Hat installed Python 2 as "python2" instead of "python" > and it led to endless problems. From guido at python.org Fri Nov 14 18:56:27 2008 From: guido at python.org (Guido van Rossum) Date: Fri, 14 Nov 2008 09:56:27 -0800 Subject: [Python-3000] Install python-3000 as python3 In-Reply-To: References: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> Message-ID: On Fri, Nov 14, 2008 at 9:48 AM, Terry Reedy wrote: > Guido van Rossum wrote: >> >> No, no, no! *Eventually* Python 3.0 will just be "python". Until >> then, it needs to be "python3.0". > > I think you should consider changing the installation policy to installing > each file with its own pythonx.y name and alias exactly one, the system > default, as 'python'. Um, we already do this, unless I misunderstand your proposal. For any 2.x (and even 1.x) version, "make altinstall" installs only pythonx.y, and "make install" creates a link to it named "python". For 3.0, "install" is equal to "altinstall" because making 3.0 the default is not realistic at this time. At some point in the future we can revisit this. > Multiple installations are going to be common for > several years, I suspect. And have been forever. > Currently on Windows, python3.0 is installed as just plain 'python'. This > already causes problems when I want to choose 2.5 or 3.0. In my quick start > list, I have two entries which both say "Python (command line)" with the > *same* icon. Switching to admin, I just renamed one to Python2.5..., but > that changes it also in the Python25 start menu directory, so I do not know > if the change will survive the upgrade to 2.5.3. Duplicate names are also a > problem for the right-click-on-.py-file context menu. It would be really > helpful if there was a new icon for 3.0. Ah, Windows. Can you file a bug about this? I think only MvL can do something about this. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rogerb at rogerbinns.com Fri Nov 14 19:22:01 2008 From: rogerb at rogerbinns.com (Roger Binns) Date: Fri, 14 Nov 2008 10:22:01 -0800 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: <491DAC8C.5060701@v.loewis.de> References: <491D4DEF.3050100@v.loewis.de> <491DAC8C.5060701@v.loewis.de> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin v. L?wis wrote: >> Also why not remove PyObject_HEAD_INIT from Python 3 headers so that if >> it is used then the compile fails? > > It's still needed for non-var objects. Wouldn't a var object have PyVarObject_HEAD and a non-var object have PyObject_HEAD? Coming from Python 2, I would expect the example as currently documented to be perfectly correctly: http://docs.python.org/dev/3.0/extending/newtypes.html It isn't a var object and doesn't use the var form of the macros. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkkdwcYACgkQmOOfHg372QSbPwCcC1V+v7x0clfJmRUaPiP+IvB0 msoAn1r5smhNdmXxENaYL5Y40h/QNHPB =lqEM -----END PGP SIGNATURE----- From rogerb at rogerbinns.com Fri Nov 14 19:24:10 2008 From: rogerb at rogerbinns.com (Roger Binns) Date: Fri, 14 Nov 2008 10:24:10 -0800 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: <491DAC8C.5060701@v.loewis.de> References: <491D4DEF.3050100@v.loewis.de> <491DAC8C.5060701@v.loewis.de> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin v. L?wis wrote: >> Obviously the Python 3 documentation and examples need to be updated. > > I see - please submit a bug report. Yet another site that wants another login to report bugs. So someone else can report it. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkkdwkMACgkQmOOfHg372QQY3ACg5Tb1nft1fheQLd1ORKghoutD joUAn3Sf5daabNtvmtm5LJZqVNMammjd =hpGA -----END PGP SIGNATURE----- From martin at v.loewis.de Fri Nov 14 21:22:46 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Nov 2008 21:22:46 +0100 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: References: <491D4DEF.3050100@v.loewis.de> <491DAC8C.5060701@v.loewis.de> Message-ID: <491DDE16.6080101@v.loewis.de> Roger Binns wrote: > Martin v. L?wis wrote: >>> Also why not remove PyObject_HEAD_INIT from Python 3 headers so that if >>> it is used then the compile fails? >> It's still needed for non-var objects. > > Wouldn't a var object have PyVarObject_HEAD and a non-var object have > PyObject_HEAD? That's in the types. In the objects, you need the *_INIT macros. Regards, Martni From martin at v.loewis.de Fri Nov 14 21:27:24 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Nov 2008 21:27:24 +0100 Subject: [Python-3000] Install python-3000 as python3 In-Reply-To: References: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> Message-ID: <491DDF2C.7080603@v.loewis.de> > Currently on Windows, python3.0 is installed as just plain 'python'. > This already causes problems when I want to choose 2.5 or 3.0. In my > quick start list, I have two entries which both say "Python (command > line)" with the *same* icon. What is a quick start list, and how did you get Python into it? > It would be really helpful if > there was a new icon for 3.0. Contributions are welcome. Regards, Martin From martin at v.loewis.de Fri Nov 14 21:29:30 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Nov 2008 21:29:30 +0100 Subject: [Python-3000] Install python-3000 as python3 In-Reply-To: References: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> Message-ID: <491DDFAA.10205@v.loewis.de> > Ah, Windows. Can you file a bug about this? I think only MvL can do > something about this. Not really. We don't install anything into a quick start list (see my other message - I don't even know what that is). For other icons, I can't do anything, either; my artistic abilities are zero. Regards, Martin From phd at phd.pp.ru Fri Nov 14 21:45:37 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 14 Nov 2008 23:45:37 +0300 Subject: [Python-3000] Install python-3000 as python3 In-Reply-To: <491DDF2C.7080603@v.loewis.de> References: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> <491DDF2C.7080603@v.loewis.de> Message-ID: <20081114204537.GA7821@phd.pp.ru> On Fri, Nov 14, 2008 at 09:27:24PM +0100, "Martin v. L?wis" wrote: > > Currently on Windows, python3.0 is installed as just plain 'python'. > > This already causes problems when I want to choose 2.5 or 3.0. In my > > quick start list, I have two entries which both say "Python (command > > line)" with the *same* icon. > > What is a quick start list, and how did you get Python into it? I think it's Windows Quick Launch bar: http://www.google.com/search?q=windows+quick+launch http://images.google.com/images?q=windows+quick+launch Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From martin at v.loewis.de Fri Nov 14 22:12:47 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Nov 2008 22:12:47 +0100 Subject: [Python-3000] Install python-3000 as python3 In-Reply-To: <20081114204537.GA7821@phd.pp.ru> References: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> <491DDF2C.7080603@v.loewis.de> <20081114204537.GA7821@phd.pp.ru> Message-ID: <491DE9CF.3080703@v.loewis.de> > I think it's Windows Quick Launch bar: > > http://www.google.com/search?q=windows+quick+launch > http://images.google.com/images?q=windows+quick+launch I see. Then I still don't fully understand the problem. Terry apparently had to add the Python command line to the quick launch bar himself - why does he have to be admin to change the label? Also, to create an entry, in the launch bar, I ctrl-dragged-and-dropped the entry from the start menu, creating a copy in the launch bar. I can now select "Properties" from the context menu in the launch bar, and rename the entry to "Python 2.5", say, without affecting how it's named in the start menu. So why does Terry say that his renaming renamed it in two places? Finally, what is the specific problem with right-on-click-.py-file context menu wrt. duplicate names? Regards, Martin From rogerb at rogerbinns.com Fri Nov 14 22:15:59 2008 From: rogerb at rogerbinns.com (Roger Binns) Date: Fri, 14 Nov 2008 13:15:59 -0800 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: <491DDE16.6080101@v.loewis.de> References: <491D4DEF.3050100@v.loewis.de> <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 My confusion was because I though that the HEAD for the data structure had to use the same corresponding HEAD_INIT in the type. So for whatever reason the PyTypeObject is declared as a var object which is why the var HEAD_INIT is needed. It still looks like PyObject_HEAD_INIT should be removed so that people using earlier versions of Python, following the Py3 docs (before they are fixed), using older tutorials etc don't get burnt. Grepping through the py3 source shows only PyModuleDef_HEAD_INIT using PyObject_HEAD_INIT. There are no other legitimate uses! Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkkd6osACgkQmOOfHg372QSXDgCbBg/zzqmAU4GaJL2qo4aNHocq c+oAn3IqgPGCvQN8jVMjttA8h+5+MO4g =f7os -----END PGP SIGNATURE----- From greg.ewing at canterbury.ac.nz Sat Nov 15 00:22:43 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 15 Nov 2008 12:22:43 +1300 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: References: <491D4DEF.3050100@v.loewis.de> Message-ID: <491E0843.9060505@canterbury.ac.nz> Roger Binns wrote: > Obviously the Python 3 documentation and examples need to be updated. > Also why not remove PyObject_HEAD_INIT from Python 3 headers so that if > it is used then the compile fails? Is there some reason not to define PyObject_HEAD_INIT so that it expands into the appropriate PyVarObject_HEAD_INIT call? -- Greg From tjreedy at udel.edu Sat Nov 15 03:38:29 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 14 Nov 2008 21:38:29 -0500 Subject: [Python-3000] Install python-3000 as python3 In-Reply-To: <491DDF2C.7080603@v.loewis.de> References: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> <491DDF2C.7080603@v.loewis.de> Message-ID: Martin v. L?wis wrote: >> Currently on Windows, python3.0 is installed as just plain 'python'. >> This already causes problems when I want to choose 2.5 or 3.0. In my >> quick start list, I have two entries which both say "Python (command >> line)" with the *same* icon. > > What is a quick start list, and how did you get Python into it? In Windows, when one click the start button on the left end of the task bar, a panel pops up with a list of frequently used programs or shortcuts on the left. I don't know its official name so I called it 'quick start list'. Windows puts things there. I did not mean QuickLaunch. On the same panel, 'All Programs' brings up shortcuts and directories with shortcuts -- the contents fof ...\StartMenu\Programs. I have Python2.5 and Python3.0 directories. Each has several identically named entries. Inside the directories, that is fine. But displayed out of the directory, as when two identically named shortcuts end up on the frequently used list, they become indistinguishable. (The items in the list *are* the originals, as with Python lists, not copies.) As a result, I click on one and if it is not the one I want, click on the other. The same would be true for desktop shortcuts, although those *are* copies and can be renamed. >> It would be really helpful if >> there was a new icon for 3.0. > > Contributions are welcome. I found the 16x16 py.ico and pyc.ico in python30/dlls, but not the nicer ones actually used for python.exe and the shortcuts nor the larger desktop versions. Any hints on how to get them for possible editing? About right-click context menu for .py and .pyc: once upon a time there were entries to run with 2.4 or 2.5. Both said run with 'python', but I could tell the difference bacause of the icons. Now there are none, either from installing 3.0 or deleting 2.4. Even 'Open' is disabled. My suggestion is that if the binaries were individually name pythonx.y, then there could be individual context menu entries 'run with pythonx.y'. I believe that you can only tell windows to add a 'run with' or 'edit with' entries with the program name having to be what it actually is. Terry Jan Reedy From ESmith-rowland at alionscience.com Fri Nov 14 16:37:53 2008 From: ESmith-rowland at alionscience.com (Smith-Rowland, Edward M) Date: Fri, 14 Nov 2008 10:37:53 -0500 Subject: [Python-3000] Install python-3000 as python3 References: <893D428105AB9F49AC7A5C96A454A8B808F6C6F8@email4a.alionscience.com> Message-ID: <893D428105AB9F49AC7A5C96A454A8B808F6C6F9@email4a.alionscience.com> Thank you for your reply. I didn't mean to instigate or revive a problem ;-) I guess when python-3.* is python we'll have python-2.* installed. I guess I forsee a significant amount of time when we'll need both versions overlapping. I just would like a way to have a fixed target for python-2.* as versions get bumped there. That's more the issue I guess. Ed -----Original Message----- From: gvanrossum at gmail.com on behalf of Guido van Rossum Sent: Fri 11/14/2008 10:24 AM To: Smith-Rowland, Edward M Cc: python-3000 at python.org Subject: Re: [Python-3000] Install python-3000 as python3 No, no, no! *Eventually* Python 3.0 will just be "python". Until then, it needs to be "python3.0". Long agon, Red Hat installed Python 2 as "python2" instead of "python" and it led to endless problems. On Wed, Nov 12, 2008 at 12:46 PM, Smith-Rowland, Edward M wrote: > > When I try to install python-3.0rcN I get the following warning: > > running install_egg_info > Writing /usr/local/lib/python3.0/lib-dynload/Python-3.0rc2-py3.0.egg-info > * Note: not installed as 'python'. > * Use 'make fullinstall' to install as 'python'. > * However, 'make fullinstall' is discouraged, > * as it will clobber your Python 2.x installation. > > Not clobbering my python-2.* install is all well and good. python-3.0 is installed as expected. > > It would be nice to have a generic way to call python-3 though. > > Suggestion: Use python3 as the generic install target for python-3.* > Then all subsequent python-3.* installs would get sent there (in addition to python-3.m.n) > > If no python2 exists you could either go ahead and use that for python3. > Or you could leave the python just for 2. > > Ed > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pb2au.alchemy at gmail.com Sat Nov 15 17:46:31 2008 From: pb2au.alchemy at gmail.com (Pb2Au) Date: Sat, 15 Nov 2008 11:46:31 -0500 (Eastern Standard Time) Subject: [Python-3000] Bytes to Unicode Conversion Message-ID: <491efdaa.6105be0a.5f55.ffff9dd4@mx.google.com> An HTML attachment was scrubbed... URL: From cvrebert at gmail.com Sun Nov 16 19:31:22 2008 From: cvrebert at gmail.com (Chris Rebert) Date: Sun, 16 Nov 2008 10:31:22 -0800 Subject: [Python-3000] Bytes to Unicode Conversion In-Reply-To: <491efdaa.6105be0a.5f55.ffff9dd4@mx.google.com> References: <491efdaa.6105be0a.5f55.ffff9dd4@mx.google.com> Message-ID: <47c890dc0811161031y397d33aaqbf568983b426fa6c@mail.gmail.com> On Sat, Nov 15, 2008 at 8:46 AM, Pb2Au wrote: > Hello, > > I recently changed from Python 2.5 to Python 3.0 rc2, and have > been trying to find out how to convert byte strings (b"example") > to unicode strings ("example"). I noticed that some of these had > changed in the latest version. > > One reason for a conversion between the two is the urllib.request.urlopen() > feature, which requires the string to be unicode rather than bytes, or else > you would receive an AttributeError error about 'bytes' object having no > attribute 'timeout'. The read() attribute of the urllib.request.urlopen() > function returns a byte string, which means I can't parse for information > in the bytes string to use in a second urllib.request.urlopen() function > unless > it was to be converted to unicode first. > > Am I simply overlooking something, or is there a built in function for > converting bytes to unicode? It seems like a function could be created > pretty easily if it has already not, but there isn't much sense in > reinventing the wheel if the function is already there. Already exists. Has for quite a while now: the_unicode = unicode(some_bytes, "name of encoding") Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com > > Thanks for your help. > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/cvrebert%40gmail.com > > From pb2au.alchemy at gmail.com Sun Nov 16 21:13:48 2008 From: pb2au.alchemy at gmail.com (Pb2Au) Date: Sun, 16 Nov 2008 15:13:48 -0500 (Eastern Standard Time) Subject: [Python-3000] Bytes to Unicode Conversion Message-ID: <49207fc0.4403be0a.380f.ffff9c2f@mx.google.com> An HTML attachment was scrubbed... URL: From martin at v.loewis.de Sun Nov 16 21:28:24 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 16 Nov 2008 21:28:24 +0100 Subject: [Python-3000] Bytes to Unicode Conversion In-Reply-To: <49207fc0.4403be0a.380f.ffff9c2f@mx.google.com> References: <49207fc0.4403be0a.380f.ffff9c2f@mx.google.com> Message-ID: <49208268.7070305@v.loewis.de> > I know that it had worked in the version 2.5, Python 3.0 rc2 doesn't > seem to recognize it as a function. a) I discourage usage of unicode and str converters; consider using .encode/.decode instead b) unicode is now called str Regards, Martin From cvrebert at gmail.com Sun Nov 16 21:20:45 2008 From: cvrebert at gmail.com (Chris Rebert) Date: Sun, 16 Nov 2008 12:20:45 -0800 Subject: [Python-3000] Bytes to Unicode Conversion In-Reply-To: <49207fc0.4403be0a.380f.ffff9c2f@mx.google.com> References: <49207fc0.4403be0a.380f.ffff9c2f@mx.google.com> Message-ID: <47c890dc0811161220h7bb5f3b4y609d11b017362f16@mail.gmail.com> Ah, my bad. Should never have referred to the Python 2.6 docs. :) Replace "unicode" with "str" in my line of code and I think it should work. Cheers, Chris On Sun, Nov 16, 2008 at 12:13 PM, Pb2Au wrote: > On Sun, Nov 16, 2008 at 4:31 PM, Chris Rebert : wrote: >>> Hello, >>> >>> I recently changed from Python 2.5 to Python 3.0 rc2, and have >>> been trying to find out how to convert byte strings (b"example") >>> to unicode strings ("example"). I noticed that some of these had >>> changed in the latest version. >>> >>> One reason for a conversion between the two is the >>> urllib.request.urlopen() >>> feature, which requires the string to be unicode rather than bytes, or >>> else >>> you would receive an AttributeError error about 'bytes' object having no >>> attribute 'timeout'. The read() attribute of the urllib.request.urlopen() >>> function returns a byte string, which means I can't parse for information >>> in the bytes string to use in a second urllib.request.urlopen() function >>> unless >>> it was to be converted to unicode first. >>> >>> Am I simply overlooking something, or is there a built in function for >>> converting bytes to unicode? It seems like a function could be created >>> pretty easily if it has already not, but there isn't much sense in >>> reinventing the wheel if the function is already there. >>> >>> Thanks for your help. >> >>Already exists. Has for quite a while now: >> >>the_unicode = unicode(some_bytes, "name of encoding") >> >>Cheers, >>Chris >>-- >>Follow the path of the Iguana... >>http://rebertia.com > > I know that it had worked in the version 2.5, Python 3.0 rc2 doesn't > seem to recognize it as a function. > > Python 3.0rc2 (r30rc2:67141, Nov 7 2008, 11:43:46) [MSC v.1500 32 bit > (Intel)] > on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> unicode() > Traceback (most recent call last): > File "", line 1, in > NameError: name 'unicode' is not defined > > -- Follow the path of the Iguana... http://rebertia.com From ncoghlan at gmail.com Sun Nov 16 21:34:01 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Nov 2008 06:34:01 +1000 Subject: [Python-3000] Bytes to Unicode Conversion In-Reply-To: <49207fc0.4403be0a.380f.ffff9c2f@mx.google.com> References: <49207fc0.4403be0a.380f.ffff9c2f@mx.google.com> Message-ID: <492083B9.7070501@gmail.com> Pb2Au wrote: > On Sun, Nov 16, 2008 at 4:31 PM, Chris Rebert : wrote: >> >>Already exists. Has for quite a while now: >> >>the_unicode = unicode(some_bytes, "name of encoding") > > I know that it had worked in the version 2.5, Python 3.0 rc2 doesn't > seem to recognize it as a function. > > Python 3.0rc2 (r30rc2:67141, Nov 7 2008, 11:43:46) [MSC v.1500 32 bit > (Intel)] > on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> unicode() > Traceback (most recent call last): > File "", line 1, in > NameError: name 'unicode' is not defined unicode becomes str in Py3k (as "type('')" will tell you). bytes.decode() works as well. Use str.encode() to go the other way. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From pb2au.alchemy at gmail.com Sun Nov 16 21:37:15 2008 From: pb2au.alchemy at gmail.com (Pb2Au) Date: Sun, 16 Nov 2008 15:37:15 -0500 (Eastern Standard Time) Subject: [Python-3000] Bytes to Unicode Conversion Message-ID: <4920853e.e203be0a.7e5e.7c9a@mx.google.com> An HTML attachment was scrubbed... URL: From greno at verizon.net Sun Nov 16 21:25:27 2008 From: greno at verizon.net (Gerry Reno) Date: Sun, 16 Nov 2008 15:25:27 -0500 Subject: [Python-3000] DBAPI Message-ID: <492081B7.2050500@verizon.net> What database API's have support for Python 3.0? From greno at verizon.net Sun Nov 16 21:44:13 2008 From: greno at verizon.net (Gerry Reno) Date: Sun, 16 Nov 2008 15:44:13 -0500 Subject: [Python-3000] DBAPI Message-ID: <4920861D.5030408@verizon.net> What database API's have support for Python 3.0? From skip at pobox.com Mon Nov 17 18:10:54 2008 From: skip at pobox.com (skip at pobox.com) Date: Mon, 17 Nov 2008 11:10:54 -0600 Subject: [Python-3000] Python3 - it's awesome (fwd) Message-ID: <18721.42398.308355.41948@montanaro-dyndns-org.local> Kudos to the Python 3.0 folks from a poster on comp.lang.python. And it's not even been released yet... Cheers, Skip -------------- next part -------------- An embedded message was scrubbed... From: Johannes Bauer Subject: Python3 - it's awesome Date: Mon, 17 Nov 2008 10:30:07 +0100 Size: 5593 URL: From josiah.carlson at gmail.com Mon Nov 17 23:50:06 2008 From: josiah.carlson at gmail.com (Josiah Carlson) Date: Mon, 17 Nov 2008 14:50:06 -0800 Subject: [Python-3000] None in Comparisons In-Reply-To: <491AB8DD.3010908@canterbury.ac.nz> References: <4919835D.5000605@egenix.com> <49199A67.1050400@egenix.com> <4919EB0B.2090004@egenix.com> <491AB37B.8010408@egenix.com> <491AB8DD.3010908@canterbury.ac.nz> Message-ID: On Wed, Nov 12, 2008 at 3:07 AM, Greg Ewing wrote: > M.-A. Lemburg wrote: > >> The difference is that None is a singleton, so the set of all >> None type instances is {None}. You always have an intuitive total order >> relation on one element sets: the identity relation. > > I don't see this having much practical consequence, though, > since sorting members of a 1-element set isn't a very > useful thing to do. This discussion smells like a re-hashing of the PEP 326 discussion. While I will (momentarily) lament None's passing as a (more or less) minimal object in Python, I believe that an explicit maximum and minimum value are much preferred over None. And as I stated previously, having a single implementation of the one true maximum or minimum value is much preferable to everyone writing their own (especially with respect to potential bugs). If a max/min value is desired (votes for None comparing smaller are a vote for a max/min value, just with a specific previously-established spelling), then a single implementation should exist. But then we get back into the same discussion that was had before: do we want them, and if so, what do we call them? - Josiah From barry at python.org Tue Nov 18 02:56:11 2008 From: barry at python.org (Barry Warsaw) Date: Mon, 17 Nov 2008 20:56:11 -0500 Subject: [Python-3000] 2.6.1 and 3.0 Message-ID: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin suggests, and I agree, that we should release Python 3.0 final and 2.6.1 at the same time. Makes sense to me. That would mean that Python 2.6.1 should be ready on 03-Dec (well, if Python 3.0 is ready then!). I'm still planning the last Python 3.0 release candidate for this Wednesday. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSIgu3EjvBPtnXfVAQKzGgP/XH2szIZdG9nvZTI2M9iWXuz/tBwH6ncd Kv70ATpttQEg/bmuRp5nSmg1p7hxSmTqu9waq4qdc07IPa+ofTngbunUKkTrbZoo E/r72dGw29pou7B6NVh/g5Db8Tl0yNJBd6vmpEUbCvUDBpljqgxCdj0uw/RiDluj 5Ek2biim7ww= =twyi -----END PGP SIGNATURE----- From facundobatista at gmail.com Tue Nov 18 11:03:02 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 18 Nov 2008 08:03:02 -0200 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> Message-ID: 2008/11/17 Barry Warsaw : > Martin suggests, and I agree, that we should release Python 3.0 final and > 2.6.1 at the same time. Makes sense to me. That would mean that Python > 2.6.1 should be ready on 03-Dec (well, if Python 3.0 is ready then!). 2.6.1 only two months after 2.6? Why so quickly? Anyway, I don't see any added value in the synchronization, so taking in consideration all the effort you're putting in these releases, I would just want to minimize your workload... which is easier to you? doing both at the same time or not? Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From ncoghlan at gmail.com Tue Nov 18 11:57:18 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Nov 2008 20:57:18 +1000 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> Message-ID: <49229F8E.7080800@gmail.com> Facundo Batista wrote: > 2008/11/17 Barry Warsaw : > >> Martin suggests, and I agree, that we should release Python 3.0 final and >> 2.6.1 at the same time. Makes sense to me. That would mean that Python >> 2.6.1 should be ready on 03-Dec (well, if Python 3.0 is ready then!). > > 2.6.1 only two months after 2.6? Why so quickly? > > Anyway, I don't see any added value in the synchronization, so taking > in consideration all the effort you're putting in these releases, I > would just want to minimize your workload... which is easier to you? > doing both at the same time or not? There have been several corrections made to the 2to3 conversion tool - it would be good to get those in developer's hands at the same time that 3.0 final becomes available. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From victor.stinner at haypocalc.com Tue Nov 18 13:14:37 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 18 Nov 2008 13:14:37 +0100 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> Message-ID: <200811181314.37443.victor.stinner@haypocalc.com> Le Tuesday 18 November 2008 11:03:02 Facundo Batista, vous avez ?crit?: > 2008/11/17 Barry Warsaw : > > Martin suggests, and I agree, that we should release Python 3.0 final and > > 2.6.1 at the same time. Makes sense to me. That would mean that Python > > 2.6.1 should be ready on 03-Dec (well, if Python 3.0 is ready then!). > > 2.6.1 only two months after 2.6? Why so quickly? Release Early, Release Often? I love release :-) I don't like waiting months to see the bugfixes applied everywhere. Victor From lists at cheimes.de Tue Nov 18 14:07:51 2008 From: lists at cheimes.de (Christian Heimes) Date: Tue, 18 Nov 2008 14:07:51 +0100 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> Message-ID: Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Martin suggests, and I agree, that we should release Python 3.0 final > and 2.6.1 at the same time. Makes sense to me. That would mean that > Python 2.6.1 should be ready on 03-Dec (well, if Python 3.0 is ready > then!). Should we release 2.6.1rc1, too? Christian From barry at python.org Tue Nov 18 15:34:44 2008 From: barry at python.org (Barry Warsaw) Date: Tue, 18 Nov 2008 09:34:44 -0500 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> Message-ID: <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 18, 2008, at 5:03 AM, Facundo Batista wrote: > 2008/11/17 Barry Warsaw : > >> Martin suggests, and I agree, that we should release Python 3.0 >> final and >> 2.6.1 at the same time. Makes sense to me. That would mean that >> Python >> 2.6.1 should be ready on 03-Dec (well, if Python 3.0 is ready then!). > > 2.6.1 only two months after 2.6? Why so quickly? Actually, I've wanted to do timed releases, though I think monthly is unrealistic. Maybe every two months is about the right time frame. Timed releases are nice because everybody then knows when a patch is due, from developers to downstream consumers. > Anyway, I don't see any added value in the synchronization, so taking > in consideration all the effort you're putting in these releases, I > would just want to minimize your workload... which is easier to you? > doing both at the same time or not? We're getting releases down to a science now! :) Actually the most painful part is updating the web site, so I plan adding some automation around that process too. OTOH, this is the first point release I'll be doing with the new script, so it'll be interesting to debug that process. As for synchronization, I think it's a good habit to get into, if my plan to do timed releases works out. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSLShXEjvBPtnXfVAQL90QP/UfWRXr0esTim+DtQJs9Fd/+Lj9PpuBV6 UCB7pAwl8uf7qIAwjDkCsdg3VD/wxzmzuwDAB8T19PF5dNxsrKWdBEzhymVpLU8T cch0Vlaevm6Co/kDp8VhyoKlPs7LDhGkC2G04qDSOETo8Ci84rBOlWd7n1KvUrYZ 01Pn6eZHdqA= =k9FS -----END PGP SIGNATURE----- From barry at python.org Tue Nov 18 15:35:51 2008 From: barry at python.org (Barry Warsaw) Date: Tue, 18 Nov 2008 09:35:51 -0500 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> Message-ID: <175832DD-76C5-4216-AA20-ED7BB62AF53F@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 18, 2008, at 8:07 AM, Christian Heimes wrote: > Barry Warsaw wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> Martin suggests, and I agree, that we should release Python 3.0 >> final and 2.6.1 at the same time. Makes sense to me. That would >> mean that Python 2.6.1 should be ready on 03-Dec (well, if Python >> 3.0 is ready then!). > > Should we release 2.6.1rc1, too? Do we need rc's for point releases? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSLSyHEjvBPtnXfVAQJuMAP/cv59kjeFz5DxMk1hMrwXdNQvSs5Ge0lZ ICC4DeKmz0gXZ0+PoZc4Yi9HBAQ8g7ZfKptzIPnEUrg65wV8KS6OzcK5KX5aptvF Mqi+cmD3TPImsOEGoPnJUtlUZ7ZETrY2LSzdIIFqIE5yO1HBt3ohBcdM95+V2zQl zt0uV+F4fnw= =7N3R -----END PGP SIGNATURE----- From adigunoble at gmail.com Tue Nov 18 16:01:35 2008 From: adigunoble at gmail.com (Owolabi Abdulkareem) Date: Tue, 18 Nov 2008 16:01:35 +0100 Subject: [Python-3000] could range be smarter? Message-ID: <4922D8CF.3010800@gmail.com> Hello , I'm not really a programmer but I am working to be one I just thought may be this is worth considering for future enhancement with python's range I've you considered trying out something of these nature with range: />>> 9000000 in range(20**30)/ you'll have to wait for sometime to get a response but if you try something like this />>> 900000.7 in range(20**30) /or even />>> 0.1 in range(20**30)/ range iterates through till the end before it gives you a response. It is clear that a float could not be in the above range and /0.1/ is less than a unit integer, why do the function /range /*have to iterate through till the end* wasting CPU resource when it is clear that it is not in the range. To avoid this ,I have to /type check /and test if the number is an integer and not a float . Failure to do this would lead to my program to freeze until the iteration is done. This makes range inefficient and one would have to remember this for it does not do the job of checking for you or/ and if a step is included in the range I have to test it with % operator: Putting this in a function: / def Inrange(begin,end,step,number): index=0 # Here is where the index of the number is placed if is in range for i in begin,end,step,number: if type(i) == type(int(i)): #Check to see all that parameters are all integers pass else: return 'All parameters must be integers' try: #check to see if parameters are appropriate to produce a true range if (begin < end and step>0): assert(begin <= number < end) elif begin > end and step < 0: assert(begin >= number > end) else: raise AssertionError except AssertionError: return 'Check your parameters ;range between them is not executable' if (number - begin)%step != 0: return 'Your number is not in range' else: index=(number - begin) // step return (True,index) / (I'm sorry about my code; any correction will be educating) Ruby's range does the comparing check but gives an erroneous 'True' with /(0 .. 20*30) include? 7.7482/ Where 7.7482 is a float ( and should not be found in a range of integers) I think it would be nice (and more efficient for there won't be a need to iterate through for this) if we could do this: / >>> 7.7483 in range(20**30) Traceback (most recent call last): File "", line 1, in 7.7483 range(20**30) TypeError: 'float' object cannot be in a range of integers >>> 622168 in range(20**30,8) (True, 77770) >>> 1219 in range(-181,20**30,8) (True, 175) >>> range(-181,20**30,8)[175] 1219 >>> list(range(-181,20**30,8)).index[1219] # as in index = (number - begin) // step 175/ In python 3.0 range(-181,20**30,8) gives: / Traceback (most recent call last): File "", line 1, in list(range(20**20)).index(20) OverflowError: Python int too large to convert to C ssize_t/ I believe this simple mathematical expression in the function above could be used to avoid iteration for this and go past this OverflowError (with some initial test similar to that in the function 'Inrange' above): / number = index*step + begin index = (number - begin) // step #making index the subject of formul/a I don't know if something could be done about this now ( or if it is really a good idea) especially when python 3000 is in its rc2 but I thought I should say something on this. Yours Abdulkareem Owolabi From lists at cheimes.de Tue Nov 18 15:52:31 2008 From: lists at cheimes.de (Christian Heimes) Date: Tue, 18 Nov 2008 15:52:31 +0100 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> Message-ID: <4922D6AF.6080400@cheimes.de> Barry Warsaw wrote: > Actually, I've wanted to do timed releases, though I think monthly is > unrealistic. Maybe every two months is about the right time frame. > Timed releases are nice because everybody then knows when a patch is > due, from developers to downstream consumers. From my point of view bi-monthly release are too much. For a ?.?.1 release two months are fine because several issues are found by 3rd party authors. But after that a release every quarter is sufficient. * .1 release two months after the .0 release * .2, .3, .4 and .5 release every quarter * about here the next minor release gets out * .6 and further releases after 6 months when necessary Christian From algorias at yahoo.com Tue Nov 18 16:43:11 2008 From: algorias at yahoo.com (Vitor Bosshard) Date: Tue, 18 Nov 2008 07:43:11 -0800 (PST) Subject: [Python-3000] could range be smarter? References: <4922D8CF.3010800@gmail.com> Message-ID: <49480.68464.qm@web54402.mail.yahoo.com> Hello, You're using range objects in ways that were hardly intended. If range objects had to be magically smart about everything, the language would be slower overall just to enable a questionable use case. Plain comparison operators and isinstance or type checks against int will make your life a lot easier in this case: >>> a = .1 >>> type(a) is int and 0 <= a < 20**30 False ? ? Vitor ----- Mensaje original ---- > De: Owolabi Abdulkareem > Para: python-3000 at python.org > Enviado: martes, 18 de noviembre, 2008 12:01:35 > Asunto: [Python-3000] could range be smarter? > > Hello , > I'm not really a programmer but I am working to be one > > I just thought may be this is worth considering for future enhancement > with python's range > > I've you considered trying out something of these nature with range: > > />>> 9000000 in range(20**30)/ > > you'll have to wait for sometime to get a response but if you try > something like this > > />>> 900000.7 in range(20**30) > > /or even > > />>> 0.1 in range(20**30)/ > > range iterates through till? the end before it gives you a response. It > is clear that a float could not be in the above range and /0.1/ is less > than a unit integer, why do the function /range? /*have to iterate > through till the end* wasting CPU resource when it is clear that it is > not in the range. > > To avoid this ,I have to /type check /and test if the number is an > integer and not a float . > > Failure to do this would lead to my program to freeze until the > iteration is done. This makes range inefficient and one would have to > remember this for it does not do the job of checking for you or/ and if > a step is included in the range I have to test it with % operator: > > Putting this in a function: > / > def Inrange(begin,end,step,number): > ? ? index=0 # Here is where the index of the number is placed if is in range > ? ? for i in begin,end,step,number: > ? ? ? ? if type(i) == type(int(i)): #Check to see all that parameters? > are all integers > ? ? ? ? ? ? pass > ? ? ? ? else: > ? ? ? ? ? ? return 'All parameters must be integers' > ? ? ? ? try:? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #check > to see if parameters are? appropriate to produce a true range > ? ? ? ? ? ? if (begin < end and step>0): > ? ? ? ? ? ? ? ? assert(begin <= number < end) > ? ? ? ? ? ? elif begin > end and step < 0: > ? ? ? ? ? ? ? ? assert(begin >= number > end) > ? ? ? ? ? ? else: > ? ? ? ? ? ? ? ? raise AssertionError > > ? ? ? ? except AssertionError: > > ? ? ? ? ? ? return 'Check your parameters ;range between them is not > executable' > ? ? ? ? if (number - begin)%step != 0: > ? ? ? ? ? ? return 'Your number is not in range' > ? ? ? ? else: > ? ? ? ? ? ? index=(number - begin) // step > > ? ? ? ? ? ? return (True,index) > / > (I'm sorry about my code; any correction will be educating) > > Ruby's range does the comparing check but gives an erroneous 'True'? with > > /(0 .. 20*30) include? 7.7482/ > > Where 7.7482 is a float ( and should not be found in a range of integers) > > I think it would be nice (and more efficient for there won't be a need > to iterate through for this) if we could do this: > / > >>> 7.7483 in range(20**30) > Traceback (most recent call last): > ? File "", line 1, in > ? ? 7.7483 range(20**30) > TypeError: 'float' object cannot be in a range of integers > > >>> 622168 in range(20**30,8) > (True, 77770) > > >>> 1219 in range(-181,20**30,8) > (True, 175) > > >>> range(-181,20**30,8)[175] > 1219 > > >>> list(range(-181,20**30,8)).index[1219]? # as in? index = (number - > begin) // step > 175/ > > In python 3.0 range(-181,20**30,8) gives: > / > Traceback (most recent call last): > ? File "", line 1, in > ? ? list(range(20**20)).index(20) > OverflowError: Python int too large to convert to C ssize_t/ > > I believe this simple mathematical expression in the function above > could be used to avoid iteration for this and > go past this OverflowError (with some? initial test similar to that in > the function 'Inrange' above): > / > number = index*step + begin > index = (number - begin) // step #making index the subject of formul/a > > I don't know if something could be done about this now ( or if it is > really a good idea) especially when python 3000 is in its rc2 but I > thought I should say something on this. > > > Yours > > Abdulkareem Owolabi > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/algorias%40yahoo.com ____________________________________________________________________________________ ?Todo sobre Amor y Sexo! La gu?a completa para tu vida en Mujer de Hoy. http://mujerdehoy.telemundo.yahoo.com/ From stargaming at gmail.com Tue Nov 18 17:30:39 2008 From: stargaming at gmail.com (Robert Lehmann) Date: Tue, 18 Nov 2008 16:30:39 +0000 (UTC) Subject: [Python-3000] could range be smarter? References: <4922D8CF.3010800@gmail.com> Message-ID: On Tue, 18 Nov 2008 16:01:35 +0100, Owolabi Abdulkareem wrote: > I just thought may be this is worth considering for future enhancement > with python's range > > I've you considered trying out something of these nature with range: > > />>> 9000000 in range(20**30)/ > > you'll have to wait for sometime to get a response See http://bugs.python.org/issue1766304 for additional discussion on this topic. Optimizations for this would have to be implemented on the C level. > I don't know if something could be done about this now ( or if it is > really a good idea) especially when python 3000 is in its rc2 but I > thought I should say something on this. I think optimizations can always be applied (though the core developers might be busy with caring for release blockers rather than optimizations right now). -- Robert "Stargaming" Lehmann From LambertDW at Corning.com Tue Nov 18 21:26:32 2008 From: LambertDW at Corning.com (Lambert, David W (S&T)) Date: Tue, 18 Nov 2008 15:26:32 -0500 Subject: [Python-3000] Possible py3k problem. Message-ID: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> Attached program works with callback = GSL_FUNCTION(self.f) set_with_values(mnzr,callback,xn,fn,xLB,fLB,xUB,fUB) But core dumps with set_with_values(mnzr,GSL_FUNCTION(self.f),xn,fn,xLB,fLB,xUB,fUB) I do not understand the difference. Must be one of these possibilities: 1) Python3rc1+ is broken, or I don't understand it. 2) ctypes is broken, or I misunderstand. 3) My system---linux on 64 bit opteron---has installation sickness. 4) The actual cause. Source attached. You'll possibly need to modify the code that loads gsl.so's. If it's a python problem, you should know of it. If it's my problem---I've found the work 'round but I'd like to understand. Thanks! Dave. -------------- next part -------------- A non-text attachment was scrubbed... Name: dump.py Type: application/octet-stream Size: 7383 bytes Desc: dump.py URL: From lists at cheimes.de Tue Nov 18 22:10:14 2008 From: lists at cheimes.de (Christian Heimes) Date: Tue, 18 Nov 2008 22:10:14 +0100 Subject: [Python-3000] Possible py3k problem. In-Reply-To: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> References: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> Message-ID: <49232F36.6080505@cheimes.de> Lambert, David W (S&T) wrote: > Attached program works with > > callback = GSL_FUNCTION(self.f) > set_with_values(mnzr,callback,xn,fn,xLB,fLB,xUB,fUB) > > But core dumps with > > set_with_values(mnzr,GSL_FUNCTION(self.f),xn,fn,xLB,fLB,xUB,fUB) > I had to change your module a bit to load the right library on my system. Here is a backtrace: This GDB was configured as "x86_64-linux-gnu"... (gdb) run dump.py 1 Starting program: /home/heimes/dev/python/py3k/python dump.py 1 [Thread debugging using libthread_db enabled] [New Thread 0x7fb4ed67d6e0 (LWP 18856)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fb4ed67d6e0 (LWP 18856)] 0x00007fb4ebf094b0 in ?? () from /usr/lib/libgsl.so.0 (gdb) bt #0 0x00007fb4ebf094b0 in ?? () from /usr/lib/libgsl.so.0 #1 0x00007fb4ebf08e07 in gsl_min_fminimizer_iterate () from /usr/lib/libgsl.so.0 #2 0x00007fb4ec67281c in ffi_call_unix64 () at /home/heimes/dev/python/py3k/Modules/_ctypes/libffi/src/x86/unix64.S:75 #3 0x00007fb4ec67222c in ffi_call (cif=0x7ffff569e650, fn=0x7fb4ebf08dd0 , rvalue=0x7ffff569e710, avalue=0x7ffff569e6f0) at /home/heimes/dev/python/py3k/Modules/_ctypes/libffi/src/x86/ffi64.c:430 #4 0x00007fb4ec667a4d in _call_function_pointer (flags=4353, pProc=0x7fb4ebf08dd0 , avalues=0x7ffff569e6f0, atypes=0x7ffff569e6d0, restype=0xb37830, resmem=0x7ffff569e710, argcount=1) at /home/heimes/dev/python/py3k/Modules/_ctypes/callproc.c:803 #5 0x00007fb4ec668665 in _CallProc (pProc=0x7fb4ebf08dd0 , argtuple=0xb73b50, flags=4353, argtypes=0xae9a00, restype=0xb37370, checker=0x0) at /home/heimes/dev/python/py3k/Modules/_ctypes/callproc.c:1150 #6 0x00007fb4ec6616ca in CFuncPtr_call (self=0xa87e60, inargs=0xb73b50, kwds=0x0) at /home/heimes/dev/python/py3k/Modules/_ctypes/_ctypes.c:3775 #7 0x00000000004e819a in PyObject_Call (func=0xa87e60, arg=0xb73b50, kw=0x0) at Objects/abstract.c:2184 From guido at python.org Tue Nov 18 22:11:51 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Nov 2008 13:11:51 -0800 Subject: [Python-3000] Possible py3k problem. In-Reply-To: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> References: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> Message-ID: Core dumps generated using ctypes are not covered by the warrantee. ;-) On Tue, Nov 18, 2008 at 12:26 PM, Lambert, David W (S&T) wrote: > > Attached program works with > > callback = GSL_FUNCTION(self.f) > set_with_values(mnzr,callback,xn,fn,xLB,fLB,xUB,fUB) > > But core dumps with > > set_with_values(mnzr,GSL_FUNCTION(self.f),xn,fn,xLB,fLB,xUB,fUB) > > > I do not understand the difference. Must be one of these possibilities: > > > 1) Python3rc1+ is broken, or I don't understand it. > > 2) ctypes is broken, or I misunderstand. > > 3) My system---linux on 64 bit opteron---has installation sickness. > > 4) The actual cause. > > > Source attached. You'll possibly need to modify the code that loads > gsl.so's. > > If it's a python problem, you should know of it. > If it's my problem---I've found the work 'round but I'd like to > understand. > > Thanks! Dave. > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Tue Nov 18 22:29:45 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Nov 2008 07:29:45 +1000 Subject: [Python-3000] Possible py3k problem. In-Reply-To: References: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> Message-ID: <492333C9.805@gmail.com> Guido van Rossum wrote: > Core dumps generated using ctypes are not covered by the warrantee. ;-) True, although it's a little bizarre that the version with the temporary variable works, but the one without it doesn't. Then again, the temp variable does change the timing on the Python side as well as the memory layout, both of which could be relevant to whether the program dumps core or not. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From martin at v.loewis.de Tue Nov 18 23:14:35 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 18 Nov 2008 23:14:35 +0100 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <175832DD-76C5-4216-AA20-ED7BB62AF53F@python.org> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <175832DD-76C5-4216-AA20-ED7BB62AF53F@python.org> Message-ID: <49233E4B.5070008@v.loewis.de> >> Should we release 2.6.1rc1, too? > > Do we need rc's for point releases? We have been doing them in the past, a week before the release. In this case, I could accept a waiver, given that the previous release acts very well as a release candidate for this release. Regards, Martin From martin at v.loewis.de Tue Nov 18 23:17:23 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 18 Nov 2008 23:17:23 +0100 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <4922D6AF.6080400@cheimes.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> Message-ID: <49233EF3.9040303@v.loewis.de> > From my point of view bi-monthly release are too much. For a ?.?.1 > release two months are fine because several issues are found by 3rd > party authors. But after that a release every quarter is sufficient. > > * .1 release two months after the .0 release > * .2, .3, .4 and .5 release every quarter > * about here the next minor release gets out > * .6 and further releases after 6 months when necessary In the past, we had been striving for releases every 6 month. This was already very difficult to achieve. While I'm happy that Barry has automated his part to a high degree, my part is, unfortunately, much less automated. I could personally automate the build process a bit more, but part of it is also testing of the installers, which is manual. Regards, Martin From stephen at xemacs.org Wed Nov 19 02:55:32 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 19 Nov 2008 10:55:32 +0900 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <49233EF3.9040303@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> Message-ID: <87wsf0mqcr.fsf@xemacs.org> "Martin v. L?wis" writes: > While I'm happy that Barry has automated his part to a high degree, > my part is, unfortunately, much less automated. I could personally > automate the build process a bit more, but part of it is also testing > of the installers, which is manual. Maybe you could delegate a lot of the testing to competent volunteers? That would be probably 2 times as much work the first couple of times, (you'd need to formalize your "script" for testing[1] and then check that the volunteers are understanding it correctly, etc) but if they are reliable you could turn that around a lot faster in the future. Footnotes: [1] Doesn't Windows have a way to send synthetic GUI events to a program? There ought to be a way to really script that, as the Python installer process presumbly doesn't change much from release to release. From mike_mp at zzzcomputing.com Wed Nov 19 06:18:51 2008 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Wed, 19 Nov 2008 00:18:51 -0500 Subject: [Python-3000] Python3 - it's awesome (fwd) In-Reply-To: <18721.42398.308355.41948@montanaro-dyndns-org.local> References: <18721.42398.308355.41948@montanaro-dyndns-org.local> Message-ID: Seconded. I tried Python 3K for the first time this weekend, spent a few hours with the 2to3 tool and we have 80% of SQLAlchemy unit tests passing on 3.0 now. It was far easier than I'd hoped, and the decisions made in PY3K make perfect sense to me. Its a better language and I think it will become popular more quickly than we've all thought. Great job to everyone on the list here who's spent many months hammering out all the details ! To be determined on our end is how to maintain 2.XX and 3.XX branches, either through an automated 2to3 process, or by maintaining separate branches. I'm leaning towards the former, possibly by augmenting 2to3 with specially annotated comments that give hints to particularly thorny sections. As I go through the code base making post-2to3 manual fixes, I'm adding in comments denoting the manual changes which I hope to turn into....something. It will be critical that we get DBAPI implementations going soon, other than pysqlite I haven't perceived any activity in that area. It will be interesting to see if we remain with the maintsays of MySQLdb, psycopg2, cx_oracle, or if new 3.0-era contenders come on the scene. On Nov 17, 2008, at 12:10 PM, skip at pobox.com wrote: > Kudos to the Python 3.0 folks from a poster on comp.lang.python. > And it's > not even been released yet... > > Cheers, > > Skip > > > From: Johannes Bauer > Date: November 17, 2008 4:30:07 AM EST > To: python-list at python.org > Subject: Python3 - it's awesome > > > Hello list, > > since I've read so much about Python 3 and ran into some trouble which > was supposed to be fixed with 3k, I yesterday came around to compile > it > and try it out. > > To sum it up: It's awesome. All the promised things like Unicode > support > "just work", all significant changes in the lanugage (try/except- > Syntax, > float division, file opening, print) I've encountered so far made > absolute sense and it was quite easy to change over (I've been using > 2.5 > so far). It was really easy to install it locally as my user (I want > to > try it out some more before I install it system-wide). > > So, why I am posting this here: First, thanks to the great work to > anyone who has contributed. Secondly, I want to encourage anyone who > hasn't tried Python3 out yet to do - it really is easier than you > think > and you won't be disappointed. > > Kind regards, > Johannes > > -- > "Meine Gegenklage gegen dich lautet dann auf bewusste Verlogenheit, > verl?sterung von Gott, Bibel und mir und bewusster Blasphemie." > -- Prophet und Vision?r Hans Joss aka HJP in de.sci.physik > <48d8bf1d$0$7510$5402220f at news.sunrise.ch> > -- > http://mail.python.org/mailman/listinfo/python-list > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/mike_mp%40zzzcomputing.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Wed Nov 19 07:02:23 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 19 Nov 2008 07:02:23 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <87wsf0mqcr.fsf@xemacs.org> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> Message-ID: <4923ABEF.50900@v.loewis.de> > > While I'm happy that Barry has automated his part to a high degree, > > my part is, unfortunately, much less automated. I could personally > > automate the build process a bit more, but part of it is also testing > > of the installers, which is manual. > > Maybe you could delegate a lot of the testing to competent volunteers? That's not the issue - I don't mind spending that time. However, it means that several hours pass between starting the release process, and making the binaries available - during this time, users always complain why the Windows binaries are not released yet. With additional volunteers, availability of the binaries would lag even more behind the release announcement. > [1] Doesn't Windows have a way to send synthetic GUI events to a > program? There ought to be a way to really script that, as the Python > installer process presumbly doesn't change much from release to release. I also need to involve different machines, e.g. XP machines and Vista machines, and machines that have Visual Studio installed and machines that don't. Plus, I need to log into each machine in different ways: as admin user and non-admin user. The automated GUI testing only really works for a logged-in user. Regards, Martin From stephen at xemacs.org Wed Nov 19 08:32:18 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 19 Nov 2008 16:32:18 +0900 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <4923ABEF.50900@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> Message-ID: <87iqqkmarh.fsf@xemacs.org> "Martin v. L?wis" writes: > That's not the issue - I don't mind spending that time. However, it > means that several hours pass between starting the release process, > and making the binaries available - during this time, users always > complain why the Windows binaries are not released yet. For "several hours" delay? Shame on the complainers! Ubuntu and MacPorts users have to wait days or weeks for installers. Debian stable users, years! My understanding was that the biggest problem with keeping to a 6-month cycle has always been that it's still a long enough time frame that people will rush to get an 80%-done project into the release just before deadline, causing extra reviewing effort for the senior committers and effort and delays for everyone for bug fixing. One month is probably short enough that people will be willing to submit things at a more appropriate stage in development. Still, it's the review and polishing-up effort that is the bottleneck, it seems to me. Not the installers. From walter at livinglogic.de Wed Nov 19 09:19:54 2008 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 19 Nov 2008 09:19:54 +0100 Subject: [Python-3000] Python3 - it's awesome (fwd) In-Reply-To: References: <18721.42398.308355.41948@montanaro-dyndns-org.local> Message-ID: <4923CC2A.9080406@livinglogic.de> Michael Bayer wrote: > [...] > It will be critical that we get DBAPI implementations going soon, other > than pysqlite I haven't perceived any activity in that area. It will > be interesting to see if we remain with the maintsays of MySQLdb, > psycopg2, cx_oracle, or if new 3.0-era contenders come on the scene. Anthony Tuininga is currently updating cx_Oracle for Python 3.0. See for example this checkin: http://cx-oracle.svn.sourceforge.net/viewvc/cx-oracle?view=rev&revision=175 Servus, Walter From victor.stinner at haypocalc.com Wed Nov 19 10:21:16 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 19 Nov 2008 10:21:16 +0100 Subject: [Python-3000] Possible py3k problem. In-Reply-To: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> References: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> Message-ID: <200811191021.17224.victor.stinner@haypocalc.com> > Attached program works with GSL is needed. Debian package: libgsl0-dev dump.py works correctly on computer: - Debian Sid - python 3.0 trunk - i386 Problem specific to x86_64? Where is the issue? :-) Victor From python-3000 at udmvt.ru Wed Nov 19 10:07:44 2008 From: python-3000 at udmvt.ru (python-3000 at udmvt.ru) Date: Wed, 19 Nov 2008 13:07:44 +0400 Subject: [Python-3000] Possible py3k problem. In-Reply-To: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> References: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> Message-ID: <20081119090744.GK9225@ruber.office.udmvt.ru> On Tue, Nov 18, 2008 at 03:26:32PM -0500, Lambert, David W (S&T) wrote: > > Attached program works with > > callback = GSL_FUNCTION(self.f) > set_with_values(mnzr,callback,xn,fn,xLB,fLB,xUB,fUB) > > But core dumps with > > set_with_values(mnzr,GSL_FUNCTION(self.f),xn,fn,xLB,fLB,xUB,fUB) > > > I do not understand the difference. Must be one of these possibilities: It would be interesting to know if slightly modified first version works too: set_with_values(mnzr,(typeof(callback))GSL_FUNCTION(self.f),xn,fn,xLB,fLB,xUB,fUB) I'm not sure, that it will segfault too. But can't test it on Opteron, sorry for that. To clear out the situation, just want to ask, what is the type of callback variable and how it differs from the type of GSL_FUNCTION(...) expression. From ncoghlan at gmail.com Wed Nov 19 10:28:28 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Nov 2008 19:28:28 +1000 Subject: [Python-3000] Python3 - it's awesome (fwd) In-Reply-To: References: <18721.42398.308355.41948@montanaro-dyndns-org.local> Message-ID: <4923DC3C.80609@gmail.com> Michael Bayer wrote: > Seconded. I tried Python 3K for the first time this weekend, spent a > few hours with the 2to3 tool and we have 80% of SQLAlchemy unit tests > passing on 3.0 now. It was far easier than I'd hoped, and the > decisions made in PY3K make perfect sense to me. Its a better language > and I think it will become popular more quickly than we've all thought. > Great job to everyone on the list here who's spent many months > hammering out all the details ! > > To be determined on our end is how to maintain 2.XX and 3.XX branches, > either through an automated 2to3 process, or by maintaining separate > branches. I'm leaning towards the former, possibly by augmenting 2to3 > with specially annotated comments that give hints to particularly thorny > sections. As I go through the code base making post-2to3 manual fixes, > I'm adding in comments denoting the manual changes which I hope to turn > into....something. Personally, I think some kind of doctest-style comment based hints or directives for 2to3 could be very useful in helping folks to automate the generation of the 3.0 versions of their code. But it will take feedback from those doing the conversions to determine what kind of directives would actually be helpful (if any). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From lists at cheimes.de Wed Nov 19 10:35:01 2008 From: lists at cheimes.de (Christian Heimes) Date: Wed, 19 Nov 2008 10:35:01 +0100 Subject: [Python-3000] Possible py3k problem. In-Reply-To: <200811191021.17224.victor.stinner@haypocalc.com> References: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> <200811191021.17224.victor.stinner@haypocalc.com> Message-ID: Victor Stinner wrote: >> Attached program works with > > GSL is needed. Debian package: libgsl0-dev > > dump.py works correctly on computer: > - Debian Sid > - python 3.0 trunk > - i386 > > Problem specific to x86_64? > > Where is the issue? :-) How did you run dump.py? It crashes only with "python3.0 dump.py 1" Christian From facundobatista at gmail.com Wed Nov 19 10:53:55 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Wed, 19 Nov 2008 07:53:55 -0200 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <49233EF3.9040303@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> Message-ID: 2008/11/18 "Martin v. L?wis" : > While I'm happy that Barry has automated his part to a high degree, > my part is, unfortunately, much less automated. I could personally > automate the build process a bit more, but part of it is also testing > of the installers, which is manual. Martin, maybe we can help you with the installers testing. While I don't have a clue about compiling complex software in Windows (and also want to stay away from that), I have a virtualbox with a win xp in my workstation, so I could try an installer. Maybe you could put a wiki page somewhere with a small recipe about what to look when testing an installer, and then produce all the versions, upload to it, and alert us here. So we go, download one of them, try it, and then mark it as tested with our name (maybe we can look after two or three testers per installer). I don't know if this will be quicker, but surely will lower your workload regarding this, which is a good thing. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From victor.stinner at haypocalc.com Wed Nov 19 12:13:41 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 19 Nov 2008 12:13:41 +0100 Subject: [Python-3000] Possible py3k problem. In-Reply-To: <200811191021.17224.victor.stinner@haypocalc.com> References: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> <200811191021.17224.victor.stinner@haypocalc.com> Message-ID: <200811191213.41892.victor.stinner@haypocalc.com> Le Wednesday 19 November 2008 10:21:16 Victor Stinner, vous avez ?crit?: > > Attached program works with > > GSL is needed. Debian package: libgsl0-dev > > dump.py works correctly on computer: Ooops, "./python dump.py" is ok but "./python dump.py 1" does crash (on i386 and x86_64). On i386, ffi_closure_SYSV_inner() crashs at "cif = closure->cif;" because closure is NULL. Victor From fuzzyman at voidspace.org.uk Wed Nov 19 12:16:56 2008 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 19 Nov 2008 11:16:56 +0000 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <4923ABEF.50900@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> Message-ID: <4923F5A8.5010505@voidspace.org.uk> Martin v. L?wis wrote: >> > While I'm happy that Barry has automated his part to a high degree, >> > my part is, unfortunately, much less automated. I could personally >> > automate the build process a bit more, but part of it is also testing >> > of the installers, which is manual. >> >> Maybe you could delegate a lot of the testing to competent volunteers? >> > > That's not the issue - I don't mind spending that time. However, it > means that several hours pass between starting the release process, and > making the binaries available - during this time, users always complain > why the Windows binaries are not released yet. > > With additional volunteers, availability of the binaries would lag even > more behind the release announcement. > > Installer tests can definitely be automated, and there is also a Python API to the virtualbox VM. I wonder if it would be possible to automate testing all the installers in various scenarios - each running simultaneously in a VM. Michael >> [1] Doesn't Windows have a way to send synthetic GUI events to a >> program? There ought to be a way to really script that, as the Python >> installer process presumbly doesn't change much from release to release. >> > > I also need to involve different machines, e.g. XP machines and Vista > machines, and machines that have Visual Studio installed and machines > that don't. Plus, I need to log into each machine in different ways: > as admin user and non-admin user. The automated GUI testing only really > works for a logged-in user. > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > -- http://www.ironpythoninaction.com/ From sidnei at enfoldsystems.com Wed Nov 19 03:27:22 2008 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Wed, 19 Nov 2008 00:27:22 -0200 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <87wsf0mqcr.fsf@xemacs.org> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> Message-ID: On Tue, Nov 18, 2008 at 11:55 PM, Stephen J. Turnbull wrote: > Footnotes: > [1] Doesn't Windows have a way to send synthetic GUI events to a > program? There ought to be a way to really script that, as the Python > installer process presumbly doesn't change much from release to release. There's at least PyWinAuto[1], Watsup[2] and winGuiAuto[3]. [1] http://pywinauto.seleniumhq.org/ [2] http://www.tizmoi.net/watsup/intro.html [3] http://www.brunningonline.net/simon/blog/archives/winGuiAuto.py.html -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 Skype zopedc From stephen at xemacs.org Wed Nov 19 15:43:44 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 19 Nov 2008 23:43:44 +0900 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <4923F5A8.5010505@voidspace.org.uk> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <4923F5A8.5010505@voidspace.org.uk> Message-ID: <87bpwbn5cv.fsf@xemacs.org> Michael Foord writes: > Installer tests can definitely be automated, and there is also a Python > API to the virtualbox VM. I wonder if it would be possible to automate > testing all the installers in various scenarios - each running > simultaneously in a VM. Now that would be an impressive tour de force! From gzlist at googlemail.com Wed Nov 19 15:39:43 2008 From: gzlist at googlemail.com (Martin (gzlist)) Date: Wed, 19 Nov 2008 14:39:43 +0000 Subject: [Python-3000] Possible py3k problem. In-Reply-To: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> References: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> Message-ID: On 18/11/2008, Lambert, David W (S&T) wrote: > > Attached program works with > > callback = GSL_FUNCTION(self.f) > set_with_values(mnzr,callback,xn,fn,xLB,fLB,xUB,fUB) > > But core dumps with > > set_with_values(mnzr,GSL_FUNCTION(self.f),xn,fn,xLB,fLB,xUB,fUB) This is covered in the documentation, isn't it? Important note for callback functions: Make sure you keep references to CFUNCTYPE objects as long as they are used from C code. ctypes doesn't, and if you don't, they may be garbage collected, crashing your program when a callback is made. Martin From victor.stinner at haypocalc.com Wed Nov 19 15:56:38 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 19 Nov 2008 15:56:38 +0100 Subject: [Python-3000] Possible py3k problem. In-Reply-To: References: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> Message-ID: <200811191556.38126.victor.stinner@haypocalc.com> Le Wednesday 19 November 2008 15:39:43 Martin (gzlist), vous avez ?crit?: > This is covered in the documentation, isn't it? > > > > Important note for callback functions: > > Make sure you keep references to CFUNCTYPE objects as long as they are > used from C code. ctypes doesn't, and if you don't, they may be > garbage collected, crashing your program when a callback is made. Oh yes, I remember this problem... I took me hours/days to understand the problem. Is there a FAQ for ctypes? To list the most common problems. The bug is in the documentation :-) Victor From barry at python.org Wed Nov 19 16:11:32 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 19 Nov 2008 10:11:32 -0500 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <4922D6AF.6080400@cheimes.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> Message-ID: <352DBA22-D814-43C5-84F8-BD9AEE756DCD@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 18, 2008, at 9:52 AM, Christian Heimes wrote: > Barry Warsaw wrote: >> Actually, I've wanted to do timed releases, though I think monthly >> is unrealistic. Maybe every two months is about the right time >> frame. Timed releases are nice because everybody then knows when a >> patch is due, from developers to downstream consumers. > > From my point of view bi-monthly release are too much. For a ?.?.1 > release two months are fine because several issues are found by 3rd > party authors. But after that a release every quarter is sufficient. > > * .1 release two months after the .0 release > * .2, .3, .4 and .5 release every quarter > * about here the next minor release gets out > * .6 and further releases after 6 months when necessary Timed releases have a lot of advantages, and I would like to see if we can adopt them and realize these benefits. What I like most about them is that everyone knows what's happening when and can coordinate efforts. Developers will know automatically (no reminders or alarms) when the next release is happening, so they can schedule what they want to do more easily. Release experts can block out the appropriate time on their schedules and plan more efficiently. Downstream consumers have a better idea of when updates are available and can lobby for certain critical bugs to be fixed in a timely and predictable manner. I think 6 months is too long between releases -- it might as well not be timed. It sounds like the Windows side is a bit of a pain, and since we're all busy, one month is probably too soon. That's why I proposed bi-monthly. I really want our releases to be predictable. I don't think we have to worry about nothing getting committed to the trees in 2 months time. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSQspHEjvBPtnXfVAQIfwwP8DzaIge8b1rL9/zACiwZ5nOn9S5d+ng+p zjSSvDKgfxX5kEMfUQQuJgI6GIOPvUm0wsmdZnH5f5AD86/1Qz1ugsBkHXO6BWWl LEI2jNjsIU9m1icQkQSnENxJoI5BFFA9upewT1zwo9md4cErzQLiK+WQrblu1hXE GKaxW0Xrva4= =ZI9e -----END PGP SIGNATURE----- From barry at python.org Wed Nov 19 16:17:44 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 19 Nov 2008 10:17:44 -0500 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <175832DD-76C5-4216-AA20-ED7BB62AF53F@python.org> Message-ID: <7D574FB8-8806-4670-813B-69C9F6CB810E@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 18, 2008, at 12:46 PM, Georg Brandl wrote: > Barry Warsaw schrieb: >> On Nov 18, 2008, at 8:07 AM, Christian Heimes wrote: >> >>> Barry Warsaw wrote: >>>> -----BEGIN PGP SIGNED MESSAGE----- >>>> Hash: SHA1 >>>> Martin suggests, and I agree, that we should release Python 3.0 >>>> final and 2.6.1 at the same time. Makes sense to me. That would >>>> mean that Python 2.6.1 should be ready on 03-Dec (well, if Python >>>> 3.0 is ready then!). >> >>> Should we release 2.6.1rc1, too? >> >> Do we need rc's for point releases? > > I think we did them in the past. There probably never was a > significant > change between the rc and the final, but Murphy dictates that once you > stop doing the rc, the final will be embarrassingly broken :) True. If the rc's are actually tested and help avoid embarrassment I'm all for them. If it's just extra work that few will test, then let's skip them and just do brown bag releases if necessary. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSQuGHEjvBPtnXfVAQJCfwQAky+ORhR0LaoZ0nevGBkEkl5LZbP0+A4a p0puGCnxuY6DVqx38dJUPLqt+wle+Zw9QX4PhhaalbTWyOQScKQk0p0CxagLnTeG GvlyTQLUM9RxFzolnzcY8mU8ewGnCJp16d7TR40AmMZ/geV/xMDzxL+tPKwiq/5p C4j+VmFHnMU= =EGqf -----END PGP SIGNATURE----- From barry at python.org Wed Nov 19 16:27:42 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 19 Nov 2008 10:27:42 -0500 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <49233EF3.9040303@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> Message-ID: <3C7340EB-C07C-4E0A-BA44-CB536012FEBF@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 18, 2008, at 5:17 PM, Martin v. L?wis wrote: >> From my point of view bi-monthly release are too much. For a ?.?.1 >> release two months are fine because several issues are found by 3rd >> party authors. But after that a release every quarter is sufficient. >> >> * .1 release two months after the .0 release >> * .2, .3, .4 and .5 release every quarter >> * about here the next minor release gets out >> * .6 and further releases after 6 months when necessary > > In the past, we had been striving for releases every 6 month. > This was already very difficult to achieve. > > While I'm happy that Barry has automated his part to a high degree, > my part is, unfortunately, much less automated. I could personally > automate the build process a bit more, but part of it is also testing > of the installers, which is manual. Martin, I'm keen on figuring out a way to reduce your workload, and also to coordinate releases better between us. I /think/ with timed releases I can tag a little early and give you something to work on so that the actual release is a matter of fiddling web pages and sending an announcement. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSQwbnEjvBPtnXfVAQIOuAP/fxzFpp886TordGNdd4tusqasL/VK2lpr wbhcfwh5TQbVhkhi9CVUFa7BNXCpgxG1nqWT9+ynSdNKIYKnK8kkjRE7FhEYantP TYkuRGI+2DznKjRBtVNXJq+JNktARWKhQwFkc0AmqooCYvhxqt9T5AkEgN4jRn4s YBLaex4g3rA= =Oi0b -----END PGP SIGNATURE----- From LambertDW at Corning.com Wed Nov 19 16:51:55 2008 From: LambertDW at Corning.com (Lambert, David W (S&T)) Date: Wed, 19 Nov 2008 10:51:55 -0500 Subject: [Python-3000] Possible py3k problem resolved. Message-ID: <84B204FFB016BA4984227335D8257FBA2738E9@CVCV0XI05.na.corning.com> Martin's "read the manual" answer is quite satisfying. Unfortunately, it doesn't fix core dump of Issue4309. I became excited thinking the problems could be related. I appreciate your incite, Dave. From steve at holdenweb.com Wed Nov 19 18:51:42 2008 From: steve at holdenweb.com (Steve Holden) Date: Wed, 19 Nov 2008 12:51:42 -0500 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <4923ABEF.50900@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> Message-ID: Martin v. L?wis wrote: >> > While I'm happy that Barry has automated his part to a high degree, >> > my part is, unfortunately, much less automated. I could personally >> > automate the build process a bit more, but part of it is also testing >> > of the installers, which is manual. >> >> Maybe you could delegate a lot of the testing to competent volunteers? > > That's not the issue - I don't mind spending that time. However, it > means that several hours pass between starting the release process, and > making the binaries available - during this time, users always complain > why the Windows binaries are not released yet. > In which case why not just hold the release until all installers are available? It's not like Beaujolais Nouveau, with people racing to be the first to get a new release installed. Particularly since the final release is usually just the re-tagged release candidate. Or are the complainers Python developers who know what goes on behind the scenes? > With additional volunteers, availability of the binaries would lag even > more behind the release announcement. > I really appreciate the dedicated work you put in to the Windows installers (as I am sure many others do also), but I wouldn't want to saddle you with it indefinitely. How well is the procedure documented? I ask this in hopes that you aren't a potential single point of failure in the release process. ...] regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ From martin at v.loewis.de Wed Nov 19 19:52:08 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 19 Nov 2008 19:52:08 +0100 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> Message-ID: <49246058.9090009@v.loewis.de> > Martin, maybe we can help you with the installers testing. Thanks for the offer. See my other message, though - this is not the point. If everything goes well, offloading testing just means that I have to wait some time for the testers to come back, and do other stuff meanwhile. For the majority of alpha and beta releases, something went wrong each time. A file was forgotten to be included in the installer generator, causing it to be missing on the target system. I forgot to perform a manual build step, causing the installer to fail, and so on. Then I have to debug the problem, and restart the production process from scratch. Offloading to testers in this case would just mean that I wait much longer until I can release, and it might not be possible to complete the build within a single day. > I don't know if this will be quicker, but surely will lower your > workload regarding this, which is a good thing. Thanks again - but I do typically find the time to do the release (if not, it gets delayed by another day). Regards, Martin From martin at v.loewis.de Wed Nov 19 20:02:04 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 19 Nov 2008 20:02:04 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <4923F5A8.5010505@voidspace.org.uk> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <4923F5A8.5010505@voidspace.org.uk> Message-ID: <492462AC.6090601@v.loewis.de> > Installer tests can definitely be automated, and there is also a Python > API to the virtualbox VM. I wonder if it would be possible to automate > testing all the installers in various scenarios - each running > simultaneously in a VM. I do use VMs, yes. However, they don't run on my workstation - which is 32-bit XP. It might be possible to automate it, but IMO, the effort of setting this up would be higher than the actual time spend in doing it manually, assuming we have no more than a dozen releases per year. Regards, Martin From theller at ctypes.org Wed Nov 19 20:15:00 2008 From: theller at ctypes.org (Thomas Heller) Date: Wed, 19 Nov 2008 20:15:00 +0100 Subject: [Python-3000] Possible py3k problem. In-Reply-To: <200811191556.38126.victor.stinner@haypocalc.com> References: <84B204FFB016BA4984227335D8257FBA2738E8@CVCV0XI05.na.corning.com> <200811191556.38126.victor.stinner@haypocalc.com> Message-ID: Victor Stinner schrieb: > Le Wednesday 19 November 2008 15:39:43 Martin (gzlist), vous avez ?crit : >> This is covered in the documentation, isn't it? >> >> >> >> Important note for callback functions: >> >> Make sure you keep references to CFUNCTYPE objects as long as they are >> used from C code. ctypes doesn't, and if you don't, they may be >> garbage collected, crashing your program when a callback is made. > > Oh yes, I remember this problem... I took me hours/days to understand the > problem. > > Is there a FAQ for ctypes? To list the most common problems. The bug is in the > documentation :-) Ok, so can someone please contribute a patch for the docs to fix this 'bug'? Or write a ctypes FAQ somewhere (where?) -- Thanks, Thomas From martin at v.loewis.de Wed Nov 19 20:18:31 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 19 Nov 2008 20:18:31 +0100 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <3C7340EB-C07C-4E0A-BA44-CB536012FEBF@python.org> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <3C7340EB-C07C-4E0A-BA44-CB536012FEBF@python.org> Message-ID: <49246687.2040204@v.loewis.de> > Martin, I'm keen on figuring out a way to reduce your workload, and also > to coordinate releases better between us. I /think/ with timed releases > I can tag a little early and give you something to work on so that the > actual release is a matter of fiddling web pages and sending an > announcement. Again - the work load is not so much an issue at the moment, and I expect it to be reduced even after 3.0 is finally released and 2.5 retired. I would indeed appreciate tighter coordination. Anthony's process differed from yours primarily in him waiting for the release announcements until the binaries are actually available. That might mean that a day or two might pass, but it did help to remove the feeling of working under tight deadlines. Regards, Martin From theller at ctypes.org Wed Nov 19 20:27:24 2008 From: theller at ctypes.org (Thomas Heller) Date: Wed, 19 Nov 2008 20:27:24 +0100 Subject: [Python-3000] Possible py3k problem resolved. In-Reply-To: <84B204FFB016BA4984227335D8257FBA2738E9@CVCV0XI05.na.corning.com> References: <84B204FFB016BA4984227335D8257FBA2738E9@CVCV0XI05.na.corning.com> Message-ID: Lambert, David W (S&T) schrieb: > Martin's "read the manual" answer is quite satisfying. > Unfortunately, it doesn't fix core dump of Issue4309. > I became excited thinking the problems could be related. Issue4309 is basically this problem calling the printf function: libc = CDLL(find_library("c")) printf = libc.printf printf("foo\n") - segfault - > I appreciate your incite, > Dave. The behaviour is explained in the docs, but you have to read it very 'carefully' (and maybe it should be changed for python 3): "None, integers, byte strings and unicode strings are the only native Python objects that can directly be used as parameters in these function calls. None is passed as a C NULL pointer, byte strings and unicode strings are passed as pointer to the memory block that contains their data (char * or wchar_t *). Python integers are passed as the platforms default C int type, their value is masked to fit into the C type." When you write printf("foo\n") you are passing a /unicode string/ in Python 3, a /byte string/ in Python 2.x. In Python 3 the printf C-function thus will receive a 'wchar_t *' pointer and so may crash or produce garbage since it must be called with a 'char *' pointer. These are the possible solutions for py3: 1. set the argtypes attribute for the printf function, and the first argument will automatically be converted to a valid 'char *' pointer, or an exception is raised if no conversion is possible: printf.argtypes = [c_char_p] printf("foo\n"); printf(b"foo\n"); ... 2. Pass a bytes string instead of a (normal, unicode) string 3. Wrap the argument into a c_char_p() instance You have already mentioned the latter two solutions in the bug tracker. The first solution has the advantage that it is works for Python 2.x AND Python 3, and also has other advantages. -- Thanks, Thomas From barry at python.org Wed Nov 19 20:37:36 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 19 Nov 2008 14:37:36 -0500 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <49246687.2040204@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <3C7340EB-C07C-4E0A-BA44-CB536012FEBF@python.org> <49246687.2040204@v.loewis.de> Message-ID: <6B34AD98-187C-4876-A410-E0A843EEF2C4@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 19, 2008, at 2:18 PM, Martin v. L?wis wrote: >> Martin, I'm keen on figuring out a way to reduce your workload, and >> also >> to coordinate releases better between us. I /think/ with timed >> releases >> I can tag a little early and give you something to work on so that >> the >> actual release is a matter of fiddling web pages and sending an >> announcement. > > Again - the work load is not so much an issue at the moment, and I > expect it to be reduced even after 3.0 is finally released and 2.5 > retired. > > I would indeed appreciate tighter coordination. Anthony's process > differed from yours primarily in him waiting for the release > announcements until the binaries are actually available. That might > mean that a day or two might pass, but it did help to remove the > feeling of working under tight deadlines. Let's try this for 3.0rc4 then. I think all it means is that I won't push the new pages or make the announcement until you verify that the Windows builds are ready and available. We can still use python- committers to coordinate when that will happen, and I'll still do all the release mechanics from my end as normal. It's okay if the announcement happens Friday or over the weekend. I will also try to get up early to do the release before my work day starts, to better coordinate with Euro time. So expect me on #python- dev tomorrow (my morning). Will that work for you? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSRrAnEjvBPtnXfVAQJiDwP/ZcbHnwkvWligaP2a3OXEmZ30GZoG1NQn +Lj/j4YNANkhxZ4Vgg9gkMH3mQ+eTwWdqr1/VM3LTW+fFXhdtAaAG1NsvHAlkAE0 N+DgEOEv4aMuO6MZplv/1kh4WeFC7SsnEX7bLext0QZITdBaL65dUN8Kt8G/ZeTG w3lQ01nBFqY= =InnO -----END PGP SIGNATURE----- From martin at v.loewis.de Wed Nov 19 20:40:38 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 19 Nov 2008 20:40:38 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> Message-ID: <49246BB6.7000607@v.loewis.de> > In which case why not just hold the release until all installers are > available? That is how Anthony Baxter handled things, indeed, and I would appreciate if we would return to that procedure. > Or are the complainers Python developers who know what goes on behind > the scenes? No - typically outsiders, who report that the links are broken (if the links get updated and the files are missing), or that the links are old (if the links are not updated). I think these people also try to be helpful (in addition to being frustrated that the release announcement is meaningless to them, and that they have to poll the release page). >> With additional volunteers, availability of the binaries would lag even >> more behind the release announcement. >> > I really appreciate the dedicated work you put in to the Windows > installers (as I am sure many others do also), but I wouldn't want to > saddle you with it indefinitely. How well is the procedure documented? IIRC, Christian Heimes did one of the alpha or beta releases, with what little documentation is available, so it's definitely doable. The tricky part really is when it breaks (which it does more often than not), in which case you need to understand msi.py, for which you need to understand MSI. IMO, the Microsoft is excellent (in being fairly precise), but the learning curve is high. The mechanical part of it can is completely automated - we produce daily MSI files in a buildbot slave (which may or may not work - I haven't checked in a while) > I > ask this in hopes that you aren't a potential single point of failure in > the release process. I think several of the "Windows people" could jump in, not just Christian. That would be best done in a beta release or release candidate, since one does get things wrong the first time. Regards, Martin From martin at v.loewis.de Wed Nov 19 20:50:41 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 19 Nov 2008 20:50:41 +0100 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <6B34AD98-187C-4876-A410-E0A843EEF2C4@python.org> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <3C7340EB-C07C-4E0A-BA44-CB536012FEBF@python.org> <49246687.2040204@v.loewis.de> <6B34AD98-187C-4876-A410-E0A843EEF2C4@python.org> Message-ID: <49246E11.6030206@v.loewis.de> > I will also try to get up early to do the release before my work day > starts, to better coordinate with Euro time. So expect me on > #python-dev tomorrow (my morning). > > Will that work for you? If you delay the announcement until the binaries are ready, you should feel free to work on it whenever it suits you best, as far as I'm concerned (of course, to coordinate with Georg, you might still prefer to work during the European daylight). I'll be busy with lectures tomorrow most of the day, and can't start working on the installer before 14:00 UTC (which I think is 9:00 your time). Around what time would you expect to have the tag set? Regards, Martin From fandecheng at gmail.com Thu Nov 20 03:27:57 2008 From: fandecheng at gmail.com (Decheng Fan) Date: Thu, 20 Nov 2008 10:27:57 +0800 Subject: [Python-3000] encode function errors="replace", but print() failed, is this a bug? Message-ID: Hi, Recently I encountered a problem with the str.encode() function. I used the function like this: s.encode("mbcs", "replace"), expecting it will eliminate all invalid characters. However it failed with the following message: UnicodeEncodeError: 'gbk' codec can't encode character '\ue104' in position 4: i Am I using it in a wrong way or is it a bug? Platform: Windows Vista SP1, system default code page: 936 (zh-cn). Program (test.py.txt) in attachment. >python3 test.py A Traceback (most recent call last): File "test.py", line 7, in print(str.encode("mbcs", "replace").decode("mbcs", "replace")) File "C:\Python30\lib\io.py", line 1485, in write b = encoder.encode(s) UnicodeEncodeError: 'gbk' codec can't encode character '\ue104' in position 4: i llegal multibyte sequence >python3 test.py A ???????{???????z B >python3 test.py A ?????????q???? B Thanks, Decheng (AKA Robbie Mosaic) Fan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test.py.txt URL: From musiccomposition at gmail.com Thu Nov 20 04:50:55 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Wed, 19 Nov 2008 21:50:55 -0600 Subject: [Python-3000] encode function errors="replace", but print() failed, is this a bug? In-Reply-To: References: Message-ID: <1afaf6160811191950l58ac36adida29e28fb4c0966f@mail.gmail.com> 2008/11/19 Decheng Fan : > Hi, > > Recently I encountered a problem with the str.encode() function. I used the > function like this: s.encode("mbcs", "replace"), expecting it will eliminate > all invalid characters. However it failed with the following message: > UnicodeEncodeError: 'gbk' codec can't encode character '\ue104' in position > 4: i > > Am I using it in a wrong way or is it a bug? print() sends it's data to stdout which encodes the data based on it's own encoding. If you want to change this behavior, replace sys.stdout with your own io.TextIOWrapper with 'replace' as the errors argument. -- Cheers, Benjamin Peterson "There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner." From martin at v.loewis.de Thu Nov 20 07:45:41 2008 From: martin at v.loewis.de (=?GB2312?B?Ik1hcnRpbiB2LiBMbyJ3aXMi?=) Date: Thu, 20 Nov 2008 07:45:41 +0100 Subject: [Python-3000] encode function errors="replace", but print() failed, is this a bug? In-Reply-To: References: Message-ID: <49250795.3070702@v.loewis.de> > Am I using it in a wrong way or is it a bug? You are using it in a wrong way. The terminal window, on Windows, does not use the "mbcs" encoding. Microsoft has two system encodings: the "ANSI" code page (CP_ACP), called "mbcs" by Python, and the "OEM" code page (CP_OEMCP). The latter is what the terminal window uses. Python does not directly expose the Microsoft OEMCP codec; instead, it determines the terminal's code page, and then carries its own codec for that code page ("gbk" in your case). To make your example work, replace "mbcs" with sys.stdout.encoding. HTH, Martin From mal at egenix.com Thu Nov 20 11:12:49 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 20 Nov 2008 11:12:49 +0100 Subject: [Python-3000] Using memoryviews Message-ID: <49253821.7010909@egenix.com> I've had a look at the new memoryview and associated buffer API and have a question: how is a C extension supposed to use the buffer API without going directly into the C struct Py_buffer ? I have not found any macros for accessing Py_buffer internals and the docs mention the struct members directly (which is a bit unusual for the Python C API). Shouldn't there be a set of macros providing some form of abstraction for the struct members ? BTW: I was looking for a suitable replacement for the buffer object which isn't available in Python 3 anymore. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 20 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2008-11-12: Released mxODBC.Connect 0.9.3 http://python.egenix.com/ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mal at egenix.com Thu Nov 20 13:39:50 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 20 Nov 2008 13:39:50 +0100 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: References: <491D4DEF.3050100@v.loewis.de> <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> Message-ID: <49255A96.9080506@egenix.com> On 2008-11-14 22:15, Roger Binns wrote: > My confusion was because I though that the HEAD for the data structure > had to use the same corresponding HEAD_INIT in the type. So for > whatever reason the PyTypeObject is declared as a var object which is > why the var HEAD_INIT is needed. > > It still looks like PyObject_HEAD_INIT should be removed so that people > using earlier versions of Python, following the Py3 docs (before they > are fixed), using older tutorials etc don't get burnt. > > Grepping through the py3 source shows only PyModuleDef_HEAD_INIT using > PyObject_HEAD_INIT. There are no other legitimate uses! Except maybe a few thousand extensions already using it which are waiting to be ported to Python 3. Whether you write: {PyObject_HEAD_INIT(0), 0, ... or {PyVarObject_HEAD_INIT(0, 0), ... for your type definition doesn't really make much difference. They both unwrap to the same code. Since PyTypeObjects are variable length objects, you always need the ob_size entry. However, the macros exist to be used for both variable size and fixed size objects, so having both available is useful and legitimate. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 20 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2008-11-12: Released mxODBC.Connect 0.9.3 http://python.egenix.com/ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From rogerb at rogerbinns.com Thu Nov 20 20:34:18 2008 From: rogerb at rogerbinns.com (Roger Binns) Date: Thu, 20 Nov 2008 11:34:18 -0800 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: <49255A96.9080506@egenix.com> References: <491D4DEF.3050100@v.loewis.de> <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> <49255A96.9080506@egenix.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 M.-A. Lemburg wrote: > Whether you write: > > {PyObject_HEAD_INIT(0), 0, ... > > or > > {PyVarObject_HEAD_INIT(0, 0), ... > > for your type definition doesn't really make much difference. Actually in Py 3 it does. If you use the former (which is how Py 2 does it) then you get serious compiler warnings due to misaligned fields in Py 3 and presumably even worse if run the code. See PEP 3123 as to why things changed. That is why all the code in Python 3 was changed from using the former to the latter. > However, the macros exist to be used for both variable size > and fixed size objects, so having both available is useful and > legitimate. ... > Except maybe a few thousand extensions already using it which are > waiting to be ported to Python 3. Can you point to any? All the ones I found (via Google) only use PyObject_HEAD_INIT for PyTypeObjects and every single one of those will have to change to using PyVarObject_HEAD_INIT. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkklu7UACgkQmOOfHg372QTZBQCgt3kwtUYF3Us8hjPAS2PvDtpm l+EAoJUata+K55mboNB0UpMsLlzoRpnA =WbrQ -----END PGP SIGNATURE----- From mal at egenix.com Fri Nov 21 15:14:18 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 21 Nov 2008 15:14:18 +0100 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: References: <491D4DEF.3050100@v.loewis.de> <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> <49255A96.9080506@egenix.com> Message-ID: <4926C23A.8080108@egenix.com> On 2008-11-20 20:34, Roger Binns wrote: > M.-A. Lemburg wrote: >> Whether you write: > >> {PyObject_HEAD_INIT(0), 0, ... > >> or > >> {PyVarObject_HEAD_INIT(0, 0), ... > >> for your type definition doesn't really make much difference. > > Actually in Py 3 it does. If you use the former (which is how Py 2 does > it) then you get serious compiler warnings due to misaligned fields in > Py 3 and presumably even worse if run the code. You might get warnings (esp. from GCC), but I have yet to see a compiler that doesn't map the above to the same memory. After all, Python 2 has been using this layout for years without any compiler warnings or segfaults because of this. > See PEP 3123 as to why > things changed. That is why all the code in Python 3 was changed from > using the former to the latter. Right. Things are now more standard compliant and you get fewer warnings. >> However, the macros exist to be used for both variable size >> and fixed size objects, so having both available is useful and >> legitimate. > ... >> Except maybe a few thousand extensions already using it which are >> waiting to be ported to Python 3. > > Can you point to any? All the ones I found (via Google) only use > PyObject_HEAD_INIT for PyTypeObjects and every single one of those will > have to change to using PyVarObject_HEAD_INIT. True, because PyTypeObjects *are* in fact PyVarObjects and not PyObjects, so they should have used PyVarObject_HEAD_INIT all along. It's only that compilers never really cared or always did the right thing - depending on how you see it :-) BTW: With the "few thousand extensions" I was referring to the current use of the PyObject_HEAD_INIT() macro which you wanted to remove, not to a few thousand extensions using it correctly. Note that it's rather uncommon to create singletons like the type objects in C. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 21 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2008-11-12: Released mxODBC.Connect 0.9.3 http://python.egenix.com/ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From barry at python.org Fri Nov 21 16:06:44 2008 From: barry at python.org (Barry Warsaw) Date: Fri, 21 Nov 2008 10:06:44 -0500 Subject: [Python-3000] RELEASED Python 3.0rc3 Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On behalf of the Python development team and the Python community, I am happy to announce the third and last planned release candidate for Python 3.0. This is a release candidate, so while it is not quite suitable for production environments, we strongly encourage you to download and test this release on your software. We expect only critical bugs to be fixed between now and the final release, currently planned for 03- Dec-2008. If you find things broken or incorrect, please submit bug reports at http://bugs.python.org Please read the RELNOTES file in the distribution for important details about this release. For more information and downloadable distributions, see the Python 3.0 website: http://www.python.org/download/releases/3.0/ See PEP 361 for release schedule details: http://www.python.org/dev/peps/pep-0361/ Enjoy, - -Barry Barry Warsaw barry at python.org Python 2.6/3.0 Release Manager (on behalf of the entire python-dev team) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSbOhHEjvBPtnXfVAQLzBwP/dS2j4XhZMNdb28TG3ZblkSmlPS4IU20U Vvq85inUkJ6idwKZBqa6brrD1hbqrl4UjKZh4/ppzhIwsJtFMlMiqnkHVrvIYFBG Yg+pQdO5HQzrw9K04aTdtNiKTiiJNIkqWdQQUd573XBFODRAIaq0qwk9C24kXeZM e3xNgNRxfmY= =TvxY -----END PGP SIGNATURE----- From josiah.carlson at gmail.com Fri Nov 21 17:30:30 2008 From: josiah.carlson at gmail.com (Josiah Carlson) Date: Fri, 21 Nov 2008 08:30:30 -0800 Subject: [Python-3000] Using memoryviews In-Reply-To: <49253821.7010909@egenix.com> References: <49253821.7010909@egenix.com> Message-ID: On Thu, Nov 20, 2008 at 2:12 AM, M.-A. Lemburg wrote: > I've had a look at the new memoryview and associated buffer API > and have a question: how is a C extension supposed to use the buffer > API without going directly into the C struct Py_buffer ? > > I have not found any macros for accessing Py_buffer internals and > the docs mention the struct members directly (which is a bit unusual > for the Python C API). > > Shouldn't there be a set of macros providing some form of abstraction > for the struct members ? > > BTW: I was looking for a suitable replacement for the buffer object > which isn't available in Python 3 anymore. > > Thanks, > -- > Marc-Andre Lemburg > eGenix.com >From what I understand of the memoryview when I tried to do the same thing a few months ago (use memoryview to replace buffer in asyncore/asynchat), memoryview is incomplete. It didn't support character buffer slicing (you know, the 'offset' and 'size' arguments that were in buffer), and at least a handful of other things (that I can't remember at the moment). - Josiah From rhamph at gmail.com Fri Nov 21 18:36:25 2008 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 21 Nov 2008 10:36:25 -0700 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: <4926C23A.8080108@egenix.com> References: <491D4DEF.3050100@v.loewis.de> <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> <49255A96.9080506@egenix.com> <4926C23A.8080108@egenix.com> Message-ID: On Fri, Nov 21, 2008 at 7:14 AM, M.-A. Lemburg wrote: > On 2008-11-20 20:34, Roger Binns wrote: >> M.-A. Lemburg wrote: >>> Whether you write: >> >>> {PyObject_HEAD_INIT(0), 0, ... >> >>> or >> >>> {PyVarObject_HEAD_INIT(0, 0), ... >> >>> for your type definition doesn't really make much difference. >> >> Actually in Py 3 it does. If you use the former (which is how Py 2 does >> it) then you get serious compiler warnings due to misaligned fields in >> Py 3 and presumably even worse if run the code. > > You might get warnings (esp. from GCC), but I have yet to see a compiler > that doesn't map the above to the same memory. > > After all, Python 2 has been using this layout for years without any > compiler warnings or segfaults because of this. The definition of PyObject_HEAD_INIT and PyVarObject_HEAD_INIT changed. We've gone from a series of common fields to a single struct containing the fields. With the series of fields using PyObject_HEAD_INIT followed by a size was perfectly correct, but with a struct it's gibberish. -- Adam Olsen, aka Rhamphoryncus From mal at egenix.com Fri Nov 21 18:53:52 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 21 Nov 2008 18:53:52 +0100 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: References: <491D4DEF.3050100@v.loewis.de> <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> <49255A96.9080506@egenix.com> <4926C23A.8080108@egenix.com> Message-ID: <4926F5B0.2020502@egenix.com> On 2008-11-21 18:36, Adam Olsen wrote: > On Fri, Nov 21, 2008 at 7:14 AM, M.-A. Lemburg wrote: >> On 2008-11-20 20:34, Roger Binns wrote: >>> M.-A. Lemburg wrote: >>>> Whether you write: >>>> {PyObject_HEAD_INIT(0), 0, ... >>>> or >>>> {PyVarObject_HEAD_INIT(0, 0), ... >>>> for your type definition doesn't really make much difference. >>> Actually in Py 3 it does. If you use the former (which is how Py 2 does >>> it) then you get serious compiler warnings due to misaligned fields in >>> Py 3 and presumably even worse if run the code. >> You might get warnings (esp. from GCC), but I have yet to see a compiler >> that doesn't map the above to the same memory. >> >> After all, Python 2 has been using this layout for years without any >> compiler warnings or segfaults because of this. > > The definition of PyObject_HEAD_INIT and PyVarObject_HEAD_INIT > changed. We've gone from a series of common fields to a single struct > containing the fields. With the series of fields using > PyObject_HEAD_INIT followed by a size was perfectly correct, but with > a struct it's gibberish. Sigh... Python 2.5: ----------- (gdb) print PyUnicode_Type $2 = {ob_refcnt = 1, ob_type = 0x6404c0, ob_size = 0, tp_name = 0x505972 "unicode", tp_basicsize = 48, tp_itemsize = 0, tp_dealloc = 0x470500 , tp_print = 0, tp_getattr = 0, tp_setattr = 0, tp_compare = 0, tp_repr = 0x4746c0 , tp_as_number = 0x644f60, tp_as_sequence = 0x644f00, tp_as_mapping = 0x644ee0, ... (gdb) print &PyUnicode_Type.tp_name $3 = (const char **) 0x642758 (gdb) print &PyUnicode_Type.ob_refcnt $4 = (Py_ssize_t *) 0x642740 Python 3.0: ----------- (gdb) print PyUnicode_Type $1 = {ob_base = {ob_base = {ob_refcnt = 1, ob_type = 0x733940}, ob_size = 0}, tp_name = 0x4f52e9 "str", tp_basicsize = 56, tp_itemsize = 0, tp_dealloc = 0x42af80 , tp_print = 0, tp_getattr = 0, tp_setattr = 0, tp_compare = 0, tp_repr = 0x431ca0 , tp_as_number = 0x736420, ... (gdb) print &PyUnicode_Type.tp_name $3 = (const char **) 0x735bf8 (gdb) print &PyUnicode_Type.ob_base $4 = (PyVarObject *) 0x735be0 In both cases, the fields are 24 bytes apart (on my 64-bit machine). Yes, it's a different way of writing and accessing the resp. fields. No, it's not a different memory layout. Yes, this is binary compatible. No, this is not going to help you, since the rest of Python 3 is not ;-) Fortunately, you only rarely have to access the fields in question directly in extensions. And Python 3.0 and 2.6 also add a few macros for abstracting this: #define Py_REFCNT(ob) ... #define Py_TYPE(ob) ... #define Py_SIZE(ob) ... -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 21 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2008-11-12: Released mxODBC.Connect 0.9.3 http://python.egenix.com/ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From rhamph at gmail.com Fri Nov 21 19:36:37 2008 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 21 Nov 2008 11:36:37 -0700 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: <4926F5B0.2020502@egenix.com> References: <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> <49255A96.9080506@egenix.com> <4926C23A.8080108@egenix.com> <4926F5B0.2020502@egenix.com> Message-ID: On Fri, Nov 21, 2008 at 10:53 AM, M.-A. Lemburg wrote: > On 2008-11-21 18:36, Adam Olsen wrote: >> On Fri, Nov 21, 2008 at 7:14 AM, M.-A. Lemburg wrote: >>> On 2008-11-20 20:34, Roger Binns wrote: >>>> M.-A. Lemburg wrote: >>>>> Whether you write: >>>>> {PyObject_HEAD_INIT(0), 0, ... >>>>> or >>>>> {PyVarObject_HEAD_INIT(0, 0), ... >>>>> for your type definition doesn't really make much difference. >>>> Actually in Py 3 it does. If you use the former (which is how Py 2 does >>>> it) then you get serious compiler warnings due to misaligned fields in >>>> Py 3 and presumably even worse if run the code. >>> You might get warnings (esp. from GCC), but I have yet to see a compiler >>> that doesn't map the above to the same memory. >>> >>> After all, Python 2 has been using this layout for years without any >>> compiler warnings or segfaults because of this. >> >> The definition of PyObject_HEAD_INIT and PyVarObject_HEAD_INIT >> changed. We've gone from a series of common fields to a single struct >> containing the fields. With the series of fields using >> PyObject_HEAD_INIT followed by a size was perfectly correct, but with >> a struct it's gibberish. > > Sigh... > > Python 2.5: > ----------- > > (gdb) print PyUnicode_Type > $2 = {ob_refcnt = 1, ob_type = 0x6404c0, ob_size = 0, tp_name = 0x505972 "unicode", > tp_basicsize = 48, tp_itemsize = 0, tp_dealloc = 0x470500 , > tp_print = 0, > tp_getattr = 0, tp_setattr = 0, tp_compare = 0, tp_repr = 0x4746c0 , > tp_as_number = 0x644f60, tp_as_sequence = 0x644f00, tp_as_mapping = 0x644ee0, > ... > > (gdb) print &PyUnicode_Type.tp_name > $3 = (const char **) 0x642758 > (gdb) print &PyUnicode_Type.ob_refcnt > $4 = (Py_ssize_t *) 0x642740 > > Python 3.0: > ----------- > > (gdb) print PyUnicode_Type > $1 = {ob_base = {ob_base = {ob_refcnt = 1, ob_type = 0x733940}, ob_size = 0}, > tp_name = 0x4f52e9 "str", tp_basicsize = 56, tp_itemsize = 0, > tp_dealloc = 0x42af80 , tp_print = 0, tp_getattr = 0, > tp_setattr = 0, > tp_compare = 0, tp_repr = 0x431ca0 , tp_as_number = 0x736420, > ... > > (gdb) print &PyUnicode_Type.tp_name > $3 = (const char **) 0x735bf8 > (gdb) print &PyUnicode_Type.ob_base > $4 = (PyVarObject *) 0x735be0 > > In both cases, the fields are 24 bytes apart (on my 64-bit machine). > > Yes, it's a different way of writing and accessing the resp. fields. > No, it's not a different memory layout. > Yes, this is binary compatible. > No, this is not going to help you, since the rest of Python 3 is not ;-) You're comparing already compiled code. The issue is with recompiling, ie source compatibility. In 2.5 the macros expanded to look like this: PyTypeObject PyUnicode_Type = { 1, /* ob_refcnt */ &PyType_Type, /* ob_type */ 0, /* ob_size */ "unicode", /* tp_name */ sizeof(PyUnicodeObject), /* tp_basicsize */ Try the same macro in 3.0 and it'll look like this: PyTypeObject PyUnicode_Type = { { { 1, /* ob_refcnt */ &PyType_Type, /* ob_type */ }, /* Trailing ob_size gets implicitly initialized to 0 */ }, 0, /* ob_size? Nope, tp_name! */ "unicode", /* tp_name? Nope, tp_basicsize! */ sizeof(PyUnicodeObject), /* tp_basicsize? Nope, tp_itemsize! */ The compiler knows what layout a PyTypeObject should have, but the initializer doesn't match up. -- Adam Olsen, aka Rhamphoryncus From rogerb at rogerbinns.com Fri Nov 21 19:58:15 2008 From: rogerb at rogerbinns.com (Roger Binns) Date: Fri, 21 Nov 2008 10:58:15 -0800 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: <4926C23A.8080108@egenix.com> References: <491D4DEF.3050100@v.loewis.de> <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> <49255A96.9080506@egenix.com> <4926C23A.8080108@egenix.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 M.-A. Lemburg wrote: > You might get warnings (esp. from GCC), but I have yet to see a compiler > that doesn't map the above to the same memory. They don't map the same as Adam showed. Your fields end up off by one which is why the compiler is giving warnings. > After all, Python 2 has been using this layout for years without any > compiler warnings or segfaults because of this. Instead of speculating, please write some Python 3 code to initialize a PyTypeObject using the Python 2 way and look at what the compiler is showing. > Right. Things are now more standard compliant and you get fewer > warnings. The underlying structures also changed. > BTW: With the "few thousand extensions" I was referring to the current > use of the PyObject_HEAD_INIT() macro which you wanted to remove, not > to a few thousand extensions using it correctly. Note that it's rather > uncommon to create singletons like the type objects in C. That means you are agreeing with what I said in the first place! You *cannot* use PyObject_HEAD_INIT to initialize a PyTypeObject in Python 3. The fields end up misaligned. You *have to* change to using PyVarObject_HEAD_INIT. As you point out Py{Var,}Object_HEAD_INIT is for singleton static objects and other than type objects is very uncommon in C. I couldn't find any. The reason I am suggesting removing PyObject_HEAD_INIT from Python 3 is because there are no demonstrated uses of it. Python 2 code being ported will be using it, but will be using it incorrectly to initialize PyTypeObjects. The compiler will issue warnings, but not errors. The code will then run and probably crash in various interesting ways. Every single extension author porting to Python 3 is going to encounter this. If PyObject_HEAD_INIT is removed then their code won't compile and they will have to work out what is going on. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkknBMMACgkQmOOfHg372QR8xgCghQs2f1+O0p7anwRPRlGyM+fV 7lAAmgIj2FYcqvQ9XTAdQ1C+38/CD0g1 =VFDH -----END PGP SIGNATURE----- From mal at egenix.com Fri Nov 21 20:25:21 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 21 Nov 2008 20:25:21 +0100 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: References: <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> <49255A96.9080506@egenix.com> <4926C23A.8080108@egenix.com> <4926F5B0.2020502@egenix.com> Message-ID: <49270B21.8090403@egenix.com> On 2008-11-21 19:36, Adam Olsen wrote: > On Fri, Nov 21, 2008 at 10:53 AM, M.-A. Lemburg wrote: >> Yes, it's a different way of writing and accessing the resp. fields. >> No, it's not a different memory layout. >> Yes, this is binary compatible. >> No, this is not going to help you, since the rest of Python 3 is not ;-) > > You're comparing already compiled code. The issue is with > recompiling, ie source compatibility. > > In 2.5 the macros expanded to look like this: > > PyTypeObject PyUnicode_Type = { > 1, /* ob_refcnt */ > &PyType_Type, /* ob_type */ > 0, /* ob_size */ > "unicode", /* tp_name */ > sizeof(PyUnicodeObject), /* tp_basicsize */ > > Try the same macro in 3.0 and it'll look like this: > > PyTypeObject PyUnicode_Type = { > { > { > 1, /* ob_refcnt */ > &PyType_Type, /* ob_type */ > }, > /* Trailing ob_size gets implicitly initialized to 0 */ > }, > 0, /* ob_size? Nope, tp_name! */ > "unicode", /* tp_name? Nope, tp_basicsize! */ > sizeof(PyUnicodeObject), /* tp_basicsize? Nope, tp_itemsize! */ > > The compiler knows what layout a PyTypeObject should have, but the > initializer doesn't match up. Well, yes, of course. That's the whole purpose of PEP 3123, isn't it ? Starting with Python 3, you have to use PyVarObject_HEAD_INIT() on PyVarObjects and PyObject_HEAD_INIT() on PyObjects. I don't see the problem. It's just another change to remember when porting to Python 3. The whole type slot interface has changed significantly between Python 2 and 3, so this minor clarification is really harmless compared to all the other changes: * PyNumberMethods have changed due to removal of the division, oct, hex and coercion slots * PySequenceMethods have changed, but maintained binary compatibility (why ?) by replacing the removed slice functions with dummy pointers * PyBufferProcs is a completely new design * A lot type flags were removed. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 21 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2008-11-12: Released mxODBC.Connect 0.9.3 http://python.egenix.com/ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mal at egenix.com Fri Nov 21 20:41:51 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 21 Nov 2008 20:41:51 +0100 Subject: [Python-3000] Using memoryviews In-Reply-To: References: <49253821.7010909@egenix.com> Message-ID: <49270EFF.3070006@egenix.com> On 2008-11-21 17:30, Josiah Carlson wrote: > On Thu, Nov 20, 2008 at 2:12 AM, M.-A. Lemburg wrote: >> I've had a look at the new memoryview and associated buffer API >> and have a question: how is a C extension supposed to use the buffer >> API without going directly into the C struct Py_buffer ? >> >> I have not found any macros for accessing Py_buffer internals and >> the docs mention the struct members directly (which is a bit unusual >> for the Python C API). >> >> Shouldn't there be a set of macros providing some form of abstraction >> for the struct members ? >> >> BTW: I was looking for a suitable replacement for the buffer object >> which isn't available in Python 3 anymore. >> >> Thanks, >> -- >> Marc-Andre Lemburg >> eGenix.com > >>From what I understand of the memoryview when I tried to do the same > thing a few months ago (use memoryview to replace buffer in > asyncore/asynchat), memoryview is incomplete. It didn't support > character buffer slicing (you know, the 'offset' and 'size' arguments > that were in buffer), and at least a handful of other things (that I > can't remember at the moment). True, memoryview objects aren't as useful in Python as the underlying Py_buffer "C" objects are in the C API. But then I only need it to signal "this is binary data" for the purpose of using the memoryview in DB-API extensions. However, this would only be of effective use if there's a documented way of accessing the actual C char* buffer behind the object, instead of having to allocate a new buffer and copy the data over - only to reference it like that. In the past, we've always tried to provide abstract access methods to C struct internals of Python objects and I wonder whether this was deliberately not done for Py_buffer structs or simply not considered. I don't think it's a good idea to use my_Py_buffer->buf in a C extension and would rather like to write: Py_Buffer_AS_BUFFER(my_Py_buffer) Py_Buffer_GET_SIZE(my_Py_buffer) Py_Buffer_GET_ITEM_SIZE(my_Py_buffer) etc. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 21 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2008-11-12: Released mxODBC.Connect 0.9.3 http://python.egenix.com/ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From rogerb at rogerbinns.com Fri Nov 21 20:47:58 2008 From: rogerb at rogerbinns.com (Roger Binns) Date: Fri, 21 Nov 2008 11:47:58 -0800 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: <49270B21.8090403@egenix.com> References: <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> <49255A96.9080506@egenix.com> <4926C23A.8080108@egenix.com> <4926F5B0.2020502@egenix.com> <49270B21.8090403@egenix.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 M.-A. Lemburg wrote: > Starting with Python 3, you have to use PyVarObject_HEAD_INIT() > on PyVarObjects and PyObject_HEAD_INIT() on PyObjects. I don't > see the problem. It's just another change to remember when porting > to Python 3. The problem is that unless you are clairvoyant you have no way of knowing about this change. Even in rc3 the documentation shows the old (wrong) way: http://docs.python.org/dev/3.0/extending/newtypes.html PyObject_HEAD_INIT is documented: http://docs.python.org/dev/3.0/search.html?q=PyObject_HEAD_INIT PyVarObject_HEAD_INIT is not: http://docs.python.org/dev/3.0/search.html?q=PyVarObject_HEAD_INIT So anyone porting from Python 2 to Python 3 is just going to compile their code. There will be some warnings but if they consult the docs the code will still look correct. They will just assume it is a Python quirk. Then the code will run and crash and they will have to examine the Python 3 source to work out what the underlying issue is and how to fix it. If PyObject_HEAD_INIT were removed/renamed then the code wouldn't even compile and so they would realize they have to fix it. > * PyNumberMethods have changed due to removal of the division, ... > * PySequenceMethods have changed, but maintained binary compatibility > (why ?) by replacing the removed slice functions with dummy pointers ... > * PyBufferProcs is a completely new design PyTypeObjects are used in every extension. The above changes affect fewer extensions and the compiler error/warnings will be more meaningful. I would have expected minor changes like these. > * A lot type flags were removed. That is fine. The Python 2 code would fail to compile so you would at least know what to look for and about. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEUEARECAAYFAkknEGsACgkQmOOfHg372QR6RwCeOJ6Sj2hYWPVwpHnwc9yOvG2H 2YkAmOb6VXWaqHQL+Xd7ihq9gEWLHiA= =EwXY -----END PGP SIGNATURE----- From barry at python.org Fri Nov 21 21:26:01 2008 From: barry at python.org (Barry Warsaw) Date: Fri, 21 Nov 2008 15:26:01 -0500 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <3C7340EB-C07C-4E0A-BA44-CB536012FEBF@python.org> <49246687.2040204@v.loewis.de> <6B34AD98-187C-4876-A410-E0A843EEF2C4@python.org> Message-ID: <697974D1-698C-4196-9B8A-132207E3D8B0@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 19, 2008, at 3:19 PM, Terry Reedy wrote: >> Let's try this for 3.0rc4 then. > > The current release is rc2. Skipping rc3 would confuse people'-) Yeah, my calendar was wrong, but the PEP (and more importantly... code!) was right :). There is nooooo rc4! - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSScZWXEjvBPtnXfVAQKpQAQAi9Q8rfgcCVXmQ2tIqaiAVKOQHDPQdfhF lyDWHg+6i2EGrbs0Jju5GB9YML1yNga3X85zfQSedu6mgpA4dV6NvW988N3Wp4oG ztDGT7yLxwYe4Wy606FF6lxSlXSvXQRLc/Nf1qgn8dDGskQKO2LZ+fUHW0BIWDBN RFAuZqzdWQY= =9Z8w -----END PGP SIGNATURE----- From tjreedy at udel.edu Fri Nov 21 21:33:22 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 21 Nov 2008 15:33:22 -0500 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: References: <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> <49255A96.9080506@egenix.com> <4926C23A.8080108@egenix.com> <4926F5B0.2020502@egenix.com> <49270B21.8090403@egenix.com> Message-ID: Roger Binns wrote: > The problem is that unless you are clairvoyant you have no way of > knowing about this change. Even in rc3 the documentation shows the old > (wrong) way: > > http://docs.python.org/dev/3.0/extending/newtypes.html > > PyObject_HEAD_INIT is documented: > > http://docs.python.org/dev/3.0/search.html?q=PyObject_HEAD_INIT > > PyVarObject_HEAD_INIT is not: > > http://docs.python.org/dev/3.0/search.html?q=PyVarObject_HEAD_INIT Whatever happens to PyObject..., that should be fixed. Is there are tracker item yet? A doc person will copy, paste, and format if given raw text. > So anyone porting from Python 2 to Python 3 is just going to compile > their code. A What's New in Py 3 API is needed that people could read first instead of digging through the source after. > There will be some warnings but if they consult the docs > the code will still look correct. They will just assume it is a Python > quirk. Then the code will run and crash and they will have to examine > the Python 3 source to work out what the underlying issue is and how to > fix it. > > If PyObject_HEAD_INIT were removed/renamed then the code wouldn't even > compile and so they would realize they have to fix it. > >> * PyNumberMethods have changed due to removal of the division, > ... >> * PySequenceMethods have changed, but maintained binary compatibility >> (why ?) by replacing the removed slice functions with dummy pointers > ... >> * PyBufferProcs is a completely new design > > PyTypeObjects are used in every extension. The above changes affect > fewer extensions and the compiler error/warnings will be more > meaningful. I would have expected minor changes like these. > >> * A lot type flags were removed. This seems like a start. From musiccomposition at gmail.com Fri Nov 21 22:34:29 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Fri, 21 Nov 2008 15:34:29 -0600 Subject: [Python-3000] Using memoryviews In-Reply-To: <49270EFF.3070006@egenix.com> References: <49253821.7010909@egenix.com> <49270EFF.3070006@egenix.com> Message-ID: <1afaf6160811211334v5de44235j936eed156e45cfa2@mail.gmail.com> On Fri, Nov 21, 2008 at 1:41 PM, M.-A. Lemburg wrote: > > In the past, we've always tried to provide abstract access methods to > C struct internals of Python objects and I wonder whether this was > deliberately not done for Py_buffer structs or simply not considered. > > I don't think it's a good idea to use my_Py_buffer->buf in a C > extension and would rather like to write: > > Py_Buffer_AS_BUFFER(my_Py_buffer) > Py_Buffer_GET_SIZE(my_Py_buffer) > Py_Buffer_GET_ITEM_SIZE(my_Py_buffer) > etc. I think that's a good idea, too, and we should get something like that in for 3.1. I rather feel like the new buffer API slipped in without any real review. -- Cheers, Benjamin Peterson "There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner." From ncoghlan at gmail.com Sat Nov 22 00:34:09 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Nov 2008 09:34:09 +1000 Subject: [Python-3000] Using memoryviews In-Reply-To: <1afaf6160811211334v5de44235j936eed156e45cfa2@mail.gmail.com> References: <49253821.7010909@egenix.com> <49270EFF.3070006@egenix.com> <1afaf6160811211334v5de44235j936eed156e45cfa2@mail.gmail.com> Message-ID: <49274571.5020202@gmail.com> Benjamin Peterson wrote: > On Fri, Nov 21, 2008 at 1:41 PM, M.-A. Lemburg wrote: >> In the past, we've always tried to provide abstract access methods to >> C struct internals of Python objects and I wonder whether this was >> deliberately not done for Py_buffer structs or simply not considered. >> >> I don't think it's a good idea to use my_Py_buffer->buf in a C >> extension and would rather like to write: >> >> Py_Buffer_AS_BUFFER(my_Py_buffer) >> Py_Buffer_GET_SIZE(my_Py_buffer) >> Py_Buffer_GET_ITEM_SIZE(my_Py_buffer) >> etc. > > I think that's a good idea, too, and we should get something like that > in for 3.1. I rather feel like the new buffer API slipped in without > any real review. The review that was done was actually quite extensive - see PEP 3118. However: 1. There's a reason 3118 is still at accepted rather than final - the major foundations (and the all-important underlying protocol) are in place, but there are finishing touches still needed. 2. The review of the PEP focused on the power and capabilities of the underlying protocol and less on the aesthetics of the C API. The PEP was fairly explicit that the fields in the Py_buffer struct were public and accessed directly via C syntax though, as are the current docs (http://docs.python.org/dev/3.0/c-api/buffer.html). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Sat Nov 22 00:45:01 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Nov 2008 09:45:01 +1000 Subject: [Python-3000] PyObject_HEAD_INIT In-Reply-To: References: <491DAC8C.5060701@v.loewis.de> <491DDE16.6080101@v.loewis.de> <49255A96.9080506@egenix.com> <4926C23A.8080108@egenix.com> <4926F5B0.2020502@egenix.com> <49270B21.8090403@egenix.com> Message-ID: <492747FD.7030006@gmail.com> Terry Reedy wrote: > Roger Binns wrote: > >> The problem is that unless you are clairvoyant you have no way of >> knowing about this change. Even in rc3 the documentation shows the old >> (wrong) way: >> >> http://docs.python.org/dev/3.0/extending/newtypes.html >> >> PyObject_HEAD_INIT is documented: >> >> http://docs.python.org/dev/3.0/search.html?q=PyObject_HEAD_INIT >> >> PyVarObject_HEAD_INIT is not: >> >> http://docs.python.org/dev/3.0/search.html?q=PyVarObject_HEAD_INIT > > Whatever happens to PyObject..., that should be fixed. Is there are > tracker item yet? A doc person will copy, paste, and format if given > raw text. I just created a release blocker pointing at this thread. http://bugs.python.org/issue4385 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From musiccomposition at gmail.com Sat Nov 22 00:52:20 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Fri, 21 Nov 2008 17:52:20 -0600 Subject: [Python-3000] Using memoryviews In-Reply-To: <49274571.5020202@gmail.com> References: <49253821.7010909@egenix.com> <49270EFF.3070006@egenix.com> <1afaf6160811211334v5de44235j936eed156e45cfa2@mail.gmail.com> <49274571.5020202@gmail.com> Message-ID: <1afaf6160811211552y656aef85gde8dd9b6775a6535@mail.gmail.com> On Fri, Nov 21, 2008 at 5:34 PM, Nick Coghlan wrote: > Benjamin Peterson wrote: >> On Fri, Nov 21, 2008 at 1:41 PM, M.-A. Lemburg wrote: >>> In the past, we've always tried to provide abstract access methods to >>> C struct internals of Python objects and I wonder whether this was >>> deliberately not done for Py_buffer structs or simply not considered. >>> >>> I don't think it's a good idea to use my_Py_buffer->buf in a C >>> extension and would rather like to write: >>> >>> Py_Buffer_AS_BUFFER(my_Py_buffer) >>> Py_Buffer_GET_SIZE(my_Py_buffer) >>> Py_Buffer_GET_ITEM_SIZE(my_Py_buffer) >>> etc. >> >> I think that's a good idea, too, and we should get something like that >> in for 3.1. I rather feel like the new buffer API slipped in without >> any real review. > > The review that was done was actually quite extensive - see PEP 3118. > However: > 1. There's a reason 3118 is still at accepted rather than final - the > major foundations (and the all-important underlying protocol) are in > place, but there are finishing touches still needed. > 2. The review of the PEP focused on the power and capabilities of the > underlying protocol and less on the aesthetics of the C API. I'm not talking necessarily about the PEP and API. I find the implementation confusing and contradictory in some places. > > The PEP was fairly explicit that the fields in the Py_buffer struct were > public and accessed directly via C syntax though, as are the current > docs (http://docs.python.org/dev/3.0/c-api/buffer.html). Well, I wrote those based on the PEP. :) -- Cheers, Benjamin Peterson "There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner." From martin at v.loewis.de Sat Nov 22 10:05:19 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 22 Nov 2008 10:05:19 +0100 Subject: [Python-3000] Eliminating PY_SSIZE_T_CLEAN Message-ID: <4927CB4F.8040904@v.loewis.de> I just noticed that the Python 3 C API still contains PY_SSIZE_T_CLEAN. This macro was a transition mechanism, to allow extensions to use Py_ssize_t in PyArg_ParseTuple, while allowing other module continue to use int. In Python 3, I would like the mechanism, making Py_ssize_t the only valid data type for size in, say, s# parsers. Is it ok to still change that? Regards, Martin From barry at python.org Sat Nov 22 15:29:01 2008 From: barry at python.org (Barry Warsaw) Date: Sat, 22 Nov 2008 09:29:01 -0500 Subject: [Python-3000] Eliminating PY_SSIZE_T_CLEAN In-Reply-To: <4927CB4F.8040904@v.loewis.de> References: <4927CB4F.8040904@v.loewis.de> Message-ID: <21F72782-F422-4BD2-85CD-98CE9DD2BF12@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 22, 2008, at 4:05 AM, Martin v. L?wis wrote: > I just noticed that the Python 3 C API still contains > PY_SSIZE_T_CLEAN. > > This macro was a transition mechanism, to allow extensions to use > Py_ssize_t in PyArg_ParseTuple, while allowing other module continue > to use int. > > In Python 3, I would like the mechanism, making Py_ssize_t the only > valid data type for size in, say, s# parsers. > > Is it ok to still change that? Given that we just released the last planned candidate, I'd say it was too late to change this for Python 3.0. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSgXLnEjvBPtnXfVAQKEVwP7BMofjGhTTfQ847X767ONgkt7gqr6+jeS Fv8y0NR7quMAU6LAsdg3ScpDhXItwiefGGAkaqGojwQKxAcy0xTWVNnhAtytQ3Xc ZuyhFng++jl0qLz3+s3/IUl+gVM/PPlnjf+Kh4dHrjpUW8yuq3wOMCdpL6DAS9xA xI9wiHHoXeU= =WLHV -----END PGP SIGNATURE----- From prologic at shortcircuit.net.au Sat Nov 22 02:17:26 2008 From: prologic at shortcircuit.net.au (James Mills) Date: Sat, 22 Nov 2008 11:17:26 +1000 Subject: [Python-3000] RELEASED Python 3.0rc3 In-Reply-To: References: Message-ID: On Sat, Nov 22, 2008 at 1:06 AM, Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On behalf of the Python development team and the Python community, I am > happy to announce the third and last planned release candidate for Python > 3.0. Whoohoo! :) Great works guys! --JamesMills -- -- -- "Problems are solved by method" From brett at python.org Sat Nov 22 20:51:34 2008 From: brett at python.org (Brett Cannon) Date: Sat, 22 Nov 2008 11:51:34 -0800 Subject: [Python-3000] Eliminating PY_SSIZE_T_CLEAN In-Reply-To: <21F72782-F422-4BD2-85CD-98CE9DD2BF12@python.org> References: <4927CB4F.8040904@v.loewis.de> <21F72782-F422-4BD2-85CD-98CE9DD2BF12@python.org> Message-ID: On Sat, Nov 22, 2008 at 06:29, Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Nov 22, 2008, at 4:05 AM, Martin v. L?wis wrote: > >> I just noticed that the Python 3 C API still contains PY_SSIZE_T_CLEAN. >> >> This macro was a transition mechanism, to allow extensions to use >> Py_ssize_t in PyArg_ParseTuple, while allowing other module continue >> to use int. >> >> In Python 3, I would like the mechanism, making Py_ssize_t the only >> valid data type for size in, say, s# parsers. >> >> Is it ok to still change that? > > Given that we just released the last planned candidate, I'd say it was too > late to change this for Python 3.0. > But we can at least document that the macro is a gone as soon as 3.0 final is out the door. -Brett From solipsis at pitrou.net Sun Nov 23 01:18:31 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 23 Nov 2008 00:18:31 +0000 (UTC) Subject: [Python-3000] Using memoryviews References: <49253821.7010909@egenix.com> Message-ID: Josiah Carlson gmail.com> writes: > > From what I understand of the memoryview when I tried to do the same > thing a few months ago (use memoryview to replace buffer in > asyncore/asynchat), memoryview is incomplete. It didn't support > character buffer slicing (you know, the 'offset' and 'size' arguments > that were in buffer), and at least a handful of other things (that I > can't remember at the moment). You should try again, memoryview now supports slicing (with the usual Python syntax, e.g. m[2:5]) as well as slice assignment (with the fairly sensible limitation that you can't resize the underlying buffer). There's no real doc for it, but you can look at test_memoryview.py in the Lib/test directory to have a fairly comprehensive list of the things currently supported. I also support the addition of official functions or macros to access the underlying fields of the Py_buffer struct, rather than access them directly from 3rd party code. Someone please open an issue for that in the tracker. The big, big limitation of memoryviews right now is that they only support one-dimensional byte buffers. The people interested in more complex arrangements (that is, Scipy/Numpy people) have been completely absent from the python-dev community for many months now, and I don't think anyone else cares enough to do the job instead of them. Regards Antoine. From josiah.carlson at gmail.com Sun Nov 23 10:12:19 2008 From: josiah.carlson at gmail.com (Josiah Carlson) Date: Sun, 23 Nov 2008 01:12:19 -0800 Subject: [Python-3000] Using memoryviews In-Reply-To: References: <49253821.7010909@egenix.com> Message-ID: On Sat, Nov 22, 2008 at 4:18 PM, Antoine Pitrou wrote: > Josiah Carlson gmail.com> writes: >> >> From what I understand of the memoryview when I tried to do the same >> thing a few months ago (use memoryview to replace buffer in >> asyncore/asynchat), memoryview is incomplete. It didn't support >> character buffer slicing (you know, the 'offset' and 'size' arguments >> that were in buffer), and at least a handful of other things (that I >> can't remember at the moment). > > You should try again, memoryview now supports slicing (with the usual Python > syntax, e.g. m[2:5]) as well as slice assignment (with the fairly sensible > limitation that you can't resize the underlying buffer). There's no real doc for > it, but you can look at test_memoryview.py in the Lib/test directory to have a > fairly comprehensive list of the things currently supported. I meant in the sense of X = memoryview(char_buffer, offset, length). Post-facto slicing is nice, but a little more wasteful than necessary. > I also support the addition of official functions or macros to access the > underlying fields of the Py_buffer struct, rather than access them directly from > 3rd party code. Someone please open an issue for that in the tracker. > > The big, big limitation of memoryviews right now is that they only support > one-dimensional byte buffers. The people interested in more complex arrangements > (that is, Scipy/Numpy people) have been completely absent from the python-dev > community for many months now, and I don't think anyone else cares enough to do > the job instead of them. That's unfortunate, as they were the major pushers for memoryview as it stands today. I'm still thinking about trying to convince people to add string methods to them (you have your encoded email message in memory, you chop it and slice it as necessary for viewing...all using pointers to the one block of memory, which minimizes fragmentation, memory copies, etc.). - Josiah From g.brandl at gmx.net Sun Nov 23 11:56:29 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 23 Nov 2008 11:56:29 +0100 Subject: [Python-3000] Using memoryviews In-Reply-To: References: <49253821.7010909@egenix.com> Message-ID: Josiah Carlson schrieb: > On Sat, Nov 22, 2008 at 4:18 PM, Antoine Pitrou wrote: >> Josiah Carlson gmail.com> writes: >>> >>> From what I understand of the memoryview when I tried to do the same >>> thing a few months ago (use memoryview to replace buffer in >>> asyncore/asynchat), memoryview is incomplete. It didn't support >>> character buffer slicing (you know, the 'offset' and 'size' arguments >>> that were in buffer), and at least a handful of other things (that I >>> can't remember at the moment). >> >> You should try again, memoryview now supports slicing (with the usual Python >> syntax, e.g. m[2:5]) as well as slice assignment (with the fairly sensible >> limitation that you can't resize the underlying buffer). There's no real doc for >> it, but you can look at test_memoryview.py in the Lib/test directory to have a >> fairly comprehensive list of the things currently supported. > > I meant in the sense of X = memoryview(char_buffer, offset, length). > Post-facto slicing is nice, but a little more wasteful than necessary. Why? It's only a view, after all. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From ncoghlan at gmail.com Sun Nov 23 12:37:16 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 23 Nov 2008 21:37:16 +1000 Subject: [Python-3000] Using memoryviews In-Reply-To: References: <49253821.7010909@egenix.com> Message-ID: <4929406C.1030500@gmail.com> Josiah Carlson wrote: > On Sat, Nov 22, 2008 at 4:18 PM, Antoine Pitrou wrote: >> The big, big limitation of memoryviews right now is that they only support >> one-dimensional byte buffers. The people interested in more complex arrangements >> (that is, Scipy/Numpy people) have been completely absent from the python-dev >> community for many months now, and I don't think anyone else cares enough to do >> the job instead of them. > > That's unfortunate, as they were the major pushers for memoryview as > it stands today. I believe the Scipy/Numpy folks mainly needed the underlying protocol for describing and sharing chunks of memory (e.g. when mixing the use of PIL and NumPy in a single program). The memoryview Python object just provides a basic mechanism to access that protocol from pure Python code. At this point in time, I would expect significant uses of the protocol to be largely mediated by extension modules (either existing ones or new ones) rather than via pure Python code. I see it as similar to the way extended slicing was originally introduced without significant support in the builtin types, but still immediately solved a problem for the NumPy folks due to the existence of the new protocol. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From greg at krypto.org Mon Nov 24 00:16:34 2008 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 23 Nov 2008 15:16:34 -0800 Subject: [Python-3000] Eliminating PY_SSIZE_T_CLEAN In-Reply-To: References: <4927CB4F.8040904@v.loewis.de> <21F72782-F422-4BD2-85CD-98CE9DD2BF12@python.org> Message-ID: <52dc1c820811231516p397834c6iea9b33108b70e2e1@mail.gmail.com> On Sat, Nov 22, 2008 at 11:51 AM, Brett Cannon wrote: > On Sat, Nov 22, 2008 at 06:29, Barry Warsaw wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > On Nov 22, 2008, at 4:05 AM, Martin v. L?wis wrote: > > > >> I just noticed that the Python 3 C API still contains PY_SSIZE_T_CLEAN. > >> > >> This macro was a transition mechanism, to allow extensions to use > >> Py_ssize_t in PyArg_ParseTuple, while allowing other module continue > >> to use int. > >> > >> In Python 3, I would like the mechanism, making Py_ssize_t the only > >> valid data type for size in, say, s# parsers. > >> > >> Is it ok to still change that? > > > > Given that we just released the last planned candidate, I'd say it was > too > > late to change this for Python 3.0. > > > > But we can at least document that the macro is a gone as soon as 3.0 > final is out the door. > > -Brett I'll commit the following update to the py3k docs if nobody objects. As it is now, the only mention of PY_SSIZE_T_CLEAR at all is in whatsnew/2.5.rst. This officially documents it and mentions that it is going away to be always on in the future. I'm assuming in 3.1 but I just left it as "a future version" to not commit to that. At least the py3k docs encourage use of s* rather than s#. -gps Index: Doc/extending/extending.rst =================================================================== --- Doc/extending/extending.rst (revision 67360) +++ Doc/extending/extending.rst (working copy) @@ -587,11 +587,16 @@ Some example calls:: + #define PY_SSIZE_T_CLEAN /* Make "s#" use Py_ssize_t rather than int. */ + #include + +:: + int ok; int i, j; long k, l; const char *s; - int size; + Py_ssize_t size; ok = PyArg_ParseTuple(args, ""); /* No arguments */ /* Python call: f() */ Index: Doc/c-api/arg.rst =================================================================== --- Doc/c-api/arg.rst (revision 67360) +++ Doc/c-api/arg.rst (working copy) @@ -42,12 +42,19 @@ responsible** for calling ``PyBuffer_Release`` with the structure after it has processed the data. -``s#`` (string, Unicode or any read buffer compatible object) [const char \*, int] +``s#`` (string, Unicode or any read buffer compatible object) [const char \*, int or :ctype:`Py_ssize_t`] This variant on ``s*`` stores into two C variables, the first one a pointer to a character string, the second one its length. All other read-buffer compatible objects pass back a reference to the raw internal data representation. Since this format doesn't allow writable buffer compatible - objects like byte arrays, ``s*`` is to be preferred. + objects like byte arrays, ``s*`` is to be preferred. The type of + the length argument (int or :ctype:`Py_ssize_t`) is controlled by + defining the macro :cmacro:`PY_SSIZE_T_CLEAN` before including + :file:`Python.h`. If the macro was defined, the output will be a + :ctype:`Py_ssize_t` rather than an int. + This behavior will change in a future Python version to only support + :ctype:`Py_ssize_t` and drop int support. It is best to always + define :cmacro:`PY_SSIZE_T_CLEAN`. ``y`` (bytes object) [const char \*] This variant on ``s`` converts a Python bytes or bytearray object to a C -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Mon Nov 24 00:51:49 2008 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 23 Nov 2008 15:51:49 -0800 Subject: [Python-3000] Eliminating PY_SSIZE_T_CLEAN In-Reply-To: <52dc1c820811231516p397834c6iea9b33108b70e2e1@mail.gmail.com> References: <4927CB4F.8040904@v.loewis.de> <21F72782-F422-4BD2-85CD-98CE9DD2BF12@python.org> <52dc1c820811231516p397834c6iea9b33108b70e2e1@mail.gmail.com> Message-ID: <52dc1c820811231551x695bea30nf5d214e5e96394e6@mail.gmail.com> On Sun, Nov 23, 2008 at 3:16 PM, Gregory P. Smith wrote: > > On Sat, Nov 22, 2008 at 11:51 AM, Brett Cannon wrote: > >> On Sat, Nov 22, 2008 at 06:29, Barry Warsaw wrote: >> > -----BEGIN PGP SIGNED MESSAGE----- >> > Hash: SHA1 >> > >> > On Nov 22, 2008, at 4:05 AM, Martin v. L?wis wrote: >> > >> >> I just noticed that the Python 3 C API still contains PY_SSIZE_T_CLEAN. >> >> >> >> This macro was a transition mechanism, to allow extensions to use >> >> Py_ssize_t in PyArg_ParseTuple, while allowing other module continue >> >> to use int. >> >> >> >> In Python 3, I would like the mechanism, making Py_ssize_t the only >> >> valid data type for size in, say, s# parsers. >> >> >> >> Is it ok to still change that? >> > >> > Given that we just released the last planned candidate, I'd say it was >> too >> > late to change this for Python 3.0. >> > >> >> But we can at least document that the macro is a gone as soon as 3.0 >> final is out the door. >> >> -Brett > > > I'll commit the following update to the py3k docs if nobody objects. As it > is now, the only mention of PY_SSIZE_T_CLEAR at all is in whatsnew/2.5.rst. > This officially documents it and mentions that it is going away to be always > on in the future. I'm assuming in 3.1 but I just left it as "a future > version" to not commit to that. At least the py3k docs encourage use of s* > rather than s#. > > -gps > Wording fixed slightly and committed as r67361. > > > Index: Doc/extending/extending.rst > =================================================================== > --- Doc/extending/extending.rst (revision 67360) > +++ Doc/extending/extending.rst (working copy) > @@ -587,11 +587,16 @@ > > Some example calls:: > > + #define PY_SSIZE_T_CLEAN /* Make "s#" use Py_ssize_t rather than int. > */ > + #include > + > +:: > + > int ok; > int i, j; > long k, l; > const char *s; > - int size; > + Py_ssize_t size; > > ok = PyArg_ParseTuple(args, ""); /* No arguments */ > /* Python call: f() */ > Index: Doc/c-api/arg.rst > =================================================================== > --- Doc/c-api/arg.rst (revision 67360) > +++ Doc/c-api/arg.rst (working copy) > @@ -42,12 +42,19 @@ > responsible** for calling ``PyBuffer_Release`` with the structure after > it > has processed the data. > > -``s#`` (string, Unicode or any read buffer compatible object) [const char > \*, int] > +``s#`` (string, Unicode or any read buffer compatible object) [const char > \*, int or :ctype:`Py_ssize_t`] > This variant on ``s*`` stores into two C variables, the first one a > pointer > to a character string, the second one its length. All other > read-buffer > compatible objects pass back a reference to the raw internal data > representation. Since this format doesn't allow writable buffer > compatible > - objects like byte arrays, ``s*`` is to be preferred. > + objects like byte arrays, ``s*`` is to be preferred. The type of > + the length argument (int or :ctype:`Py_ssize_t`) is controlled by > + defining the macro :cmacro:`PY_SSIZE_T_CLEAN` before including > + :file:`Python.h`. If the macro was defined, the output will be a > + :ctype:`Py_ssize_t` rather than an int. > + This behavior will change in a future Python version to only support > + :ctype:`Py_ssize_t` and drop int support. It is best to always > + define :cmacro:`PY_SSIZE_T_CLEAN`. > > ``y`` (bytes object) [const char \*] > This variant on ``s`` converts a Python bytes or bytearray object to a > C > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From biltar at hotmail.com Mon Nov 24 11:55:01 2008 From: biltar at hotmail.com (Ali art) Date: Mon, 24 Nov 2008 10:55:01 +0000 Subject: [Python-3000] NameError Message-ID: I am using Windows XP professional version 2002 Service pack 3. AMD Athlon(TM)XP 2400+ 2.00GHz 992MB RAM. I have downloaded Windows x86 MSI Instaler Python 3.0rc3 (sig) Release: 21-Nov-2008. Control Panel -> System -> Advanced -> Environment Variables. System Variables -> Path -> edit C:\Windows\System32\Wbem;C:\Python30 start -> programs -> python 3.0 -> IDLE(Python GUI) -> IDLE 3.0rc3 -> File -> New Window -> i wrote "print('Hello World')" without qutes-> File -> Save -> Python30 -> i gave file name "helloworld.py" without qutes-> and saved -> and closed file and python shell -> and then start -> programs -> python 3.0 -> Python (command line) -> Python 3.0rc3 (r30rc3:67313, Nov 21 2008, 07:14:45) [MSC v.1500 32 bit (Intel)]on win32Type "help", "copyright", "credits" or "license" for more information.>>> -> i wrote "helloworld.py" without quotes -> it gives NameError -> Python 3.0rc3 (r30rc3:67313, Nov 21 2008, 07:14:45) [MSC v.1500 32 bit (Intel)]on win32Type "help", "copyright", "credits" or "license" for more information.>>> helloworld.pyTraceback (most recent call last): File "", line 1, in NameError: name 'helloworld' is not defined>>> What did i do wrong? _________________________________________________________________ Discover the new Windows Vista http://search.msn.com/results.aspx?q=windows+vista&mkt=en-US&form=QBRE -------------- next part -------------- An HTML attachment was scrubbed... URL: From biltar at hotmail.com Mon Nov 24 12:13:54 2008 From: biltar at hotmail.com (Ali art) Date: Mon, 24 Nov 2008 11:13:54 +0000 Subject: [Python-3000] unicode_test Message-ID: I am using Windows XP professional version 2002 Service pack 3. AMD Athlon(TM)XP 2400+ 2.00GHz 992MB RAM. I have downloaded Windows x86 MSI Instaler Python 3.0rc3 (sig) Release: 21-Nov-2008. Control Panel -> System -> Advanced -> Environment Variables. System Variables -> Path -> edit C:\Windows\System32\Wbem;C:\Python30 start -> programs -> python 3.0 -> IDLE(Python GUI) -> IDLE 3.0rc3 -> File -> New Window -> i wrote "print('?????')" without qutes-> File -> Save -> Python30 -> i gave file name "Unicode_test" without qutes-> and Run -> Run Module -> it gives error "invalid character in identifier" But if i write in Phyton Shell -> >>> print('?????') and pressed enter -> gives "?????" it works. What is wrong? _________________________________________________________________ News, entertainment and everything you care about at Live.com. Get it now! http://www.live.com/getstarted.aspx -------------- next part -------------- An HTML attachment was scrubbed... URL: From facundobatista at gmail.com Mon Nov 24 12:40:50 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Mon, 24 Nov 2008 09:40:50 -0200 Subject: [Python-3000] NameError In-Reply-To: References: Message-ID: 2008/11/24 Ali art : > > What did i do wrong? > You should ask these kind of questions in the Python general list: http://www.python.org/mailman/listinfo/python-list This Python-3000 list is about developing Python 3 *itself*, not developing *in* Python Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From clp at rebertia.com Mon Nov 24 12:54:10 2008 From: clp at rebertia.com (Chris Rebert) Date: Mon, 24 Nov 2008 03:54:10 -0800 Subject: [Python-3000] NameError In-Reply-To: References: Message-ID: <47c890dc0811240354o3e32b43dhc0de8eb3126dd223@mail.gmail.com> On Mon, Nov 24, 2008 at 2:55 AM, Ali art wrote: > > I am using Windows XP professional version 2002 Service pack 3. AMD > Athlon(TM)XP 2400+ 2.00GHz 992MB RAM. > > I have downloaded Windows x86 MSI Instaler Python 3.0rc3 (sig) Release: > 21-Nov-2008. > > Control Panel -> System -> Advanced -> Environment Variables. > System Variables -> Path -> edit C:\Windows\System32\Wbem;C:\Python30 > > start -> programs -> python 3.0 -> IDLE(Python GUI) > -> IDLE 3.0rc3 -> File -> New Window -> i wrote "print('Hello World')" > without qutes > -> File -> Save -> Python30 -> i gave file name "helloworld.py" without > qutes > -> and saved -> and closed file and python shell You do know you can run the file from within IDLE by just pressing F5, right? > > -> and then start -> programs -> python 3.0 -> Python (command line) -> > > Python 3.0rc3 (r30rc3:67313, Nov 21 2008, 07:14:45) [MSC v.1500 32 bit > (Intel)] > on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> > > -> i wrote "helloworld.py" No! Wrong! See below. > without quotes -> it gives NameError -> > > Python 3.0rc3 (r30rc3:67313, Nov 21 2008, 07:14:45) [MSC v.1500 32 bit > (Intel)] > on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> helloworld.py > Traceback (most recent call last): > File "", line 1, in > NameError: name 'helloworld' is not defined >>>> > > What did i do wrong? As I believe someone pointed out earlier when you previously asked this question on one of the other mailinglists, to do what you want to do, you should: 1. Go Start -> Run, enter "cmd", click Run, to open the DOS shell. 2. cd to whatever folder helloworld.py is located in 3. running the following command in the terminal window: python helloworld.py Python's "shell" is an interactive interpreter of Python *code*, and is NOT used to run python script *files*, at least in the particular way you're trying. When you enter "helloworld.py" at the Python interpreter prompt, Python is not trying to run the file "helloworld.py"; it's looking for a variable named "helloworld" to then find the "py" attribute of (i.e. the dot is an operator in this instance). Since there's obviously no such variable, it's raising a NameError exception. For simplicity, I'd recommend just running the script directly from IDLE as I explained above. Finally, I echo the comment that your question is completely offtopic for this mailinglist, which is about developing Python 3.0, not using Python 3.0 (unless you found a bug, which is not the case here). Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com > > ________________________________ > Discover the new Windows Vista Learn more! > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/cvrebert%40gmail.com > > From barry at python.org Mon Nov 24 13:48:57 2008 From: barry at python.org (Barry Warsaw) Date: Mon, 24 Nov 2008 07:48:57 -0500 Subject: [Python-3000] Eliminating PY_SSIZE_T_CLEAN In-Reply-To: <52dc1c820811231516p397834c6iea9b33108b70e2e1@mail.gmail.com> References: <4927CB4F.8040904@v.loewis.de> <21F72782-F422-4BD2-85CD-98CE9DD2BF12@python.org> <52dc1c820811231516p397834c6iea9b33108b70e2e1@mail.gmail.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 23, 2008, at 6:16 PM, Gregory P. Smith wrote: > On Sat, Nov 22, 2008 at 11:51 AM, Brett Cannon > wrote: > >> On Sat, Nov 22, 2008 at 06:29, Barry Warsaw wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> On Nov 22, 2008, at 4:05 AM, Martin v. L?wis wrote: >>> >>>> I just noticed that the Python 3 C API still contains >>>> PY_SSIZE_T_CLEAN. >>>> >>>> This macro was a transition mechanism, to allow extensions to use >>>> Py_ssize_t in PyArg_ParseTuple, while allowing other module >>>> continue >>>> to use int. >>>> >>>> In Python 3, I would like the mechanism, making Py_ssize_t the only >>>> valid data type for size in, say, s# parsers. >>>> >>>> Is it ok to still change that? >>> >>> Given that we just released the last planned candidate, I'd say it >>> was >> too >>> late to change this for Python 3.0. >>> >> >> But we can at least document that the macro is a gone as soon as 3.0 >> final is out the door. >> >> -Brett > > > I'll commit the following update to the py3k docs if nobody > objects. As it > is now, the only mention of PY_SSIZE_T_CLEAR at all is in whatsnew/ > 2.5.rst. > This officially documents it and mentions that it is going away to > be always > on in the future. I'm assuming in 3.1 but I just left it as "a future > version" to not commit to that. At least the py3k docs encourage > use of s* > rather than s#. Perfect, thanks. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSqiuXEjvBPtnXfVAQIh3QP/WNw2mwCATAZ6oqI1vB0K37R+JZmV/qVZ 8CjgwrJcBPolFB9DYZ8AO6rvdGwnqRf2a/3fCg2ZRPQuJgJh1lPeFXuxm92Qn9fJ aXS0ph1i4r467LyYMqhZYcHRGOATQwc31phd2YHvkeYZCdijq3sPN7ZCq40LDQRQ sxJj+GkjkTE= =ejGW -----END PGP SIGNATURE----- From victor.stinner at haypocalc.com Mon Nov 24 14:06:34 2008 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 24 Nov 2008 14:06:34 +0100 Subject: [Python-3000] unicode_test In-Reply-To: References: Message-ID: <200811241406.34147.victor.stinner@haypocalc.com> Le Monday 24 November 2008 12:13:54 Ali art, vous avez ?crit?: > (...) IDLE 3.0rc3 -> File -> New Window -> i wrote "print('?????')" without > qutes-> File -> Save -> Python30 -> i gave file name "Unicode_test" without > qutes-> and Run -> Run Module -> it gives error "invalid character in > identifier" This bug may be related to: http://bugs.python.org/issue4323 -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ From mal at egenix.com Tue Nov 25 17:56:49 2008 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 25 Nov 2008 17:56:49 +0100 Subject: [Python-3000] Using memoryviews In-Reply-To: <1afaf6160811211552y656aef85gde8dd9b6775a6535@mail.gmail.com> References: <49253821.7010909@egenix.com> <49270EFF.3070006@egenix.com> <1afaf6160811211334v5de44235j936eed156e45cfa2@mail.gmail.com> <49274571.5020202@gmail.com> <1afaf6160811211552y656aef85gde8dd9b6775a6535@mail.gmail.com> Message-ID: <492C2E51.5030800@egenix.com> On 2008-11-22 00:52, Benjamin Peterson wrote: > On Fri, Nov 21, 2008 at 5:34 PM, Nick Coghlan wrote: >> Benjamin Peterson wrote: >>> On Fri, Nov 21, 2008 at 1:41 PM, M.-A. Lemburg wrote: >>>> In the past, we've always tried to provide abstract access methods to >>>> C struct internals of Python objects and I wonder whether this was >>>> deliberately not done for Py_buffer structs or simply not considered. >>>> >>>> I don't think it's a good idea to use my_Py_buffer->buf in a C >>>> extension and would rather like to write: >>>> >>>> Py_Buffer_AS_BUFFER(my_Py_buffer) >>>> Py_Buffer_GET_SIZE(my_Py_buffer) >>>> Py_Buffer_GET_ITEM_SIZE(my_Py_buffer) >>>> etc. >>> I think that's a good idea, too, and we should get something like that >>> in for 3.1. I rather feel like the new buffer API slipped in without >>> any real review. >> The review that was done was actually quite extensive - see PEP 3118. >> However: >> 1. There's a reason 3118 is still at accepted rather than final - the >> major foundations (and the all-important underlying protocol) are in >> place, but there are finishing touches still needed. >> 2. The review of the PEP focused on the power and capabilities of the >> underlying protocol and less on the aesthetics of the C API. > > I'm not talking necessarily about the PEP and API. I find the > implementation confusing and contradictory in some places. > >> The PEP was fairly explicit that the fields in the Py_buffer struct were >> public and accessed directly via C syntax though, as are the current >> docs (http://docs.python.org/dev/3.0/c-api/buffer.html). > > Well, I wrote those based on the PEP. :) I find the implementation of the buffer protocol way too complicated. One of the reasons why the buffer protocol in Python 2 never caught on was the fact that it was too complicated and the Python 3 is even worse in this respect. In practice you do want to have the ability to hook directly into the data buffer of an object, but apart from some special needs that PIL and the numeric folks may have, most users will just want to work with a single contiguous chunk of memory and need a simple API to do this - pass in an object, get a void* back. With the new interface, programmers will have to deal with an PyObject_GetBuffer() API having 17 modification flags in order to deal with many different corner cases and returning a Py_buffer C struct with another 10 elements. http://docs.python.org/dev/3.0/c-api/buffer.html#PyObject_GetBuffer Can we please get something simple like PyObject_AsReadBuffer() back into Python 3 ? http://docs.python.org/c-api/objbuffer.html (and ideally, this should also work on memoryview objects) Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 25 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2008-11-12: Released mxODBC.Connect 0.9.3 http://python.egenix.com/ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From rasky at develer.com Wed Nov 26 20:47:25 2008 From: rasky at develer.com (Giovanni Bajo) Date: Wed, 26 Nov 2008 19:47:25 +0000 (UTC) Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> Message-ID: On Wed, 19 Nov 2008 20:40:38 +0100, Martin v. L?wis wrote: > The tricky part really is when it breaks (which it does more often than > not), in which case you need to understand msi.py, for which you need to > understand MSI. IMO, the Microsoft is excellent (in being fairly > precise), but the learning curve is high. The mechanical part of it can > is completely automated - we produce daily MSI files in a buildbot slave > (which may or may not work - I haven't checked in a while) I always wondered why it was necessary to write msi.py in the first place. Maintaining it is surely a big effort and requires understanding of a dark library which a few people have (IMO it's a much higher effort than setting up automated tests in a bunch of VM, which you said is "not worth it"). There are plenty of MSI installer generator programs, and Python's needs do not seem so weird to require a custom MSI generator. I'm sure the Python Software Foundation would easily get a free license of one of the good commercial MSI installer generators. In short: if msi.py and the fact it breaks is part of the issue here, it's very easy to solve in my opinion. -- Giovanni Bajo Develer S.r.l. http://www.develer.com From martin at v.loewis.de Wed Nov 26 21:03:59 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 26 Nov 2008 21:03:59 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> Message-ID: <492DABAF.5020808@v.loewis.de> > I always wondered why it was necessary to write msi.py in the first > place. Maintaining it is surely a big effort and requires understanding > of a dark library which a few people have (IMO it's a much higher effort > than setting up automated tests in a bunch of VM, which you said is "not > worth it"). > > There are plenty of MSI installer generator programs Originally it was written because none of the MSI generator programs were capable of packaging Python. In particular, none was capable of creating 64-bit packages (which were first needed to create the Itanium packages). > and Python's needs > do not seem so weird to require a custom MSI generator. Python's needs are fairly weird, so I'm very skeptical that any other generator is capable of doing what msi.py does (or, if it was capable of doing that, that it was then any simpler than msi.py). The critical part is that you need a powerful way to specify what files to package (having to select them in a UI is unacceptable, as the set of files constantly changes - the current generator can cope with many types of file additions without needing any change). > I'm sure the > Python Software Foundation would easily get a free license of one of the > good commercial MSI installer generators. Can you recommend a specific one? In addition, I'm also skeptical wrt. commercial setup tools. We had been using Wise for a while, and it was a management problem because the license was only available on a single machine - so it was difficult for anybody else to jump in and do a release. > In short: if msi.py and the fact it breaks is part of the issue here, > it's very easy to solve in my opinion. I'm very skeptical that this statement is actually true. Regards, Martin From elliot at canonical.com Wed Nov 26 21:20:05 2008 From: elliot at canonical.com (Elliot Murphy) Date: Wed, 26 Nov 2008 15:20:05 -0500 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <492DABAF.5020808@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> Message-ID: <492DAF75.5050408@canonical.com> Martin v. L?wis wrote: >> I'm sure the >> Python Software Foundation would easily get a free license of one of the >> good commercial MSI installer generators. > > Can you recommend a specific one? > > In addition, I'm also skeptical wrt. commercial setup tools. We had been > using Wise for a while, and it was a management problem because the > license was only available on a single machine - so it was difficult > for anybody else to jump in and do a release. > I've also had terrible times with installshield and other things in the past, but I've been very very pleased with WiX: http://wix.sourceforge.net/ Free, open source, and it gave me absolute control over how the MSI and MSM modules were built, using text files so I could store them in version control, spit out installers from our automated build, etc. The source format is XML (shrug), so on that project we even wrote a WiX parser to calculate SCons dependencies all the way through the MSM to the MSI, so that we could tell when a source file was changed what installers needed to be resigned and shipped to customers. It's really nice to be able to code review installer changes, and to have automated builds spit out .msi files alongside the .debs and .rpms and more. -- Elliot Murphy | https://launchpad.net/~statik/ From martin at v.loewis.de Wed Nov 26 21:34:18 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 26 Nov 2008 21:34:18 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <492DAFDA.4080909@voidspace.org.uk> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <492DAFDA.4080909@voidspace.org.uk> Message-ID: <492DB2CA.9020803@v.loewis.de> > Wix is an msi creator (open source) that takes XML files as the input. > It is also capable of creating 64bit installers. At Resolver Systems we > use CPython scripts to generate the XML as input for Wix. > > It would still need *some* code therefore, but maybe simpler if someone > wanted to do the work. :-) I had looked at WiX before, and found that it can't do out of the box what I want to do - I still would need to generate the input files, e.g. with a script (and I'm happy to hear that you can confirm that analysis). I also had quite some problems understanding it, and can understand msi.py much better (surprise, surprise). For a newcomer, my feeling is that learning WiX and learning msi.py is about the same effort - you really need to "get" MSI files. Regards, Martin From martin at v.loewis.de Wed Nov 26 21:39:27 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 26 Nov 2008 21:39:27 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <938f42d70811261222g6f027478re1776ca3014ab003@mail.gmail.com> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <938f42d70811261222g6f027478re1776ca3014ab003@mail.gmail.com> Message-ID: <492DB3FF.8070801@v.loewis.de> > What is the rationale behind using an MSI ? Has anyone attempted to > create a Python installer using something a bit simpler, like NSIS > [http://nsis.sourceforge.net/Main_Page]? If not, what are the reasons? It's a lot of effort to look at any such tool (and I really mean a *lot* of effort - like a full week). That's why nobody did it. When I looked at MSI, I did it because it has a few unique features: - it supports 64-bit installers, which now is an absolute requirement (people really do want to use the AMD64 binaries) a shallow look at the feature list of NSIS suggest that NSIS would fail this requirement. - it supports installation through Windows Domain policy. I would be willing to drop this requirement, but I believe some users would not be happy. Nothing but MSI has this capability (by design of Windows Active Directory). Regards, Martin From fuzzyman at voidspace.org.uk Wed Nov 26 21:21:46 2008 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 26 Nov 2008 20:21:46 +0000 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <492DABAF.5020808@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> Message-ID: <492DAFDA.4080909@voidspace.org.uk> Martin v. L?wis wrote: >> I always wondered why it was necessary to write msi.py in the first >> place. Maintaining it is surely a big effort and requires understanding >> of a dark library which a few people have (IMO it's a much higher effort >> than setting up automated tests in a bunch of VM, which you said is "not >> worth it"). >> >> There are plenty of MSI installer generator programs >> > > Originally it was written because none of the MSI generator programs > were capable of packaging Python. In particular, none was capable of > creating 64-bit packages (which were first needed to create the > Itanium packages). > > >> and Python's needs >> do not seem so weird to require a custom MSI generator. >> > > Python's needs are fairly weird, so I'm very skeptical that any other > generator is capable of doing what msi.py does (or, if it was capable > of doing that, that it was then any simpler than msi.py). > > The critical part is that you need a powerful way to specify what files > to package (having to select them in a UI is unacceptable, as the set > of files constantly changes - the current generator can cope with many > types of file additions without needing any change). > > Wix is an msi creator (open source) that takes XML files as the input. It is also capable of creating 64bit installers. At Resolver Systems we use CPython scripts to generate the XML as input for Wix. It would still need *some* code therefore, but maybe simpler if someone wanted to do the work. :-) It would work well with the files being generated from an XML templating language like Mako which is what we will be switching to at Resolver Systems. http://wix.sourceforge.net/ Michael Foord >> I'm sure the >> Python Software Foundation would easily get a free license of one of the >> good commercial MSI installer generators. >> > > Can you recommend a specific one? > > In addition, I'm also skeptical wrt. commercial setup tools. We had been > using Wise for a while, and it was a management problem because the > license was only available on a single machine - so it was difficult > for anybody else to jump in and do a release. > > >> In short: if msi.py and the fact it breaks is part of the issue here, >> it's very easy to solve in my opinion. >> > > I'm very skeptical that this statement is actually true. > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog From josepharmbruster at gmail.com Wed Nov 26 21:22:55 2008 From: josepharmbruster at gmail.com (Joseph Armbruster) Date: Wed, 26 Nov 2008 15:22:55 -0500 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <492DABAF.5020808@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> Message-ID: <938f42d70811261222g6f027478re1776ca3014ab003@mail.gmail.com> Martin, What is the rationale behind using an MSI ? Has anyone attempted to create a Python installer using something a bit simpler, like NSIS [ http://nsis.sourceforge.net/Main_Page]? If not, what are the reasons? Joe On Wed, Nov 26, 2008 at 3:03 PM, "Martin v. L?wis" wrote: > > I always wondered why it was necessary to write msi.py in the first > > place. Maintaining it is surely a big effort and requires understanding > > of a dark library which a few people have (IMO it's a much higher effort > > than setting up automated tests in a bunch of VM, which you said is "not > > worth it"). > > > > There are plenty of MSI installer generator programs > > Originally it was written because none of the MSI generator programs > were capable of packaging Python. In particular, none was capable of > creating 64-bit packages (which were first needed to create the > Itanium packages). > > > and Python's needs > > do not seem so weird to require a custom MSI generator. > > Python's needs are fairly weird, so I'm very skeptical that any other > generator is capable of doing what msi.py does (or, if it was capable > of doing that, that it was then any simpler than msi.py). > > The critical part is that you need a powerful way to specify what files > to package (having to select them in a UI is unacceptable, as the set > of files constantly changes - the current generator can cope with many > types of file additions without needing any change). > > > I'm sure the > > Python Software Foundation would easily get a free license of one of the > > good commercial MSI installer generators. > > Can you recommend a specific one? > > In addition, I'm also skeptical wrt. commercial setup tools. We had been > using Wise for a while, and it was a management problem because the > license was only available on a single machine - so it was difficult > for anybody else to jump in and do a release. > > > In short: if msi.py and the fact it breaks is part of the issue here, > > it's very easy to solve in my opinion. > > I'm very skeptical that this statement is actually true. > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/josepharmbruster%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Wed Nov 26 22:54:05 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 26 Nov 2008 22:54:05 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <1227736209.6739.9.camel@ozzu> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <1227736209.6739.9.camel@ozzu> Message-ID: <492DC57D.1010101@v.loewis.de> > I've had good results with Advanced Installer: > http://www.advancedinstaller.com/feats-list.html So how much effort would it be to create a Python installer? Could you kindly provide one? Regards, Martin From rasky at develer.com Wed Nov 26 22:50:09 2008 From: rasky at develer.com (Giovanni Bajo) Date: Wed, 26 Nov 2008 22:50:09 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <492DABAF.5020808@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> Message-ID: <1227736209.6739.9.camel@ozzu> On mer, 2008-11-26 at 21:03 +0100, "Martin v. L?wis" wrote: > > I'm sure the > > Python Software Foundation would easily get a free license of one of the > > good commercial MSI installer generators. > > Can you recommend a specific one? I've had good results with Advanced Installer: http://www.advancedinstaller.com/feats-list.html It does support 64-bit packages, and it uses a XML file as input. It supports Vista and UAC, per-user and per-machine install, registry modification, environment variables, upgrades/downgrades/side installs, online installs. And it's free as in beer. The commercial version has many more features, but I don't think Python needs them. But the basic idea is that this tool totally abstracts the MSI details. I know *nothing* of MSI but I'm fully able to use this tool and produce installers with more features than those I notice within Python's installer. -- Giovanni Bajo Develer S.r.l. http://www.develer.com From lists at cheimes.de Wed Nov 26 23:15:43 2008 From: lists at cheimes.de (Christian Heimes) Date: Wed, 26 Nov 2008 23:15:43 +0100 Subject: [Python-3000] 2.6.1 and 3.0 In-Reply-To: <1227736209.6739.9.camel@ozzu> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <1227736209.6739.9.camel@ozzu> Message-ID: <492DCA8F.4050105@cheimes.de> Giovanni Bajo wrote: > On mer, 2008-11-26 at 21:03 +0100, "Martin v. L?wis" wrote: > >>> I'm sure the >>> Python Software Foundation would easily get a free license of one of the >>> good commercial MSI installer generators. >> Can you recommend a specific one? > > I've had good results with Advanced Installer: > http://www.advancedinstaller.com/feats-list.html > > It does support 64-bit packages, and it uses a XML file as input. It > supports Vista and UAC, per-user and per-machine install, registry > modification, environment variables, upgrades/downgrades/side installs, > online installs. And it's free as in beer. The commercial version has > many more features, but I don't think Python needs them. The free edition is missing at least one important feature: Merge Modules into your installation Create self-contained MSI packages, by including and configuring the required merge modules. Christian From martin at v.loewis.de Wed Nov 26 23:38:54 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 26 Nov 2008 23:38:54 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <492DCA8F.4050105@cheimes.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <1227736209.6739.9.camel@ozzu> <492DCA8F.4050105@cheimes.de> Message-ID: <492DCFFE.2060903@v.loewis.de> > Merge Modules into your installation > Create self-contained MSI packages, by including and configuring the > required merge modules. Right. Still, if people want to go this route (I personally don't), I think it would be useful to build an installer from the free edition. You can then run Tools/msi/merge.py, which adds the CRT merge module into the MSI file (mostly as-is, except for discarding the ALLUSERS property from that merge module). Alternatively, for testing, you can just assume that the CRT is already installed. When we then have a script that generates a mostly-complete installer, I'm sure Giovanni would be happy to add support for the CRT merge module to see how the tool fares (my expectation is that it breaks, as I assume it just doesn't deal with the embedded ALLUSERS property correctly - merge.py really uses a bad hack here). Regards, Martin From rasky at develer.com Wed Nov 26 23:46:56 2008 From: rasky at develer.com (Giovanni Bajo) Date: Wed, 26 Nov 2008 23:46:56 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <492DCFFE.2060903@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <1227736209.6739.9.camel@ozzu> <492DCA8F.4050105@cheimes.de> <492DCFFE.2060903@v.loewis.de> Message-ID: <1227739616.6739.13.camel@ozzu> On mer, 2008-11-26 at 23:38 +0100, "Martin v. L?wis" wrote: > > Merge Modules into your installation > > Create self-contained MSI packages, by including and configuring the > > required merge modules. > > Right. Still, if people want to go this route (I personally don't), > I think it would be useful to build an installer from the free edition. > You can then run Tools/msi/merge.py, which adds the CRT merge module > into the MSI file (mostly as-is, except for discarding the ALLUSERS > property from that merge module). Alternatively, for testing, you can > just assume that the CRT is already installed. So, deducing from your reply, this "merge module" is a thing that allows to install the CRT (and other shared components)? I quickly googled but I'm not really into the msi slang, so I'm not sure I understood. > When we then have a script that generates a mostly-complete installer, > I'm sure Giovanni would be happy to add support for the CRT merge > module to see how the tool fares (my expectation is that it breaks, > as I assume it just doesn't deal with the embedded ALLUSERS property > correctly - merge.py really uses a bad hack here). Another option is to contact the Advanced Installer vendor and ask for a free license for the Python Software Foundation. This would mean that everybody in the world would still be able to build an installer without CRT, and only PSF would build the official one with CRT bundled. I personally don't see this as a show-stopper (does anyone ever build the .msi besides Martin?). -- Giovanni Bajo Develer S.r.l. http://www.develer.com From rasky at develer.com Wed Nov 26 23:49:52 2008 From: rasky at develer.com (Giovanni Bajo) Date: Wed, 26 Nov 2008 23:49:52 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <492DC57D.1010101@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <1227736209.6739.9.camel@ozzu> <492DC57D.1010101@v.loewis.de> Message-ID: <1227739792.6739.15.camel@ozzu> On mer, 2008-11-26 at 22:54 +0100, "Martin v. L?wis" wrote: > > I've had good results with Advanced Installer: > > http://www.advancedinstaller.com/feats-list.html > > So how much effort would it be to create a Python installer? > Could you kindly provide one? In my case, the biggest effort would be finding out what needs to be put within the installer. If you can give me a pointer to where the current build process reads the complete file list to put within the .msi (and their relative destination path), I can try and build a simple test installer, on which we can start doing some evaluations. -- Giovanni Bajo Develer S.r.l. http://www.develer.com From martin at v.loewis.de Thu Nov 27 00:29:42 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 27 Nov 2008 00:29:42 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <1227739616.6739.13.camel@ozzu> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <1227736209.6739.9.camel@ozzu> <492DCA8F.4050105@cheimes.de> <492DCFFE.2060903@v.loewis.de> <1227739616.6739.13.camel@ozzu> Message-ID: <492DDBE6.60205@v.loewis.de> > So, deducing from your reply, this "merge module" is a thing that allows > to install the CRT (and other shared components)? Correct. More generally, a merge module is a something like an MSI library (.a). It includes a set of files and snippets of an installation procedure for them. > Another option is to contact the Advanced Installer vendor and ask for a > free license for the Python Software Foundation. This would mean that > everybody in the world would still be able to build an installer without > CRT, and only PSF would build the official one with CRT bundled. I > personally don't see this as a show-stopper (does anyone ever build > the .msi besides Martin?). I personally don't have any interest to spend any time on an alternative technology. The current technology works fine for me, and I understand it fully. Everybody in the world is able to build an installer today, also. However, I won't stop anybody else from working a switch to a different technology, either. Regards, Martin From martin at v.loewis.de Thu Nov 27 00:44:31 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 27 Nov 2008 00:44:31 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <1227739792.6739.15.camel@ozzu> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <1227736209.6739.9.camel@ozzu> <492DC57D.1010101@v.loewis.de> <1227739792.6739.15.camel@ozzu> Message-ID: <492DDF5F.7090603@v.loewis.de> > In my case, the biggest effort would be finding out what needs to be put > within the installer. If you can give me a pointer to where the current > build process reads the complete file list to put within the .msi (and > their relative destination path), I can try and build a simple test > installer, on which we can start doing some evaluations. The simplest approach might be to look at what it actually installs. If you want to read the specifcation: it's in Tools/msi/msi.py:add_files. directory.add_file takes a file, and optionally a source file (which defaults to the respective source directory). You also need to consider the features structure; there is a "current" feature at any point in time, and all components being added get added to the current feature. HTH, Martin From rasky at develer.com Thu Nov 27 10:29:32 2008 From: rasky at develer.com (Giovanni Bajo) Date: Thu, 27 Nov 2008 10:29:32 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <492DDBE6.60205@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <1227736209.6739.9.camel@ozzu> <492DCA8F.4050105@cheimes.de> <492DCFFE.2060903@v.loewis.de> <1227739616.6739.13.camel@ozzu> <492DDBE6.60205@v.loewis.de> Message-ID: <1227778172.6944.8.camel@ozzu> On gio, 2008-11-27 at 00:29 +0100, "Martin v. L?wis" wrote: > > So, deducing from your reply, this "merge module" is a thing that allows > > to install the CRT (and other shared components)? > > Correct. More generally, a merge module is a something like an MSI > library (.a). It includes a set of files and snippets of an installation > procedure for them. OK. One question: why CRT doesn't get installed as regular files near to the python executable? That's how I usually ship it, but maybe Python has some special need. > > Another option is to contact the Advanced Installer vendor and ask for a > > free license for the Python Software Foundation. This would mean that > > everybody in the world would still be able to build an installer without > > CRT, and only PSF would build the official one with CRT bundled. I > > personally don't see this as a show-stopper (does anyone ever build > > the .msi besides Martin?). > > I personally don't have any interest to spend any time on an alternative > technology. The current technology works fine for me, and I understand > it fully. Everybody in the world is able to build an installer today, > also. However, I won't stop anybody else from working a switch to a > different technology, either. I proposed an alternatives because I read you saying: "The tricky part really is when it breaks (which it does more often than not), in which case you need to understand msi.py, for which you need to understand MSI". Which means that maybe everybody *has tools* to build an installer today, but only a few people have the required knowledge to really do releases on Windows. So I believe that switching to an alternative that doesn't require full understanding of MSI and msi.py would probably low the barrier and allow more people to help you out. -- Giovanni Bajo Develer S.r.l. http://www.develer.com From stefan_ml at behnel.de Thu Nov 27 11:03:26 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 27 Nov 2008 11:03:26 +0100 Subject: [Python-3000] Using memoryviews In-Reply-To: <492C2E51.5030800@egenix.com> References: <49253821.7010909@egenix.com> <49270EFF.3070006@egenix.com> <1afaf6160811211334v5de44235j936eed156e45cfa2@mail.gmail.com> <49274571.5020202@gmail.com> <1afaf6160811211552y656aef85gde8dd9b6775a6535@mail.gmail.com> <492C2E51.5030800@egenix.com> Message-ID: M.-A. Lemburg wrote: > I find the implementation of the buffer protocol way too complicated. > One of the reasons why the buffer protocol in Python 2 never caught > on was the fact that it was too complicated and the Python 3 is > even worse in this respect. > > In practice you do want to have the ability to hook directly into the > data buffer of an object, but apart from some special needs that PIL > and the numeric folks may have, most users will just want to work > with a single contiguous chunk of memory and need a simple API to > do this - pass in an object, get a void* back. Cython makes it that easy to access a buffer (also in Python 2.3-2.5, BTW). You only have to declare the type of a buffer variable. http://wiki.cython.org/enhancements/buffer According to what I hear, at least the NumPy developers make use of this already. No idea how common it is in the PIL area, but it does work there, too. Stefan From selva_kum9 at yahoo.co.in Thu Nov 27 15:09:51 2008 From: selva_kum9 at yahoo.co.in (selva kum) Date: Thu, 27 Nov 2008 19:39:51 +0530 (IST) Subject: [Python-3000] Catching HTTP requests and serving for it Message-ID: <988339.66333.qm@web8802.mail.in.yahoo.com> I have a requirement to serve for http requests. The requests nature is to pass some parameters to the server and seek the server to get some data from the database and send it back to the client. I like to use? a? simple http server instead of apache. Could anyone help me in this? Since I have got some idea in the usage of socket, SimpleHTTPRequest modules. I am able to develop a simple server to accept connections, but in the applications point of view I couldn't do any more things with the http request contents. Please suggest me somethong in this regard. Be the first one to try the new Messenger 9 Beta! Go to http://in.messenger.yahoo.com/win/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Thu Nov 27 16:24:38 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 27 Nov 2008 16:24:38 +0100 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <1227778172.6944.8.camel@ozzu> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <4F0B9197-E929-41B4-8441-D6E9E7762955@python.org> <4922D6AF.6080400@cheimes.de> <49233EF3.9040303@v.loewis.de> <87wsf0mqcr.fsf@xemacs.org> <4923ABEF.50900@v.loewis.de> <49246BB6.7000607@v.loewis.de> <492DABAF.5020808@v.loewis.de> <1227736209.6739.9.camel@ozzu> <492DCA8F.4050105@cheimes.de> <492DCFFE.2060903@v.loewis.de> <1227739616.6739.13.camel@ozzu> <492DDBE6.60205@v.loewis.de> <1227778172.6944.8.camel@ozzu> Message-ID: <492EBBB6.1060203@v.loewis.de> Giovanni Bajo wrote: > On gio, 2008-11-27 at 00:29 +0100, "Martin v. L?wis" wrote: >>> So, deducing from your reply, this "merge module" is a thing that allows >>> to install the CRT (and other shared components)? >> Correct. More generally, a merge module is a something like an MSI >> library (.a). It includes a set of files and snippets of an installation >> procedure for them. > > OK. One question: why CRT doesn't get installed as regular files near to > the python executable? That's how I usually ship it, but maybe Python > has some special need. When installing "for all users", pythonxy.dll goes into system32. This, in turn, requires the CRT to be installed globally (which meant into system32 for VS6 and VS7.1, but means using SxS for VS 2008). It's necessary to install it into system32 so that PythonCOM can find it (alternatively, we could now also making it an SxS assembly). VS2008 adds another twist: assembly manifests. As a consequence of this technology, if Python 2.6 is installed "just for me" on Windows Vista (i.e. the CRT is next to the executable), it just won't work, because the extension modules (.pyd) can't find the CRT. > I proposed an alternatives because I read you saying: "The tricky part > really is when it breaks (which it does more often than > not), in which case you need to understand msi.py, for which you need to > understand MSI". Which means that maybe everybody *has tools* to build > an installer today, but only a few people have the required knowledge to > really do releases on Windows. > > So I believe that switching to an alternative that doesn't require full > understanding of MSI and msi.py would probably low the barrier and allow > more people to help you out. I remain skeptical. You replace the need to learn MSI with the need to learn this tool, and not only to work around the limitations of MSI, but also around the limitations of the tool you have chosen. Regards, Martin From aahz at pythoncraft.com Thu Nov 27 17:17:56 2008 From: aahz at pythoncraft.com (Aahz) Date: Thu, 27 Nov 2008 08:17:56 -0800 Subject: [Python-3000] Catching HTTP requests and serving for it In-Reply-To: <988339.66333.qm@web8802.mail.in.yahoo.com> References: <988339.66333.qm@web8802.mail.in.yahoo.com> Message-ID: <20081127161756.GA7137@panix.com> On Thu, Nov 27, 2008, selva kum wrote: > > I have a requirement to serve for http requests. The requests nature > is to pass some parameters to the server and seek the server to get > some data from the database and send it back to the client. Please use comp.lang.python; this list is for discussion of new versions of Python. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From greg at krypto.org Thu Nov 27 21:40:29 2008 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 27 Nov 2008 12:40:29 -0800 Subject: [Python-3000] [Python-Dev] 2.6.1 and 3.0 In-Reply-To: <492EBBB6.1060203@v.loewis.de> References: <3E43E049-6F3F-4FAA-9746-3FDE38F34A39@python.org> <492DABAF.5020808@v.loewis.de> <1227736209.6739.9.camel@ozzu> <492DCA8F.4050105@cheimes.de> <492DCFFE.2060903@v.loewis.de> <1227739616.6739.13.camel@ozzu> <492DDBE6.60205@v.loewis.de> <1227778172.6944.8.camel@ozzu> <492EBBB6.1060203@v.loewis.de> Message-ID: <52dc1c820811271240p7552c55dt16e2e88203703ec4@mail.gmail.com> I am not at all a windows person but I have used http://www.dennisbareis.com/makemsi.htm in the past to automate editing and tweaking some MSI files for testing. It can also be used to generate new ones. It looks like it would still require something to generate its own input description. Regardless, just wanted to offer the link so people are aware that it exists. I have no opinion on what actually gets used so long as its automated. -gps On Thu, Nov 27, 2008 at 7:24 AM, "Martin v. L?wis" wrote: > Giovanni Bajo wrote: > > On gio, 2008-11-27 at 00:29 +0100, "Martin v. L?wis" wrote: > >>> So, deducing from your reply, this "merge module" is a thing that > allows > >>> to install the CRT (and other shared components)? > >> Correct. More generally, a merge module is a something like an MSI > >> library (.a). It includes a set of files and snippets of an installation > >> procedure for them. > > > > OK. One question: why CRT doesn't get installed as regular files near to > > the python executable? That's how I usually ship it, but maybe Python > > has some special need. > > When installing "for all users", pythonxy.dll goes into system32. This, > in turn, requires the CRT to be installed globally (which meant into > system32 for VS6 and VS7.1, but means using SxS for VS 2008). It's > necessary to install it into system32 so that PythonCOM can find it > (alternatively, we could now also making it an SxS assembly). > > VS2008 adds another twist: assembly manifests. As a consequence of this > technology, if Python 2.6 is installed "just for me" on Windows Vista > (i.e. the CRT is next to the executable), it just won't work, because > the extension modules (.pyd) can't find the CRT. > > > I proposed an alternatives because I read you saying: "The tricky part > > really is when it breaks (which it does more often than > > not), in which case you need to understand msi.py, for which you need to > > understand MSI". Which means that maybe everybody *has tools* to build > > an installer today, but only a few people have the required knowledge to > > really do releases on Windows. > > > > So I believe that switching to an alternative that doesn't require full > > understanding of MSI and msi.py would probably low the barrier and allow > > more people to help you out. > > I remain skeptical. You replace the need to learn MSI with the need to > learn this tool, and not only to work around the limitations of MSI, but > also around the limitations of the tool you have chosen. > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ideasman42 at gmail.com Sun Nov 30 07:49:26 2008 From: ideasman42 at gmail.com (Campbell Barton) Date: Sun, 30 Nov 2008 17:49:26 +1100 Subject: [Python-3000] Impact of Py_FindMethod removal on external API's Message-ID: <7c1ab96d0811292249y6ca3d6adj74f3302893d3fa51@mail.gmail.com> Hey there, Recently I started to write a new python/C api for Blender3D with python3.0rc3. This PyApi is a thin wrapper for an autogenerated C API to be used by the UI and animation system (all in C). Without going into too many details, Im using tp_getattr so the requested attribute can be forwarded to our internal functions. The Blender3D game engine that has used this method to wrap internal data for some years as well, so I don't think this is that unusual. We moved most of our api's to use tp_getset, but that only works when the attributes you need dont change. Quote... http://mail.python.org/pipermail/python-3000/2008-July/014303.html > The primary use of Py_FindMethod was to implement a tp_getattr slot > handler. Now that it has been removed, there is nothing remaining in > the py3k codebase that actually uses the tp_getattr slot! > It has been 12 years since tp_getattro was introduced. Is it time to > finally phase out tp_getattr? I have no problems with breaking compatibility for python3000, however in this case removal of Py_FindMethod, removes functionality for external C api's that rely on dynamic attributes with tp_getattr. Is the intention to remove support for tp_getattr altogether? Note - in the meantime Ill add our own version of Py_FindMethod but its not ideal. -- - Campbell From amauryfa at gmail.com Sun Nov 30 22:31:22 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Sun, 30 Nov 2008 22:31:22 +0100 Subject: [Python-3000] Impact of Py_FindMethod removal on external API's In-Reply-To: <7c1ab96d0811292249y6ca3d6adj74f3302893d3fa51@mail.gmail.com> References: <7c1ab96d0811292249y6ca3d6adj74f3302893d3fa51@mail.gmail.com> Message-ID: Hello, Campbell Barton wrote: > Hey there, Recently I started to write a new python/C api for > Blender3D with python3.0rc3. > This PyApi is a thin wrapper for an autogenerated C API to be used by > the UI and animation system (all in C). > > Without going into too many details, Im using tp_getattr so the > requested attribute can be forwarded to our internal functions. > The Blender3D game engine that has used this method to wrap internal > data for some years as well, so I don't think this is that unusual. > > We moved most of our api's to use tp_getset, but that only works when > the attributes you need dont change. > > Quote... > http://mail.python.org/pipermail/python-3000/2008-July/014303.html >> The primary use of Py_FindMethod was to implement a tp_getattr slot >> handler. Now that it has been removed, there is nothing remaining in >> the py3k codebase that actually uses the tp_getattr slot! > >> It has been 12 years since tp_getattro was introduced. Is it time to >> finally phase out tp_getattr? > > I have no problems with breaking compatibility for python3000, however > in this case removal of Py_FindMethod, removes functionality for > external C api's that rely on dynamic attributes with tp_getattr. > > Is the intention to remove support for tp_getattr altogether? > > Note - in the meantime Ill add our own version of Py_FindMethod but > its not ideal. The same functionality can be achieved with the tp_getattro slot: implement your special dynamic attributes there, and then call PyObject_GenericGetAttr for the default behavior. You may have a look at the implementation of the pyexpat module: Modules/pyexpat.c, function xmlparse_getattro -- Amaury Forgeot d'Arc