Untitled Document

From tim.one@home.com Fri Feb 1 00:21:47 2002 From: tim.one@home.com (Tim Peters) Date: Thu, 31 Jan 2002 19:21:47 -0500 Subject: [Python-Dev] distutils & stderr In-Reply-To: <15449.47153.267248.439654@beluga.mojam.com> Message-ID: [Skip] > Sorry about the missing link. PyInline uses distutils to compile the C > code. How PyInline does its think doesn't really matter to me, so I'm > not going to be interested in distutils' messages. If distutils output isn't interesting to PyInline users, shouldn't PyInline be changed to run setup.py with its -q/--quiet option? From gmcm@hypernet.com Fri Feb 1 00:51:50 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 31 Jan 2002 19:51:50 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <15449.9882.393698.701265@gondolin.digicool.com> References: <00c801c1aa96$2b521320$6d94fea9@newmexico> Message-ID: <3C59A056.18632.5A29CDD7@localhost> On 31 Jan 2002 at 6:12, Jeremy Hylton wrote: > import mod.sub > creates a binding for "mod" in the global namespace > > The compiler can detect that the import statement is > a package import -- and mark "mod.sub" as a > candidate for optimization. A use of "mod.sub.attr" > in function should be treated just as "mod.attr". How can the compiler tell it's a package import? It's bad practice, but people write "import mod.attr" all the time. Heck, Marc-Andre tricks import so that pkg.mod is really pkg.attr where the attr turns into a mod when accessed. No problem, since it's only import that cares what it is. By the time it's used it's always global.attr.attr.... -- Gordon http://www.mcmillan-inc.com/ From jeremy@alum.mit.edu Fri Feb 1 01:13:16 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 31 Jan 2002 20:13:16 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <3C59A056.18632.5A29CDD7@localhost> References: <00c801c1aa96$2b521320$6d94fea9@newmexico> <3C59A056.18632.5A29CDD7@localhost> Message-ID: <15449.60332.741353.525674@gondolin.digicool.com> >>>>> "GM" == Gordon McMillan writes: GM> On 31 Jan 2002 at 6:12, Jeremy Hylton wrote: >> import mod.sub creates a binding for "mod" in the global >> namespace >> >> The compiler can detect that the import statement is a package >> import -- and mark "mod.sub" as a candidate for optimization. A >> use of "mod.sub.attr" in function should be treated just as >> "mod.attr". GM> How can the compiler tell it's a package import? I'm assuming it can guess based on import statements and that a runtime check in LOAD_GLOBAL_ATTR (or whatever it's called) can verify this assumption. I haven't thought this part through fully, because I'm not aware of the full perversity of what people do with import hooks. GM> It's bad practice, but people write "import mod.attr" all the GM> time. I write it all the time when attr is a module in a package. And I know I can't do it for an actual attr of module. GM> Heck, Marc-Andre tricks import so that pkg.mod is really GM> pkg.attr where the attr turns into a mod when accessed. No GM> problem, since it's only import that cares what it is. By the GM> time it's used it's always global.attr.attr.... Not sure I understand what Marc-Andre is doing. (That's probably true in general .) A client of his code types "import foo.bar." foo is a module? a package? When the "bar" attribute is loaded (LOAD_ATTR) is turns into another module? Jeremy From gward@python.net Fri Feb 1 03:06:35 2002 From: gward@python.net (Greg Ward) Date: Thu, 31 Jan 2002 22:06:35 -0500 Subject: [Python-Dev] distutils & stderr In-Reply-To: <15449.45367.46625.175691@beluga.mojam.com> References: <15449.45367.46625.175691@beluga.mojam.com> Message-ID: <20020201030635.GB9864@gerg.ca> On 31 January 2002, Skip Montanaro said: > If I could "cvs up" I would submit a patch, but in the meantime, is there > any good reason that distutils shouldn't write its output to stderr? I'm > using PyInline to execute a little bit of C code that returns some > information about the system to the calling Python code. This code then > sends some output to stdout. Because stderr is for error messages. Most of the noise generated by the Distutils is optional, here's-what-I'm-doing-now stuff -- ie. *not* errors. If there are Distutils messages that are not silenced with -q, that's a bug (and probably pretty easy to fix, too). Greg -- Greg Ward - programmer-at-large gward@python.net http://starship.python.net/~gward/ All of science is either physics or stamp collecting. From gward@python.net Fri Feb 1 03:11:43 2002 From: gward@python.net (Greg Ward) Date: Thu, 31 Jan 2002 22:11:43 -0500 Subject: [Python-Dev] distutils & stderr In-Reply-To: <15449.22538.192214.765110@gondolin.digicool.com> References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> Message-ID: <20020201031143.GC9864@gerg.ca> On 31 January 2002, Jeremy Hylton said: > I started a thread on similar issues on the distutils-sig mailing list > a week or two ago. There's agreement that output is a problem. The amount of output, or the binary nature of control (total silence vs. total verbosity)? I knew that was a minor problem when I wrote that code initially, but had bigger fish to fry. FWIW, my current thinking is that code that wants to be chatty should do something like this: log(1, "installing foo.bar package") ... log(2, "copying foo/bar/baz.py to /usr/local/lib/python2.1/site-packages/foo/bar") The first number is the logging threshold, compared against a global verbosity level. In a strongly OO system like the Distutils, that should probably be spelled log(N, msg) where the logging threshold is carried around in each object (or in some global object). This shouldn't be too hard to bolt onto the existing code -- ISTR that the verbose flag is readily available to every object in the system; just change it from a boolean to an integer and ensure that every log message goes through self.log(). Oh wait: most of the low-level worker code in the Distutils falls outside the main class hierarchy, so the verbose flag isn't *quite* so readily available; it gets passed in to a heck of a lot of functions. Crap. Greg -- Greg Ward - programmer-at-big gward@python.net http://starship.python.net/~gward/ "He's dead, Jim. You get his tricorder and I'll grab his wallet." From gmcm@hypernet.com Fri Feb 1 03:30:23 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 31 Jan 2002 22:30:23 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <15449.60332.741353.525674@gondolin.digicool.com> References: <3C59A056.18632.5A29CDD7@localhost> Message-ID: <3C59C57F.7192.5ABAF61C@localhost> On 31 Jan 2002 at 20:13, Jeremy Hylton wrote: > GM> How can the compiler tell it's a package import? > > I'm assuming it can guess based on import > statements and that a runtime check in > LOAD_GLOBAL_ATTR (or whatever it's called) can > verify this assumption. I haven't thought this part > through fully, because I'm not aware of the full > perversity of what people do with import hooks. Import hooks are effectively dead. People play namespace games almost exclusively now. > GM> It's bad practice, but people write "import > mod.attr" all the GM> time. > > I write it all the time when attr is a module in a > package. And I know I can't do it for an actual > attr of module. import os.path works even though there's no module named path. import pkg.attr always works. > GM> Heck, Marc-Andre tricks import so that > pkg.mod is really GM> pkg.attr where the attr turns > into a mod when accessed. No GM> problem, since it's > only import that cares what it is. By the GM> time > it's used it's always global.attr.attr.... > > Not sure I understand what Marc-Andre is doing. > (That's probably true in general .) A client of > his code types "import foo.bar." foo is a module? a > package? When the "bar" attribute is loaded > (LOAD_ATTR) is turns into another module? foo is a package. The __init__.py creates an instance of LazyModule named bar. Doing anything with foo.bar triggers an import, and replacment of the name "bar" in foo with module bar. That one's clean. Now turn your eye on the shennanigans in PyXML. -- Gordon http://www.mcmillan-inc.com/ From mal@lemburg.com Fri Feb 1 10:21:14 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 01 Feb 2002 11:21:14 +0100 Subject: [Python-Dev] Re: opcode performance measurements References: <00c801c1aa96$2b521320$6d94fea9@newmexico> <3C59A056.18632.5A29CDD7@localhost> <15449.60332.741353.525674@gondolin.digicool.com> Message-ID: <3C5A6C1A.BEED7152@lemburg.com> Jeremy Hylton wrote: > > >>>>> "GM" == Gordon McMillan writes: > GM> Heck, Marc-Andre tricks import so that pkg.mod is really > GM> pkg.attr where the attr turns into a mod when accessed. No > GM> problem, since it's only import that cares what it is. By the > GM> time it's used it's always global.attr.attr.... > > Not sure I understand what Marc-Andre is doing. (That's probably true > in general .) A client of his code types "import foo.bar." > foo is a module? a package? When the "bar" attribute is loaded > (LOAD_ATTR) is turns into another module? Take a look at e.g. mx.DateTime.__init__ and the included LazyModule module for more background. I don't really use that approach myself, but sometimes it can be handy to be able to reference modules in packages without requiring an import of them, e.g. import mx.DateTime date = mx.DateTime.Parser.DateTimeFromString('2002-02-01') -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mwh@python.net Fri Feb 1 11:11:44 2002 From: mwh@python.net (Michael Hudson) Date: 01 Feb 2002 11:11:44 +0000 Subject: [Python-Dev] distutils & stderr In-Reply-To: Greg Ward's message of "Thu, 31 Jan 2002 22:11:43 -0500" References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> Message-ID: <2m665h1mof.fsf@starship.python.net> Greg Ward writes: > On 31 January 2002, Jeremy Hylton said: > > I started a thread on similar issues on the distutils-sig mailing list > > a week or two ago. There's agreement that output is a problem. > > The amount of output, or the binary nature of control (total silence > vs. total verbosity)? I knew that was a minor problem when I wrote that > code initially, but had bigger fish to fry. I'm thinking that verbose should range from about -2 (no output at all, even from commands if we can supress it) to about 2 (stupid amounts of output) with the default being 0, where we take our guide from what make outputs by default. -v and -q would then be additive on the command line, so python setup.py -q -v -v -q -q would be an odd way of specifying "verbose==-1". > FWIW, my current thinking is that code that wants to be chatty should do > something like this: > > log(1, "installing foo.bar package") > ... > log(2, "copying foo/bar/baz.py to /usr/local/lib/python2.1/site-packages/foo/bar") > > The first number is the logging threshold, compared against a global > verbosity level. This sounds good. > In a strongly OO system like the Distutils, that should probably be > spelled > > log(N, msg) > > where the logging threshold is carried around in each object (or in some > global object). > > This shouldn't be too hard to bolt onto the existing code -- ISTR that > the verbose flag is readily available to every object in the system; > just change it from a boolean to an integer and ensure that every log > message goes through self.log(). > > Oh wait: most of the low-level worker code in the Distutils falls > outside the main class hierarchy, so the verbose flag isn't *quite* so > readily available; it gets passed in to a heck of a lot of functions. > Crap. There are a lot of calls in disutils that go func(...,...,verbose=self.verbose, dry_run=self.dry_run); Would it really be so bad to have a global "verbose" variable in, say, core? (same for dry_run, too). Of course, what I would like is CL-style special variables, but ne'er mind that... -- ARTHUR: Why are there three of you? LINTILLAS: Why is there only one of you? ARTHUR: Er... Could I have notice of that question? -- The Hitch-Hikers Guide to the Galaxy, Episode 11 From mal@lemburg.com Fri Feb 1 11:58:33 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 01 Feb 2002 12:58:33 +0100 Subject: [Python-Dev] distutils & stderr References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> Message-ID: <3C5A82E9.7367EA03@lemburg.com> Michael Hudson wrote: > > Greg Ward writes: > > > On 31 January 2002, Jeremy Hylton said: > > > I started a thread on similar issues on the distutils-sig mailing list > > > a week or two ago. There's agreement that output is a problem. > > > > The amount of output, or the binary nature of control (total silence > > vs. total verbosity)? I knew that was a minor problem when I wrote that > > code initially, but had bigger fish to fry. > > I'm thinking that verbose should range from about -2 (no output at > all, even from commands if we can supress it) to about 2 (stupid > amounts of output) with the default being 0, where we take our guide > from what make outputs by default. > > -v and -q would then be additive on the command line, so > > python setup.py -q -v -v -q -q > > would be an odd way of specifying "verbose==-1". That looks like line noise :-) > > FWIW, my current thinking is that code that wants to be chatty should do > > something like this: > > > > log(1, "installing foo.bar package") > > ... > > log(2, "copying foo/bar/baz.py to /usr/local/lib/python2.1/site-packages/foo/bar") > > > > The first number is the logging threshold, compared against a global > > verbosity level. > > This sounds good. Hmm, that's very close to what I have implemented in mx.Log (see the egenix-mx-base package). > > In a strongly OO system like the Distutils, that should probably be > > spelled > > > > log(N, msg) > > > > where the logging threshold is carried around in each object (or in some > > global object). > > > > This shouldn't be too hard to bolt onto the existing code -- ISTR that > > the verbose flag is readily available to every object in the system; > > just change it from a boolean to an integer and ensure that every log > > message goes through self.log(). > > > > Oh wait: most of the low-level worker code in the Distutils falls > > outside the main class hierarchy, so the verbose flag isn't *quite* so > > readily available; it gets passed in to a heck of a lot of functions. > > Crap. > > There are a lot of calls in disutils that go > > func(...,...,verbose=self.verbose, dry_run=self.dry_run); > > Would it really be so bad to have a global "verbose" variable in, say, > core? (same for dry_run, too). > > Of course, what I would like is CL-style special variables, but ne'er > mind that... FYI, I usually use a package/module scope global logging object for this kind of thing (rather than a function which then looks somewhere for the debug level). Works great. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Fri Feb 1 12:31:21 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 01 Feb 2002 13:31:21 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.352,1.353 References: Message-ID: <3C5A8A99.F2D7DCDE@lemburg.com> Tim Peters wrote: > > [MAL] > > Wouldn't it be better to use Win32 APIs for this ? That way, > > other compilers on Windows will have a chance to use the > > same code. > > I have no reason to believe that other compilers on Windows don't follow MS > in this respect; they usually seem to ape the same functions, sometimes with > or without leading underscores, or other trivial name changes. If you want > to wrestle with the Win32 API, be my guest . I'm no Win32 expert, just though that the code in the win32process module (which is part of win32all) probably already provides code in this area. Another candidate for Windows emulation would be os.kill(). win32process has TerminateProcess() which could probably be used for this (no idea however, how you get from a PID to a process handle on Windows). Anyway, just a thought... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From skip@pobox.com Fri Feb 1 14:38:48 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 1 Feb 2002 08:38:48 -0600 Subject: [Python-Dev] distutils & stderr In-Reply-To: References: <15449.47153.267248.439654@beluga.mojam.com> Message-ID: <15450.43128.601760.548974@12-248-41-177.client.attbi.com> Tim> If distutils output isn't interesting to PyInline users, shouldn't Tim> PyInline be changed to run setup.py with its -q/--quiet option? Probably so, but not all prints are guarded by "if verbose:". Skip From skip@pobox.com Fri Feb 1 14:51:25 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 1 Feb 2002 08:51:25 -0600 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <3C59C57F.7192.5ABAF61C@localhost> References: <3C59A056.18632.5A29CDD7@localhost> <3C59C57F.7192.5ABAF61C@localhost> Message-ID: <15450.43885.275862.430781@12-248-41-177.client.attbi.com> It just occurred to me that my LOAD_GLOBAL/LOAD_ATTR eliding scheme can't work, since LOAD_ATTR calls PyObject_GetAttr, which can wind up calling __getattr__, which is free to inflict all sorts of side effects on the attribute lookup. PEP 267 doesn't appear to be similarly affected, assuming it can conclude that LOAD_GLOBAL is actually loading a module object. (Can it?) LOAD_GLOBAL alone shouldn't be a problem, since all that does is call PyDict_GetItem for globals and builtins. damn... Skip From jack@oratrix.com Fri Feb 1 14:52:11 2002 From: jack@oratrix.com (Jack Jansen) Date: Fri, 1 Feb 2002 15:52:11 +0100 Subject: [Python-Dev] next vs darwin In-Reply-To: Message-ID: <459BBDA3-1723-11D6-B4B0-0030655234CE@oratrix.com> On Friday, February 1, 2002, at 12:09 , Martin v. Loewis wrote: > Jack Jansen writes: > >> With the define on it loads all extension modules into the application >> namespace. Some people want this (despite the problems sketched above) >> because they have modules that refer to external symbols defined in >> modules that have been loaded earlier (and I assume there's magic that >> ensures their modules are loaded in the right order). > > On Unix, this is a runtime option via sys.setdlopenflags (RTLD_GLOBAL > turns on import into application namespace). Do you think you could > emulate this API? Shouldn't be a problem. I had never heard of sys.setdlopenflags(), otherwise I would have done so already. >> I prefer the new (OSX 10.1) preferred Apple way of linking plugins >> (which is also the common way to do so on all other non-unix >> platforms) where the plugin has to be linked against the application >> and dynamic libraries it is going to be plugged into, so none of >> this dynamic behaviour goes on. > > I'm not sure linking with a libpython.so is desirable, I'm quite fond > of the approach to let the executable export symbols to the > extensions. If that is possible on OS X, I'd encourage you to follow > such a strategy (in unix gcc/ld, this is enabled through > -Wl,--export-dynamic). Indeed, you link against the embedder (be it .so, framework or application) in a special way that say "this is going to be the host application". -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From jeremy@alum.mit.edu Fri Feb 1 15:10:58 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Feb 2002 10:10:58 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <15450.43885.275862.430781@12-248-41-177.client.attbi.com> Message-ID: > It just occurred to me that my LOAD_GLOBAL/LOAD_ATTR eliding scheme can't > work, since LOAD_ATTR calls PyObject_GetAttr, which can wind up calling > __getattr__, which is free to inflict all sorts of side effects on the > attribute lookup. PEP 267 doesn't appear to be similarly affected, assuming > it can conclude that LOAD_GLOBAL is actually loading a module object. (Can > it?) LOAD_GLOBAL alone shouldn't be a problem, since all that does is call > PyDict_GetItem for globals and builtins. The approach I'm working on would have to check that the object is a module on each use, but that's relatively cheap compared to the layers of function calls we have now. It's a pretty safe assumption because it would only be made for objects bound by an import statement. I also wanted to answer Samuele's question briefly, because I'm going to be busy with other things most of today. The basic idea, which I need to flesh out by next week, is that the internal binding for "mod.attr" that a module keeps is just a hint. The compiler notices that function f() uses "mod.attr" and that mod is imported at the module level. The "mod.attr" binding must include a pointer to the location where mod is stored and the pointer it found when the "mod.attr" binding was updated. When "mod.attr" is used, the interpreter must check that mod is still bound to the same object. If so, the "mod.attr" binding is still valid. Note that the "mod.attr" binding is a PyObject ** -- a pointer to the location where "attr" is bound in "mod". Jeremy From guido@python.org Fri Feb 1 15:48:22 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 01 Feb 2002 10:48:22 -0500 Subject: [Python-Dev] distutils & stderr In-Reply-To: Your message of "01 Feb 2002 11:11:44 GMT." <2m665h1mof.fsf@starship.python.net> References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> Message-ID: <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> > I'm thinking that verbose should range from about -2 (no output at > all, even from commands if we can supress it) to about 2 (stupid > amounts of output) with the default being 0, where we take our guide > from what make outputs by default. I think the point is that Make has a more useful definition of what should be printed in the default case and what shouldn't, and that's the real problem -- not that there aren't enough levels. Fewer levels is actually better, since there are less ways to screw up. :-) The specific problem is that by default you don't want it to blab about all the things it doesn't have to do because they're already done. Make got this right and distutils got it wrong. I could see three levels at most: - verbose, tells you about everything it could do - default, only tells you about things it does and not about things it skips - quiet, only tells you about errors --Guido van Rossum (home page: http://www.python.org/~guido/) From jack@oratrix.com Fri Feb 1 16:04:05 2002 From: jack@oratrix.com (Jack Jansen) Date: Fri, 1 Feb 2002 17:04:05 +0100 Subject: [Python-Dev] next vs darwin In-Reply-To: <459BBDA3-1723-11D6-B4B0-0030655234CE@oratrix.com> Message-ID: <50CB70C6-172D-11D6-B4B0-0030655234CE@oratrix.com> On Friday, February 1, 2002, at 03:52 , Jack Jansen wrote: >> On Unix, this is a runtime option via sys.setdlopenflags (RTLD_GLOBAL >> turns on import into application namespace). Do you think you could >> emulate this API? > > Shouldn't be a problem. I had never heard of sys.setdlopenflags(), > otherwise I would have done so already. Hmm. I had a look at the setdlopenflags() and accompanying infrastructure, and it seems you can set many flags to dlopen() through this call, is that right? If it is, is it a good idea to call the OSX-specific routine setdlopenflags() too, even though it will only support the "use global namespace" flag? Or is that the only flag you can reasonably pass to dlopen() anyway? -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - > From gward@python.net Fri Feb 1 17:01:47 2002 From: gward@python.net (Greg Ward) Date: Fri, 1 Feb 2002 12:01:47 -0500 Subject: [Python-Dev] urllib2 bug Message-ID: <20020201170147.GA11551@gerg.ca> I've just discovered a bug in urllib2: it drops caller-supplied headers when processing HTTP redirects. See http://sourceforge.net/tracker/index.php?func=detail&aid=511786&group_id=5470&atid=105470 for details. The fix (to HTTPRedirectHandler.http_error_302()), as near as I can tell, is trivial: --- Lib/urllib2.py 2001/11/09 16:46:51 1.24 +++ Lib/urllib2.py 2002/02/01 17:00:05 @@ -416,7 +416,7 @@ # XXX Probably want to forget about the state of the current # request, although that might interact poorly with other # handlers that also use handler-specific request attributes - new = Request(newurl, req.get_data()) + new = Request(newurl, req.get_data(), req.headers) new.error_302_dict = {} if hasattr(req, 'error_302_dict'): if len(req.error_302_dict)>10 or \ I'll check this in (2.2.1 candidate) and close the bug unless anyone howls. Greg -- Greg Ward - Unix bigot gward@python.net http://starship.python.net/~gward/ This message transmitted with 100% recycled electrons. From trentm@ActiveState.com Fri Feb 1 18:20:50 2002 From: trentm@ActiveState.com (Trent Mick) Date: Fri, 1 Feb 2002 10:20:50 -0800 Subject: [Python-Dev] distutils & stderr In-Reply-To: <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Feb 01, 2002 at 10:48:22AM -0500 References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020201102050.B31242@ActiveState.com> On Fri, Feb 01, 2002 at 10:48:22AM -0500, Guido van Rossum wrote: > I could see three levels at most: > > - verbose, tells you about everything it could do > - default, only tells you about things it does and not about things it > skips > - quiet, only tells you about errors FYI, The log4j (j==Java) system uses five levels: 1. debug 2. info 3. warn 4. error 5. fatal Application code uses the system something like this (simplified Python translation): # Go through some steps to get a "Logger" singleton object. import log4py log = log4py.getLogger("distutils") # Then you call methods on the 'log' object for the level of message to # write. log.debug("Distutil *could* do BAR.") ... log.info("Distutils is now doing FOO") ... log.warn("Beware, SPAM may not be what you expect.") ... log.error("This is just wrong.") ... log.fatal("This is really bad. Aborting EGGS.") # The 'log' object knows if, say, log.debug() calls should actually # result in any output (because the setup.py option processing sets the # level to print). So, if I use 'python setup.py -q' the print level is # set to "WARN" (or perhaps "ERROR") and only .warn(), .error(), and # .fatal() calls get printed. That is just an idea of how it could be done. You could reduce the logging levels down to three, as Guido suggested. c.f. http://jakarta.apache.org/log4j/docs/index.html Cheers, Trent -- Trent Mick TrentM@ActiveState.com From gward@python.net Fri Feb 1 18:21:23 2002 From: gward@python.net (Greg Ward) Date: Fri, 1 Feb 2002 13:21:23 -0500 Subject: [Python-Dev] distutils & stderr In-Reply-To: <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020201182123.GA12019@gerg.ca> On 01 February 2002, Guido van Rossum said: > I could see three levels at most: > > - verbose, tells you about everything it could do > > - default, only tells you about things it does and not about things it > skips > > - quiet, only tells you about errors +1 from me. The Distutils' current level of verbosity stems from my desire to see exactly what was happening in the code at the development/debugging stage. That's obsolete and should be fixed. I like Guido's idea. Greg -- Greg Ward - Linux weenie gward@python.net http://starship.python.net/~gward/ Jesus Saves -- and you can too, by redeeming these valuable coupons! From thomas.heller@ion-tof.com Fri Feb 1 18:33:07 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 1 Feb 2002 19:33:07 +0100 Subject: [Python-Dev] distutils & stderr References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> <20020201102050.B31242@ActiveState.com> Message-ID: <07c301c1ab4e$e4ad0e70$e000a8c0@thomasnotebook> From: "Trent Mick" > FYI, > > The log4j (j==Java) system uses five levels: > 1. debug > 2. info > 3. warn > 4. error > 5. fatal > I'm also a very happy user of log4* (although * = C at the moment for me). IMO: The debug and info levels are for the programmer, only warn, error, and fatal are for the user. Thomas From Samuele Pedroni" Message-ID: <021901c1ab4e$867fb640$6d94fea9@newmexico> First, thanks for the answer :). Here is my input on the topic [Obviously I won't be present the developer day] From: Jeremy Hylton > The approach I'm working on would have to check that the object is a module > on each use, but that's relatively cheap compared to the layers of function > calls we have now. It's a pretty safe assumption because it would only be > made for objects bound by an import statement. > > I also wanted to answer Samuele's question briefly, because I'm going to be > busy with other things most of today. The basic idea, which I need to flesh > out by next week, is that the internal binding for "mod.attr" that a module > keeps is just a hint. The compiler notices that function f() uses > "mod.attr" and that mod is imported at the module level. The "mod.attr" > binding must include a pointer to the location where mod is stored and the > pointer it found when the "mod.attr" binding was updated. When "mod.attr" > is used, the interpreter must check that mod is still bound to the same > object. If so, the "mod.attr" binding is still valid. Note that the > "mod.attr" binding is a PyObject ** -- a pointer to the location where > "attr" is bound in "mod". > I see, btw I asked primarily because the PEP as it is is vague, not because I believed the idea cannot fly [for Jython the issue is more complicated, PyObject ** is not something easily expressed in Java ] I think that it is worth to point out that what you propose is a special/ ad-hoc version of what typically other Smalltalk-like dynamic languages do, together with jitting, but the approach is orthogonal to that, namely: for every send site they have a send-site cache: if send-site-cache-still-applies: # (1) dispatch based on site-cache contents # (2) else: normal send lookup and update send-site-cache In Python more or less the same could be applied to load_* instead of sends. Your approach deals with a part of those. These need (only) module-level caches. The extended/general approach could work too and give some benefit. But it is clear that the complexity and overhead of (1) and (2), and the space-demand for the caches depend on how much homogeneous are system object layouts and behaviors. And Python with modules, data-objects, class/instances, types etc is quite a zoo :(. Pushing the class/type unification further, this is an aspect to consider IMHO. If those things where already all known sorry for the boring post. regards. From martin@v.loewis.de Fri Feb 1 19:19:46 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 01 Feb 2002 20:19:46 +0100 Subject: [Python-Dev] next vs darwin In-Reply-To: <50CB70C6-172D-11D6-B4B0-0030655234CE@oratrix.com> References: <50CB70C6-172D-11D6-B4B0-0030655234CE@oratrix.com> Message-ID: Jack Jansen writes: > Hmm. I had a look at the setdlopenflags() and accompanying > infrastructure, and it seems you can > set many flags to dlopen() through this call, is that right? Correct. It is admittedly very Unixish at the moment. > If it is, is it a good idea to call the OSX-specific routine > setdlopenflags() too, even though it will only support the "use > global namespace" flag? Or is that the only flag you can reasonably > pass to dlopen() anyway? Effectively, yes. There is also a symbol RTLD_LOCAL, which is 0 on most systems, and it may be reasonable to add RTLD_LAZY (defer resolution of function symbols until they are called the first time). Anyway, my main point is that this should be a run-time option. If the APIs can merge, that might be a good thing (even if it means to deprecate setdlopenflags); if that is not feasible, I'd atleast recommend that you put the control over extension loading also into sys. Regards, Martin From skip@pobox.com Fri Feb 1 19:59:46 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 1 Feb 2002 13:59:46 -0600 Subject: [Python-Dev] distutils & stderr In-Reply-To: <20020201182123.GA12019@gerg.ca> References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> <20020201182123.GA12019@gerg.ca> Message-ID: <15450.62386.306227.518014@12-248-41-177.client.attbi.com> On 01 February 2002, Guido van Rossum said: >> - quiet, only tells you about errors And only to stderr, assuming stderr is available. (Can this be detected on Windows?) If you log messages to stdout, scripts that use distutils can't be used as filters. Skip From guido@python.org Fri Feb 1 20:09:16 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 01 Feb 2002 15:09:16 -0500 Subject: [Python-Dev] distutils & stderr In-Reply-To: Your message of "Fri, 01 Feb 2002 13:59:46 CST." <15450.62386.306227.518014@12-248-41-177.client.attbi.com> References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> <20020201182123.GA12019@gerg.ca> <15450.62386.306227.518014@12-248-41-177.client.attbi.com> Message-ID: <200202012009.g11K9G704961@pcp742651pcs.reston01.va.comcast.net> > >> - quiet, only tells you about errors > > And only to stderr, assuming stderr is available. (Can this be > detected on Windows?) Depends on what you call available. sys.stderr should always exist. > If you log messages to stdout, scripts that use distutils can't > be used as filters. IMO it would be better if there was a way to give distutils a file where to send output. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Fri Feb 1 20:10:49 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 1 Feb 2002 21:10:49 +0100 Subject: [Python-Dev] distutils & stderr References: <15449.47153.267248.439654@beluga.mojam.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> <20020201182123.GA12019@gerg.ca> <15450.62386.306227.518014@12-248-41-177.client.attbi.com> <200202012009.g11K9G704961@pcp742651pcs.reston01.va.comcast.net> Message-ID: <098801c1ab5c$8a853090$e000a8c0@thomasnotebook> From: "Guido van Rossum" > > >> - quiet, only tells you about errors > > > > And only to stderr, assuming stderr is available. (Can this be > > detected on Windows?) > > Depends on what you call available. sys.stderr should always exist. > > > If you log messages to stdout, scripts that use distutils can't > > be used as filters. > > IMO it would be better if there was a way to give distutils a file > where to send output. One additional annoyance under windows is that MSVC (when compiling) always prints messages to the console (stderr, stdout? not sure) which cannot be suppressed (at least I haven't found a way). Thomas From tim.one@home.com Fri Feb 1 21:29:38 2002 From: tim.one@home.com (Tim Peters) Date: Fri, 1 Feb 2002 16:29:38 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.352,1.353 In-Reply-To: <3C5A8A99.F2D7DCDE@lemburg.com> Message-ID: [MAL] > I'm no Win32 expert, just though that the code in the win32process > module (which is part of win32all) probably already provides code in > this area. With a Win32 flavor, which isn't what I need here. There is no distinct "wait for process" function in Win32, it's just another application of the very cool WaitFor{Single,Multiple}Object(s)[Ex] APIs (which can "wait" for sets of "handles" to "do something": kinda like Unix select(), except not braindead ). That's fine, but what I specifically needed (for a Zope Corp project) was a Unixish waitpid() workalike. MS already did most of the work for that in their _cwait function, so it would be silly not to reuse it. BTW, a google search suggested Borland also supports a cwait function, but I have neither a Borland compiler nor time to worry about that platform. You didn't worry much about the Cray T3E when implementing Unicode either . > Another candidate for Windows emulation would be os.kill(). > win32process has TerminateProcess() which could probably be used > for this (no idea however, how you get from a PID to a process handle > on Windows). I'm not looking for random functions to implement; if I *need* an os.kill()-alike, I'll do one, but I don't expect the need. TerminateProcess() is a dangerous function on Windows (read the docs). If you want to risk it, you go from process pid to process handle via the Win32 OpenProcess() function. From mal@lemburg.com Fri Feb 1 21:58:34 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 01 Feb 2002 22:58:34 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.352,1.353 References: Message-ID: <3C5B0F8A.E1998EB9@lemburg.com> Tim Peters wrote: >=20 > [MAL] > > I'm no Win32 expert, just though that the code in the win32process > > module (which is part of win32all) probably already provides code in > > this area. >=20 > With a Win32 flavor, which isn't what I need here. There is no distinc= t > "wait for process" function in Win32, it's just another application of = the > very cool WaitFor{Single,Multiple}Object(s)[Ex] APIs (which can "wait" = for > sets of "handles" to "do something": kinda like Unix select(), except = not > braindead ). >=20 > That's fine, but what I specifically needed (for a Zope Corp project) w= as a > Unixish waitpid() workalike. MS already did most of the work for that = in > their _cwait function, so it would be silly not to reuse it. BTW, a go= ogle > search suggested Borland also supports a cwait function, but I have nei= ther > a Borland compiler nor time to worry about that platform. You didn't w= orry > much about the Cray T3E when implementing Unicode either . Touch=E9 :-) > > Another candidate for Windows emulation would be os.kill(). > > win32process has TerminateProcess() which could probably be used > > for this (no idea however, how you get from a PID to a process handle > > on Windows). >=20 > I'm not looking for random functions to implement; if I *need* an > os.kill()-alike, I'll do one, but I don't expect the need. > TerminateProcess() is a dangerous function on Windows (read the docs). = If > you want to risk it, you go from process pid to process handle via the = Win32 > OpenProcess() function. Too bad, because I have will have a need for porting a multi-process application to Windows sometime soon :-) Here's an article I found on the topic: http://www.wdj.com/articles/1999/9907/9907c/9907c.htm What a hack... now I know why you don't want to use Win32 APIs ;-) --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jeremy@alum.mit.edu Fri Feb 1 12:59:52 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Feb 2002 07:59:52 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <021901c1ab4e$867fb640$6d94fea9@newmexico> References: <021901c1ab4e$867fb640$6d94fea9@newmexico> Message-ID: <15450.37192.51717.328419@gondolin.digicool.com> >>>>> "SP" == Samuele Pedroni writes: SP> But it is clear that the complexity and overhead of (1) and (2), SP> and the space-demand for the caches depend on how much SP> homogeneous are system object layouts and behaviors. Good point! It's important to try to extract the general principles at work and see how they can be applied systematically. The general notion I have is that dictionaries are not an efficient way to implement namespaces. The way most namespaces are used -- in particular that most names are known statically -- allows a more efficient implement. SP> And Python with modules, data-objects, class/instances, types SP> etc is quite a zoo :(. And, again, this is a problem. The same sorts of techniques apply to all namespaces. It would be good to try to make the approach general, but some namespaces are more dynamic than others. Python's classes, lack of declarations, and separate compilation of modules means class/instance namespaces are hard to do right. Need to defer a lot of final decisions to runtime and keep an extra dictionary around just in case. SP> Pushing the class/type unification further, this is an aspect to SP> consider IMHO. SP> If those things where already all known sorry for the boring SP> post. Thanks for good questions and suggestions. Too bad you can't come to dev day. I'll try to post slides before or after the talk -- and update the PEP. Jermey From Jack.Jansen@oratrix.nl Fri Feb 1 23:34:51 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Sat, 2 Feb 2002 00:34:51 +0100 Subject: [Python-Dev] Patch to enable sys.setdlopenflags() on MacOSX Message-ID: <49FC9AC9-176C-11D6-9B87-003065517236@oratrix.nl> I put a patch on sourceforge, #511962, which enables sys.setdlopenflags() on MacOSX. The only values you can pass are 0 (the default, dynamic modules are each loaded into their own private symbol namespace) and 0x100 (modules are loaded into the process' global symbol namespace, so they can refer to eah other's symbols). As the API is compatible with Linux I hope this solves the problem of the people who want a global namespace, could they please apply the patch and see whether it works? As I myself think this is a hack upon a hack, could people with Strong Opinions (you know who you are:-) please tell (a) me to commit the patch, (b) me to change the patch, or (c) the people addressed in the previous paragraph to not do what they're doing:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From Samuele Pedroni" <021901c1ab4e$867fb640$6d94fea9@newmexico> <15450.37192.51717.328419@gondolin.digicool.com> Message-ID: <047c01c1ab84$bf7170c0$6d94fea9@newmexico> [Jeremy Hylton] > Thanks for good questions and suggestions. Too bad you can't come to > dev day. I'll try to post slides before or after the talk -- and > update the PEP. Here are some more wild ideas, probably more thought provoking than useful, but this is really an area where only the profiler knows the truth . > SP> And Python with modules, data-objects, class/instances, types > SP> etc is quite a zoo :(. > > And, again, this is a problem. The same sorts of techniques apply to > all namespaces. It would be good to try to make the approach > general, but some namespaces are more dynamic than others. Python's > classes, lack of declarations, and separate compilation of modules > means class/instance namespaces are hard to do right. Need to defer a > lot of final decisions to runtime and keep an extra dictionary around > just in case. > * instance namespaces As I said but what eventually will happen with class/type unification plays a role. 1. __slots__ are obviously a good thing here :) 2. old-style instances and in general instances with a dict: one can try to guess the slots of a class looking for the "self.attr" pattern at compile time in a more or less clever way. The set of compile-time guessed attrs will be passed to MAKE_CLASS which will construct the runtime guess using the union of the super-classes guesses and the compile time guess for the class. This information can be used to layout a dlict. * zoo problem [yes as I said this whole inline cache thing is supossed to trade memory with speed. And the fact that python internal objects are so inhomogeneous/ polymorphic does not help to keep the amount small, for example having only new-style classes would help] ideally one can assign to each bytecode in a codeobject whose behavior depends/dispatchs on the concrete object "type" a "cache line" (or many, polymorphic inline caches for modern Smalltalk impl does that in the context of the jit) (As long as the GIL is there we do not need per-thread version of the caches) the first entries in the "cache-line" could contain the PyObject type and then a function pointer, so the we would have a common logic like: if PyObjectType(obj) == cache_line.type: cache_line.onType() else: ... then the per-type code could use the rest of the space in cache-line polymorphically to contain type-specific cached "dispatch" info. E.g. the index of a dict entry for the load_attr/set_attr logic on an instance ... Abstractly one can think about a cache-line for a bytecode as the streamlined version in terms of values/or code-pointers of the last time taken path for that bytecode, plus values to check whether the very same path still makes sense. 1. in practice these ideas can perform very poorly 2. this try to address things/internals as they are, 3. Yup, anything on the object layout/behavior side that simplifies this picture probably does a step in the right direction. regards, Samuele. From jeremy@alum.mit.edu Sat Feb 2 01:15:21 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Feb 2002 20:15:21 -0500 Subject: [Python-Dev] Re: opcode performance measurements In-Reply-To: <047c01c1ab84$bf7170c0$6d94fea9@newmexico> References: <021901c1ab4e$867fb640$6d94fea9@newmexico> <15450.37192.51717.328419@gondolin.digicool.com> <047c01c1ab84$bf7170c0$6d94fea9@newmexico> Message-ID: <15451.15785.174046.282855@gondolin.digicool.com> >>>>> "SP" == Samuele Pedroni writes: SP> * instance namespaces SP> As I said but what eventually will happen with class/type SP> unification plays a role. SP> 1. __slots__ are obviously a good thing here :) SP> 2. old-style instances and in general instances with a dict: SP> one can try to guess the slots of a class looking for the SP> "self.attr" pattern at compile time in a more or less clever SP> way. The set of compile-time guessed attrs will be passed to SP> MAKE_CLASS which will construct the runtime guess using the SP> union of the super-classes guesses and the compile time guess SP> for the class. This information can be used to layout a dlict. Right! There's another step necessary to take advantage though. When you execute a method you don't know the receiver type (self.__class__). So you need to specialize the bytecode to a particular receiver the first time the method is called. Since this could be relatively expensive and you don't know how often the method will be executed, you need to decide dynamically when to do it. Just like HotSpot. We probably have to worry about a class or instance being modified in a way that invalidates the dlict offsets computed. (Not sure here, but I think that's the case.) If so, we probably need a different object -- call it a template -- that represents the concrete layout and is tied to unmodified concrete class. When objects or classes are modified in dangerous ways, we'd need to invalidate the template pointer for the affected instances. Jeremy From Samuele Pedroni" <021901c1ab4e$867fb640$6d94fea9@newmexico><15450.37192.51717.328419@gondolin.digicool.com><047c01c1ab84$bf7170c0$6d94fea9@newmexico> <15451.15785.174046.282855@gondolin.digicool.com> Message-ID: <004201c1ab90$d3c18540$9d97bac3@newmexico> From: Jeremy Hylton > >>>>> "SP" == Samuele Pedroni writes: ... > SP> one can try to guess the slots of a class looking for the > SP> "self.attr" pattern at compile time in a more or less clever > SP> way. The set of compile-time guessed attrs will be passed to > SP> MAKE_CLASS which will construct the runtime guess using the > SP> union of the super-classes guesses and the compile time guess > SP> for the class. This information can be used to layout a dlict. > > Right! There's another step necessary to take advantage though. When > you execute a method you don't know the receiver type > (self.__class__). So you need to specialize the bytecode to a > particular receiver the first time the method is called. Since this > could be relatively expensive and you don't know how often the method > will be executed, you need to decide dynamically when to do it. Just > like HotSpot. Right, because with multiple inheritance you cannot make the layout of a subclass compatible with that of *all* superclasses, so simple monomorphic inline caches will not work :(. OTOH you can use polymorphic inline cachesm, that means a bunch of class->index lines for each bytecode or not specialize the bytecode but (insane idea) choose on method entry a different bunch of cache-lines based on self class. > We probably have to worry about a class or instance being modified in > a way that invalidates the dlict offsets computed. (Not sure here, > but I think that's the case.) If so, we probably need a different > object -- call it a template -- that represents the concrete layout > and is tied to unmodified concrete class. When objects or classes are > modified in dangerous ways, we'd need to invalidate the template > pointer for the affected instances. This would be similar to the Self VM map concept (although python is type/class based because of the very dynamic nature of instances it has similar problems to prototype based languages). I don't know if we need that and if it can be implemented effectively, I considered that too during my brainstorming. AFAIK caching/memoization plays an important role in all high perf dynamic object languages impls. Abstractly it seems effective for Python too, but it is unclear if the complexity of the internal models will render it ineffective. With caching you can probably simply timestamp classes, when a class is changed structurally you increment its timestamp and that of all direct and inderect subclasses, you don't touch instances. Then you compare the cached timestamp with that of instance class to check if the entry is valid. The tricky part is that in python an instance attribute can be added at any point that shadows a class attribute. I don't know if there are open issues, but an approach would be in that case to increment the timestamp of the instance classe too. The problem is that there are so many cases and situations, that's why the multi-staged cache-lines approach in theory makes some sense but could be anyway totally ineffective in practice . These are all interesting topics, although from these more or less informal discussions to results there is a lot of details and code :(. But already improving the global lookup thing would be a good step. Hope this makes some kind of sense. Samuele. From neal@metaslash.com Sat Feb 2 14:38:33 2002 From: neal@metaslash.com (Neal Norwitz) Date: Sat, 02 Feb 2002 09:38:33 -0500 Subject: [Python-Dev] Re: opcode performance measurements References: <021901c1ab4e$867fb640$6d94fea9@newmexico><15450.37192.51717.328419@gondolin.digicool.com><047c01c1ab84$bf7170c0$6d94fea9@newmexico> <15451.15785.174046.282855@gondolin.digicool.com> <004201c1ab90$d3c18540$9d97bac3@newmexico> Message-ID: <3C5BF9E9.1EFEA2FE@metaslash.com> Samuele Pedroni wrote: > > From: Jeremy Hylton > > >>>>> "SP" == Samuele Pedroni writes: > ... > > SP> one can try to guess the slots of a class looking for the > > SP> "self.attr" pattern at compile time in a more or less clever > > SP> way. The set of compile-time guessed attrs will be passed to > > SP> MAKE_CLASS which will construct the runtime guess using the > > SP> union of the super-classes guesses and the compile time guess > > SP> for the class. This information can be used to layout a dlict. > > > > Right! There's another step necessary to take advantage though. When > > you execute a method you don't know the receiver type > > (self.__class__). So you need to specialize the bytecode to a > > particular receiver the first time the method is called. Since this > > could be relatively expensive and you don't know how often the method > > will be executed, you need to decide dynamically when to do it. Just > > like HotSpot. Why not assume the general case is the most common, ie, that the object is an instance of this class or one of its subclasses? That way you could do the specialization at compile time. And for the (presumably) few times that this isn't true fallback to another technique, perhaps like HotSpot. Also, doesn't calling a base class method as: Base.method(self) # in particular __init__() vs. self.method() create problems if you specialize for a specific class? Or does specialization necessarily mean for a subclass and all its base clases? > Right, because with multiple inheritance you cannot make the layout > of a subclass compatible with that of *all* superclasses, so simple > monomorphic inline caches will not work :(. ISTM that it would be best to handle single inheritance first. Multiple inheritance could perhaps be handled for the class with the most commonly referenced attribute (assuming 2+ classes don't define the same attr. And use a fallback technique for all other cases. > > We probably have to worry about a class or instance being modified in > > a way that invalidates the dlict offsets computed. (Not sure here, > > but I think that's the case.) If so, we probably need a different Right, if an attr is deleted, methods added/removed dynamically, etc. > > object -- call it a template -- that represents the concrete layout > > and is tied to unmodified concrete class. When objects or classes are > > modified in dangerous ways, we'd need to invalidate the template > > pointer for the affected instances. By using a template, doesn't that become a dict lookup again? > These are all interesting topics, although from these > more or less informal discussions to results there is > a lot of details and code :(. I agree. Since we can't know what will be optimal, it seems safer to keep the existing functionality as a fallback case and try to improve things with small steps (eg, single inheritance, first). > But already improving the global lookup thing > would be a good step. Definitely. Neal From Samuele Pedroni" <021901c1ab4e$867fb640$6d94fea9@newmexico><15450.37192.51717.328419@gondolin.digicool.com><047c01c1ab84$bf7170c0$6d94fea9@newmexico> <15451.15785.174046.282855@gondolin.digicool.com> <004201c1ab90$d3c18540$9d97bac3@newmexico> <3C5BF9E9.1EFEA2FE@metaslash.com> Message-ID: <00b601c1ac00$89bc96e0$6d94fea9@newmexico> From: Neal Norwitz > Why not assume the general case is the most common, ie, that the > object is an instance of this class or one of its subclasses? > That way you could do the specialization at compile time. And > for the (presumably) few times that this isn't true fallback to another > technique, perhaps like HotSpot. > > Also, doesn't calling a base class method as: > Base.method(self) # in particular __init__() > vs. > self.method() > > create problems if you specialize for a specific class? Or does > specialization necessarily mean for a subclass and all its base clases? Puzzled. In Python you could specialization at MAKE_CLASS time, which means rewriting all the direct and indirect superclasses methods and the class method under the assumption that self is of the built class. Doing so is probably too expensive. Typically specializing only when a given method is actually called makes more sense. Btw typical systems specialize and native-compile at the same time, if you substract the native-compile part your cost equation change a lot. Given that people can change self (although nobody does) that you need data/control flow analysis, that's too bad: def a(self): self = 3 return self+1 Also: def __add__(self,o): ... You cannot do anything special for o :(. > [Jeremy] > Right, because with multiple inheritance you cannot make the layout > > of a subclass compatible with that of *all* superclasses, so simple > > monomorphic inline caches will not work :(. > > ISTM that it would be best to handle single inheritance first. > Multiple inheritance could perhaps be handled for the class with > the most commonly referenced attribute (assuming 2+ classes don't > define the same attr. And use a fallback technique for all other cases. > How do you decide which are the most commonly referenced attributes? > > > We probably have to worry about a class or instance being modified in > > > a way that invalidates the dlict offsets computed. (Not sure here, > > > but I think that's the case.) If so, we probably need a different > > Right, if an attr is deleted, methods added/removed dynamically, etc. It really depends on implementation details. > > > object -- call it a template -- that represents the concrete layout > > > and is tied to unmodified concrete class. When objects or classes are > > > modified in dangerous ways, we'd need to invalidate the template > > > pointer for the affected instances. > > By using a template, doesn't that become a dict lookup again? Tthe good thing about templates as idea is that they could solve the zoo isssue. You're right about lookup but to see the utility you should bring per bytecode instr caches in the picture: if obj.template == cache_line.template: use cache_line.cached_lookup_result else: lookup and update cache_line [The Self VM used maps (read templates) in such a way] There is really a huge hack/implentation space to play with. These comments are mainly informal, if the interest remain after the conference I will be pleased to partecipate to more focused and into-the-details discussions. regards, Samuele. From tim.one@home.com Sat Feb 2 21:51:37 2002 From: tim.one@home.com (Tim Peters) Date: Sat, 2 Feb 2002 16:51:37 -0500 Subject: [Python-Dev] distutils & stderr In-Reply-To: <15450.43128.601760.548974@12-248-41-177.client.attbi.com> Message-ID: [Tim] > If distutils output isn't interesting to PyInline users, > shouldn't PyInline be changed to run setup.py with its -q/--quiet > option? [Skip] > Probably so, but not all prints are guarded by "if verbose:". Have you tried it in the case you complained about at the start of this? These days I routinely build elaborate pieces of Zope using -q, and the only msgs I ever see then are things like """ MultiMapping.c Creating library build\temp.win32-2.3\Release\MultiMapping.lib and object build\temp.win32-2.3\Release\MultiMapping.exp """ I believe those are generated by Microsoft's compiler (the case-sensitive string "Creating" appears nohwere in the distutils source; and yes, these go to stdout too), and if so there's nothing distutils can do about that. I don't see any messages that look like they come from distutils. just-because-you-don't-understand-the-code-doesn't-mean-it-doesn't- do-what-you-want-ly y'rs - tim From skip@pobox.com Sun Feb 3 03:43:43 2002 From: skip@pobox.com (Skip Montanaro) Date: Sat, 2 Feb 2002 21:43:43 -0600 Subject: [Python-Dev] distutils & stderr In-Reply-To: References: <15450.43128.601760.548974@12-248-41-177.client.attbi.com> Message-ID: <15452.45551.652825.195467@12-248-41-177.client.attbi.com> Tim> [Skip] >> Probably so, but not all prints are guarded by "if verbose:". Tim> Have you tried it in the case you complained about at the start of Tim> this? Yes, and it seems to shut things up just fine. I made that comment after having modified my source to dump all prints to stderr. Tim> MultiMapping.c Tim> Creating library build\temp.win32-2.3\Release\MultiMapping.lib and object Tim> build\temp.win32-2.3\Release\MultiMapping.exp Tim> I believe those are generated by Microsoft's compiler (the Tim> case-sensitive string "Creating" appears nohwere in the distutils Tim> source; and yes, these go to stdout too), and if so there's nothing Tim> distutils can do about that. I don't see any messages that look Tim> like they come from distutils. Windows matters little to me for most applications, and not at all when I write scripts that I want to work like Unix filters, which is what my original complaint was about. I will suggest to Ken Simpson that PyInline use the -q flag. Thx, Skip From skip@pobox.com Sun Feb 3 15:56:17 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 3 Feb 2002 09:56:17 -0600 Subject: [Python-Dev] network access from conference? Message-ID: <15453.23969.582214.695481@12-248-41-177.client.attbi.com> Any hope of network access from the conference? I have ethernet and wireless cards (no modem - one of those stinkin' winmodems came with my laptop). Thx, Skip From guido@python.org Sun Feb 3 18:16:43 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 03 Feb 2002 13:16:43 -0500 Subject: [Python-Dev] network access from conference? In-Reply-To: Your message of "Sun, 03 Feb 2002 09:56:17 CST." <15453.23969.582214.695481@12-248-41-177.client.attbi.com> References: <15453.23969.582214.695481@12-248-41-177.client.attbi.com> Message-ID: <200202031816.g13IGiJ15881@pcp742651pcs.reston01.va.comcast.net> > Any hope of network access from the conference? I have ethernet and > wireless cards (no modem - one of those stinkin' winmodems came with my > laptop). I think there was network access last year so I'm counting on it myself. Wireless might be a possibility too, but bring your ethernet card too. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Feb 3 20:39:55 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 03 Feb 2002 15:39:55 -0500 Subject: [Python-Dev] Re: Network access from conference? Message-ID: <200202032039.g13KdtY16193@pcp742651pcs.reston01.va.comcast.net> Thought this might be good to know! --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Forwarded Message Date: Sun, 03 Feb 2002 14:35:44 -0500 From: Kevin Jacobs To: Subject: FYI: Re: Network access from conference? I am bringing a wireless access point, so we can create a CAN (Conference Area Network) if you want. Feel free to send out an e-mail to python-dev or the Python10 list and let people know that they can bring their WiFi cards. The only caveat is that I won't arrive until Monday evening, so the first day participants are somewhat out of luck. - -Kevin - -- - -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com ------- End of Forwarded Message From guido@python.org Mon Feb 4 02:35:25 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 03 Feb 2002 21:35:25 -0500 Subject: [Python-Dev] Want to co-design and implement a logging module? Message-ID: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> I'd like to see a logging module in the standard Python library. Is anybody interested in helping spec out requirements and work on an implementation? Some ideas from Zope's zLOG module should probably go into it (it should eventually be a replacement for that), and some from log4j (http://jakarta.apache.org/log4j/docs/). Any takers? --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@rahul.net Mon Feb 4 06:36:07 2002 From: aahz@rahul.net (Aahz Maruch) Date: Sun, 3 Feb 2002 22:36:07 -0800 (PST) Subject: [Python-Dev] Want to co-design and implement a logging module? In-Reply-To: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> from "Guido van Rossum" at Feb 03, 2002 09:35:25 PM Message-ID: <20020204063607.EA948E8C6@waltz.rahul.net> Guido van Rossum wrote: > > I'd like to see a logging module in the standard Python library. Is > anybody interested in helping spec out requirements and work on an > implementation? Some ideas from Zope's zLOG module should probably go > into it (it should eventually be a replacement for that), and some > from log4j (http://jakarta.apache.org/log4j/docs/). > > Any takers? I'm not sure I'm a "taker", but I did a bit of research and found log4p, http://log4p.sourceforge.net/ Have you looked at it, and if yes, what's a short reason why it wouldn't be suitable? (One of the things I disliked about Zlogger (I believe that's the correct name) is that it seems to require an error tuple, based on what I'm reading in http://www.zope.org/Documentation/Misc/LOGGING.txt I believe that loggers should be more generic.) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From martin@v.loewis.de Mon Feb 4 07:19:40 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 04 Feb 2002 08:19:40 +0100 Subject: [Python-Dev] Want to co-design and implement a logging module? In-Reply-To: <20020204063607.EA948E8C6@waltz.rahul.net> References: <20020204063607.EA948E8C6@waltz.rahul.net> Message-ID: aahz@rahul.net (Aahz Maruch) writes: > I'm not sure I'm a "taker", but I did a bit of research and found log4p, > http://log4p.sourceforge.net/ > > Have you looked at it, and if yes, what's a short reason why it wouldn't > be suitable? The thing I dislike about log4p is that it looks much to java-ish. import java.util.DateFormat; is not something I would like to do when using the standard Python library. Regards, Martin From mal@lemburg.com Mon Feb 4 09:39:27 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 04 Feb 2002 10:39:27 +0100 Subject: [Python-Dev] Want to co-design and implement a logging module? References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C5E56CF.F2641490@lemburg.com> Guido van Rossum wrote: > > I'd like to see a logging module in the standard Python library. Is > anybody interested in helping spec out requirements and work on an > implementation? Some ideas from Zope's zLOG module should probably go > into it (it should eventually be a replacement for that), and some > from log4j (http://jakarta.apache.org/log4j/docs/). > > Any takers? You might want to have a look at mx.Log which is part of the egenix-mx-base distribution. It is undocumented, but reading the source should give some insights. The basic idea is that you have logging objects which are usually created as singletons; these can then log various information depending on a fine grained verbosity level to a log file, stdout or stderr. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From aahz@rahul.net Mon Feb 4 14:40:20 2002 From: aahz@rahul.net (Aahz Maruch) Date: Mon, 4 Feb 2002 06:40:20 -0800 (PST) Subject: [Python-Dev] Tuples vs. lists In-Reply-To: <200201282126.QAA30702@pcp742651pcs.reston01.va.comcast.net> from "Guido van Rossum" at Jan 28, 2002 04:26:23 PM Message-ID: <20020204144021.56495E8C3@waltz.rahul.net> Guido van Rossum wrote: > Aahz: >> >> It's a constant. The BCD module is Binary Coded Decimal; instances are >> intended to be as immutable as strings and numbers (well, it *is* a >> number type). Modifying an instance is guaranteed to produce a new >> instance. To a large extent, I guess I feel that if a class is intended >> to be immutable, each of its underlying data attributes should also be >> immutable. > > Or you could assign it to a private variable. And in private e-mail, Guido writes: > > I hate to continue harping on this tiny item in public, but what woud > you do if you needed a constant dictionary? All right, I guess it's time for me to just follow the Python motto: "There's only one way, and that way is Guido's." -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From trentm@ActiveState.com Mon Feb 4 17:41:49 2002 From: trentm@ActiveState.com (Trent Mick) Date: Mon, 4 Feb 2002 09:41:49 -0800 Subject: [Python-Dev] Want to co-design and implement a logging module? In-Reply-To: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net>; from guido@python.org on Sun, Feb 03, 2002 at 09:35:25PM -0500 References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020204094149.C31089@ActiveState.com> On Sun, Feb 03, 2002 at 09:35:25PM -0500, Guido van Rossum wrote: > I'd like to see a logging module in the standard Python library. Is > anybody interested in helping spec out requirements and work on an > implementation? Some ideas from Zope's zLOG module should probably go > into it (it should eventually be a replacement for that), and some > from log4j (http://jakarta.apache.org/log4j/docs/). > > Any takers? I'll take it. I have been (slowly) working on a log4j translation, trying to stay as close to log4j's API as possible. I'll take a look at zLOG. [Aahz said] > I'm not sure I'm a "taker", but I did a bit of research and found log4p, > http://log4p.sourceforge.net/ That one has not seen any development for ages and I don't believe it is even functional. There *is* a log4py out there. http://www.its4you.at/log4py.php http://sourceforge.net/project/showfiles.php?group_id=36216 I took a quick look at it a while ago and thought it was pretty limited. Perhaps not though -- I may have been sufferring from a bout of "Not invented here." [MAL said:] > You might want to have a look at mx.Log which is part of the > egenix-mx-base distribution. It is undocumented, but reading the > source should give some insights. > > The basic idea is that you have logging objects which are > usually created as singletons; these can then log various > information depending on a fine grained verbosity level to a > log file, stdout or stderr. Sounds very similar to log4j. I'll take a look at that too. Note that the log4j manual that is currently up (http://jakarta.apache.org/log4j/docs/manual.html) is for the current release version. They have an alpha version that cleans up the naming a little bit mainly, I think, to try to make log4j look a little bit more like the java.util.logging API. Actually, log4j's site *used* to have a bunch of other pages up their that included links to contributed packages and ports of log4k to other languages (C, C++, Perl, Python, etc). How about I try to have a PEP together within a week or two, and perhaps a working base implementation? Trent -- Trent Mick TrentM@ActiveState.com From barry@zope.com Mon Feb 4 19:51:30 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 4 Feb 2002 14:51:30 -0500 Subject: [Python-Dev] Want to co-design and implement a logging module? References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com> Message-ID: <15454.58946.378611.495989@anthem.wooz.org> >>>>> "TM" == Trent Mick writes: TM> How about I try to have a PEP together within a week or two, TM> and perhaps a working base implementation? +1 -Barry From tim.one@home.com Mon Feb 4 20:36:52 2002 From: tim.one@home.com (Tim Peters) Date: Mon, 4 Feb 2002 15:36:52 -0500 Subject: [Python-Dev] Tuples vs. lists In-Reply-To: <20020204144021.56495E8C3@waltz.rahul.net> Message-ID: [Guido] > I hate to continue harping on this tiny item in public, but what woud > you do if you needed a constant dictionary? [Aahz] > All right, I guess it's time for me to just > follow the Python motto: "There's only one way, and that way is Guido's." Well, the other one way is to agitate for, e.g., accepting the new digraphs {? ?} as delimiting a constant dict . If I were Aahz, I'd keep using tuples: a serious BCD user can have gazillions of these objects sitting around, and tuples also allow significant memory savings over lists. If you have to, think of the digits '3' and '7' of being different types, so that you can fool Guido into believing it's not a homogeneous collection (he doesn't read the fine print in math-related code ). practicality-beats-purity-ly y'rs - tim From jeremy@zope.com Tue Feb 5 04:33:14 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 4 Feb 2002 23:33:14 -0500 Subject: [Python-Dev] Tuples vs. lists In-Reply-To: References: <20020204144021.56495E8C3@waltz.rahul.net> Message-ID: <15455.24714.128907.199785@gondolin.digicool.com> Hey, should I change all the tuples in code objects to be lists, too? A code object has got things like co_names and co_consts. They're currently implemented as tuples, but they're just homogenous, variable-length sequences. 'course if people modified the lists, they'd caused Python to dump core. Jeremy From gh_pythonlist@gmx.de Tue Feb 5 04:52:10 2002 From: gh_pythonlist@gmx.de (Gerhard =?iso-8859-15?Q?H=E4ring?=) Date: Tue, 5 Feb 2002 05:52:10 +0100 Subject: [Python-Dev] Should Python compile as C++? Message-ID: <20020205045209.GB1181@lilith.hqd-internal> I'm currently doing a native mingw32 port of Python, and I've hit the ugly "initializer is not a constant" problem mentioned in the FAQ. Hmm, looks like I have three options: 1 Fix the Python sources in the Object/ directory and initalize the structs in a seperate init_objects function 2 compile Python with a C++ compiler 3 fix the mingw32 compiler When trying option 2, I recognized that a lot of Python's source is not valid ANSI C++. There are even variable names like "class" and "new". There are of course less obvious issues when trying to make the source compile as C++, in particular a lot more casts are needed. If it's just that Python is supposed to compile as C++ but it hasn't been tested for a while, I could do the necessary fixes and submit a patch. But if that's a new idea, I don't know if fixing it now makes sense. Because I plan to submit the required changes as a patch when the port is ready, I'd like to know if you'd accept a patch for option #1. Gerhard -- This sig powered by Python! Au�entemperatur in M�nchen: 6.1 �C Wind: 4.0 m/s From guido@python.org Tue Feb 5 06:34:19 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 05 Feb 2002 01:34:19 -0500 Subject: [Python-Dev] Should Python compile as C++? In-Reply-To: Your message of "Tue, 05 Feb 2002 05:52:10 +0100." <20020205045209.GB1181@lilith.hqd-internal> References: <20020205045209.GB1181@lilith.hqd-internal> Message-ID: <200202050634.g156YJf19271@pcp742651pcs.reston01.va.comcast.net> > I'm currently doing a native mingw32 port of Python, and I've hit the > ugly "initializer is not a constant" problem mentioned in the FAQ. Hmm, > looks like I have three options: > > 1 Fix the Python sources in the Object/ directory and initalize the > structs in a seperate init_objects function > 2 compile Python with a C++ compiler > 3 fix the mingw32 compiler > > When trying option 2, I recognized that a lot of Python's source is not > valid ANSI C++. There are even variable names like "class" and "new". > There are of course less obvious issues when trying to make the source > compile as C++, in particular a lot more casts are needed. If it's just > that Python is supposed to compile as C++ but it hasn't been tested for > a while, I could do the necessary fixes and submit a patch. But if > that's a new idea, I don't know if fixing it now makes sense. > > Because I plan to submit the required changes as a patch when the port > is ready, I'd like to know if you'd accept a patch for option #1. Sounds to me like the Mingw32 compiler is not ANSI compatible. I don't want to have to change the source to accommodate a broken compiler that a very small minority of users want to use. So I am against #1. We never said that our .c files would be valid C++ (.h files is a different story) so I think #2 is not an option. I vote for #3 -- if enough software can't compiled with mingw32 the compiler will be fixed, as it should, and I'm happy to help encourage this. --Guido van Rossum (home page: http://www.python.org/~guido/) From gh_pythonlist@gmx.de Tue Feb 5 09:30:55 2002 From: gh_pythonlist@gmx.de (Gerhard =?iso-8859-15?Q?H=E4ring?=) Date: Tue, 5 Feb 2002 10:30:55 +0100 Subject: [Python-Dev] Should Python compile as C++? In-Reply-To: <200202050634.g156YJf19271@pcp742651pcs.reston01.va.comcast.net> References: <20020205045209.GB1181@lilith.hqd-internal> <200202050634.g156YJf19271@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020205093054.GA7547@lilith.hqd-internal> Le 05/02/02 � 01:34, Guido van Rossum �crivit: > > I'm currently doing a native mingw32 port of Python, and I've hit the > > ugly "initializer is not a constant" problem mentioned in the FAQ. Hmm, > > looks like I have three options: > > > > 1 Fix the Python sources in the Object/ directory and initalize the > > structs in a seperate init_objects function > > 2 compile Python with a C++ compiler > > 3 fix the mingw32 compiler > > > > [Python doesn't compile with C++ compiler] > > > > Because I plan to submit the required changes as a patch when the port > > is ready, I'd like to know if you'd accept a patch for option #1. > > Sounds to me like the Mingw32 compiler is not ANSI compatible. I > don't want to have to change the source to accommodate a broken > compiler that a very small minority of users want to use. So I am > against #1. I now found the reason for the compiler message. I forgot to set USE_DL_EXPORT when compiling the Python core. Doh! Sorry for the noise. Everything works reasonably fine now. > We never said that our .c files would be valid C++ (.h files is a > different story) [...] Ok. I must have mistaken Python with a different project. > I vote for #3 -- if enough software can't compiled with mingw32 the > compiler will be fixed, as it should, and I'm happy to help encourage > this. I'm not quite sure if was really a bug in mingw32, but the fact that the compiler accepts the code when compiled as C++ is at least inconsistent. Gerhard -- This sig powered by Python! Au�entemperatur in M�nchen: 10.2 �C Wind: 3.6 m/s From mal@lemburg.com Tue Feb 5 10:46:37 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 05 Feb 2002 11:46:37 +0100 Subject: [Python-Dev] Tuples vs. lists References: <20020204144021.56495E8C3@waltz.rahul.net> <15455.24714.128907.199785@gondolin.digicool.com> Message-ID: <3C5FB80D.A2931407@lemburg.com> I haven't really followed this thread, but what's all this talk about lists vs. tuples about ? Tuples have a smaller memory footprint, provide faster element access, can be cached and are generally a good data type for constant data structures. Lists, OTOH, provide more flexibility when the size of the object isn't known in advance. They use up more memory, are not cacheable and slower on access. For the BCD stuff Aahz was talking about, I'd suggest to have a look at either arrays or cStringIO buffers. Jeremy Hylton wrote: > > Hey, should I change all the tuples in code objects to be lists, too? > A code object has got things like co_names and co_consts. They're > currently implemented as tuples, but they're just homogenous, > variable-length sequences. > > 'course if people modified the lists, they'd caused Python to dump > core. I hope I read the correctly :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mhammond@skippinet.com.au Tue Feb 5 11:29:20 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Tue, 5 Feb 2002 22:29:20 +1100 Subject: [Python-Dev] Should Python compile as C++? In-Reply-To: <20020205093054.GA7547@lilith.hqd-internal> Message-ID: > > I vote for #3 -- if enough software can't compiled with mingw32 the > > compiler will be fixed, as it should, and I'm happy to help encourage > > this. > > I'm not quite sure if was really a bug in mingw32, but the fact that the > compiler accepts the code when compiled as C++ is at least inconsistent. IIRC, msvc has the exact same problem, and that is turns out that the error is actually correct. I believe the problem is that C does not guarantee the initialization order of static objects across object modules. The Python idiom of taking the address of a global variable in one module to initialize another global variable in another module is not guaranteed to do what you expect. OTOH, C++ does make such a guarantee. The good news is that if msvc has the same problem, wherever the blame lies, you can be fairly sure that something will be done so msvc works (and has indeed been done for a few modules). Therefore you get mingw for free :) Or-something-like-that ly, Mark. From mal@lemburg.com Tue Feb 5 12:22:48 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 05 Feb 2002 13:22:48 +0100 Subject: [Python-Dev] Should Python compile as C++? References: Message-ID: <3C5FCE98.81E22FB7@lemburg.com> Mark Hammond wrote: > > > > I vote for #3 -- if enough software can't compiled with mingw32 the > > > compiler will be fixed, as it should, and I'm happy to help encourage > > > this. > > > > I'm not quite sure if was really a bug in mingw32, but the fact that the > > compiler accepts the code when compiled as C++ is at least inconsistent. > > IIRC, msvc has the exact same problem, and that is turns out that the error > is actually correct. is listening :)> I believe the problem is that C does not guarantee the > initialization order of static objects across object modules. The Python > idiom of taking the address of a global variable in one module to initialize > another global variable in another module is not guaranteed to do what you > expect. OTOH, C++ does make such a guarantee. > > The good news is that if msvc has the same problem, wherever the blame lies, > you can be fairly sure that something will be done so msvc works (and has > indeed been done for a few modules). Therefore you get mingw for free :) If the initialization of type objects is all that needs fixing to get Python compile to on MinGW32, why not simply fix it ? MSVC has had the same problem for years. What's strange is that in some cases, MSVC does seem to get it right where in others it fails with an error -- probably a DLL vs. EXE thing. Or am I missing something here :-? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mhammond@skippinet.com.au Tue Feb 5 12:28:50 2002 From: mhammond@skippinet.com.au (Mark Hammond) Date: Tue, 5 Feb 2002 23:28:50 +1100 Subject: [Python-Dev] Should Python compile as C++? In-Reply-To: <3C5FCE98.81E22FB7@lemburg.com> Message-ID: > If the initialization of type objects is all that needs fixing to > get Python compile to on MinGW32, why not simply fix it ? Gerhard indicated all is working now. > MSVC has had > the same problem for years. What's strange is that in some cases, > MSVC does seem to get it right where in others it fails with an > error -- probably a DLL vs. EXE thing. It is a problem for extension modules. Object files in the core DLL have no problem, but object modules in seperate extension DLLs that reference the global in pythonxx.dll generate the error. Thus, we see the error as modules are split out of the core - eg, _socket, _winreg, etc. Mark. From tim.one@comcast.net Tue Feb 5 12:28:27 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 05 Feb 2002 07:28:27 -0500 Subject: [Python-Dev] Should Python compile as C++? In-Reply-To: Message-ID: Static initializers in C++ are much more liberal than in C, without the latter's "constant expression" limitations. This follows from that you couldn't declare a static object of an arbitrary class otherwise: C++ has to be prepared to execute any code whatsoever in order to run user-coded constructors. OTOH, because C++ is much more liberal in this respect, order of module initialization is a much worse problem in C, and C++ doesn't define that any more than C does. Some of the worst debugging problems I ever had in C++ were tracking down quiet assumptions about initialization order that didn't hold x-platform. There are a number of well-known hacks in the C++ world for worming around this, some of which explain why starting a large C++ program can give your disk a major workout. As to making Python source compilable under C++, I quietly nudge it in that direction. If I explained why, it wouldn't be quiet anymore . From tim.one@comcast.net Tue Feb 5 12:31:16 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 05 Feb 2002 07:31:16 -0500 Subject: [Python-Dev] Should Python compile as C++? In-Reply-To: <3C5FCE98.81E22FB7@lemburg.com> Message-ID: > ... > MSVC has had the same problem for years. What's strange is that in some > cases, MSVC does seem to get it right where in others it fails with an > error -- probably a DLL vs. EXE thing. MS C can't handle cross-DLL references in initializers, because they're truly not "constant" in the way C requires (but C doesn't say anything about DLLs!). C++'s initialization model is much more liberal (and correspondingly more elaborate and expensive), and C++ can handle cross-DLL references in initializers. MSVC plays both according to reasonable readings of the respective languages' rules. From jack@oratrix.com Tue Feb 5 15:37:06 2002 From: jack@oratrix.com (Jack Jansen) Date: Tue, 5 Feb 2002 16:37:06 +0100 Subject: [Python-Dev] Should Python compile as C++? In-Reply-To: Message-ID: <35AB4C18-1A4E-11D6-ABBF-0030655234CE@oratrix.com> On Tuesday, February 5, 2002, at 01:31 , Tim Peters wrote: >> ... >> MSVC has had the same problem for years. What's strange is that in some >> cases, MSVC does seem to get it right where in others it fails with an >> error -- probably a DLL vs. EXE thing. > > MS C can't handle cross-DLL references in initializers, because they're > truly not "constant" in the way C requires (but C doesn't say anything > about > DLLs!). I've always understood that the problem here was that Microsoft's object file format allows for patching up references in the text segment but not in the data segment. And C++ doesn't have the problem, because it can do initializers in code anyway, so it doesn't need a data segment reference to the symbol from the DLL. > -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From aahz@rahul.net Tue Feb 5 16:56:21 2002 From: aahz@rahul.net (Aahz Maruch) Date: Tue, 5 Feb 2002 08:56:21 -0800 (PST) Subject: [Python-Dev] Tuples vs. lists In-Reply-To: <3C5FB80D.A2931407@lemburg.com> from "M.-A. Lemburg" at Feb 05, 2002 11:46:37 AM Message-ID: <20020205165621.D60DBE8C3@waltz.rahul.net> M.-A. Lemburg wrote: > > For the BCD stuff Aahz was talking about, I'd suggest to have > a look at either arrays or cStringIO buffers. cStringIO wouldn't work because I want to store ints. Arrays might work, but I think I'll stick with tuples because they're a bit more familiar to most Pythonistas. I'm not too concerned with raw speed and efficiency before I convert the code to C; remember Knuth. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From mal@lemburg.com Tue Feb 5 17:25:16 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 05 Feb 2002 18:25:16 +0100 Subject: [Python-Dev] Tuples vs. lists References: <20020205165621.D60DBE8C3@waltz.rahul.net> Message-ID: <3C60157C.D498A4F@lemburg.com> Aahz Maruch wrote: > > M.-A. Lemburg wrote: > > > > For the BCD stuff Aahz was talking about, I'd suggest to have > > a look at either arrays or cStringIO buffers. > > cStringIO wouldn't work because I want to store ints. I was thinking of storing integers as chr(value) in these. > Arrays might > work, but I think I'll stick with tuples because they're a bit more > familiar to most Pythonistas. I'm not too concerned with raw speed and > efficiency before I convert the code to C; remember Knuth. If you plan to convert this to C, why not have a look at mxNumber first ? It's a wrapper around GMP and provides high performance implementations for many numeric operations, e.g. it should be easy to create a BCD type using the GMP (arbitrary length) longs and an additional C long for the decimal point position. In fact, there's a GMP extension MPFR which tries to do just this. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From aahz@rahul.net Tue Feb 5 18:03:13 2002 From: aahz@rahul.net (Aahz Maruch) Date: Tue, 5 Feb 2002 10:03:13 -0800 (PST) Subject: [Python-Dev] Tuples vs. lists In-Reply-To: <3C60157C.D498A4F@lemburg.com> from "M.-A. Lemburg" at Feb 05, 2002 06:25:16 PM Message-ID: <20020205180313.EDBD8E8C3@waltz.rahul.net> M.-A. Lemburg wrote: > Aahz Maruch wrote: >> M.-A. Lemburg wrote: >>> >>> For the BCD stuff Aahz was talking about, I'd suggest to have >>> a look at either arrays or cStringIO buffers. >> >> cStringIO wouldn't work because I want to store ints. > > I was thinking of storing integers as chr(value) in these. Why spend the conversion time? >> Arrays might >> work, but I think I'll stick with tuples because they're a bit more >> familiar to most Pythonistas. I'm not too concerned with raw speed and >> efficiency before I convert the code to C; remember Knuth. > > If you plan to convert this to C, why not have a look at mxNumber > first ? It's a wrapper around GMP and provides high performance > implementations for many numeric operations, e.g. it should be easy > to create a BCD type using the GMP (arbitrary length) longs and an > additional C long for the decimal point position. In fact, there's a > GMP extension MPFR which tries to do just this. I'm specifically implementing the ANSI BCD spec. If you want to argue the theory of this, poke the Timbot; I think it's simpler to ensure that I'm following the spec if I implement everything by hand. Once I really understand what I'm doing, *then* it's time to optimize. Note that one reason for using BCD over GMP longs (which are presumably similar to Python longs) is speed of I/O conversion. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From mal@lemburg.com Wed Feb 6 08:45:51 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 06 Feb 2002 09:45:51 +0100 Subject: [Python-Dev] Tuples vs. lists References: <20020205180313.EDBD8E8C3@waltz.rahul.net> Message-ID: <3C60ED3F.72341290@lemburg.com> Aahz Maruch wrote: > > >> Arrays might > >> work, but I think I'll stick with tuples because they're a bit more > >> familiar to most Pythonistas. I'm not too concerned with raw speed and > >> efficiency before I convert the code to C; remember Knuth. > > > > If you plan to convert this to C, why not have a look at mxNumber > > first ? It's a wrapper around GMP and provides high performance > > implementations for many numeric operations, e.g. it should be easy > > to create a BCD type using the GMP (arbitrary length) longs and an > > additional C long for the decimal point position. In fact, there's a > > GMP extension MPFR which tries to do just this. > > I'm specifically implementing the ANSI BCD spec. If you want to argue > the theory of this, poke the Timbot; I think it's simpler to ensure that > I'm following the spec if I implement everything by hand. Once I really > understand what I'm doing, *then* it's time to optimize. Just thought you might want to take a look at what other people have done in this area. MPFR is specifically aimed at dealing with the problems of rounding; MPFI which implements interval arithmetics based on MPFR takes a slightly different approach: rounding issues are handled using intervals (these are also very handy in optimization). Pointers: http://www.loria.fr/projets/mpfr/ http://www.ens-lyon.fr/~nrevol/nr_software.html > Note that one reason for using BCD over GMP longs (which are presumably > similar to Python longs) is speed of I/O conversion. Depends on which base you use for that conversion ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From thomas.heller@ion-tof.com Wed Feb 6 12:56:30 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 6 Feb 2002 13:56:30 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> Message-ID: <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> From: "Guido van Rossum" > > Does this mean this is the wrong route, or is it absolute impossible > > to create a subtype of PyType_Type in C with additional slots? > > I wish I had time to explain this, but I don't. For now, you'll have > to read how types are initialized in typeobject.c -- maybe there's a > way, maybe there isn't. > > > Any tips about the route to take? > > It can be done easily dynamically. > I'm still struggling with this. How can it be done dynamically? My idea would be to realloc() the object after creation, adding a few bytes at the end. The problem is that I don't know how to find out about the object size without knowledge about the internals. The formula given in PEP 253 type->tp_basicsize + nitems * type->tp_itemsize seems not to be valid any more (at least with CYCLE GC). Thomas From mwh@python.net Wed Feb 6 13:54:14 2002 From: mwh@python.net (Michael Hudson) Date: 06 Feb 2002 13:54:14 +0000 Subject: [Python-Dev] Mixing memory management APIs In-Reply-To: Neal Norwitz's message of "Wed, 30 Jan 2002 19:13:58 -0500" References: <3C588C46.2BF27BBE@metaslash.com> Message-ID: <2m4rkuyaux.fsf@starship.python.net> Neal Norwitz writes: > Because of Michael Hudson's request, I tried running Purify > --with-pymalloc enabled. The results were a bit surprising: 13664 errors! > > All the errors were in unicodeobject.c. There were 3 types of errors: > Free Memory Reads, Array Bounds Reads, and Unitialized Memory Reads. > The line #s were in strange places (e.g., in a function declaration > and accessing self->length in an if clause, after it was accessed w/o error). > The line #s are primarily: unicodeobject.c:2875, and unicodeobject.c:2214. Might this have something to do with bug [ #495401 ] Build troubles: --with-pymalloc http://sourceforge.net/tracker/?func=detail&atid=105470&aid=495401&group_id=5470 ? Is there a reason one of the fixes for this problem hasn't been checked in yet? Cheers, M. -- . <- the point your article -> . |------------------------- a long way ------------------------| -- Cristophe Rhodes, ucam.chat From guido@python.org Wed Feb 6 14:36:27 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 06 Feb 2002 09:36:27 -0500 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: Your message of "Wed, 06 Feb 2002 13:56:30 +0100." <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> Message-ID: <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net> > > I wish I had time to explain this, but I don't. For now, you'll have > > to read how types are initialized in typeobject.c -- maybe there's a > > way, maybe there isn't. > > > > > Any tips about the route to take? > > > > It can be done easily dynamically. > > I'm still struggling with this. How can it be done dynamically? > > My idea would be to realloc() the object after creation, adding > a few bytes at the end. The problem is that I don't know how to > find out about the object size without knowledge about the internals. > The formula given in PEP 253 > type->tp_basicsize + nitems * type->tp_itemsize > seems not to be valid any more (at least with CYCLE GC). I have thought about this a little more and come to the conclusion that you cannot define a metaclass that creates type objects that have more C slots than the standard type object lay-out. It would be the same as trying to add a C slot to the instances of a string subtype: there's variable-length data at the end, and you cannot place anything *before* that variable-length data because all the C code that works with the base type knows where the variable length data start; you cannot place anything *after* that variable-lenth data because there's no way to address it from C. The only way around this would be to duplicate *all* the code of type_new(), which I don't recommend because it will probably have to be changed for each Python version (even for bugfix releases). A better solution is to store additional information in the __dict__. --Guido van Rossum (home page: http://www.python.org/~guido/) From jepler@inetnebr.com Wed Feb 6 16:24:31 2002 From: jepler@inetnebr.com (Jeff Epler) Date: Wed, 6 Feb 2002 10:24:31 -0600 Subject: Half-baked idea (was Re: [Python-Dev] Extending types in C - help needed) In-Reply-To: <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net> References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020206102426.A20584@unpythonic.dhs.org> On Wed, Feb 06, 2002 at 09:36:27AM -0500, Guido van Rossum wrote: > I have thought about this a little more and come to the conclusion > that you cannot define a metaclass that creates type objects that have > more C slots than the standard type object lay-out. It would be the > same as trying to add a C slot to the instances of a string subtype: > there's variable-length data at the end, and you cannot place anything > *before* that variable-length data because all the C code that works > with the base type knows where the variable length data start; you > cannot place anything *after* that variable-lenth data because there's > no way to address it from C. I had a half-baked idea when I read this. Is there something unworkable about the scheme, aside from being very different from the way Python currently operates? Has anybody written a system that works this way? Is it just plain gross? Jeff Epler jepler@inetnebr.com Half-Baked Idea --------------- The problem is that we have variable-length types. For example, struct S { int nelem; int elem[0]; }; you can allocate a new one by struct S *new_S(int nelem) { struct S *ret = malloc(sizeof(S) + nelem * sizeof(int)); ret->nelem = nelem; return ret; } Normally, we "subclass" structures by appending fields to the end: struct BASE { int x, y; }; struct DERIVED { /* from struct BASE */ int x, y; int flag; }; but this doesn't work with a dynamic-length object. So, with the caveat that you can only have dynamic-length behavior in the base class, why not place the new fields *BEFORE* the fields of base struct: struct S2 { int flag; int nelem; int elem[0]; }; now, whenever you are going to pass S2 to a function on S, you simply pass in (struct S*)((char*)s2 + offsetof(S2, nelem)) and if you're faced with an instance of S that turns out to be an S2, you can get the pointer to the start of S with (struct S2*)((char*)s - offsetof(S2, nelem)) Note that neither of these is an additional level of indirection, it's just an offset calculation, one that your compiler may be able to combine with subsequent field accesses through the -> operator. But how do you free an instance of S-or-subclass, without knowing all the subclasses? Well, you could store a pointer to the real start of the structure, or an offset back to it, in the structure. You'd use that pointer only in a few occasions, usually using the "add const to pointer" in functions which are for a particular subclass of S: struct S { void *real_head; int nelem; int elem[0]; }; struct S1 { /* derived from S */ int flag; void *real_head; int nelem; int elem[0]; }; struct S1_1 { /* derived from S1 */ int new_flag; int flag; void *real_head; int nelem; int elem[0]; }; now, you can allocate a version of an S subclass by struct S *new_S(int nelem, int pre_size) { char *mem = malloc(sizeof(S) + nelem * sizeof(int) + pre_size); struct S *ret = mem + pre_size; ret->nelem = nelem; return ret; } and free it by void free_S(struct S* s) { free(s->real_head); } I don't know how this will interact with a garbage collector, but it does maintain a pointer to the head of the allocated block, though that pointer is only accessible through a pointer to the inside of a block. From mal@lemburg.com Wed Feb 6 16:49:24 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 06 Feb 2002 17:49:24 +0100 Subject: [Python-Dev] Mixing memory management APIs References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> Message-ID: <3C615E94.AF637093@lemburg.com> Michael Hudson wrote: > > Neal Norwitz writes: > > > Because of Michael Hudson's request, I tried running Purify > > --with-pymalloc enabled. The results were a bit surprising: 13664 errors! > > > > All the errors were in unicodeobject.c. There were 3 types of errors: > > Free Memory Reads, Array Bounds Reads, and Unitialized Memory Reads. > > The line #s were in strange places (e.g., in a function declaration > > and accessing self->length in an if clause, after it was accessed w/o error). > > The line #s are primarily: unicodeobject.c:2875, and unicodeobject.c:2214. > > Might this have something to do with > > bug [ #495401 ] Build troubles: --with-pymalloc > > http://sourceforge.net/tracker/?func=detail&atid=105470&aid=495401&group_id=5470 > > ? > > Is there a reason one of the fixes for this problem hasn't been > checked in yet? It is currently assigned to Martin. Perhaps I should just take the Unicode patch and check it in (the first one, not the second one for the reasons stated in the bug-tracker) ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Feb 6 18:13:41 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 06 Feb 2002 19:13:41 +0100 Subject: [Python-Dev] Mixing memory management APIs References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> Message-ID: <3C617255.6D4E5278@lemburg.com> "M.-A. Lemburg" wrote: > > Michael Hudson wrote: > > > > Neal Norwitz writes: > > > > > Because of Michael Hudson's request, I tried running Purify > > > --with-pymalloc enabled. The results were a bit surprising: 13664 errors! > > > > > > All the errors were in unicodeobject.c. There were 3 types of errors: > > > Free Memory Reads, Array Bounds Reads, and Unitialized Memory Reads. > > > The line #s were in strange places (e.g., in a function declaration > > > and accessing self->length in an if clause, after it was accessed w/o error). > > > The line #s are primarily: unicodeobject.c:2875, and unicodeobject.c:2214. > > > > Might this have something to do with > > > > bug [ #495401 ] Build troubles: --with-pymalloc > > > > http://sourceforge.net/tracker/?func=detail&atid=105470&aid=495401&group_id=5470 > > > > ? > > > > Is there a reason one of the fixes for this problem hasn't been > > checked in yet? > > It is currently assigned to Martin. > > Perhaps I should just take the Unicode patch and check it in (the > first one, not the second one for the reasons stated in the > bug-tracker) ?! I've checked in a patch for the UTF-8 codec problem. Could you try Purify against the CVS version ? Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From niemeyer@conectiva.com Wed Feb 6 18:31:26 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Wed, 6 Feb 2002 16:31:26 -0200 Subject: [Python-Dev] Python optmizations Message-ID: <20020206163126.B4071@ibook.distro.conectiva> Hello Skip! I've been reading some books and papers about stack virtual machines optimization, and playing around with Python's bytecode and inner loop organization. As always, I found some interesting results and some frustrating ones. Recently, I have found your paper about peephole optimization, and other tries you've made in the same job. Well, basically I discovered that I'm not original, and repeated most of your ideas and mistakes. :-) But that's ok. It gave me a good idea of paths to follow if I want to keep playing with this. One thing I thought and also found a reference in your paper is about some instructions that should be turned into a single opcode. To understand how this would affect the code, I have disassembled the whole Python standard library, and the whole Zope library. After that I've run a script to detect opcode repeatings (excluding SET_LINENO). Here are the top repeatings: 23632 LOAD_FAST, LOAD_ATTR 15382 LOAD_CONST, LOAD_CONST 12842 JUMP_IF_FALSE, POP_TOP 12397 CALL_FUNCTION, POP_TOP 12121 LOAD_FAST, LOAD_FAST Not by casuality, I found in your paper references to a LOAD_FAST_ATTR opcode. Since you probably have mentioned this to others, I wouldn't like to bother everyone again asking why it was not implemented. Could you please explain me the reasons that left this behind? If you have the time, I'd also like to understand what's the trouble involved in getting a peephole optimizer in the python compiler itself. Is it just about compiling performance? I don't remember to have read about this in your paper, but you probably thought about that as well. Thank you! -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From thomas.heller@ion-tof.com Wed Feb 6 20:53:08 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 6 Feb 2002 21:53:08 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net> Message-ID: <104601c1af50$4888bd90$e000a8c0@thomasnotebook> > I have thought about this a little more and come to the conclusion > that you cannot define a metaclass that creates type objects that have > more C slots than the standard type object lay-out. It would be the > same as trying to add a C slot to the instances of a string subtype: > there's variable-length data at the end, and you cannot place anything > *before* that variable-length data because all the C code that works > with the base type knows where the variable length data start; you > cannot place anything *after* that variable-lenth data because there's > no way to address it from C. > It's a pity, isn't it? > A better solution is to store additional information in the __dict__. You loose nice features: access these (new) slots from Python by providing tp_members entries for them (for example). Are you planning to address this issue in the future? Thanks, Thomas From neal@metaslash.com Wed Feb 6 21:37:28 2002 From: neal@metaslash.com (Neal Norwitz) Date: Wed, 06 Feb 2002 16:37:28 -0500 Subject: [Python-Dev] Mixing memory management APIs References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> Message-ID: <3C61A218.CD3AB779@metaslash.com> "M.-A. Lemburg" wrote: > I've checked in a patch for the UTF-8 codec problem. Could you > try Purify against the CVS version ? with-pymalloc or without or both? Neal From mal@lemburg.com Wed Feb 6 22:41:52 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 06 Feb 2002 23:41:52 +0100 Subject: [Python-Dev] Mixing memory management APIs References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> Message-ID: <3C61B130.3040609@lemburg.com> Neal Norwitz wrote: > "M.-A. Lemburg" wrote: > > >>I've checked in a patch for the UTF-8 codec problem. Could you >>try Purify against the CVS version ? >> > > with-pymalloc or without or both? Both if possible -- the leakage showed up with pymalloc AFAIR :-) Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From neal@metaslash.com Wed Feb 6 23:34:24 2002 From: neal@metaslash.com (Neal Norwitz) Date: Wed, 06 Feb 2002 18:34:24 -0500 Subject: [Python-Dev] Mixing memory management APIs References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> Message-ID: <3C61BD80.5F11C66B@metaslash.com> "M.-A. Lemburg" wrote: > > Neal Norwitz wrote: > > > "M.-A. Lemburg" wrote: > > > > > >>I've checked in a patch for the UTF-8 codec problem. Could you > >>try Purify against the CVS version ? > >> > > > > with-pymalloc or without or both? > > Both if possible -- the leakage showed up with pymalloc AFAIR :-) There is a lot of data and it's very hard to follow, but I'm trying to provide as much info as I can. Let me know how I can make this info easier to use. Here is a summary: * I'm using gcc version 2.95.3, on Solaris 8, Purify 2002. * The new patches don't fix all the problems, but it may reduce the problems (I'm not sure). I think there were 13k errors on build before, it's 5.5k now. * test_unicodedata fails: *** mismatch between line 3 of expected output and line 3 of actual output: - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18 + Methods: 3051e6d4d117133c3e36a5c22b3a1ae362474321 * Purify now has 2 UMRs now w/o pymalloc, but they are in fwrite() and contain no usable stack trace. * It's probably best to try using Electric Fence and/or dbmalloc. This may give better results than Purify. * There is a warning from sre.h that may be significant: Modules/sre.h:24: warning: `SRE_CODE' redefined Modules/sre.h:19: warning: this is the location of the previous definition I'll try some more things to see if I can get better info. Neal -- bash-2.03$ ./configure --with-pymalloc --enable-unicode=ucs4 bash-2.03$ make PURIFY=purify ---> 5542 errors Free Memory Read, Array Bounds Read, and Uninit Memory Read errors at lines unicodeobject.c:2214 & 2875 (both are bogus lines) 2214 is in: PyUnicode_TranslateCharmap() 2875 is in: split_char() bash-2.03$ ./python -E -tt Lib/test/regrtest.py test_unicode.py \ test_unicode_file.py test_unicodedata.py test_unicode test test_unicode crashed -- exceptions.UnicodeError: UTF-8 decoding error: illegal encoding test_unicode_file test_unicodedata test test_unicodedata produced unexpected output: ********************************************************************** *** mismatch between line 3 of expected output and line 3 of actual output: - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18 + Methods: 3051e6d4d117133c3e36a5c22b3a1ae362474321 ********************************************************************** 1 test OK. 2 tests failed: test_unicode test_unicodedata -------------------------------------------------------------------- Without purify, test_unicode completed successfully, but unicodedata produced the same results. The errors produced in purify for these 3 tests were 99745. The errors were in the same places as for the build step. -------------------------------------------------------------------- bash-2.03$ make clean bash-2.03$ ./configure --enable-unicode=ucs4 bash-2.03$ make PURIFY=purify bash-2.03$ ./python -E -tt Lib/test/regrtest.py test_unicode.py \ test_unicode_file.py test_unicodedata.py test test_unicodedata produced unexpected output: ********************************************************************** *** mismatch between line 3 of expected output and line 3 of actual output: - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18 + Methods: 84b72943b1d4320bc1e64a4888f7cdf62eea219a ********************************************************************** 2 tests OK. 1 test failed: test_unicodedata -------------------------------------------------------------------- Purify did have 2 UMRs, but both contain almost no information: UMR: Uninitialized memory read This is occurring while in: _write [libc.so.1] _xflsbuf [libc.so.1] _fflush_u [libc.so.1] fseek [libc.so.1] *unknown func* [pc=0xe417c] *unknown func* [pc=0xe4db4] *unknown func* [pc=0xe64c4] *unknown func* [pc=0xe5cf0] *unknown func* [pc=0xe5524] *unknown func* [pc=0xe58a0] *unknown func* [pc=0x160464] *unknown func* [pc=0x159b64] Reading 3609 bytes from 0x6a2fcc in the heap (4 bytes at 0x6a3706 uninit). Address 0x6a2fcc is 4 bytes into a malloc'd block at 0x6a2fc8 of 8200 bytes. This block was allocated from: do_mkvalue [modsupport.c:243] _findbuf [libc.so.1] _wrtchk [libc.so.1] _flsbuf [libc.so.1] putc [libc.so.1] *unknown func* [pc=0xe8b9c] *unknown func* [pc=0xed794] *unknown func* [pc=0xe4104] *unknown func* [pc=0xe4db4] *unknown func* [pc=0xe64c4] *unknown func* [pc=0xe5cf0] *unknown func* [pc=0xe5524] -------------------------------------------------------------------- UMR: Uninitialized memory read This is occurring while in: _write [libc.so.1] _xflsbuf [libc.so.1] _fwrite_unlocked [libc.so.1] fwrite [libc.so.1] *unknown func* [pc=0xeaa50] *unknown func* [pc=0xeadf4] *unknown func* [pc=0xeb3c8] *unknown func* [pc=0xed7e8] *unknown func* [pc=0xe411c] *unknown func* [pc=0xe4db4] *unknown func* [pc=0xe64c4] *unknown func* [pc=0xe5cf0] Reading 8192 bytes from 0x79d88c in the heap (4 bytes at 0x79de8d uninit). Address 0x79d88c is 4 bytes into a malloc'd block at 0x79d888 of 8200 bytes. This block was allocated from: do_mkvalue [modsupport.c:243] _findbuf [libc.so.1] _wrtchk [libc.so.1] _flsbuf [libc.so.1] putc [libc.so.1] *unknown func* [pc=0xe8b9c] *unknown func* [pc=0xed794] *unknown func* [pc=0xe4104] *unknown func* [pc=0xe4db4] *unknown func* [pc=0xe64c4] *unknown func* [pc=0xe5cf0] *unknown func* [pc=0xe5524] From neal@metaslash.com Wed Feb 6 23:36:19 2002 From: neal@metaslash.com (Neal Norwitz) Date: Wed, 06 Feb 2002 18:36:19 -0500 Subject: [Python-Dev] Mixing memory management APIs References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> Message-ID: <3C61BDF3.848F363C@metaslash.com> "M.-A. Lemburg" wrote: > > Neal Norwitz wrote: > > > "M.-A. Lemburg" wrote: > > > > > >>I've checked in a patch for the UTF-8 codec problem. Could you > >>try Purify against the CVS version ? > >> > > > > with-pymalloc or without or both? > > Both if possible -- the leakage showed up with pymalloc AFAIR :-) I forgot to mention that purify reports no memory leaks either with or without pymalloc. Neal From mal@lemburg.com Thu Feb 7 08:49:52 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 07 Feb 2002 09:49:52 +0100 Subject: [Python-Dev] Mixing memory management APIs References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> <3C61BDF3.848F363C@metaslash.com> Message-ID: <3C623FB0.4C401A0@lemburg.com> Neal Norwitz wrote: > > "M.-A. Lemburg" wrote: > > > > Neal Norwitz wrote: > > > > > "M.-A. Lemburg" wrote: > > > > > > > > >>I've checked in a patch for the UTF-8 codec problem. Could you > > >>try Purify against the CVS version ? > > >> > > > > > > with-pymalloc or without or both? > > > > Both if possible -- the leakage showed up with pymalloc AFAIR :-) > > I forgot to mention that purify reports no memory leaks either > with or without pymalloc. So that bug seems to be fixed now. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Feb 7 08:55:11 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 07 Feb 2002 09:55:11 +0100 Subject: [Python-Dev] Mixing memory management APIs References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> <3C61BD80.5F11C66B@metaslash.com> Message-ID: <3C6240EF.4FE8E9C@lemburg.com> Neal Norwitz wrote: > > "M.-A. Lemburg" wrote: > > > > Neal Norwitz wrote: > > > > > "M.-A. Lemburg" wrote: > > > > > > > > >>I've checked in a patch for the UTF-8 codec problem. Could you > > >>try Purify against the CVS version ? > > >> > > > > > > with-pymalloc or without or both? > > > > Both if possible -- the leakage showed up with pymalloc AFAIR :-) > > There is a lot of data and it's very hard to follow, > but I'm trying to provide as much info as I can. > Let me know how I can make this info easier to use. > > Here is a summary: > > * I'm using gcc version 2.95.3, on Solaris 8, Purify 2002. > > * The new patches don't fix all the problems, but it may > reduce the problems (I'm not sure). I think there were > 13k errors on build before, it's 5.5k now. > > * test_unicodedata fails: > *** mismatch between line 3 of expected output and > line 3 of actual output: > - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18 > + Methods: 3051e6d4d117133c3e36a5c22b3a1ae362474321 Hmm, I did run test_unicode, but forgot test_unicodedata. Now, looking at test_unicodedata.py it produces loads of these unpaired Unicode surrogates and then tries to encode them using UTF-8. Since the UTF-8 previously produced wrong results for these, I guess I'll have to recreate the test output. > * Purify now has 2 UMRs now w/o pymalloc, but they are in > fwrite() and contain no usable stack trace. > > * It's probably best to try using Electric Fence and/or dbmalloc. > This may give better results than Purify. > > * There is a warning from sre.h that may be significant: > Modules/sre.h:24: warning: `SRE_CODE' redefined > Modules/sre.h:19: warning: this is the location > of the previous definition > > I'll try some more things to see if I can get better info. > > Neal > -- > > bash-2.03$ ./configure --with-pymalloc --enable-unicode=ucs4 > bash-2.03$ make PURIFY=purify > > ---> 5542 errors > Free Memory Read, Array Bounds Read, and Uninit Memory Read errors > at lines unicodeobject.c:2214 & 2875 > (both are bogus lines) > > 2214 is in: PyUnicode_TranslateCharmap() > 2875 is in: split_char() Hmm, I'll have to look at this one... > bash-2.03$ ./python -E -tt Lib/test/regrtest.py test_unicode.py \ > test_unicode_file.py test_unicodedata.py > test_unicode > test test_unicode crashed -- exceptions.UnicodeError: UTF-8 decoding error: illegal encoding That's strange, because at least on my machine, test_unicode runs through just fine. Could you run the test by hand, so that the error location can be localized ? > test_unicode_file > test_unicodedata > test test_unicodedata produced unexpected output: > ********************************************************************** > *** mismatch between line 3 of expected output and line 3 of actual output: > - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18 > + Methods: 3051e6d4d117133c3e36a5c22b3a1ae362474321 > ********************************************************************** See above. > 1 test OK. > 2 tests failed: > test_unicode test_unicodedata > > -------------------------------------------------------------------- > > Without purify, test_unicode completed successfully, but unicodedata > produced the same results. > > The errors produced in purify for these 3 tests were 99745. > The errors were in the same places as for the build step. > > -------------------------------------------------------------------- > > bash-2.03$ make clean > bash-2.03$ ./configure --enable-unicode=ucs4 > bash-2.03$ make PURIFY=purify > > bash-2.03$ ./python -E -tt Lib/test/regrtest.py test_unicode.py \ > test_unicode_file.py test_unicodedata.py > test test_unicodedata produced unexpected output: > ********************************************************************** > *** mismatch between line 3 of expected output and line 3 of actual output: > - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18 > + Methods: 84b72943b1d4320bc1e64a4888f7cdf62eea219a > ********************************************************************** > 2 tests OK. > 1 test failed: > test_unicodedata > > -------------------------------------------------------------------- > > Purify did have 2 UMRs, but both contain almost no information: > > UMR: Uninitialized memory read > This is occurring while in: > _write [libc.so.1] > _xflsbuf [libc.so.1] > _fflush_u [libc.so.1] > fseek [libc.so.1] > *unknown func* [pc=0xe417c] > *unknown func* [pc=0xe4db4] > *unknown func* [pc=0xe64c4] > *unknown func* [pc=0xe5cf0] > *unknown func* [pc=0xe5524] > *unknown func* [pc=0xe58a0] > *unknown func* [pc=0x160464] > *unknown func* [pc=0x159b64] > Reading 3609 bytes from 0x6a2fcc in the heap (4 bytes at 0x6a3706 uninit). > Address 0x6a2fcc is 4 bytes into a malloc'd block at 0x6a2fc8 of 8200 bytes. > This block was allocated from: > do_mkvalue [modsupport.c:243] > _findbuf [libc.so.1] > _wrtchk [libc.so.1] > _flsbuf [libc.so.1] > putc [libc.so.1] > *unknown func* [pc=0xe8b9c] > *unknown func* [pc=0xed794] > *unknown func* [pc=0xe4104] > *unknown func* [pc=0xe4db4] > *unknown func* [pc=0xe64c4] > *unknown func* [pc=0xe5cf0] > *unknown func* [pc=0xe5524] > > -------------------------------------------------------------------- > > UMR: Uninitialized memory read > This is occurring while in: > _write [libc.so.1] > _xflsbuf [libc.so.1] > _fwrite_unlocked [libc.so.1] > fwrite [libc.so.1] > *unknown func* [pc=0xeaa50] > *unknown func* [pc=0xeadf4] > *unknown func* [pc=0xeb3c8] > *unknown func* [pc=0xed7e8] > *unknown func* [pc=0xe411c] > *unknown func* [pc=0xe4db4] > *unknown func* [pc=0xe64c4] > *unknown func* [pc=0xe5cf0] > Reading 8192 bytes from 0x79d88c in the heap (4 bytes at 0x79de8d uninit). > Address 0x79d88c is 4 bytes into a malloc'd block at 0x79d888 of 8200 bytes. > This block was allocated from: > do_mkvalue [modsupport.c:243] > _findbuf [libc.so.1] > _wrtchk [libc.so.1] > _flsbuf [libc.so.1] > putc [libc.so.1] > *unknown func* [pc=0xe8b9c] > *unknown func* [pc=0xed794] > *unknown func* [pc=0xe4104] > *unknown func* [pc=0xe4db4] > *unknown func* [pc=0xe64c4] > *unknown func* [pc=0xe5cf0] > *unknown func* [pc=0xe5524] Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mwh@python.net Thu Feb 7 10:40:42 2002 From: mwh@python.net (Michael Hudson) Date: 07 Feb 2002 10:40:42 +0000 Subject: [Python-Dev] Mixing memory management APIs In-Reply-To: "M.-A. Lemburg"'s message of "Wed, 06 Feb 2002 23:41:52 +0100" References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> Message-ID: <2m8za5mv6d.fsf@starship.python.net> "M.-A. Lemburg" writes: > Neal Norwitz wrote: > > > "M.-A. Lemburg" wrote: > > > > > >>I've checked in a patch for the UTF-8 codec problem. Could you > >>try Purify against the CVS version ? > >> > > > > with-pymalloc or without or both? > > > Both if possible -- the leakage showed up with pymalloc AFAIR :-) I thought we were chasing memory stomping, not leaking, this time around... Cheers, M. -- /* I'd just like to take this moment to point out that C has all the expressive power of two dixie cups and a string. */ -- Jamie Zawinski from the xkeycaps source From mal@lemburg.com Thu Feb 7 11:42:23 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 07 Feb 2002 12:42:23 +0100 Subject: [Python-Dev] Mixing memory management APIs References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> <3C61BD80.5F11C66B@metaslash.com> <3C6240EF.4FE8E9C@lemburg.com> Message-ID: <3C62681F.D27932B8@lemburg.com> I've just checked in a set of fixes for the UTF-8 encoder and decoder and also updated the test output of test_unicodedata. You should now no longer get the test failures you were seeing (test_unicode failure was due to the old marshal format using illegal UTF-8 sequences, test_unicodedata was due to the same UTF-8 problem but shows up in a different hash value). Hope I got it right this time around :-/ -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Feb 7 11:48:11 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 07 Feb 2002 12:48:11 +0100 Subject: [Python-Dev] PYC Magic Message-ID: <3C62697B.3555501@lemburg.com> FYI, I've bumped the PYC magic in a non-standard way (the old standard broke on 2002-01-01); please review: import.c: """ /* New way to come up with the low 16 bits of the magic number: (YEAR-1995) * 10000 + MONTH * 100 + DAY where MONTH and DAY are 1-based. XXX Whatever the "old way" may have been isn't documented. XXX This scheme breaks in 2002, as (2002-1995)*10000 = 70000 doesn't fit in 16 bits. XXX Later, sometimes 1 gets added to MAGIC in order to record that the Unicode -U option is in use. IMO (Tim's), that's a Bad Idea (quite apart from that the -U option doesn't work so isn't used anyway). XXX MAL, 2002-02-07: I had to modify the MAGIC due to a fix of the UTF-8 encoder (it previously produced invalid UTF-8 for unpaired high surrogates), so I simply bumped the month value to 20 (invalid month) and set the day to 1. This should be recognizable by any algorithm relying on the above scheme. Perhaps we should simply start counting in increments of 10 from now on ?! Known values: Python 1.5: 20121 Python 1.5.1: 20121 Python 1.5.2: 20121 Python 2.0: 50823 Python 2.0.1: 50823 Python 2.1: 60202 Python 2.1.1: 60202 Python 2.1.2: 60202 Python 2.2: 60717 Python 2.3a0: 62001 */ #define MAGIC (62001 | ((long)'\r'<<16) | ((long)'\n'<<24)) """ -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Feb 7 11:52:38 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 07 Feb 2002 12:52:38 +0100 Subject: [Python-Dev] Mixing memory management APIs References: <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> <2m8za5mv6d.fsf@starship.python.net> Message-ID: <3C626A86.58B12055@lemburg.com> Michael Hudson wrote: > > > >>I've checked in a patch for the UTF-8 codec problem. Could you > > >>try Purify against the CVS version ? > > >> > > > > > > with-pymalloc or without or both? > > > > > > Both if possible -- the leakage showed up with pymalloc AFAIR :-) > > I thought we were chasing memory stomping, not leaking, this time > around... Both, I guess: pymalloc doesn't behave well with overallocation and codecs use this technique a lot. I reduced the overallocation in the UTF-8 encoder down from 3*size to 2*size which should cover the most common cases better than before. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From thomas.heller@ion-tof.com Thu Feb 7 21:10:18 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 7 Feb 2002 22:10:18 +0100 Subject: [Python-Dev] Extending types in C - help needed References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net> <104601c1af50$4888bd90$e000a8c0@thomasnotebook> Message-ID: <065001c1b01b$d8bcc520$e000a8c0@thomasnotebook> [Guido] > > A better solution is to store additional information in the __dict__. > [Thomas] > You loose nice features: access these (new) slots from Python > by providing tp_members entries for them (for example). This thread is IMO closed, just for completenes I want to mention that the same effect can be accomplished easily with tp_getset. Thomas From Jack.Jansen@oratrix.nl Thu Feb 7 22:43:46 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Thu, 7 Feb 2002 23:43:46 +0100 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: <104601c1af50$4888bd90$e000a8c0@thomasnotebook> Message-ID: <25807C26-1C1C-11D6-B5C4-003065517236@oratrix.nl> On Wednesday, February 6, 2002, at 09:53 PM, Thomas Heller wrote: >> A better solution is to store additional information in the __dict__. > > You loose nice features: access these (new) slots from Python > by providing tp_members entries for them (for example). Martin pointed at a way to solve this. And I think that with my proposed API (... where is it..., ah yes, found it) void PyType_SetAnnotation(PyTypeObject *tp, char *name, void *unique, void *); void *PyType_GetAnnotation(PyTypeObject *tp, char *name, void *unique); it would be almost as easy to use as a tp_ slot. The only thing needed to make it 100% safe is a registry for name/descr pairs. (Actually the API is changed a little since I understand how the second arg works) For the benefit of whoever missed the previous thread: name is used as the key into the dictionary, and unique is a pointer stored with the entry, which assures that this entry hasn't been used for something else accidentally. So in stead of a new slot tp_foo what you would need to do is come up with a name ("tp_foo" comes to mind) and a global variable whose address can be used for unique. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From guido@python.org Fri Feb 8 03:03:14 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 07 Feb 2002 22:03:14 -0500 Subject: [Python-Dev] Extending types in C - help needed In-Reply-To: Your message of "Wed, 06 Feb 2002 21:53:08 +0100." <104601c1af50$4888bd90$e000a8c0@thomasnotebook> References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net> <104601c1af50$4888bd90$e000a8c0@thomasnotebook> Message-ID: <200202080303.g1833EO23629@pcp742651pcs.reston01.va.comcast.net> > > A better solution is to store additional information in the __dict__. > > You loose nice features: access these (new) slots from Python > by providing tp_members entries for them (for example). I'm not sure I understand what you mean. Why would you need a tp_members entry for something that's in __dict__? > Are you planning to address this issue in the future? David Abrahams (of Boost++ fame) is also interested in a solution for this problem, so I may have to. Not in 2.2.1, though -- this will have to be rearchitected so it's a 2.3 issue. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Feb 8 03:04:47 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 07 Feb 2002 22:04:47 -0500 Subject: [Python-Dev] PYC Magic In-Reply-To: Your message of "Thu, 07 Feb 2002 12:48:11 +0100." <3C62697B.3555501@lemburg.com> References: <3C62697B.3555501@lemburg.com> Message-ID: <200202080304.g1834l423644@pcp742651pcs.reston01.va.comcast.net> > FYI, I've bumped the PYC magic in a non-standard way (the old > standard broke on 2002-01-01); please review: This is fine. I never intended the algorithm as reversible, just as a way to come up with unique magic numbers. There is no requirement that from the magic number one can calculate the date it was assigned. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Fri Feb 8 05:34:10 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 7 Feb 2002 23:34:10 -0600 Subject: [Python-Dev] Re: Python optmizations In-Reply-To: <20020206163126.B4071@ibook.distro.conectiva> References: <20020206163126.B4071@ibook.distro.conectiva> Message-ID: <15459.25426.546160.663165@12-248-41-177.client.attbi.com> Gustavo, Thanks for the note. Funny coincidence the timing of your note like a bolt out of the blue and my attendance at IPC10 where Jeremy Hylton and I led a session this afternoon on optimization issues. Gustavo> Recently, I have found your paper about peephole optimization, Gustavo> ... I should probably check my peephole optimizer into the nondist sandbox on SF. It shouldn't take very much effort. Unlike Rattlesnake, it still pretty much works. Gustavo> ... discovered that I'm not original, and repeated most of your Gustavo> ideas and mistakes. I was not original either. I'm sure I repeated the mistakes of many other people as well. Gustavo> One thing I thought and also found a reference in your paper is Gustavo> about some instructions that should be turned into a single Gustavo> opcode. To understand how this would affect the code, I have Gustavo> disassembled the whole Python standard library, and the whole Gustavo> Zope library. After that I've run a script to detect opcode Gustavo> repeatings (excluding SET_LINENO). It sounds like your measurements were made on the static bytecode. I suspect you might find the dynamic opcode pair frequencies interesting as well. That's what most people look at when deciding what really follows what. (For example, did you consider the basic block structure of the bytecode or just adjacency in the disassembler output?) You can get this information by defining the two macros DXPAIRS and DYNAMIC_EXECUTION_PROFILE when compiling Python. I think all you need to recompile is ceval.c and sysmodule.c Once you've done this, run your scripts as usual, then before exit (you might just want to run your subject script with the -i flag), call sys.getdxp(). That will return you a 257x256 list (well, a list containing 257 other lists, each of which contains 256 elements). The value at location [i][j] corresponds to the frequency of opcode j being executed immediately after opcode i (or the other way around - a quick peek at the code will tell you which is which). At one point I had an xmlrpc server running on manatee.mojam.com to which people could submit such opcode frequency arrays, however nobody submitted anything to it, so I eventually turned it off. I would be happy to crank it up again. With the atexit module and xmlrpclib in the core library, it's dead easy to instrument a program so it automatically dumps the data to my server upon program exit. Gustavo> 23632 LOAD_FAST, LOAD_ATTR This is not all that surprising and supports Jeremy's belief (which I agree with) that self.attr is a very common construct in the language. Gustavo> 15382 LOAD_CONST, LOAD_CONST Now, this is interesting. If those constants are numbers and the next opcode is a BINARY_*, my peephole optimizer can elide that operation and create a new constant, so something like LOAD_CONST 60 LOAD_CONST 60 BINARY_MULTIPLY would get converted to simply LOAD_CONST 3600 Gustavo> 12842 JUMP_IF_FALSE, POP_TOP Gustavo> 12397 CALL_FUNCTION, POP_TOP I don't think these can be avoided. Gustavo> 12121 LOAD_FAST, LOAD_FAST While this pair occurs frequently, they are very cheap instructions. All you'd be saving is a trip around the opcode dispatch loop. Gustavo> Not by casuality, I found in your paper references to a Gustavo> LOAD_FAST_ATTR opcode. Since you probably have mentioned this Gustavo> to others, I wouldn't like to bother everyone again asking why Gustavo> it was not implemented. Could you please explain me the reasons Gustavo> that left this behind? LOAD_ATTR is a *very* expensive opcode (perhaps only second to CALL_FUNCTION on a per-instruction basis). Jeremy measured minimums of around 500 clock cycles and means of around 1200 clock cycles for this opcode. In contrast, it appears that a single trip around the opcode dispatch loop is on the order of 50 clock cycles, so merging a LOAD_FAST/LOAD_ATTR pair into one instruction only saves about 50 cycles. What you want to eliminate is the 500+ cycles from the LOAD_ATTR instruction. Jeremy and I both have ideas about how to accomplish some of that, but it's not a trivial task. I believe in most cases I got about a 5% speedup with peephole optimization. That's nothing to sneeze at I suppose, but there were some barriers to adoption. First and foremost, generating that instruction requires running my optimizer, which isn't blindingly fast. (Probably fast enough for a "compileall" step that you execute once at install time, but maybe too slow to use regularly on-the-fly.) It also predates the compiler Jeremy implemented in Python. It would probably be fairly easy to hang my optimizer off the back end of his compiler as an optional pass. It looks like Guido would like to see a little work put into regaining some of the performance that was lost between 1.5.2 and 2.2, so now would probably be a good time to dust off my optimizer. Gustavo> If you have the time, I'd also like to understand what's the Gustavo> trouble involved in getting a peephole optimizer in the python Gustavo> compiler itself. Is it just about compiling performance? I Gustavo> don't remember to have read about this in your paper, but you Gustavo> probably thought about that as well. Mostly just time. Tick tick tick... Skip From guido@python.org Fri Feb 8 16:50:31 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 11:50:31 -0500 Subject: [Python-Dev] Accessing globals without dict lookup Message-ID: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> Inspired by talks by Jeremy and Skip on DevDay, here's a different idea for speeding up access to globals. It retain semantics but (like Jeremy's proposal) changes the type of a module's __dict__. - Let a cell be a really simple PyObject, containing a PyObject pointer and a cell pointer. Both pointers may be NULL. (It may have to be called PyGlobalCell since I believe there's already a PyCell object.) (Maybe it doesn't even have to be an object -- it could just be a tiny struct.) - Let a celldict be a mapping that is implemented using a dict of cells. When you use its getitem method, the PyObject * in the cell is dereferenced, and if a NULL is found, getitem raises KeyError even if the cell exists. Using setitem to add a new value creates a new cell and stores the value there; using setitem to change the value for an existing key stores the value in the existing cell for that key. There's a separate API to access the cells. - We change the module implementation to use a celldict for its __dict__. The module's getattr and setattr operations now map to getitem and setitem on the celldict. I think the type of .__dict__ and globals() is the only backwards incompatibility. - When a module is initialized, it gets its __builtins__ from the __builtin__ module, which is itself a celldict. For each cell in __builtins__, the new module's __dict__ adds a cell with a NULL PyObject pointer, whose cell pointer points to the corresponding cell of __builtins__. - The compiler generates LOAD_GLOBAL_CELL (and STORE_GLOBAL_CELL etc.) opcodes for references to globals, where is a small index with meaning only within one code object like the const index in LOAD_CONST. The code object has a new tuple, co_globals, giving the names of the globals referenced by the code indexed by . I think no new analysis is required to be able to do this. - When a function object is created from a code object and a celldict, the function object creates an array of cell pointers by asking the celldict for cells corresponding to the names in the code object's co_globals. If the celldict doesn't already have a cell for a particular name, it creates and an empty one. This array of cell pointers is stored on the function object as func_cells. When a function object is created from a regular dict instead of a celldict, func_cells is a NULL pointer. - When the VM executes a LOAD_GLOBAL_CELL instruction, it gets cell number from func_cells. It then looks in the cell's PyObject pointer, and if not NULL, that's the global value. If it is NULL, it follows the cell's cell pointer to the next cell, if it is not NULL, and looks in the PyObject pointer in that cell. If that's also NULL, or if there is no second cell, NameError is raised. (It could follow the chain of cell pointers until a NULL cell pointer is found; but I have no use for this.) Similar for STORE_GLOBAL_CELL , except it doesn't follow the cell pointer chain -- it always stores in the first cell. - There are fallbacks in the VM for the case where the function's globals aren't a celldict, and hence func_cells is NULL. In that case, the code object's co_globals is indexed with to find the name of the corresponding global and this name is used to index the function's globals dict. I believe that's it. I think this faithfully implements the current semantics (where a global can shadow a builtin), but without the need for any dict lookups when accessing globals, except in cases where an explicit dict is passed to exec or eval(). Compare this to Jeremy's scheme using dlicts: http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobals - My approach doesn't require global agreement on the numbering of the globals; each code object has its own numbering. This avoids the need for more global analysis, and allows adding code to a module using exec that introduces new globals without having to fall back on a less efficient scheme. - Jeremy's approach might be a teensy bit faster because it may have to do less work in the LOAD_GLOBAL; but I'm not convinced. Here's a implementation sketch for cell and celldict. Note in particular that keys() only returns the keys for which the cell's objptr is not NULL. NULL = object() # used as a token class cell(object): def __init__(self): self.objptr = NULL self.cellptr = NULL class celldict(object): def __init__(self): self.__dict = {} # dict of cells def getcell(self, key): c = self.__dict.get(key) if c is None: c = cell() self.__dict[key] = c return c def __getitem__(self, key): c = self.__dict.get(key) if c is None: raise KeyError, key value = c.objptr if value is NULL: raise KeyError, key else: return value def __setitem__(self, key, value): c = self.__dict.get(key) if c is None: c = cell() self.__dict[key] = c c.objptr = value def __delitem__(self, key): c = self.__dict.get(key) if c is None or c.objptr is NULL: raise KeyError, key c.objptr = NULL def keys(self): return [c.objptr for c in self.__dict.keys() if c.objptr is not NULL] def clear(self): for c in self.__dict.values(): c.objptr = NULL # Etc. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Fri Feb 8 18:04:45 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 8 Feb 2002 19:04:45 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library Message-ID: <029a01c1b0cb$195cb400$ced241d5@hagrid> I propose adding a basic time type (or time base type ;-) to the standard library, which can be subclassed by more elaborate date/time/timestamp implementations, such as mxDateTime, custom types provided by DB-API drivers, etc. The goal is to make it easy to extract the year, month, day, hour, minute, and second from any given time object. Or to put it another way, I want the following to work for any time object, including mxDateTime objects, any date/timestamp returned by a DB-API driver, and weird date/time-like types I've developed myself: if isinstance(t, basetime): # yay! it's a timestamp print t.timetuple() The goal is not to standardize any behaviour beyond this; anything else should be provided by subtypes. More details here: http://effbot.org/ideas/time-type.htm I can produce PEP and patch if necessary. From guido@python.org Fri Feb 8 18:09:27 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 13:09:27 -0500 Subject: [Python-Dev] Speeding up instance attribute access Message-ID: <200202081809.g18I9SD02879@pcp742651pcs.reston01.va.comcast.net> Inspired by the second half of Jeremy's talk on DevDay, here's my alternative approach for speeding up instance attribute access. Like my idea for globals, it uses double indirection rather than recompilation. - We only care about attributes of 'self' (which is identified as the first argument of a method, not by name). We can exclude functions from our analysis that make any assignment to self -- this is extremely rare and would throw off our analysis. We should also exclude static methods and class methods, since their first argument doesn't have the same role. - Static analysis of the source code of a class (without access to the base class) can determine attributes of the class, and to some extent instance variables. Without also analyzing the base classes, this analysis cannot reliably distinguish between instance variables and methods inherited from a base class; it can distinguish between instance variables and methods defined in the current class. - We can guess the status of un-assigned-to inherited attributes by seeing whether they are called or not. This is not 100% accurate, so we need things to work (if slower) even when we guess wrong. - For instance variable references and stores of the form self., the bytecode compiler emits opcodes LOAD_SELF_IVAR and STORE_SELF_IVAR , where is a small int identifying the instance variable (ivar). A particular ivar is identified by the same throughout all methods defined in the same class statement, but there is no attempt to coordinate this across different classes related by inheritance. - It would be nice if we also had a single-opcode way to express a method call on self, e.g. CALL_SELF_METHOD , , where identifies the method like above, and and are the number of positional and keyword arguments. Or maybe we should just have LOAD_SELF_METHOD which may be able to skip looking in the instance dict. - Some data structure describing the mapping from to attribute name, and whether it's an ivar or a method, is produced by the compiler and stored in the class __dict__. The function objects representing methods also contain a pointer to this data structure. (Or the code objects? But it needs to be shared. Details, details.) - When a class object is created (at run-time), another data structure is created that accumulates the -to-name mappings from that class and all its base classes. From guido@python.org Fri Feb 8 18:11:06 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 13:11:06 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Fri, 08 Feb 2002 19:04:45 +0100." <029a01c1b0cb$195cb400$ced241d5@hagrid> References: <029a01c1b0cb$195cb400$ced241d5@hagrid> Message-ID: <200202081811.g18IB6a02905@pcp742651pcs.reston01.va.comcast.net> From guido@python.org Fri Feb 8 18:14:34 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 13:14:34 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Fri, 08 Feb 2002 19:04:45 +0100." <029a01c1b0cb$195cb400$ced241d5@hagrid> References: <029a01c1b0cb$195cb400$ced241d5@hagrid> Message-ID: <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net> > http://effbot.org/ideas/time-type.htm > > I can produce PEP and patch if necessary. Yes, a PEP, please! Jim Fulton has been asking for this for a long time too. His main requirement is that timestamp objects are small, both in memory and as pickles, because Zope keeps a lot of these around. They are currently represented either as long ints (with a little under 64 bits) or as 8-byte strings. A dedicated timestamp object could be smaller than that. Your idea of a base type (which presumably standarizes at least one form of representation) sounds like a breakthrough that can help satisfy different other needs. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Feb 8 18:16:25 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 13:16:25 -0500 Subject: [Python-Dev] Speeding up instance attribute access In-Reply-To: Your message of "Fri, 08 Feb 2002 13:09:27 EST." <200202081809.g18I9SD02879@pcp742651pcs.reston01.va.comcast.net> References: <200202081809.g18I9SD02879@pcp742651pcs.reston01.va.comcast.net> Message-ID: <200202081816.g18IGPq02967@pcp742651pcs.reston01.va.comcast.net> Forget that. I hit "send" accidentally; my fingers seem jittery after the conference. I'll send the real proposal in a while. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Fri Feb 8 19:05:03 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 08 Feb 2002 20:05:03 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <029a01c1b0cb$195cb400$ced241d5@hagrid> <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C64215F.AF31BB96@lemburg.com> Guido van Rossum wrote: > > > http://effbot.org/ideas/time-type.htm > > > > I can produce PEP and patch if necessary. > > Yes, a PEP, please! Jim Fulton has been asking for this for a long > time too. His main requirement is that timestamp objects are small, > both in memory and as pickles, because Zope keeps a lot of these > around. They are currently represented either as long ints (with a > little under 64 bits) or as 8-byte strings. A dedicated timestamp > object could be smaller than that. > > Your idea of a base type (which presumably standarizes at least one > form of representation) sounds like a breakthrough that can help > satisfy different other needs. Sounds like a plan :-) In order to make mxDateTime subtypes of this new type we'd need to make sure that the datetime type uses a true struct subset of what I have in DateTime objects now: typedef struct { PyObject_HEAD /* Representation used to do calculations */ long absdate; /* number of days since 31.12. in the year 1 BC calculated in the Gregorian calendar. */ double abstime; /* seconds since 0:00:00.00 (midnight) on the day pointed to by absdate */ ...lots of broken down values needed to assure roundtrip safety... } Depending on the size of PyObject_HEAD, this should meet Jim Fultons requirements (the base type would of course not implement the "..." part :-). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Fri Feb 8 19:08:43 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 08 Feb 2002 20:08:43 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <029a01c1b0cb$195cb400$ced241d5@hagrid> <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net> <3C64215F.AF31BB96@lemburg.com> Message-ID: <3C64223B.F345E4AE@lemburg.com> "M.-A. Lemburg" wrote: > > In order to make mxDateTime subtypes of this new type we'd need to > make sure that the datetime type uses a true struct subset of what > I have in DateTime objects now: > > typedef struct { > PyObject_HEAD > > /* Representation used to do calculations */ > long absdate; /* number of days since 31.12. in the year 1 BC > calculated in the Gregorian calendar. */ > double abstime; /* seconds since 0:00:00.00 (midnight) > on the day pointed to by absdate */ > > ...lots of broken down values needed to assure roundtrip safety... > > } > > Depending on the size of PyObject_HEAD, this should meet Jim > Fultons requirements (the base type would of course not implement > the "..." part :-). I forgot to mention that there is another object type in mxDateTime too: DateTimeDelta. That's the type needed to represent the time difference between two DateTime instances, or what people usually call "time" :-) It has the following type "signature": typedef struct { PyObject_HEAD double seconds; /* number of delta seconds */ ...some broken down values needed to assure roundtrip safety... } -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Fri Feb 8 19:18:20 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 14:18:20 -0500 Subject: [Python-Dev] Speeding up instance attribute access Message-ID: <200202081918.g18JIKb03242@pcp742651pcs.reston01.va.comcast.net> (By mistake I sent an incomplete version of this earlier. Please ignore that and read this instead.) Inspired by the second half of Jeremy's talk on DevDay, here's my alternative approach for speeding up instance attribute access. Like my idea for globals, it uses double indirection rather than recompilation. - We only care about attributes of 'self' (which is identified as the first argument of a method, not by name). We can exclude functions from our analysis that make any assignment to self -- this is extremely rare and would throw off our analysis. We should also exclude static methods and class methods, since their first argument doesn't have the same role. - Static analysis of the source code of a class (without access to the base class) can determine attributes of the class, and to some extent instance variables. Without also analyzing the base classes, this analysis cannot reliably distinguish between instance variables and methods inherited from a base class; it can distinguish between instance variables and methods defined in the current class. - We can guess the status of un-assigned-to inherited attributes by seeing whether they are called or not. This is not 100% accurate, so we need things to work (if slower) even when we guess wrong. - For instance variable references and stores of the form self., the bytecode compiler emits opcodes LOAD_SELF_IVAR and STORE_SELF_IVAR , where is a small int identifying the instance variable (ivar). A particular ivar is identified by the same throughout all methods defined in the same class statement, but there is no attempt to coordinate this across different classes related by inheritance. - Some data structure describing the mapping from to attribute name, and whether it's an ivar or a method, is produced by the compiler and stored in the class, in a way that a user can't change it. The function objects representing methods also contain a pointer to this data structure. (Or the code objects? But it needs to be shared. Details, details.) - At *run time*, when a class object is created, another data structure is created that accumulates the -to-name mappings from that class and all its base classes. This has more accurate information because it can collect information from the bases (though it can still be fooled by dynamic manipulations of classes). In particular, it has the canonical set of known instance variables for instances of this class. This is stored in the class object, in a way that a user can't change it. - When an instance is created, the run-time data structure stored in the class is consulted to know how many instance variables to allocate. An array is allocated at the end of the instance (or in a separately allocated block?) with one PyObject pointer for each known instance variable. There's also a pointer to the instance __dict__, to hold instance variables that the parser didn't spot, but this pointer starts off as NULL -- a dictionary is created for it only when needed. - There is no requirement that the layout of the ivar array for instances of a subclass is an extension of the layout of the ivar array for its base classes. But there *is* a requirement that all instances of the same class that are not instances of a subclass (i.e., all x and y where x.__class__ is y.__class__) have the same layout, and this layout is determined by the run-time data structure stored in the class. - Now all we need is an efficient way to map LOAD_SELF_IVAR to an index in the array of ivars. Two different classes play a role here: the class of self (the run-time class) and the class that defined the method containing the LOAD_SELF_IVAR opcode (the compile-time class). We assume the run-time class is a subclass of the compile-time class. The correct mapping can easily be calculated from the data structures left behind by the compiler in the compile-time class and by class construction at run time in the run-time class. Since this mapping doesn't change, it can be calculated once and then cached. The cache can be held in the run-time class; it can be a dictionary whose keys are compile-time classes. This means a single dict lookup that must be done once per method call (and only when LOAD_SELF_IVAR or STORE_SELF_IVAR is used). We could save even that dict lookup in most cases by caching the run-time class and the outcome with the method. - We need fallbacks for various exceptional cases: * If the compile-time class uses LOAD_SELF_IVAR but the run-time class doesn't think that is an instance variable, LOAD_SELF_IVAR must fall back to look in the instance dict (if non-NULL) and then down the run-time class and its base classes. * If the ivar slot in the instance corresponding to exists but is NULL, LOAD_SELF_IVAR must fall back to searching the run-time class and its base classes. For both cases, the mapping from to attribute names must be available. The language and the current code generation guarantee that 'self' is the first local variable. - The instance __dict__ must become a proxy that knows whether a given name is stored in the array of ivars or in the overflow dict; this is much like Jeremy's DLict. - Note that there are two sources of savings: the major savings (probably) comes from avoiding a dict lookup for every ivar access; an additional minor savings comes from collapsing two opcodes: LOAD_FAST 0 (self) LOAD_ATTR 1 (foo) into one: LOAD_SELF_IVAR 0 (self.foo) - I don't know if we should try to generate LOAD_SELF_IVAR only for things that really are (likely) ivars, or for all attributes. Maybe it would be nice if we also had a single-opcode way to express a method call on self, e.g. CALL_SELF_METHOD , , where identifies the method, and and are the number of positional and keyword arguments. Or maybe we should just have LOAD_SELF_METHOD which looks in the instance overflow dict (but only if non-NULL) and in the class and bases, but can avoid looking in the ivars array if does not describe a known ivar (again this information can be cached). - The required global analysis is a bit hairy, and not something we already do. I believe that Jeremy thinks PyChecker already does this; I'm not sure if that means we can borrow code or just ideas. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Feb 8 19:51:02 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 14:51:02 -0500 Subject: Half-baked idea (was Re: [Python-Dev] Extending types in C - help needed) In-Reply-To: Your message of "Wed, 06 Feb 2002 10:24:31 CST." <20020206102426.A20584@unpythonic.dhs.org> References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net> <20020206102426.A20584@unpythonic.dhs.org> Message-ID: <200202081951.g18Jp2g03461@68.49.146.65> [Idea about extending variable-length structures at the front instead of at the back] The problem with applying this idea to Python objects, IMO, is that Python requires the object header to be at the start. Anything operating on a PyObject * expects that it can use the Py_INCREF and Py_DECREF macros, and those expect the refcount to be the first field and the type pointer to be the second. So our objects are already constrained at the front. Also, the GC implementation already uses thistrick: it adds three fields in front of the structure. But then it assumes you can use fixed address calculations to translate between the object and the GC header. Adding something in front of the GC header would be too painful. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Feb 8 19:54:16 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 14:54:16 -0500 Subject: Half-baked idea (was Re: [Python-Dev] Extending types in C - help needed) In-Reply-To: Your message of "Wed, 06 Feb 2002 10:24:31 CST." <20020206102426.A20584@unpythonic.dhs.org> References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net> <20020206102426.A20584@unpythonic.dhs.org> Message-ID: <200202081954.g18JsG003558@pcp742651pcs.reston01.va.comcast.net> [Idea about extending variable-length structures at the front instead of at the back] The problem with applying this idea to Python objects, IMO, is that Python requires the object header to be at the start. Anything operating on a PyObject * expects that it can use the Py_INCREF and Py_DECREF macros, and those expect the refcount to be the first field and the type pointer to be the second. So our objects are already constrained at the front. Also, the GC implementation already uses thistrick: it adds three fields in front of the structure. But then it assumes you can use fixed address calculations to translate between the object and the GC header. Adding something in front of the GC header would be too painful. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Feb 8 20:16:33 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 08 Feb 2002 15:16:33 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net> Message-ID: [/F] > http://effbot.org/ideas/time-type.htm > > I can produce PEP and patch if necessary. [Guido] > Yes, a PEP, please! Jim Fulton has been asking for this for a long > time too. His main requirement is that timestamp objects are small > both in memory and as pickles, because Zope keeps a lot of these > around. They are currently represented either as long ints (with a > little under 64 bits) or as 8-byte strings. A dedicated timestamp > object could be smaller than that. Are you sure Jim is looking to replace the TimeStamp object? All the complaints I've seen aren't about the relatively tiny TimeStamp object, but about Zope's relatively huge DateTime class (note that you won't have source for that if you're looking at a StandaloneZODB checkout -- DateTime is used at higher Zope levels), which is a Python class with a couple dozen(!) instance attributes. See, e.g., http://dev.zope.org/Wikis/DevSite/Proposals/ReplacingDateTime It seems clear from the source code that TimeStamp is exactly what Jim intended it to be . > Your idea of a base type (which presumably standarizes at least one > form of representation) sounds like a breakthrough that can help > satisfy different other needs. Best I can make out, /F is only proposing what Jim would call an Interface: the existence of two methods, timetuple() and utctimetuple(). In a comment on his page, /F calls it an "abstract" base class, which is more C++-ish terminology, and the sample implementation makes clear it's a "pure" abstract base class, so same thing as a Jim Interface in the end. From guido@python.org Fri Feb 8 20:27:16 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 15:27:16 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Fri, 08 Feb 2002 15:16:33 EST." References: Message-ID: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> [Tim] > Are you sure Jim is looking to replace the TimeStamp object? All the > complaints I've seen aren't about the relatively tiny TimeStamp object, but > about Zope's relatively huge DateTime class (note that you won't have source > for that if you're looking at a StandaloneZODB checkout -- DateTime is used > at higher Zope levels), which is a Python class with a couple dozen(!) > instance attributes. See, e.g., > > http://dev.zope.org/Wikis/DevSite/Proposals/ReplacingDateTime > > It seems clear from the source code that TimeStamp is exactly what Jim > intended it to be . I'm notoriously bad at channeling Jim. Nevertheless, I do recall him saying he wanted a lightweight time object. I think the mistake of DateTime is that it stores the broken-out info, rather than computing it on request. > > Your idea of a base type (which presumably standarizes at least one > > form of representation) sounds like a breakthrough that can help > > satisfy different other needs. > > Best I can make out, /F is only proposing what Jim would call an Interface: > the existence of two methods, timetuple() and utctimetuple(). In a comment > on his page, /F calls it an "abstract" base class, which is more C++-ish > terminology, and the sample implementation makes clear it's a "pure" > abstract base class, so same thing as a Jim Interface in the end. I'll show the PEP to Jim when it appears. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Feb 8 20:47:28 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 08 Feb 2002 15:47:28 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido] > I'm notoriously bad at channeling Jim. Nevertheless, I do recall him > saying he wanted a lightweight time object. Given that most mallocs align to 8-byte boundaries these days (also true of pymalloc), it's impossible in reality to define a smaller object than TimeStamp, provided it needs at least one byte of info beyond PyObject_HEAD. > I think the mistake of DateTime is that it stores the broken-out info, > rather than computing it on request. Possibly, but hard to say, since speed of display is also an issue, and I imagine also speed of range searches. At least 2.2 makes it easy to define computed attributes, any of which could choose to cache their ultimate value, but none of which would need to be stored in pickles. From fdrake@acm.org Fri Feb 8 20:50:03 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 8 Feb 2002 15:50:03 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15460.14843.719197.239058@grendel.zope.com> Guido van Rossum writes: > def keys(self): > return [c.objptr for c in self.__dict.keys() if c.objptr is not NULL] I presume you meant values() here rather than keys()? The keys() method could simply delegate to self.__dict. I imagine most of us can fill in any additional dictionary methods, though. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jeremy@alum.mit.edu Fri Feb 8 01:04:23 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 7 Feb 2002 20:04:23 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15459.9239.83647.334632@gondolin.digicool.com> >>>>> "TP" == Tim Peters writes: TP> [Guido] >> I'm notoriously bad at channeling Jim. Nevertheless, I do recall >> him saying he wanted a lightweight time object. TP> Given that most mallocs align to 8-byte boundaries these days TP> (also true of pymalloc), it's impossible in reality to define a TP> smaller object than TimeStamp, provided it needs at least one TP> byte of info beyond PyObject_HEAD. Also, it may not be necessary to have a TimeStamp object in ZODB 4. There are three uses for the timestamp: tracking how recently an object was used for cache evication, providing a last modified time to users, and as a simple version number. In ZODB 4, the cache eviction may be done quite differently. The version number may be a simple int. The last mod time will not be provided for each object; instead, users will need to define this themselves if they care about it. If they define it themselves, they'd probably use a DateTime object, but we'd care much less about how small it is. Jeremy From niemeyer@conectiva.com Fri Feb 8 20:56:35 2002 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Fri, 8 Feb 2002 18:56:35 -0200 Subject: [Python-Dev] Re: Python optmizations In-Reply-To: <15459.25426.546160.663165@12-248-41-177.client.attbi.com> References: <20020206163126.B4071@ibook.distro.conectiva> <15459.25426.546160.663165@12-248-41-177.client.attbi.com> Message-ID: <20020208185635.B4607@ibook.distro.conectiva> Hi Skip! > Thanks for the note. Funny coincidence the timing of your note like a bolt > out of the blue and my attendance at IPC10 where Jeremy Hylton and I led a > session this afternoon on optimization issues. You have a powerful mind... ;-) [...] > I should probably check my peephole optimizer into the nondist sandbox on > SF. It shouldn't take very much effort. Unlike Rattlesnake, it still > pretty much works. Please, do that. I'd like to have a look at it. [...] > It sounds like your measurements were made on the static bytecode. I Indeed. > suspect you might find the dynamic opcode pair frequencies interesting as > well. That's what most people look at when deciding what really follows > what. (For example, did you consider the basic block structure of the > bytecode or just adjacency in the disassembler output?) You can get this Yes, I have customized the disassembler a little bit. > information by defining the two macros DXPAIRS and DYNAMIC_EXECUTION_PROFILE > when compiling Python. I think all you need to recompile is ceval.c and > sysmodule.c Once you've done this, run your scripts as usual, then before > exit (you might just want to run your subject script with the -i flag), call > sys.getdxp(). That will return you a 257x256 list (well, a list containing > 257 other lists, each of which contains 256 elements). The value at > location [i][j] corresponds to the frequency of opcode j being executed > immediately after opcode i (or the other way around - a quick peek at the > code will tell you which is which). I was aware about this because of the code just after the dispatch_opcode label. Again, I'm not original. :-) On the other hand, when running an application you have data about that specific application, and the behavior of that run (next time it may follow other paths). While this is good, because you know what opcodes are being repeated most often, measuring static data may give you a wider view of repeating opcodes. > At one point I had an xmlrpc server running on manatee.mojam.com to which > people could submit such opcode frequency arrays, however nobody submitted > anything to it, so I eventually turned it off. I would be happy to crank it > up again. With the atexit module and xmlrpclib in the core library, it's > dead easy to instrument a program so it automatically dumps the data to my > server upon program exit. Now, *that* is something interesting. If you're really going to put the system up, you may count with my help if you need any. > Gustavo> 15382 LOAD_CONST, LOAD_CONST > > Now, this is interesting. If those constants are numbers and the next > opcode is a BINARY_*, my peephole optimizer can elide that operation and > create a new constant, so something like > > LOAD_CONST 60 > LOAD_CONST 60 > BINARY_MULTIPLY > > would get converted to simply > > LOAD_CONST 3600 Good point!! >>> def f(): ... return 2+1*5 ... >>> dis.dis(f) 0 SET_LINENO 1 3 SET_LINENO 2 6 LOAD_CONST 1 (2) 9 LOAD_CONST 2 (1) 12 LOAD_CONST 3 (5) 15 BINARY_MULTIPLY 16 BINARY_ADD 17 RETURN_VALUE 18 LOAD_CONST 0 (None) 21 RETURN_VALUE That's something we shouldn't left behind. [...] > Gustavo> 12121 LOAD_FAST, LOAD_FAST > > While this pair occurs frequently, they are very cheap instructions. All > you'd be saving is a trip around the opcode dispatch loop. I see.. > Gustavo> Not by casuality, I found in your paper references to a > Gustavo> LOAD_FAST_ATTR opcode. Since you probably have mentioned this > Gustavo> to others, I wouldn't like to bother everyone again asking why > Gustavo> it was not implemented. Could you please explain me the reasons > Gustavo> that left this behind? > > LOAD_ATTR is a *very* expensive opcode (perhaps only second to CALL_FUNCTION > on a per-instruction basis). Jeremy measured minimums of around 500 clock > cycles and means of around 1200 clock cycles for this opcode. In contrast, > it appears that a single trip around the opcode dispatch loop is on the > order of 50 clock cycles, so merging a LOAD_FAST/LOAD_ATTR pair into one > instruction only saves about 50 cycles. What you want to eliminate is the > 500+ cycles from the LOAD_ATTR instruction. Jeremy and I both have ideas > about how to accomplish some of that, but it's not a trivial task. Hummmm... pretty interesting! Thanks for the explanation. > I believe in most cases I got about a 5% speedup with peephole optimization. > That's nothing to sneeze at I suppose, but there were some barriers to > adoption. First and foremost, generating that instruction requires running > my optimizer, which isn't blindingly fast. (Probably fast enough for a > "compileall" step that you execute once at install time, but maybe too slow > to use regularly on-the-fly.) It also predates the compiler Jeremy > implemented in Python. It would probably be fairly easy to hang my > optimizer off the back end of his compiler as an optional pass. It looks I understand. That's something to be implemented in C, once we know the efforts are worthwhile. Maybe 5% is not that much, but optimization is something we do once, and benefit forever. A good peephole interface, with plugable passes, will also motivate new optimizations, in the peephole itself and around it. > like Guido would like to see a little work put into regaining some of the > performance that was lost between 1.5.2 and 2.2, so now would probably be a > good time to dust off my optimizer. No doubts. [...] > Mostly just time. Tick tick tick... I don't have much of this thing lately.. :-) But I'll try to use some of it helping wherever possible. Thank you! -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From guido@python.org Fri Feb 8 21:01:16 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 16:01:16 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Fri, 08 Feb 2002 15:50:03 EST." <15460.14843.719197.239058@grendel.zope.com> References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> <15460.14843.719197.239058@grendel.zope.com> Message-ID: <200202082101.g18L1GA04598@pcp742651pcs.reston01.va.comcast.net> > Guido van Rossum writes: > > def keys(self): > > return [c.objptr for c in self.__dict.keys() if c.objptr is not NULL] > > I presume you meant values() here rather than keys()? The keys() > method could simply delegate to self.__dict. I imagine most of us can > fill in any additional dictionary methods, though. Oops, I was indeed confused. I think I meant this: def keys(self): return [k for k, c in self.__dict.iteritems() if c.objptr is not NULL] And indeed I expected that you could extrapolate to the other methods. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Feb 8 21:03:33 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 16:03:33 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Thu, 07 Feb 2002 20:04:23 EST." <15459.9239.83647.334632@gondolin.digicool.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <15459.9239.83647.334632@gondolin.digicool.com> Message-ID: <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net> > In ZODB 4, the cache eviction may be done quite differently. The > version number may be a simple int. The last mod time will not be > provided for each object; instead, users will need to define this > themselves if they care about it. If they define it themselves, > they'd probably use a DateTime object, but we'd care much less about > how small it is. In that case, I take back everything I've said about Jim Fulton's requirements. I'm quite sure that in the past he said he needed a very lightweight date/time object, but from what you say it appears this need has disappeared. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Fri Feb 8 21:04:28 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 8 Feb 2002 16:04:28 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202082101.g18L1GA04598@pcp742651pcs.reston01.va.comcast.net> References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> <15460.14843.719197.239058@grendel.zope.com> <200202082101.g18L1GA04598@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15460.15708.873799.157131@grendel.zope.com> Guido van Rossum writes: > Oops, I was indeed confused. I think I meant this: > > def keys(self): > return [k for k, c in self.__dict.iteritems() if c.objptr is not NULL] Was I not clear, or am I missing something entirely? keys() needs *no* special treatment, but items() and values() do: class celldict(object): ... def keys(self): return self.__dict.keys() def items(self): return [k, c.objptr for k, c in self.__dict.iteritems() if c.objptr is not NULL] def values(self): return [c.objptr for c in self.__dict.itervalues() if c.objptr is not NULL] -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Fri Feb 8 21:07:04 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 8 Feb 2002 16:07:04 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <15459.9239.83647.334632@gondolin.digicool.com> <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15460.15864.266226.241495@grendel.zope.com> Guido van Rossum writes: > In that case, I take back everything I've said about Jim Fulton's > requirements. I'm quite sure that in the past he said he needed a > very lightweight date/time object, but from what you say it appears > this need has disappeared. He wanted this for the catalog, and I suspect he still does. Both size and performance (of comparisons) were important, not rendering time. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From neal@metaslash.com Fri Feb 8 21:10:25 2002 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 08 Feb 2002 16:10:25 -0500 Subject: [Python-Dev] Speeding up instance attribute access References: <200202081918.g18JIKb03242@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C643EC1.A5D6B094@metaslash.com> Guido van Rossum wrote: > - The required global analysis is a bit hairy, and not something we > already do. I believe that Jeremy thinks PyChecker already does > this; I'm not sure if that means we can borrow code or just ideas. The algorithm pychecker uses is pretty simple. I think something like this should work: for each method: self = method.co_varnames[0] for each byte code in method: if op == STORE_FAST and oparg == self: break # we don't know self anymore if (op == STORE_ATTR or op == LOAD_ATTR) and selfOnTop: # we have an attribute store it off selfOnTop = (LOAD_FAST and oparg == self) Note that storing the attributes could be done during the compilation step. This means that it should be simple housekeeping in the compiler to store the info and not require another pass (as in pychecker). Neal From guido@python.org Fri Feb 8 21:19:54 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 16:19:54 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Fri, 08 Feb 2002 16:04:28 EST." <15460.15708.873799.157131@grendel.zope.com> References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> <15460.14843.719197.239058@grendel.zope.com> <200202082101.g18L1GA04598@pcp742651pcs.reston01.va.comcast.net> <15460.15708.873799.157131@grendel.zope.com> Message-ID: <200202082119.g18LJs405546@pcp742651pcs.reston01.va.comcast.net> > Guido van Rossum writes: > > Oops, I was indeed confused. I think I meant this: > > > > def keys(self): > > return [k for k, c in self.__dict.iteritems() if c.objptr is not NULL] > > Was I not clear, or am I missing something entirely? I'm guessing both. ;-) > keys() needs > *no* special treatment, but items() and values() do: > > class celldict(object): > ... > > def keys(self): > return self.__dict.keys() Wrong. keys() *does* need special treatment. If c.objptr is NULL, the cell exists, but keys() should not return the corresponding key. This is so that len(x.keys()) == len(x.values()), amongst other reasons! > def items(self): > return [k, c.objptr for k, c in self.__dict.iteritems() > if c.objptr is not NULL] > > def values(self): > return [c.objptr for c in self.__dict.itervalues() > if c.objptr is not NULL] Yes, these are correct. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Fri Feb 8 21:17:28 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 8 Feb 2002 16:17:28 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202082119.g18LJs405546@pcp742651pcs.reston01.va.comcast.net> References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> <15460.14843.719197.239058@grendel.zope.com> <200202082101.g18L1GA04598@pcp742651pcs.reston01.va.comcast.net> <15460.15708.873799.157131@grendel.zope.com> <200202082119.g18LJs405546@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15460.16488.20690.943775@grendel.zope.com> Guido van Rossum writes: > I'm guessing both. ;-) ... > Wrong. keys() *does* need special treatment. If c.objptr is NULL, > the cell exists, but keys() should not return the corresponding key. > This is so that len(x.keys()) == len(x.values()), amongst other > reasons! Ow! Bad Fred! I should know better than to speak up on a Friday! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido@python.org Fri Feb 8 21:22:24 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Feb 2002 16:22:24 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Fri, 08 Feb 2002 16:07:04 EST." <15460.15864.266226.241495@grendel.zope.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <15459.9239.83647.334632@gondolin.digicool.com> <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net> <15460.15864.266226.241495@grendel.zope.com> Message-ID: <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> > Guido van Rossum writes: > > In that case, I take back everything I've said about Jim Fulton's > > requirements. I'm quite sure that in the past he said he needed a > > very lightweight date/time object, but from what you say it appears > > this need has disappeared. > > He wanted this for the catalog, and I suspect he still does. Both > size and performance (of comparisons) were important, not rendering > time. Is comparison the same what Tim mentioned as range searches? I guess a representation like current Zope timestamps or what time.time() returns is fine for that -- it is monononous even if not necessarily continuous. I guess a broken-out time tuple is much harder to compare. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Fri Feb 8 21:31:09 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 8 Feb 2002 16:31:09 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <15459.9239.83647.334632@gondolin.digicool.com> <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net> <15460.15864.266226.241495@grendel.zope.com> <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15460.17309.905113.103005@grendel.zope.com> Guido van Rossum writes: > Is comparison the same what Tim mentioned as range searches? I guess > a representation like current Zope timestamps or what time.time() > returns is fine for that -- it is monononous even if not necessarily > continuous. I guess a broken-out time tuple is much harder to compare. Yes; as long as ordering is easy to check, we're fine with a long int or some such thing. The range search is indeed the specific application Jim has in mind. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mal@lemburg.com Fri Feb 8 22:17:31 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 08 Feb 2002 23:17:31 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <15459.9239.83647.334632@gondolin.digicool.com> <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net> <15460.15864.266226.241495@grendel.zope.com> <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> <15460.17309.905113.103005@grendel.zope.com> Message-ID: <3C644E7B.F69AF73C@lemburg.com> "Fred L. Drake, Jr." wrote: > > Guido van Rossum writes: > > Is comparison the same what Tim mentioned as range searches? I guess > > a representation like current Zope timestamps or what time.time() > > returns is fine for that -- it is monononous even if not necessarily > > continuous. I guess a broken-out time tuple is much harder to compare. > > Yes; as long as ordering is easy to check, we're fine with a long int > or some such thing. The range search is indeed the specific > application Jim has in mind. Uhm... I think this thread is heading in the wrong direction. Fredrik wasn't proposing a solution to Jim's particular problem (whatever it was ;-), but instead opting for a solution of a large number of Python users out there. While mxDateTime probably works for most of them (and is used by pretty much all major database modules out there), some may feel that they don't want to rely on external libs for their software to run on. I would be willing to make the mxDateTime types subtypes of whatever Fredrik comes up with. The only requirement I have is that the binary footprint of the types needs to match todays layout of mxDateTime types since I need to maintain binary compatibility. The other possibility would be adding a set of new types to mxDateTime which focus on low memory requirements rather than data roundtrip safety and speed. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From neal@metaslash.com Fri Feb 8 22:54:51 2002 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 08 Feb 2002 17:54:51 -0500 Subject: [Python-Dev] Python 2.2 group missing on SF Patches Message-ID: <3C64573B.3412540F@metaslash.com> There is no 2.2 (or 2.2.1) choice under Group when submitting a patch on Source Forge. Neal From tim.one@comcast.net Fri Feb 8 23:04:49 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 08 Feb 2002 18:04:49 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> Message-ID: I'm not looking for point-by-point answers here, I'm just pointing out things that were hard to follow so that they may get addressed in a revision. [Guido] > Inspired by talks by Jeremy and Skip on DevDay, here's a different > idea for speeding up access to globals. It retain semantics but (like > Jeremy's proposal) changes the type of a module's __dict__. > > - Let a cell be a really simple PyObject, containing a PyObject > pointer and a cell pointer. Meaning a pointer to a cell, I bet. Note that in the pseduo-code at the end, the cellptr member of cell objects is never referenced, so it's hard to be sure. > Both pointers may be NULL. (It may have to be called PyGlobalCell > since I believe there's already a PyCell object.) There is a PyCellObject already. > (Maybe it doesn't even have to be an object -- it could just be a tiny > struct.) Would probably make it much harder to use the existing dict code (which maps PyObjects to PyObjects). > - Let a celldict be a mapping that is implemented using a dict of > cells. Presumably this is a mapping *to* cells, and from ...? String objects? > When you use its getitem method, the PyObject * in the cell is > dereferenced, and if a NULL is found, getitem raises KeyError > even if the cell exists. Had a hard time with this: 1. Letting p be "the PyObject* in the cell", are you saying p==NULL or *p==NULL is the KeyError trigger? "dereference" suggests the latter, but the former seems to make more sense. 2. Presumably the first "the cell" in this sentence refers to a different cell than the second "the cell" intends. > Using setitem to add a new value creates a new cell and stores the > value there; Presumably in the PyObject* member of the new cell. To what is the cellptr member of the new cell set? I think NULL. > using setitem to change the value for an existing key stores the > value in the existing cell for that key. There's a separate API to > access the cells. delitem is missing, but presumably straightforward. > - We change the module implementation to use a celldict for its > __dict__. The module's getattr and setattr operations now map to > getitem and setitem on the celldict. I think the type of > .__dict__ and globals() is the only backwards > incompatibility. > > - When a module is initialized, it gets its __builtins__ from the > __builtin__ module, which is itself a celldict. Surely the __builtin__ module isn't a celldict, but rather has a __dict__ that is a celldict. > For each cell in __builtins__, the new module's __dict__ adds a cell > with a NULL PyObject pointer, whose cell pointer points to the > corresponding cell of __builtins__. > > - The compiler generates LOAD_GLOBAL_CELL (and STORE_GLOBAL_CELL > etc.) opcodes for references to globals, where is a small > index with meaning only within one code object like the const index > in LOAD_CONST. The code object has a new tuple, co_globals, giving > the names of the globals referenced by the code indexed by . I > think no new analysis is required to be able to do this. Me too. > - When a function object is created from a code object and a celldict, > the function object creates an array of cell pointers by asking the > celldict for cells corresponding to the names in the code object's > co_globals. If the celldict doesn't already have a cell for a > particular name, it creates and an empty one. This array of cell > pointers is stored on the function object as func_cells. I expect that the more we use these guys (cells), the more valuable to make them PyObjects in their own right (for uniformity, ease of introspection, etc). > When a function object is created from a regular dict instead of a > celldict, func_cells is a NULL pointer. This part is regrettable, since it's Yet Another NULL check at the *top* of code using this stuff (meaning it slows the normal case, assuming that it's unusual not to get a celldict). I'm not clear on how code ends up getting created from a regular dict instead of a celldict -- is this because of stuff like "exec whatever in mydict"? > - When the VM executes a LOAD_GLOBAL_CELL instruction, it gets > cell number from func_cells. It then looks in the cell's > PyObject pointer, and if not NULL, that's the global value. If it > is NULL, it follows the cell's cell pointer to the next cell, if it > is not NULL, and looks in the PyObject pointer in that cell. If > that's also NULL, or if there is no second cell, NameError is > raised. (It could follow the chain of cell pointers until a NULL > cell pointer is found; but I have no use for this.) Similar for > STORE_GLOBAL_CELL , except it doesn't follow the cell pointer > chain -- it always stores in the first cell. If I'm reading this right, then in the normal case of resolving "len" in def mylen(s): return len(s) 1. We test func_cells for NULL and find out it isn't. 2. A pointer to a cell object is read out of func_cells at a fixed (wrt this function) offset. This points to len's cell object in the module's celldict. 3. The cell object's PyObject* pointer is tested and found to be NULL. 4. The cell object's cellptr pointer is tested and found not to be NULL. This points to len's cell object in __builtin__'s celldict. 5. The cell object's cellptr's PyObject* is tested and found not to be NULL. 6. The cell object's cellptr's PyObject* is returned. > - There are fallbacks in the VM for the case where the function's > globals aren't a celldict, and hence func_cells is NULL. In that > case, the code object's co_globals is indexed with to find the > name of the corresponding global and this name is used to index the > function's globals dict. Which may not succeed, so we also need another level to back off to the builtins. I'd like to pursue getting rid of the func_cells==NULL special case, even if it means constructing a celldict out of a regular dict for the duration, and feeding mutations back in to the regular dict afterwards. > I believe that's it. I think this faithfully implements the current > semantics (where a global can shadow a builtin), but without the need > for any dict lookups when accessing globals, except in cases where an > explicit dict is passed to exec or eval(). I think I agree. Note that a chain of 4 test+branches against NULL in "the usual case" for builtins may not be faster on average than inlining the first few useful lines of lookdict_string twice (the expected path in this routine became fat-free for 2.2): i = hash; ep = &ep0[i]; if (ep->me_key == NULL || ep->me_key == key) return ep; Win or lose, that's usually the end of a dict lookup. That is, I'm certain we're paying significantly more for layers of C-level function call overhead today than for what the dict implementation actually does now (in the usual cases). > Compare this to Jeremy's scheme using dlicts: > > http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobals > > - My approach doesn't require global agreement on the numbering of the > globals; each code object has its own numbering. This avoids the > need for more global analysis, Don't really care about that. > and allows adding code to a module using exec that introduces new > globals without having to fall back on a less efficient scheme. That is indeed lovely. > - Jeremy's approach might be a teensy bit faster because it may have > to do less work in the LOAD_GLOBAL; but I'm not convinced. LOAD_GLOBAL is executed much more often than STORE_GLOBAL, so whichever scheme wins for LOAD_GLOBAL will enjoy a multiplier effect when measuring overall performance. [and skipping the code] From tim.one@comcast.net Sat Feb 9 00:31:06 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 08 Feb 2002 19:31:06 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <3C644E7B.F69AF73C@lemburg.com> Message-ID: [M.-A. Lemburg] > Uhm... I think this thread is heading in the wrong direction. Maybe from your POV, but from our POV the only way we can get time to work on Python is talk all of you into doing Zope work for Jim . > Fredrik wasn't proposing a solution to Jim's particular > problem (whatever it was ;-), but instead opting for a solution > of a large number of Python users out there. I believe all /F is asking for is that all datetime types supply two specific methods, so that he can get the year etc out of anybody's datetime object via a uniform spelling. It's a fine idea. > ... > I would be willing to make the mxDateTime types subtypes of > whatever Fredrik comes up with. The only requirement I have is > that the binary footprint of the types needs to match todays > layout of mxDateTime types since I need to maintain binary > compatibility. If /F is asking more than that datetime types implement a specific interface, he's got some major rewriting to do . Python doesn't have a good way to spell "interface" now, so think of it as a do-nothing base class, inheriting from which means absolutely nothing except that (a) you promise to supply the methods /F specified, and (b) /F can use isinstance to determine whether or not a given object supports this interface. > The other possibility would be adding a set of new types > to mxDateTime which focus on low memory requirements rather > than data roundtrip safety and speed. That's getting back to what Jim wants. Maybe someone should ask him what that is . From jason@jorendorff.com Sat Feb 9 04:09:59 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Fri, 8 Feb 2002 22:09:59 -0600 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum wrote: > - Let a cell be a really simple PyObject, containing a PyObject > pointer and a cell pointer. Both pointers may be NULL. (It may > have to be called PyGlobalCell since I believe there's already a > PyCell object.) (Maybe it doesn't even have to be an object -- it > could just be a tiny struct.) > > - Let a celldict be a mapping that is implemented using a dict of > cells. When you use its getitem method, the PyObject * in the cell > is dereferenced, and if a NULL is found, getitem raises KeyError > even if the cell exists. Using setitem to add a new value creates a > new cell and stores the value there; using setitem to change the > value for an existing key stores the value in the existing cell for > that key. There's a separate API to access the cells. The following is totally unimportant, but I feel compelled to share: I implemented this once, long ago, for Python 1.5-ish, I believe. I got it to the point where it was only 15% slower than ordinary Python, then abandoned it. ;) In my implementation, "cells" were real first-class objects, and "celldict" was a copy-and-hack version of dictionary. I forget how the rest worked. Anyway, this is all very exciting to me. :) ## Jason Orendorff http://www.jorendorff.com/ From tim.one@comcast.net Sat Feb 9 04:35:22 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 08 Feb 2002 23:35:22 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Message-ID: [Jason Orendorff] > The following is totally unimportant, but I feel compelled to share: > > I implemented this once, long ago, for Python 1.5-ish, I believe. I got > it to the point where it was only 15% slower than ordinary Python, then > abandoned it. ;) In my implementation, "cells" were real first-class > objects, That shouldn't matter to speed via any first-order effect, unless you also used accessor functions instead of direct reference to get at the data members. > and "celldict" was a copy-and-hack version of dictionary. Hmm. > I forget how the rest worked. > > Anyway, this is all very exciting to me. :) Don't worry -- it will run much faster if Guido codes it. One key difference is that Guido will run each cell in its own thread . From tim.one@comcast.net Sat Feb 9 05:02:28 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 09 Feb 2002 00:02:28 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <15459.9239.83647.334632@gondolin.digicool.com> Message-ID: [Jeremy Hylton] > Also, it may not be necessary to have a TimeStamp object in ZODB 4. > There are three uses for the timestamp: tracking how recently an > object was used for cache evication, providing a last modified time to > users, and as a simple version number. > > In ZODB 4, the cache eviction may be done quite differently. The > version number may be a simple int. WRT RAM usage, a Python int is no smaller than a TimeStamp object. An int pickle is likely much smaller, though. > The last mod time will not be provided for each object; instead, users > will need to define this themselves if they care about it. If they > define it themselves, they'd probably use a DateTime object, but we'd > care much less about how small it is. Unclear that we'd care less, if the catalog remains full of DateTime objects, and Fred is channeling more faithfully than the rest of us . From tim.one@comcast.net Sat Feb 9 05:30:34 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 09 Feb 2002 00:30:34 -0500 Subject: [Python-Dev] PYC Magic In-Reply-To: <3C62697B.3555501@lemburg.com> Message-ID: [M.-A. Lemburg] > FYI, I've bumped the PYC magic in a non-standard way (the old > standard broke on 2002-01-01); please review: Fine by me, except you should also check in a NEWS blurb about it. The current NEWS file says: """ - Because Python's magic number scheme broke on January 1st, we decided to stop Python development. Thanks for all the fish! """ That's why PythonLabs hasn't done much of anything on Python since 2.2 was released . > algorithm relying on the above scheme. Perhaps we should simply > start counting in increments of 10 from now on ?! Why 10? I'd rather see it incremented by 1. If you respond that you want to make room for more hacks akin to -U, my response would be that's exactly what I want to prevent by blessing 1 <0.4 wink>. From srichter@cbu.edu Sat Feb 9 07:22:37 2002 From: srichter@cbu.edu (Stephan Richter) Date: Sat, 09 Feb 2002 01:22:37 -0600 Subject: [Python-Dev] proposal: add basic time type to the standard library Message-ID: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu> --=====================_112555085==_.ALT Content-Type: text/plain; charset="us-ascii"; format=flowed Hello everyone, what a coincidence. I just was discussing this issue with Jason O. today. Here is my original mail: Hey Jason, I also want to start to think about a DateTime module. PostGres has a nice discussion of their impementation: http://www.ca.postgresql.org/users-lounge/docs/7.1/user/datatype-datetime.html Here is Java's stuff on it: http://www3.cbu.edu/sciences/jdk1.3/docs/api/java/sql/Date.html Low Level Data Types: Date Time DateTime TimeStamp - Timestamps are always in UTC. * Intervals can be added or subtracted from themselves and the types above. DateInterval TimeInterval DateTimeInterval TimeStampInterval Notes: - The basic data type must be as small as possible, so that applications that save these types often (i.e. ZODB/Persistent) then it should not increase the amount of data by much. - We should then have high-level classes that put in all the functionality, such as lower-level in/output. I think high-level in/output should be handled by functions inside the module, such as getDateTimeFromString(str, someI18NspecificInfo=None). - We need flexible i18n support!!! This is very important, especially for Zope. By default the system should come with a gettext implementation, but I would like to have the module generic enough that we can define other types of translation and localization mechanisms. Mh, the more I think of it, the more I think we will end up building our own stuff and then exposing that via an API. - The parsing of Date, Time and DateTimes as well as their Intervals (PostGreSQL has some very nice ways for that) should be tremendously flexible. I am thinking here about a plugin-type architecture, where you can create your own plugins for parsing. For example, while the "." notation was reserved for the European Date Formats until now, more and more American companies (which are totally ignorant that there might be another country besides the US in the world) use this notation to write the American Date Format this way too. Therefore we need to have a mechanism to switch between the two. I thought of some sort of a list of regex expressions which try to resolve a string. Oh yeah, we need internationalization here as well of course, even though the parser should be generic enough. - The tough part will be time zones. I am almost thinking that we need our own object for handling that. Timezones are horribly complex, but we need to handle them well. I know Zope's current DateTime implementation has a good handle on that, even though I think the code is horrible (sorry Jim). - A professor just mentioned that we should also handle daylight saving. This is not even that trivial, but I agree with him; there needs to be support for that, even though most apps handle that via the time zone, which is ok for the numeric version, but not if you say "CST" for example. PS: Jim, I cc'ed you so that you might be able to comment in some of the points I made. FYI, Jason and I think about implementing a DateTime module for Python in general, which is small and sweet. We are shooting for our calendar system only. Regards, Stephan -- Stephan Richter CBU - Physics and Chemistry Student Web2k - Web Design/Development & Technical Project Management --=====================_112555085==_.ALT Content-Type: text/html; charset="us-ascii" Hello everyone,

what a coincidence. I just was discussing this issue with Jason O. today. Here is my original mail:

Hey Jason,

I also want to start to think about a DateTime module. PostGres has a nice discussion of their impementation: http://www.ca.postgresql.org/users-lounge/docs/7.1/user/datatype-datetime.html

Here is Java's stuff on it:
http://www3.cbu.edu/sciences/jdk1.3/docs/api/java/sql/Date.html

Low Level Data Types:

Date
Time
DateTime
TimeStamp - Timestamps are always in UTC.

* Intervals can be added or subtracted from themselves and the types above.
DateInterval
TimeInterval
DateTimeInterval
TimeStampInterval

Notes:

- The basic data type must be as small as possible, so that applications that save these types often (i.e. ZODB/Persistent) then it should not increase the amount of data by much.

- We should then have high-level classes that put in all the functionality, such as lower-level in/output. I think high-level in/output should be handled by functions inside the module, such as getDateTimeFromString(str, someI18NspecificInfo=None).

- We need flexible i18n support!!! This is very important, especially for Zope. By default the system should come with a gettext implementation, but I would like to have the module generic enough that we can define other types of translation and localization mechanisms. Mh, the more I think of it, the more I think we will end up building our own stuff and then exposing that via an API.

- The parsing of Date, Time and DateTimes as well as their Intervals (PostGreSQL has some very nice ways for that) should be tremendously flexible. I am thinking here about a plugin-type architecture, where you can create your own plugins for parsing. For example, while the "." notation was reserved for the European Date Formats until now, more and more American companies (which are totally ignorant that there might be another country besides the US in the world) use this notation to write the American Date Format this way too. Therefore we need to have a mechanism to switch between the two.
I thought of some sort of a list of regex expressions which try to resolve a string. Oh yeah, we need internationalization here as well of course, even though the parser should be generic enough.

- The tough part will be time zones. I am almost thinking that we need our own object for handling that. Timezones are horribly complex, but we need to handle them well. I know Zope's current DateTime implementation has a good handle on that, even though I think the code is horrible (sorry Jim).

- A professor just mentioned that we should also handle daylight saving. This is not even that trivial, but I agree with him; there needs to be support for that, even though most apps handle that via the time zone, which is ok for the numeric version, but not if you say "CST" for example.

PS: Jim, I cc'ed you so that you might be able to comment in some of the points I made. FYI, Jason and I think about implementing a DateTime module for Python in general, which is small and sweet. We are shooting for our calendar system only.

Regards,
Stephan

--
Stephan Richter
CBU - Physics and Chemistry Student
Web2k - Web Design/Development & Technical Project Management --=====================_112555085==_.ALT-- From fredrik@pythonware.com Sat Feb 9 11:21:00 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 9 Feb 2002 12:21:00 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <029a01c1b0cb$195cb400$ced241d5@hagrid> <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net> <3C64215F.AF31BB96@lemburg.com> Message-ID: <00ac01c1b15b$e26b7d00$ced241d5@hagrid> mal wrote: > In order to make mxDateTime subtypes of this new type we'd need to > make sure that the datetime type uses a true struct subset of what > I have in DateTime objects now: > > typedef struct { > PyObject_HEAD > > /* Representation used to do calculations */ > long absdate; /* number of days since 31.12. in the year 1 BC > calculated in the Gregorian calendar. */ > double abstime; /* seconds since 0:00:00.00 (midnight) > on the day pointed to by absdate */ > > ...lots of broken down values needed to assure roundtrip safety... > > } as Tim has pointed out, what I have in mind is: typedef struct { PyObject_HEAD /* nothing here: subtypes should implement timetuple and, if possible, utctimetuple */ } basetimeObject; /* maybe: */ PyObject* basetime_timetuple(PyObject* self, PyObject* args) { PyErr_SetString(PyExc_NotImplementedError, "must override"); return NULL; } (to adapt mxDateTime, all you should have to do is to inherit from baseObject, and add an alias for your "tuple" method) ::: since it's really easy to do, we should probably also add a simpletime type to the standard library, which wraps the standard time_t: typedef struct { PyObject_HEAD time_t time; /* maybe: int timezone; */ } simpletimeObject; ::: What I'm looking for is "decoupling", and making it easier for people to experiment with different implementations. Things like xmlrpclib, the logging system, database adapters, etc can look for basetime instances, and use the standard protocol to extract time information from any time object implementation. (I can imagine similar "abstract" basetypes for money/decimal data -- a basetype plus standardized behaviour for __int__, __float__, __str__ -- and possibly some other data types: baseimage, base- sound, basedomnode, ...) Hopefully, such base types can be converted to "interfaces" when- ever we get that. But I don't want to wait for a datetime working group to solve everything that MAL has already solved in mxDate- Time, and then everything he hasn't addressed. Nor do I want to wait for an interface working group to sort that thing out. Let's do something really simple instead. From mal@lemburg.com Sat Feb 9 11:31:38 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 09 Feb 2002 12:31:38 +0100 Subject: [Python-Dev] PYC Magic References: Message-ID: <3C65089A.F2607E59@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > FYI, I've bumped the PYC magic in a non-standard way (the old > > standard broke on 2002-01-01); please review: > > Fine by me, except you should also check in a NEWS blurb about it. The > current NEWS file says: > > """ > - Because Python's magic number scheme broke on January 1st, we decided > to stop Python development. Thanks for all the fish! > """ > > That's why PythonLabs hasn't done much of anything on Python since 2.2 was > released . Done. > > algorithm relying on the above scheme. Perhaps we should simply > > start counting in increments of 10 from now on ?! > > Why 10? I'd rather see it incremented by 1. If you respond that you want > to make room for more hacks akin to -U, my response would be that's exactly > what I want to prevent by blessing 1 <0.4 wink>. The reason is that I don't want to break the -U scheme. I know it's a hack, but until someone comes up with a better way to add flags to store PYC compile options, we'll have to stick with it (-U changes the semantics of the language in a pretty nasty way ... nothing works anymore ;-). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Sat Feb 9 11:40:27 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 09 Feb 2002 12:40:27 +0100 Subject: [Python-Dev] proposal: add basic time type to the standardlibrary References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu> Message-ID: <3C650AAB.878CC336@lemburg.com> [You should never post HTML email to mailing lists...] Stephan Richter wrote: > > Hello everyone, > > what a coincidence. I just was discussing this issue with Jason O. > today. Here is my original mail: > > Hey Jason, > > I also want to start to think about a DateTime module. PostGres has a > nice discussion of their impementation: > http://www.ca.postgresql.org/users-lounge/docs/7.1/user/datatype-datetime.html > > Here is Java's stuff on it: > http://www3.cbu.edu/sciences/jdk1.3/docs/api/java/sql/Date.html > > Low Level Data Types: > > Date > Time > DateTime > TimeStamp - Timestamps are always in UTC. See below... you don't need that many types. > * Intervals can be added or subtracted from themselves and the types > above. > DateInterval > TimeInterval > DateTimeInterval > TimeStampInterval Intervals are a bad idea. You really only need two types: one referencing fixed points in time and another one for storing the delta between two such fixed points. Everything else can be modeled on top of those two. Please have a look at mxDateTime. It has these two types and much of what you described in your notes. BTW, you wouldn't believe how complicated dealing with date and time really is... ah, yes, and don't even think of ever getting DST to work properly :-/ -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Sat Feb 9 11:57:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 09 Feb 2002 12:57:05 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <029a01c1b0cb$195cb400$ced241d5@hagrid> <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net> <3C64215F.AF31BB96@lemburg.com> <00ac01c1b15b$e26b7d00$ced241d5@hagrid> Message-ID: <3C650E91.529D9D29@lemburg.com> Fredrik Lundh wrote: > > mal wrote: > > > In order to make mxDateTime subtypes of this new type we'd need to > > make sure that the datetime type uses a true struct subset of what > > I have in DateTime objects now: > > > > typedef struct { > > PyObject_HEAD > > > > /* Representation used to do calculations */ > > long absdate; /* number of days since 31.12. in the year 1 BC > > calculated in the Gregorian calendar. */ > > double abstime; /* seconds since 0:00:00.00 (midnight) > > on the day pointed to by absdate */ > > > > ...lots of broken down values needed to assure roundtrip safety... > > > > } > > as Tim has pointed out, what I have in mind is: > > typedef struct { > PyObject_HEAD > /* nothing here: subtypes should implement timetuple > and, if possible, utctimetuple */ > } basetimeObject; > > /* maybe: */ > > PyObject* > basetime_timetuple(PyObject* self, PyObject* args) > { > PyErr_SetString(PyExc_NotImplementedError, "must override"); > return NULL; > } > > (to adapt mxDateTime, all you should have to do is to inherit from > baseObject, and add an alias for your "tuple" method) Ok. Sounds like you are inventing something like a set of abstract types here. I'm very much +1 on that idea, provided the interfaces we define for these types are simple enough (I think the DB SIG has shown that simple interface can go a looong way). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From srichter@cbu.edu Sat Feb 9 12:20:13 2002 From: srichter@cbu.edu (Stephan Richter) Date: Sat, 09 Feb 2002 06:20:13 -0600 Subject: [Python-Dev] proposal: add basic time type to the standardlibrary In-Reply-To: <3C650AAB.878CC336@lemburg.com> References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu> Message-ID: <5.1.0.14.2.20020209061810.02ce9dd0@mercury-1.cbu.edu> At 12:40 PM 2/9/2002 +0100, M.-A. Lemburg wrote: >[You should never post HTML email to mailing lists...] I know. I noticed it only after I had seen the archive entry. Did you guys could still read it? If not, I will resend it. Sorry! Regards, Stephan -- Stephan Richter CBU - Physics and Chemistry Student Web2k - Web Design/Development & Technical Project Management From guido@python.org Sat Feb 9 13:55:07 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 09 Feb 2002 08:55:07 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Sat, 09 Feb 2002 00:02:28 EST." References: Message-ID: <200202091355.g19Dt7H07636@pcp742651pcs.reston01.va.comcast.net> > WRT RAM usage, a Python int is no smaller than a TimeStamp object. Wrong, unless TimeStamps also use a custom allocator. The custom allocator uses 12 bytes per int (on a 32-bit machine) and incurs malloc overhead + 8 bytes of additional overhead for every 82 ints. That's about 12.2 bytes per int object; using malloc it would probably be 24 bytes. (PyMalloc would probably do a little better, except it would still round up to 16 bytes.) If TimeStamp objects were to use a similar allocation scheme, they could be pushed down to 16.2 bytes. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Feb 9 14:23:01 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 09 Feb 2002 09:23:01 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Fri, 08 Feb 2002 18:04:49 EST." References: Message-ID: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net> > I'm not looking for point-by-point answers here, I'm just pointing out > things that were hard to follow so that they may get addressed in a > revision. Do you think it's PEP time yet? > > When you use its getitem method, the PyObject * in the cell is > > dereferenced, and if a NULL is found, getitem raises KeyError > > even if the cell exists. > > Had a hard time with this: > [...] > > 2. Presumably the first "the cell" in this sentence refers to a > different cell than the second "the cell" intends. No, they are the same. See __getitem__ pseudo code. > delitem is missing, but presumably straightforward. I left it out intentionally because it adds nothing new. Maybe that was wrong -- it's important that deleting a global stores NULL in its cell.objptr but does not delete the cell from the celldict. > > When a function object is created from a regular dict instead of a > > celldict, func_cells is a NULL pointer. > > This part is regrettable, since it's Yet Another NULL check at the > *top* of code using this stuff (meaning it slows the normal case, > assuming that it's unusual not to get a celldict). I'm not clear on > how code ends up getting created from a regular dict instead of a > celldict -- is this because of stuff like "exec whatever in mydict"? Yes, I don't want to break such code because that's been the politically correct way for ages. We do have to deprecate it to encourage people to use celldicts here. To avoid the NULL check at the top, we could stuff func_cells with empty cells and do the special-case check at the end (just before we would raise NameError). Then there still needs to be a check for STORE and DELETE, because we don't want to store into the dummy cells. Sound like a hack to assess separately later. (Another hack probably not worth it right now is to make the module's cell.cellptr point to itself if it's not shadowing a builtin cell -- then the first NULL check for cell.cellptr can be avoided in the case of finding a builtin name successful.) > > - There are fallbacks in the VM for the case where the function's > > globals aren't a celldict, and hence func_cells is NULL. In that > > case, the code object's co_globals is indexed with to find the > > name of the corresponding global and this name is used to index the > > function's globals dict. > > Which may not succeed, so we also need another level to back off to > the builtins. I'd like to pursue getting rid of the > func_cells==NULL special case, even if it means constructing a > celldict out of a regular dict for the duration, and feeding > mutations back in to the regular dict afterwards. The problem is that *during* the execution accessing the dict doesn't give the right results. I don't care about this case being fast (after all it's exec and if people want it faster they can switch to using a celldict). I do care about not changing corners of the semantics. > Note that a chain of 4 test+branches against NULL in "the usual case" for > builtins may not be faster on average than inlining the first few useful > lines of lookdict_string twice (the expected path in this routine became > fat-free for 2.2): > > i = hash; > ep = &ep0[i]; > if (ep->me_key == NULL || ep->me_key == key) > return ep; > > Win or lose, that's usually the end of a dict lookup. That is, I'm certain > we're paying significantly more for layers of C-level function call overhead > today than for what the dict implementation actually does now (in the usual > cases). This should be tried!!! > > Compare this to Jeremy's scheme using dlicts: > > > > http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobals > > > > - My approach doesn't require global agreement on the numbering of the > > globals; each code object has its own numbering. This avoids the > > need for more global analysis, > > Don't really care about that. I do. The C code in compiler.c is already at a level of complexity that nobody understands it in its entirety! (I don't understand what Jeremy added, and Jeremy has to ask me about the original code. :-( ) Switching to the compiler.py package is unrealistic for 2.3; there's a bootstrap problem, plus it's much slower. I know that we cache the bytecode, but there are enough situations where we can't and the slowdown would kill us (imagine starting Zope for the first time from a fresh CVS checkout). > > and allows adding code to a module using exec that introduces new > > globals without having to fall back on a less efficient scheme. > > That is indeed lovely. Forgot a there? It seems a pretty minor advantage to me. I would like to be able to compare the two schemes more before committing to any implementation. Unfortunately there's no description of Jeremy's scheme that we can compare easily (though I'm glad to see he put up his slides on the web: http://www.python.org/~jeremy/talks/spam10/PEP-267-1.html). I guess there's so much handwaving in Jeremy's proposal about how to deal with exceptional cases that I'm uncomfortable with it. But that could be fixed. --Guido van Rossum (home page: http://www.python.org/~guido/) From Jeff Graysmith Sat Feb 9 18:17:08 2002 From: Jeff Graysmith (Jeff Graysmith) Date: 09 Feb 2002 10:17:08 -0800 Subject: [Python-Dev] You know your email is vulnerable to SPAM Robots? Message-ID: ------=_QO8SlNmY_Hqe7FiRQ_MA Content-Type: text/plain Content-Transfer-Encoding: 8bit Hello, Please pardon the intrusion, but I saw that the email address python- dev@python.org is in plain text on your site at http://python.sourceforge.net/peps/pep-0226.html making it vulnerable to be harvested by SPAM robots. Check this out there's a way to hide your email from robots, but still have it visible to human users. http://www.email-cloak.net Sincerely, Jeff Graysmith ------=_QO8SlNmY_Hqe7FiRQ_MA Content-Type: text/html Content-Transfer-Encoding: 8bit Hello,

Please pardon the intrusion, but I saw that the email address python-dev@python.org is in plain text on your site at http://python.sourceforge.net/peps/pep-0226.html making it vulnerable to be harvested by SPAM robots. Check this out there's a way to hide your email from robots, but still have it visible to human users.
http://www.email-cloak.net

Sincerely,
Jeff Graysmith
------=_QO8SlNmY_Hqe7FiRQ_MA-- From aahz@rahul.net Sat Feb 9 20:16:42 2002 From: aahz@rahul.net (Aahz Maruch) Date: Sat, 9 Feb 2002 12:16:42 -0800 (PST) Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <029a01c1b0cb$195cb400$ced241d5@hagrid> from "Fredrik Lundh" at Feb 08, 2002 07:04:45 PM Message-ID: <20020209201642.0FECEE8C4@waltz.rahul.net> Fredrik Lundh wrote: > > Or to put it another way, I want the following to work for any time object, > including mxDateTime objects, any date/timestamp returned by a DB-API > driver, and weird date/time-like types I've developed myself: > > if isinstance(t, basetime): > # yay! it's a timestamp > print t.timetuple() Looks good! I'd prefer None to -1, though, for the last three items of the tuple. Also, the raise on utctime() should be NotImplementedError, maybe? -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From tim.one@comcast.net Sat Feb 9 20:48:21 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 09 Feb 2002 15:48:21 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <200202091355.g19Dt7H07636@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Tim] > WRT RAM usage, a Python int is no smaller than a TimeStamp object. [Guido[ > Wrong, unless TimeStamps also use a custom allocator. Good point, and it doesn't (it uses PyObject_NEW). I don't think counting fractions of bytes is of great interest here, though, since I (still) believe it's the massive Zope DateTime type that's the focus of complaints. > The custom allocator uses 12 bytes per int (on a 32-bit machine) and > incurs malloc overhead + 8 bytes of additional overhead for every 82 ints. > That's about 12.2 bytes per int object; using malloc it would probably > be 24 bytes. (PyMalloc would probably do a little better, except it > would still round up to 16 bytes.) pymalloc overhead is a few percent; would work out to 16+f bytes per int object, for some f < 1.0. A difference is that "total memory dedicated to ints" never shrinks using the custom allocator, but can get reused for other objects under pymalloc. From mal@lemburg.com Sat Feb 9 21:55:02 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 09 Feb 2002 22:55:02 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <20020209201642.0FECEE8C4@waltz.rahul.net> Message-ID: <3C659AB6.73DB91A2@lemburg.com> Aahz Maruch wrote: > > Fredrik Lundh wrote: > > > > Or to put it another way, I want the following to work for any time object, > > including mxDateTime objects, any date/timestamp returned by a DB-API > > driver, and weird date/time-like types I've developed myself: > > > > if isinstance(t, basetime): > > # yay! it's a timestamp > > print t.timetuple() > > Looks good! I'd prefer None to -1, though, for the last three items of > the tuple. None would be better from an interface design POV, but for historic reasons (compatibility to localtime()) -1 is better. > Also, the raise on utctime() should be NotImplementedError, > maybe? In the DB API we let the implementors decide: if the functionality cannot be provided per design, then it should not be implemented; if it can be implemented, but only works under certain conditions, a DB API NotSupportedError is raised instead. For mxDateTime I would implement both methods since mxDateTime does not store a timezone with the value but instead defines methods (and other operations) based on assumptions about the value. Time zones are on the plate, though, and the parser already knows about them. The C lib only provides APIs for local time and UTC; if you ever tried to convert a non-local time value into UTC, you'll know that this is not easy at all (mostly because of the troubles caused by DST and sometimes also due to leap seconds getting in the way). About the proposed interface: I'd rename the type to datetimebase and the methods to .tuple() and .gmtuple(). y,m,d = datetime.tuple()[:3] h,m,s = datetime.utctuple()[3:6] IMHO, it looks better :-) One thing I'm missing is a definition for a constructor (type objects are callable, so it'll have to do something, I guess...) and there should also be a datetimedeltabase type (this one is needed for dealing with the difference between two datetime values). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Sat Feb 9 22:48:29 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 09 Feb 2002 17:48:29 -0500 Subject: [Python-Dev] PYC Magic In-Reply-To: <3C65089A.F2607E59@lemburg.com> Message-ID: [M.-A. Lemburg] > The reason is that I don't want to break the -U scheme. But -U doesn't work anyway: > (-U changes the semantics of the language in a pretty nasty > way ... nothing works anymore ;-). The only things -U have bought us are a bizarre definition of "magic number", code complication, and complaints from people who see -U in the "python -h" blurb and want to know why everything breaks when they try it. It may be a hack you want to use for internal testing, but stuff that's been broken since the day it was introduced, and makes no progress towards working, doesn't belong in the general release. > I know it's a hack, but until someone comes up with a better way to > add flags to store PYC compile options, we'll have to stick with > it. But there is no need to store info about PYC compile options: -U is its only use now, and -U has never worked. Since it's worse than useless, better to throw it out, then dream up a rational way to store PYC compile options if and when (and only if and when) there's an actual need for such. What would we lose if we tossed the -U support code? I can see what we'd gain. From tim.one@comcast.net Sat Feb 9 23:57:26 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 09 Feb 2002 18:57:26 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido] > Do you think it's PEP time yet? If the ideas aren't written down in an editable form while they're fresh on at least one person's mind, I'm afraid they'll just get lost. If it's a PEP, at least there will be a nagging reminder that someone once had a promising idea . >> 2. Presumably the first "the cell" in this sentence refers to a >> different cell than the second "the cell" intends. > No, they are the same. See __getitem__ pseudo code. I persist in my delusion. Original text: When you use its getitem method, the PyObject * in the cell is dereferenced, and if a NULL is found, getitem raises KeyError even if the cell exists. Since we're doing something with "the PyObject* in the cell", surely "the cell" *must* exist. So what is the "even if the cell exists" trying to say? I believe it means to say even if the cell's cellptr is not NULL and "the cell's cellptr is not NULL" is quite different from "the cell exists". > ... > To avoid the NULL check at the top, we could stuff func_cells with > empty cells and do the special-case check at the end (just before we > would raise NameError). That would be better -- getting cycles out of the most-frequent paths is my only goal here. > Then there still needs to be a check for STORE and DELETE, because we > don't want to store into the dummy cells. Sound like a hack to assess > separately later. Another idea: a celldict could contain a "real dict" pointer, normally NULL, and pointing to a plain dict when a real dict is given. The celldict constructor would populate the cells from the realdict's contents when not NULL. Then getitem wouldn't have to do anything special (realdict==NULL and realdict!=NULL would be the same to it). setitem and delitem would propagate mutations immediately into the realdict too when non-NULL. Since mutations are almost certainly much rarer than accesses, this makes the rarer operations pay. The eval loop would always see a celldict. > (Another hack probably not worth it right now is to make the module's > cell.cellptr point to itself if it's not shadowing a builtin cell -- > then the first NULL check for cell.cellptr can be avoided in the case > of finding a builtin name successful.) I don't think I followed this. If, e.g., a module's "len" cell is normally {NULL, pointer to __builtin__'s "len" cell} under the original scheme, how would that change? {NULL, pointer to this very cell} wouldn't make sense. {builtin len, pointer to this very cell} would make sense, but then the pointer to self is useless -- except as a hint that we copied the value up from the builtins? But then a change to __builtin__.len wouldn't be visible to the module. > ... > The problem is that *during* the execution accessing the dict doesn't > give the right results. I don't care about this case being fast > (after all it's exec and if people want it faster they can switch to > using a celldict). I do care about not changing corners of the > semantics. I expect that a write-through realdict (see above) attached to a celldict in such cases would address this, keeping the referencing code uniform and fast, and moving the special-casing into the celldict implementation for mutating operations. >> i = hash; >> ep = &ep0[i]; >> if (ep->me_key == NULL || ep->me_key == key) >> return ep; >> >> Win or lose, that's usually the end of a dict lookup. That is, >> I'm certain we're paying significantly more for layers of C-level >> function call overhead today than for what the dict implementation >> actually does now in the usual cases). > This should be tried!!! It's less promising after more thought. The chirf snag is that "usually the end" relies on that we're usually looking for things that are there. But when looking for a builtin, it's usually not in the module's dict, where we look first. In that case, about half the time we'll find an occupied irrelevant slot in the module's dict, and then we need the rest of lookdict_string to do a (usually brief, but there's no getting away from the loop because we can't know how brief in advance) futile chase down the collision chain. >>> This avoids the need for more global analysis, >> Don't really care about that. > I do. The C code in compiler.c is already at a level of complexity > that nobody understands it in its entirety! (I don't understand what > Jeremy added, and Jeremy has to ask me about the original code. :-( ) I don't care because I care about something else : it would add to the pressure to refactor this code mercilessly, and that would be a Good Thing over the long term. The current complexity isn't inherent, it's an artifact of outgrowing the original concrete-syntax-tree direct-to bytecode one-pass design. Now we've got multiple passes crawling over a now- inappropriate program representation, glued together more by "reliable accidents" than sensible design. That's all curable, and the pressures *to* cure it will continue to multiply over time (e.g., it would take a certain insanity to even think about folding pychecker-like checks into the current architecture). > Switching to the compiler.py package is unrealistic for 2.3; there's a > bootstrap problem, plus it's much slower. I know that we cache the > bytecode, but there are enough situations where we can't and the > slowdown would kill us (imagine starting Zope for the first time from > a fresh CVS checkout). I'm a fan of fast compilation. Heck, I was upset in 1982 when Cray's compiler dropped below 100,000 lines/minute for the first time . >>> and allows adding code to a module using exec that introduces new >>> globals without having to fall back on a less efficient scheme. >> That is indeed lovely. > Forgot a there? It seems a pretty minor advantage to me. No, it's lovely, not major. It's simply a good sign when the worst semantic nightmares "just work". It's also a lovely sign. Most flowers aren't terribly important either, but they are lovely. > I would like to be able to compare the two schemes more before > committing to any implementation. Unfortunately there's no > description of Jeremy's scheme that we can compare easily (though I'm > glad to see he put up his slides on the web: > http://www.python.org/~jeremy/talks/spam10/PEP-267-1.html). > > I guess there's so much handwaving in Jeremy's proposal about how to > deal with exceptional cases that I'm uncomfortable with it. But that > could be fixed. I agree it needs more detail, but at the start I'm more interested in the normal cases. I'll reattach my no-holds-barred description of resolving normal-case "len" in this scheme. Perhaps Jeremy could do the same for his. Jeremy is also aiming at speeding things like math.pi (global.attribute) as a whole (not just speeding the "math" part of it). Regurgitatia: """ If I'm reading this right, then in the normal case of resolving "len" in def mylen(s): return len(s) 1. We test func_cells for NULL and find out it isn't. 2. A pointer to a cell object is read out of func_cells at a fixed (wrt this function) offset. This points to len's cell object in the module's celldict. 3. The cell object's PyObject* pointer is tested and found to be NULL. 4. The cell object's cellptr pointer is tested and found not to be NULL. This points to len's cell object in __builtin__'s celldict. 5. The cell object's cellptr's PyObject* is tested and found not to be NULL. 6. The cell object's cellptr's PyObject* is returned. """ For a module global, the same description applies, but the outcome of #3 is not-NULL and it ends there then. For global.attr, step #3 yields the global, and then attr lookup is the same as today. Jeremy, can you do the same level of detail for your scheme? Skip? From andreas@andreas-jung.com Sat Feb 9 00:50:30 2002 From: andreas@andreas-jung.com (Andreas Jung) Date: Fri, 8 Feb 2002 19:50:30 -0500 Subject: [Python-Dev] Re: [Zope3-dev] Strange behaviour of python2.2 -U with Zope 3 References: <00ee01c1b103$72040990$02010a0a@suxlap> Message-ID: <00f701c1b103$cbc5a330$02010a0a@suxlap> Another followup: the module import seems to completely broken when using the -U option. Andreas ----- Original Message ----- From: "Andreas Jung" To: Sent: Friday, February 08, 2002 19:48 Subject: [Zope3-dev] Strange behaviour of python2.2 -U with Zope 3 > python2.2 utilities/unittestgui.py Zope.Testing.allZopeTests > start the GUI for the Zope3 unittests. So far so good. > > I tried to run all tests with unicode as default string type: > > python2.2 -U utilities/unittestgui.py Zope.Testing.allZopeTests > > This fails with the following traceback: > > yetix@/develop/DC/sandboxes/3x(57)% python2.2 -U utilities/unittestgui.py > Zope.Testing.allZopeTests > Traceback (most recent call last): > File "utilities/unittestgui.py", line 30, in ? > import linecache > ImportError: No module linecache > > Also "python2.2 -U -c "import linecache" " fails > > Any ideas ? > > Andreas > > > > _______________________________________________ > Zope3-dev mailing list > Zope3-dev@zope.org > http://lists.zope.org/mailman/listinfo/zope3-dev > From tim@zope.com Sun Feb 10 01:02:02 2002 From: tim@zope.com (Tim Peters) Date: Sat, 9 Feb 2002 20:02:02 -0500 Subject: [Python-Dev] RE: [Zope3-dev] Strange behaviour of python2.2 -U with Zope 3 In-Reply-To: <00f701c1b103$cbc5a330$02010a0a@suxlap> Message-ID: [Andreas Jung] > Another followup: > > the module import seems to completely broken when using the -U option. See my last email on zope3-dev: leave -U alone. It doesn't work and isn't supported. It shouldn't even exist (IMO). From mal@lemburg.com Sun Feb 10 13:29:36 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 10 Feb 2002 14:29:36 +0100 Subject: [Python-Dev] PYC Magic References: Message-ID: <3C6675C0.BE7B42FE@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > The reason is that I don't want to break the -U scheme. > > But -U doesn't work anyway: > > > (-U changes the semantics of the language in a pretty nasty > > way ... nothing works anymore ;-). > > The only things -U have bought us are a bizarre definition of "magic > number", code complication, and complaints from people who see -U in the > "python -h" blurb and want to know why everything breaks when they try it. > It may be a hack you want to use for internal testing, but stuff that's been > broken since the day it was introduced, and makes no progress towards > working, doesn't belong in the general release. Wait... the -U option was added in order to be able to see how well the 8-bit string / Unicode integration works. It's a know fact that the Python standard lib is not Unicode compatible yet and that's exactly what the -U option allows you to test (in a very simple way). In the long run, Python's std lib should move into the direction of being Unicode compatible, so I don't really see the need for removing -U altogether. To reduce the noise about Python failing to run with the option set, it may be a good idea to remove the mentioning from the -h blurb, though. > > I know it's a hack, but until someone comes up with a better way to > > add flags to store PYC compile options, we'll have to stick with > > it. > > But there is no need to store info about PYC compile options: -U is its > only use now, and -U has never worked. Since it's worse than useless, > better to throw it out, then dream up a rational way to store PYC compile > options if and when (and only if and when) there's an actual need for such. The -U option is currently the only application of such a flag. We will definitely have a need for these options in the future to make the runtime aware of certain assumptions which have been made in the compiled byte code, e.g. byte code using special opcodes, byte code compiled for a different Python virtual machine (once we get pluggable Python compiler / VM combos), byte code which was compiled using special literal interpretations (such as in the -U case or when compiling the source code with a different source code encoding assumption). I would be more than happy to get rid off the current PYC magic hack for -U and have it replaced with a better and extensible alternative, e.g. a combination of PYC version number and marhsalled option dictionary. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From srichter@cbu.edu Sun Feb 10 16:07:50 2002 From: srichter@cbu.edu (Stephan Richter) Date: Sun, 10 Feb 2002 10:07:50 -0600 Subject: [Python-Dev] proposal: add basic time type to the standardlibrary In-Reply-To: <3C650AAB.878CC336@lemburg.com> References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu> Message-ID: <5.1.0.14.2.20020210091537.02026d38@mercury-1.cbu.edu> > > * Intervals can be added or subtracted from themselves and the types > > above. > > DateInterval > > TimeInterval > > DateTimeInterval > > TimeStampInterval > >Intervals are a bad idea. Why? They are the same as your Deltas. Interval is the more common term I think, therefore I chose it. Maybe having a Time/Date/DateTime{Interval} is too much and they should be really one. So you would have DateTimeInterval and TimeStampInterval for the same reasons I describe below. On the other hand Java does not seem to implement intervals at all, which I think is a bad idea, since RDBs support it. >>> import DateTime >>> DateTime.parseInterval('6 mins 3 secs') # DateTime.DateTimeInterval is the default 6 minutes 3 seconds >>> DateTime.parseInterval('50 secs 3 millis', type=DateTime.TimeStampInterval) # returns ticks 50.003 I still think that many types are a good thing; it leaves the developer with choice. However the module should be smart and hide some of the choice from you, if you are a beginner. For example I imagine this to work: >>> import DateTime >>> date = DateTime.parseDateTime('2.1.2001') >>> type(date).__name__ Date >>> time = DateTime.parseDateTime('12:00:00') >>> type(time).__name__ Time >>> datetime = DateTime.parseDateTime('2.1.2001 12:00:00') >>> type(datetime).__name__ DateTime >You really only need two types: one referencing fixed points in >time and another one for storing the delta between two such >fixed points. Everything else can be modeled on top of those >two. Well yes, but this is a reason why I have such a hard rime to get mxDateTime into Zope. Your module is well suited for certain tasks, but not everybody wants to use mxDateTime for Date/Time manipulation. So, saving components of a date is for some uses much better than saving ticks and vice versa. I also talked with Jim Fulton about it, and he agrees that there is a need for more than one Date/Time type. However it should be easy of course to convert between both, the Timestamp and the DateTime type. Here are some more examples: >>> import DateTime >>> date = DateTime.parseDateTime('2.1.2001') >>> type(date).__name__ Date >>> stamp = DateTime.TimeStamp(date) >>> type(stamp).__name__ TimeStamp BTW, something I do not want to support is: >>> import DateTime >>> date = DateTime.DateTime('2.1.2001') Since putting parsing into the object itself is a big mess, as we noticed in the Zope 2.x DateTime implementation. I think there should be only two ways to initialize a DateTime object, one of which I showed above, which is responsible of converting TimeStamps to DateTimes (mmh, maybe that should be a module function as well). The other one is: >>> import DateTime >>> DateTime.DateTime(2001, 2, 3) February 3, 2001 >>> DateTime.DateTime('2001', '02', '03') # Of course it also supports strings here February 3, 2001 >>> DateTime.DateTime(2001, 2, 3, 12, 0) February 3, 2001 12:00:00 >>> DateTime.DateTime(2001, hour=12) # missing pieces will be replaced by 1 or 0 January 1, 2001 12:00:00 >>> DateTime.DateTime(year=2001, month=2, day=3, hour=1, minute=2, second=3, millisecond=4, timezone=-6) # max amount of arguments February 3, 2001 01:02:03.004 -06:00 >Please have a look at mxDateTime. It has these two types and >much of what you described in your notes. I know mxDateTime very well and have even suggested before to make it the Zope DateTime module and even put it in the standard Python distribution. Here is the mail from the Zope-Coders list: http://lists.zope.org/pipermail/zope-coders/2001-October/000100.html. You can follow the thread to see some responses. Also, the list of notes was made from my experience working with mxDateTime, Zope DateTime and PostGreSQL Dates/Times. I know it was not complete, but it had some of the hotspots in it. >BTW, you wouldn't believe how complicated dealing with date >and time really is... ah, yes, and don't even think of ever >getting DST to work properly :-/ Oh, I have seen and fixed the Zope DateTime implementation plenty and I have thought of the problem for 2.5 years now. The problem is that the US starts to use the German "." notation (as mentioned in my original mail) and other issues, which make it much harder. That is the reason why I want to build an ultra-flexible parsing engine. So you can do things like: >>> import DateTime >>> DateTime.parseDateTime('03/02/01', format=DateTime.ISO) February 1, 2003 >>> DateTime.parseDateTime('03/02/01', format=DateTime.US) March 2, 2001 >>> DateTime.parseDateTime('03.02.01', format=DateTime.US) March 2, 2001 >>> DateTime.parseDateTime('03/02/01', format=DateTime.GERMAN) # just in case Europe/Germany goes insane as well. February 3, 2001 But by default: >>> DateTime.parseDateTime('03/02/01') March 2, 2001 >>> DateTime.parseDateTime('03.02.01') February 3, 2001 Regards, Stephan -- Stephan Richter CBU - Physics and Chemistry Student Web2k - Web Design/Development & Technical Project Management From guido@python.org Sun Feb 10 16:20:30 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 10 Feb 2002 11:20:30 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Sat, 09 Feb 2002 18:57:26 EST." References: Message-ID: <200202101620.g1AGKUm17544@pcp742651pcs.reston01.va.comcast.net> > I persist in my delusion. Original text: > > When you use its getitem method, the PyObject * in the cell is > dereferenced, and if a NULL is found, getitem raises KeyError > even if the cell exists. > > Since we're doing something with "the PyObject* in the cell", surely "the > cell" *must* exist. So what is the "even if the cell exists" trying to say? It is trying to say "despite the cell's existence". See the sample code. > I believe it means to say > > even if the cell's cellptr is not NULL > > and "the cell's cellptr is not NULL" is quite different from "the cell > exists". No, it doesn't try to say that. But you're right that it's useful to add that the cell's cellptr is irrelevant to getitem. > Another idea: a celldict could contain a "real dict" pointer, > normally NULL, and pointing to a plain dict when a real dict is > given. The celldict constructor would populate the cells from the > realdict's contents when not NULL. Then getitem wouldn't have to do > anything special (realdict==NULL and realdict!=NULL would be the > same to it). setitem and delitem would propagate mutations > immediately into the realdict too when non-NULL. Since mutations > are almost certainly much rarer than accesses, this makes the rarer > operations pay. The eval loop would always see a celldict. This works for propagating changes from the celldict to the real dict, but not the other way around. Example: d = {'x': 10} def set_x(x): d['x'] = x exec "...some code that calls set_x()..." in d > > (Another hack probably not worth it right now is to make the module's > > cell.cellptr point to itself if it's not shadowing a builtin cell -- > > then the first NULL check for cell.cellptr can be avoided in the case > > of finding a builtin name successful.) > > I don't think I followed this. If, e.g., a module's "len" cell is normally > > {NULL, pointer to __builtin__'s "len" cell} > > under the original scheme, how would that change? > > {NULL, pointer to this very cell} > > wouldn't make sense. > > {builtin len, pointer to this very cell} > > would make sense, but then the pointer to self is useless -- except as a > hint that we copied the value up from the builtins? But then a change to > __builtin__.len wouldn't be visible to the module. I meant that for "len" it would not change, i.e. it would be {NULL, pointer to __builtin__'s "len" cell} but for a global "foo" it would change to {value of foo or NULL if foo is undefined, pointer to this very cell} Then if foo is defined, the code would find the value of foo in the first cell it tries, and if foo is undefined, it would find a NULL in the cell and in the cell it points to. > > I do. The C code in compiler.c is already at a level of > > complexity that nobody understands it in its entirety! (I don't > > understand what Jeremy added, and Jeremy has to ask me about the > > original code. :-( ) > > I don't care because I care about something else : it would > add to the pressure to refactor this code mercilessly, and that > would be a Good Thing over the long term. The current complexity > isn't inherent, it's an artifact of outgrowing the original > concrete-syntax-tree direct-to bytecode one-pass design. Now we've > got multiple passes crawling over a now- inappropriate program > representation, glued together more by "reliable accidents" > than sensible design. That's all curable, and the pressures *to* > cure it will continue to multiply over time (e.g., it would take a > certain insanity to even think about folding pychecker-like checks > into the current architecture). Actually, the concrete syntax tree was never a very good representation; it was convenient for the parser to generate that, and it was "okay" (or "good enough") to generate code from and to do anything else from. I agree that it's a good idea to start thinking about changing the parse tree representation to a proper abstract syntax tree. Maybe the normalization that the compiler.py package uses would be a good start? Except that I've never quite grasped the visitor architecture there. :-( > I agree it needs more detail, but at the start I'm more interested > in the normal cases. I'll reattach my no-holds-barred description > of resolving normal-case "len" in this scheme. Perhaps Jeremy could > do the same for his. Jeremy is also aiming at speeding things like > math.pi (global.attribute) as a whole (not just speeding the "math" > part of it). One problem with that is that it's hard to know when in . is a module, and when it's something else. I guess global analysis could help -- if it's imported ("import math") it's likely a module, if it's assigned from an expression ("L = []") or a locally defined function or class, it's likely not a module. But "from X import Y" creates a mystery -- X could be a package containing a module Y, or it could be a module containing a function or class Y. > Regurgitatia: > > """ > If I'm reading this right, then in the normal case of resolving "len" in > > def mylen(s): > return len(s) > > 1. We test func_cells for NULL and find out it isn't. This step could be avoided using my trick of an array of dummy cells or using your trick of a celldict containing an optional reference to a real dict, so let's skip it. > 2. A pointer to a cell object is read out of func_cells at a fixed (wrt > this function) offset. This points to len's cell object in the > module's celldict. > 3. The cell object's PyObject* pointer is tested and found to be NULL. > 4. The cell object's cellptr pointer is tested and found not to be NULL. This NULL test shouldn't be needed given my trick of linking cells that do not shadow globals to themselves. > This points to len's cell object in __builtin__'s celldict. > 5. The cell object's cellptr's PyObject* is tested and found not to be > NULL. > 6. The cell object's cellptr's PyObject* is returned. > """ > > For a module global, the same description applies, but the outcome of #3 is > not-NULL and it ends there then. > > For global.attr, step #3 yields the global, and then attr lookup is the same > as today. > > Jeremy, can you do the same level of detail for your scheme? Skip? Jeremy is probably still recovering with his family from the conference. I know I got sick there and am now stuck with a horrible cold (the umpteenth one this season). --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Sun Feb 10 18:16:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 10 Feb 2002 19:16:05 +0100 Subject: [Python-Dev] proposal: add basic time type to thestandardlibrary References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu> <5.1.0.14.2.20020210091537.02026d38@mercury-1.cbu.edu> Message-ID: <3C66B8E5.DC2DD1CA@lemburg.com> Stephan Richter wrote: > > > > * Intervals can be added or subtracted from themselves and the types > > > above. > > > DateInterval > > > TimeInterval > > > DateTimeInterval > > > TimeStampInterval > > > >Intervals are a bad idea. > > Why? They are the same as your Deltas. Interval is the more common term I > think, therefore I chose it. Maybe having a Time/Date/DateTime{Interval} is > too much and they should be really one. So you would have DateTimeInterval > and TimeStampInterval for the same reasons I describe below. As I explained my reply, most of these intervals are not needed as *base types*. You can easily model them on top of the two types I have in mxDateTime. Some may not like this model because it comes from a more mathematical point of view, but in reality it works quite nicely and simplifies the API structure significantly. A time interval is basically just an amount of seconds, nothing more. There's no need to have 4 different types to wrap a single double ;-) > On the other hand Java does not seem to implement intervals at all, which I > think is a bad idea, since RDBs support it. > > >>> import DateTime > >>> DateTime.parseInterval('6 mins 3 secs') # DateTime.DateTimeInterval is > the default > 6 minutes 3 seconds > >>> DateTime.parseInterval('50 secs 3 millis', > type=DateTime.TimeStampInterval) # returns ticks > 50.003 > > I still think that many types are a good thing; it leaves the developer > with choice. However the module should be smart and hide some of the choice > from you, if you are a beginner. For example I imagine this to work: > > >>> import DateTime > >>> date = DateTime.parseDateTime('2.1.2001') > >>> type(date).__name__ > Date > >>> time = DateTime.parseDateTime('12:00:00') > >>> type(time).__name__ > Time > >>> datetime = DateTime.parseDateTime('2.1.2001 12:00:00') > >>> type(datetime).__name__ > DateTime Just think of all the possible combinations you have in operations like '+', '-' and comparisons. You don't want to go down this road... > >You really only need two types: one referencing fixed points in > >time and another one for storing the delta between two such > >fixed points. Everything else can be modeled on top of those > >two. > > Well yes, but this is a reason why I have such a hard rime to get > mxDateTime into Zope. Your module is well suited for certain tasks, but not > everybody wants to use mxDateTime for Date/Time manipulation. Uhm, where did you get the impression that I want all the world to use mxDateTime :-? I wrote it for use in mxODBC since at the time there was no DateTime type around which could handle dates prior to 1970. As a result, mxDateTime was written to provide everything you need for database interfacing. That's also the reason why there is no time zone support in mxDateTime's base types: databases don't have time zone support built into their date/time types either (and for a good reason: time zones are better handled at application level). > So, saving > components of a date is for some uses much better than saving ticks and > vice versa. I also talked with Jim Fulton about it, and he agrees that > there is a need for more than one Date/Time type. However it should be easy > of course to convert between both, the Timestamp and the DateTime type. That's why mxDateTime provides so many interfaces to other forms of storing and reading date/time values, e.g. COMDate, ticks, doubles, tuples, strings, various scientific formats, in two different calendars etc. > Here are some more examples: > > >>> import DateTime > >>> date = DateTime.parseDateTime('2.1.2001') > >>> type(date).__name__ > Date > >>> stamp = DateTime.TimeStamp(date) > >>> type(stamp).__name__ > TimeStamp > > BTW, something I do not want to support is: > > >>> import DateTime > >>> date = DateTime.DateTime('2.1.2001') > > Since putting parsing into the object itself is a big mess, as we noticed > in the Zope 2.x DateTime implementation. I think there should be only two > ways to initialize a DateTime object, one of which I showed above, which is > responsible of converting TimeStamps to DateTimes (mmh, maybe that should > be a module function as well). The other one is: > > >>> import DateTime > >>> DateTime.DateTime(2001, 2, 3) > February 3, 2001 > >>> DateTime.DateTime('2001', '02', '03') # Of course it also supports > strings here > February 3, 2001 > >>> DateTime.DateTime(2001, 2, 3, 12, 0) > February 3, 2001 12:00:00 > >>> DateTime.DateTime(2001, hour=12) # missing pieces will be replaced by > 1 or 0 > January 1, 2001 12:00:00 > >>> DateTime.DateTime(year=2001, month=2, day=3, hour=1, > minute=2, second=3, millisecond=4, timezone=-6) # max > amount of arguments > February 3, 2001 01:02:03.004 -06:00 You really just want to support one way for the type constructor (broken down numbers). All other possibilities can be had via factory functions. > >Please have a look at mxDateTime. It has these two types and > >much of what you described in your notes. > > I know mxDateTime very well and have even suggested before to make it the > Zope DateTime module and even put it in the standard Python distribution. > Here is the mail from the Zope-Coders list: > http://lists.zope.org/pipermail/zope-coders/2001-October/000100.html. You > can follow the thread to see some responses. > Also, the list of notes was made from my experience working with > mxDateTime, Zope DateTime and PostGreSQL Dates/Times. I know it was not > complete, but it had some of the hotspots in it. > > >BTW, you wouldn't believe how complicated dealing with date > >and time really is... ah, yes, and don't even think of ever > >getting DST to work properly :-/ > > Oh, I have seen and fixed the Zope DateTime implementation plenty and I > have thought of the problem for 2.5 years now. The problem is that the US > starts to use the German "." notation (as mentioned in my original mail) > and other issues, which make it much harder. That is the reason why I want > to build an ultra-flexible parsing engine. So you can do things like: > > >>> import DateTime > >>> DateTime.parseDateTime('03/02/01', format=DateTime.ISO) > February 1, 2003 > >>> DateTime.parseDateTime('03/02/01', format=DateTime.US) > March 2, 2001 > >>> DateTime.parseDateTime('03.02.01', format=DateTime.US) > March 2, 2001 > >>> DateTime.parseDateTime('03/02/01', format=DateTime.GERMAN) # just in > case Europe/Germany goes insane as well. > February 3, 2001 > > But by default: > > >>> DateTime.parseDateTime('03/02/01') > March 2, 2001 > >>> DateTime.parseDateTime('03.02.01') > February 3, 2001 You can do all this with Parser module in mxDateTime. It allows you to specify a list of parsers to try and in which order to try them. Chuck Esterbrook has kept me working on it for quite some time, so it should be very complete by now :-) For more specific (and strict) formats, there are two other modules ISO and ARPA which can handle the respective formats used in Internet standards. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From srichter@cbu.edu Sun Feb 10 19:08:09 2002 From: srichter@cbu.edu (Stephan Richter) Date: Sun, 10 Feb 2002 13:08:09 -0600 Subject: [Python-Dev] proposal: add basic time type to thestandardlibrary In-Reply-To: <3C66B8E5.DC2DD1CA@lemburg.com> References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu> <5.1.0.14.2.20020210091537.02026d38@mercury-1.cbu.edu> Message-ID: <5.1.0.14.2.20020210124257.01eb3820@mercury-1.cbu.edu> >As I explained my reply, most of these intervals are not needed >as *base types*. You can easily model them on top of the two >types I have in mxDateTime. Some may not like this model because it >comes from a more mathematical point of view, but in reality it >works quite nicely and simplifies the API structure significantly. > >A time interval is basically just an amount of seconds, nothing more. >There's no need to have 4 different types to wrap a single >double ;-) Well, this is an okay representation, if you want to do a lot of math and use it mainly for this reason. On the other hand it might be fairly expensive, if I always want to extract components. In fact, only 10% of my usage requires mathematical operations. Most of the time I get the interval out of the database and want to simply display it (localized), such as in calendars. > > >>> import DateTime > > >>> date = DateTime.parseDateTime('2.1.2001') > > >>> type(date).__name__ > > Date > > >>> time = DateTime.parseDateTime('12:00:00') > > >>> type(time).__name__ > > Time > > >>> datetime = DateTime.parseDateTime('2.1.2001 12:00:00') > > >>> type(datetime).__name__ > > DateTime > >Just think of all the possible combinations you have in operations >like '+', '-' and comparisons. You don't want to go down this >road... Well, but again, 90% of the time I do not need to do any manipulation whatsoever. For this reason you would have time stamps or you know (because you used this type) that it will be less efficient to do '+' and '-' with DateTime objects, since it does need some more conversions. >Uhm, where did you get the impression that I want all the world >to use mxDateTime :-? I wrote it for use in mxODBC since at the time >there was no DateTime type around which could handle dates prior >to 1970. As a result, mxDateTime was written to provide everything >you need for database interfacing. That's also the reason why there >is no time zone support in mxDateTime's base types: databases >don't have time zone support built into their date/time types >either (and for a good reason: time zones are better handled at >application level). Well, back then (when I wrote the mail) I thought so. But now I see the limitations and have a better idea what people need; hence this proposal. For the same reason you say mxDateTime is not good for everything we need a solution that works for more situations. >You really just want to support one way for the type constructor >(broken down numbers). All other possibilities can be had via >factory functions. Probably so. I will have to think about it some more and look at some applications. > > But by default: > > > > >>> DateTime.parseDateTime('03/02/01') > > March 2, 2001 > > >>> DateTime.parseDateTime('03.02.01') > > February 3, 2001 > >You can do all this with Parser module in mxDateTime. It allows >you to specify a list of parsers to try and in which order >to try them. Chuck Esterbrook has kept me working on it for >quite some time, so it should be very complete by now :-) > >For more specific (and strict) formats, there are two other >modules ISO and ARPA which can handle the respective >formats used in Internet standards. Right. And I am not saying that we will not reuse some of the mxDateTime or the Zope DateTime code. I certainly do not want to reimplement stuff that already works very well. Also, we need to support I18N, which means the module needs to understand things like "February", but also "Februar" if the German locale was requested. I have no desire to compete with the mxDateTime implementation. I want to look at some of the solutions out there and take the best from everyone and provide a module that will suit 95-100% of the people. For several reasons, which I tried to point out in my mails, mxDateTime or Zope's Datetime in its current states is not suitable. Regards, Stephan -- Stephan Richter CBU - Physics and Chemistry Student Web2k - Web Design/Development & Technical Project Management From jeremy@alum.mit.edu Sun Feb 10 00:04:19 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Sat, 9 Feb 2002 19:04:19 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: References: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15461.47363.911259.672824@gondolin.digicool.com> Here's a brief review of the example function. def mylen(s): return len(s) LOAD_BUILTIN 0 (len) LOAD_FAST 0 (s) CALL_FUNCTION 1 RETURN_VALUE The interpreter has a dlict for all the builtins. The details don't matter here. Let's say that len is at index 4. The function mylen has an array: func_builtin_index = [4] # an index for each builtin used in mylen The entry at index 0 of func_builtin_index is the index of len in the interpreter's builtin dlict. It is either initialized when the function is created or on first use of len. (It doesn't matter for the mechanism and there's no need to decide which is better yet.) The module has an md_globals_dirty flag. If it is true, then a global was introduced dynamically, i.e. a name binding op occurred that the compiler did not detect statically. The code object has a co_builtin_names that is like co_names except that it only contains the names of builtins used by LOAD_BUILTIN. It's there to get the correct behavior when shadowing of a builtin by a local occurs at runtime. The frame grows a bunch of pointers -- f_module from the function (which stores it instead of func_globals) f_builtin_names from the code object f_builtins from the interpreter The implementation of LOAD_BUILTIN 0 is straightforward -- in pidgin C: case LOAD_BUILTIN: if (f->f_module->md_globals_dirty) { PyObject *w = PyTuple_GET_ITEM(f->f_builtin_names); ... /* rest is just like current LOAD_GLOBAL except that is used PyDLict_GetItem() */ } else { int builtin_index = f->f_builtin_index[oparg]; PyObject *x = f->f_builtins[builtin_index]; if (x == NULL) raise NameError Py_INCREF(x); PUSH(x); } The LOAD_GLOBAL opcode ends up looking basically the same, except that it doesn't need to check md_globals_dirty. case LOAD_GLOBAL: int global_index = f->f_global_index[oparg]; PyObject *x = f->f_module->md_globals[global_index]; if (x == NULL) { check for dynamically introduced builtin } Py_INCREF(x); PUSH(x); In the x == NULL case above, we need to take extra care for a builtin that the compiler didn't expect. It's an odd case. There is a global for the module named spam that hasn't yet been assigned to in the module and there's also a builtin named spam that will be hidden once spam is bound in the module. Jeremy From skip@pobox.com Sun Feb 10 20:34:08 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 10 Feb 2002 14:34:08 -0600 Subject: [Python-Dev] -U flag In-Reply-To: <3C6675C0.BE7B42FE@lemburg.com> References: <3C6675C0.BE7B42FE@lemburg.com> Message-ID: <15462.55616.207319.531285@12-248-41-177.client.attbi.com> (I think we've had this discussion before...) MAL> Wait... the -U option was added in order to be able to see how well MAL> the 8-bit string / Unicode integration works. It's a know fact that MAL> the Python standard lib is not Unicode compatible yet and that's MAL> exactly what the -U option allows you to test (in a very simple MAL> way). If -U is really just a "test" flag, I don't think it should show up in "python -h" output. Skip From jeremy@alum.mit.edu Sun Feb 10 01:13:53 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Sat, 9 Feb 2002 20:13:53 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: References: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15461.51537.381585.439205@gondolin.digicool.com> Let's try an attribute of a module. import math def mysin(x): return math.sin(x) There are two variants of support for this that differ in the way they handle math being rebound. Say another function is: def yikes(): global math import string as math We can either check on each use of math.attr to see if math is rebound, or we can require that STORE_GLOBAL marks all the math.attr entries as invalid. I'm not sure which is better, so I'll try to describe both. Case #1: Binding operation responsible for invalidating cache. The module has a dlict for globals that contains three entries: [math, mysin, yikes]. Each is a PyObject *. The module also has a global attrs cache, where each entry is struct { int ce_initialized; /* just a flag */ PyObject **ce_ref; } cache_entry; In the case we're considering, ce_module points to math and ce_module_index is math's index in the globals dlict. It's assigned to when the module object is created and never changes. There is one entry in the global attrs cache, for math.sin. There's only one entry because the compiler only found one attribute access of a global bound by an import statement. The function mysin(x) uses LOAD_GLOBAL_ATTR 0 (math.sin). case LOAD_GLOBAL_ATTR: cache_entry *e = f->f_module->md_cache[oparg]; if (!e->ce_initialized) { /* lookup module and find it's sin attr. store pointer to module dlict entry in ce_ref. NB: cache shared by all functions. if the thing we expected to be a module isn't actually a module, handle that case here and leave initalized set to false. */ } if (*e->ce_ref == NULL) { /* raise NameError if global module isn't bound yet. raise AttributeError if module is bound, but doesn't have attr. */ } Py_INCREF(*e->ce_ref); PUSH(*e->ce_ref); To support invalidation of cache entries, we need to arrange the cache entries in a particular order and add an auxiliary data structure that maps from module globals to cache entries it must invalidation. For example, say a module use math.sin, math.cos, and math.tan. The three cache entries for the math module should be stored contiguously in the cache. cache_entry *cache[] = { math.sin entry, math.cos entry, math.tan entry, } struct { int index; /* first attr of this module in cache */ int length; /* number of attrs for this module in cache */ } invalidation_info; There is one invalidation_info for each module that has cached attributes. (And only for things that the compiler determines to be modules.) The invalidation_info for math would be {0, 3}. If a STORE_GLOBAL rebinds math, it must walk through the cache and set ce_initialized to false for each cache entry. This isn't exactly the scheme I described in the slides, where I suggested that the LOAD_GLOBAL_ATTR would check if the module binding was still valid on each use. A question from Ping pushed me back in favor of the approach that I just described. No time this weekend to describe that check-on-each-use scheme. Jeremy From tim.one@comcast.net Sun Feb 10 21:13:07 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 10 Feb 2002 16:13:07 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202101620.g1AGKUm17544@pcp742651pcs.reston01.va.comcast.net> Message-ID: >>> When you use its getitem method, the PyObject * in the cell is >>> dereferenced, and if a NULL is found, getitem raises KeyError >>> even if the cell exists. >> Since we're doing something with "the PyObject* in the cell", >> surely "the cell" *must* exist. So what is the "even if the cell >> exists" trying to say? > It is trying to say "despite the cell's existence". Then s/even if/even though/, and that's what it will say . >> I believe it means to say >> >> even if the cell's cellptr is not NULL >> >> and "the cell's cellptr is not NULL" is quite different from "the cell >> exists". > No, it doesn't try to say that. But you're right that it's useful to > add that the cell's cellptr is irrelevant to getitem. So even though the cell exists, and even if its cellptr is non-NULL. [on a write-through dict] > This works for propagating changes from the celldict to the real dict, > but not the other way around. Fudge, that's right. Make it a 2-way write-through dict pair . > Example: > > d = {'x': 10} > def set_x(x): > d['x'] = x > exec "...some code that calls set_x()..." in d Understood. What about this? import cheater def f(): cheater.cheat() return pachinko() print f() where cheater.py contains: def cheat(): import __builtin__ __builtin__.pachinko = lambda: 666 That works fine today, and prints 666. Under the proposed scheme, I believe: 1. The main module's globals don't get a cell for pachinko at module creation time, because pachinko isn't in the builtins at that point. 2. When the function object for f() is created, the main module's globals grow a {NULL, NULL} cell for pachinko. 3. When cheat() is called, __builtin__'s celldict grows a {lambda: 666, NULL} slot for pachinko. 4. But the main module's globals neither sees that nor has a pointer to the new cell. 5. The reference to pachinko in f's return finds {NULL, NULL} in the globals, and so raises NameError. I'm thinking more radically now, about using module dicts mapping to pairs PyObject* (the actual value) a "I got this PyObject* from the builtins" flag (the "shadow flag") Invariants: 1. value==NULL implies flag is false. 2. flag is true implies value is the value in the builtin dict Suppose that, at module creation time, the module's globals still got an entry for every then-existing builtin, but rather than point to the builtin's cell, copied the ultimate PyObject* into one of these pairs (and set the flag). Most of the other machinery in the proposal stays the same. Accessing a global (builtin or not) via LOAD_GLOBAL_CELL is then very simple: the pair does or doesn't contain NULL, and that's the end of it either way. This makes the most frequent operation as fast as I can imagine it becoming (short of plugging ultimate PyObject* values directly into func_cells -- still thinking about that). How can this get out of synch? 1. Mutations of the builtin dict (creation of a new builtin, rebinding an existing builtin, del'ing an existing builtin). This has got to be exceedingly rare, and never happens in most programs: it doesn't matter how expensive this is. Each dict serving as a builtin dict could, e.g., maintain a list of weakrefs to the modules that were initialized from it. Then mutations of the dict could reach into the relevant module dicts and adjust them accordingly. The shadow flags in the module dicts' pairs let the fixup code know whether a given entry really refers to the builtin. This makes mutations of the builtins very expensive, but I'd be surprised to find a real program where it's a measurable expense. Note: It may be helpful to view this as akin to propagating changes in new-style base classes down to their descendants. 2. del'ing a module global may uncover a builtin of the same name. While not as exceedingly rare as mutations of the builtins, it's still a rare thing. Seems like it would be reasonably cheap anyway: Module delitem: raise exception if no such key # key exists raise exception if it came from the builtins (flag is set) # key exists and is a module global, not a builtin; flag is false set value to NULL if the builtins have this name as a key: copy the current builtin value set the flag 3. Module setitem: if key exists: overwrite the value and clear the flag else: add new {value, false} pair 4. Module getitem: if name isn't a key or flag is set or value is NULL: raise exception else: return value That's for non-builtin module timdicts. I expect the same code would work for the builtin module's timdict too provided it were given an empty dict as the source from which to initialize itself (then the flag component of all its pairs would start out, and remain, false, and the code above would "do the right thing" by magic, reading "the builtins" as "the timdict from which I got initialized"). > ... > I meant that for "len" it would not change, i.e. it would be > > {NULL, pointer to __builtin__'s "len" cell} > > but for a global "foo" it would change to > > {value of foo or NULL if foo is undefined, pointer to this very cell} > > Then if foo is defined, the code would find the value of foo in the > first cell it tries, and if foo is undefined, it would find a NULL in > the cell and in the cell it points to. Whereas the original scheme stored {value of foo or NULL if foo is undefined, NULL} in this case. So that's just as quick if foo is defined, but if it isn't defined as a module global, has to do an extra NULL check on the cellptr. Gotcha. OTOH, if "foo" later *becomes* defined in the builtins too, the module dict won't know that it must change its foo's cellptr. [on the compiler's architecture] > Actually, the concrete syntax tree was never a very good > representation; it was convenient for the parser to generate that, and > it was "okay" (or "good enough") to generate code from and to do > anything else from. I think it worked great until the local-variable optimization got added. Unfortunately, that happened shortly after the dawn of time. > I agree that it's a good idea to start thinking about changing the > parse tree representation to a proper abstract syntax tree. Maybe the > normalization that the compiler.py package uses would be a good start? > Except that I've never quite grasped the visitor architecture there. :-( This msg is too long already <0.7 wink>. ... [on Jeremy's global.attr scheme] > One problem with that is that it's hard to know when in > . is a module, and when it's something else. I > guess global analysis could help -- if it's imported ("import math") > it's likely a module, if it's assigned from an expression ("L = []") > or a locally defined function or class, it's likely not a module. But > "from X import Y" creates a mystery -- X could be a package containing > a module Y, or it could be a module containing a function or class Y. Jeremy is aware of all this (I've heard him ponder these specific points), but I don't think he has a fully fleshed out approach to all of it yet. [a rework of the "len resolution" example, incorporating Guido's comments] """ If I'm reading this right, then in the normal case of resolving "len" in def mylen(s): return len(s) 1. A pointer to a cell object is read out of func_cells at a fixed (wrt this function) offset. This points to len's cell object in the module's celldict. 2. The cell object's PyObject* pointer is tested and found to be NULL. [3'. The cell object's cellptr pointer is tested and found not to be NULL. This points to len's cell object in __builtin__'s celldict. > This NULL test shouldn't be needed given my trick of linking cells > that do not shadow globals to themselves. As above, in the presence of mutations to builtins, a global that didn't shadow a builtin at first may end up shadowing one later. Perhaps you want to punt on preserving current behavior in such cases. The variant I sketched above is intended to preserve all current behavior, while running faster in non-pathological cases. ] 3. The cell object's cellptr's PyObject* is tested and found not to be NULL. 4. The cell object's cellptr's PyObject* is returned. """ In the {PyObject*, flag} variant: 1. A pointer to a pair is read out of func_cells at a fixed (wrt this function) offset. This points to len's pair in the module's timdict. 2. The pair's PyObject* pointer is tested and found to be non-NULL. 3. The pair's PyObject* is returned. > ... > I know I got sick there and am now stuck with a horrible cold (the > umpteenth one this season). In empathy, I'll refrain from painting a word-picture of my neck . From skip@pobox.com Sun Feb 10 21:39:28 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 10 Feb 2002 15:39:28 -0600 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15462.59536.874232.817836@12-248-41-177.client.attbi.com> >> When a function object is created from a regular dict instead of a >> celldict, func_cells is a NULL pointer. Tim> This part is regrettable, since it's Yet Another NULL check at the Tim> *top* of code using this stuff (meaning it slows the normal case, Tim> assuming that it's unusual not to get a celldict). I'm not clear Tim> on how code ends up getting created from a regular dict instead of Tim> a celldict -- is this because of stuff like "exec whatever in Tim> mydict"? I'm still working my way through this thread, so forgive me if this has been hashed out already. It seems to me that the correct thing to do is to convert plain dicts to celldicts when creating functions. Besides, where are functions going to get created that are outside of your (PyhonLabs) control? Skip From skip@pobox.com Sun Feb 10 21:53:07 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 10 Feb 2002 15:53:07 -0600 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: References: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15462.60355.540553.195176@12-248-41-177.client.attbi.com> Tim> If I'm reading this right, then in the normal case of resolving Tim> "len" in Tim> def mylen(s): Tim> return len(s) ... Tim> Jeremy, can you do the same level of detail for your scheme? Skip? Yeah, it's TRACK_GLOBAL 'len' LOAD_FAST LOAD_FAST CALL_FUNCTION 1 UNTRACK_GLOBAL 'len' RETURN_VALUE or something similar. (Stuff in <...> represent array indexes.) My scheme makes update of my local copy of __builtins__.len the responsibility of the guy who changes the global copy. Most of the time this never changes, so as the number of accesses to len increase, the average time per lookup approaches that of a simple LOAD_FAST. Skip From tim.one@comcast.net Sun Feb 10 21:51:59 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 10 Feb 2002 16:51:59 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <15462.59536.874232.817836@12-248-41-177.client.attbi.com> Message-ID: [Skip Montanaro] > I'm still working my way through this thread, so forgive me if > this has been hashed out already. It seems to me that the correct > thing to do is to convert plain dicts to celldicts when creating > functions. There's the problem of object identity: it's possible for exec'ed code to mutate the original dict while the exec'ed code is running, and Guido gave an example where that can matter. I had originally suggested building a celldict that *contained* the original dict, reflecting mutations from the former to the latter as they happened. Mutations in the other direction go unnoticed, though. If the binary layouts are compatible enough, it may suffice to replace the dict's type pointer for the duration. Even then, the exec'ed code may get tripped up via testing (directly or indirectly) the type of the original dict (I suppose it could lie about its type ...). > Besides, where are functions going to get created that are outside > of your (PyhonLabs) control? They aren't, but eval and exec and execfile allow users to pass in plain dicts to be used for locals and/or globals. From skip@pobox.com Sun Feb 10 22:29:53 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 10 Feb 2002 16:29:53 -0600 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15462.62561.295245.440138@12-248-41-177.client.attbi.com> Guido> Inspired by talks by Jeremy and Skip on DevDay, here's a Guido> different idea for speeding up access to globals. It retain Guido> semantics but (like Jeremy's proposal) changes the type of a Guido> module's __dict__. Just to see if I have a correct mental model of what Guido proposed, I drew a picture: http://manatee.mojam.com/~skip/python/celldict.png The cells are the small blank boxes. I guess the celldict would be the stuff I labelled "module dict". The "func cells" would be an array like fastlocals, but would refer to cells in the module's dict. I'm not clear where/how builtins are accessed though. Is that what the extra indirection is, or are builtins incorporated into the module dict somehow? If anyone wants to correct my picture, the Dia diagram is at http://manatee.mojam.com/~skip/python/celldict Skip From skip@pobox.com Sun Feb 10 22:35:49 2002 From: skip@pobox.com (Skip Montanaro) Date: Sun, 10 Feb 2002 16:35:49 -0600 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: References: <15462.59536.874232.817836@12-248-41-177.client.attbi.com> Message-ID: <15462.62917.505362.521566@12-248-41-177.client.attbi.com> Tim> There's the problem of object identity: it's possible for exec'ed Tim> code to mutate the original dict while the exec'ed code is running, Tim> and Guido gave an example where that can matter. In the face of exec statements or calls to execfile can't the compiler just generate the usual LOAD_NAME fallback instead of the new-fangled LOAD_GLOBAL opcode? Skip From tim.one@comcast.net Sun Feb 10 22:45:34 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 10 Feb 2002 17:45:34 -0500 Subject: [Python-Dev] PYC Magic In-Reply-To: <3C6675C0.BE7B42FE@lemburg.com> Message-ID: [MAL] > Wait... the -U option was added in order to be able to see how well > the 8-bit string / Unicode integration works. It's a know fact that > the Python standard lib is not Unicode compatible yet and that's > exactly what the -U option allows you to test (in a very simple > way). I don't object to testing hacks provided they don't trip up the innocent; it would help to remove -U from the user-visible docs (which I'll do). Note that, by coincidence, Andreas Jung (at Zope Corp) pissed away time worrying about -U breakage yesterday independent of our thread here: it's doing harm. If you're the only one who tries -U on purpose (anyone? it's clear that I don't ...), it would be better done via a preprocessor define. How often is this used even by you? If it's once per release just to make sure it's still broken , a variant build wouldn't be a real burden. > ... > The -U option is currently the only application of such a flag. > We will definitely have a need for these options in the future > to make the runtime aware of certain assumptions which have been > made in the compiled byte code, e.g. byte code using special > opcodes, byte code compiled for a different Python virtual > machine (once we get pluggable Python compiler / VM combos), > byte code which was compiled using special literal > interpretations (such as in the -U case or when compiling > the source code with a different source code encoding > assumption). There remains no current use for any of these things. When a real use appears, "magic number" abuse won't be appropriate: imp.get_magic() doesn't return a vector; we're not doing the Unixish /etc/magic database any favors by *ever* changing it; and needing to register umpteen distinct magic numbers per release for Linux binfmt would make Python even more irritating to live with there. > I would be more than happy to get rid off the current PYC magic hack > for -U and have it replaced with a better and extensible alternative, > e.g. a combination of PYC version number and marhsalled option > dictionary. I agree, except that I still think having -U now is a net loss. From tim.one@comcast.net Sun Feb 10 22:51:19 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 10 Feb 2002 17:51:19 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <15462.62917.505362.521566@12-248-41-177.client.attbi.com> Message-ID: [Skip] > In the face of exec statements or calls to execfile can't the > compiler just generate the usual LOAD_NAME fallback instead of the new- > fangled LOAD_GLOBAL opcode? Note that exec doesn't have to be passed a string: you can pass it a compiled code object just as well. The compiler can't guess how a code object will be used at the time it's compiled. In theory there would be nothing to stop exec from rewriting the bytecode in a compiled code object passed to it, but I doubt we could get Guido to buy that trick until he first buys rewriting bytecode to set debugger breakpoints . From tim.one@comcast.net Sun Feb 10 23:27:09 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 10 Feb 2002 18:27:09 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <15462.62561.295245.440138@12-248-41-177.client.attbi.com> Message-ID: [Skip Montanaro] > Just to see if I have a correct mental model of what Guido > proposed, I drew a picture: > > http://manatee.mojam.com/~skip/python/celldict.png > > The cells are the small blank boxes. I guess the celldict would be the > stuff I labelled "module dict". The "func cells" would be an array like > fastlocals, but would refer to cells in the module's dict. Yup, except that fastlocals are part of a frame, not part of a function object. Guido didn't make a big deal about this, but it's key to efficiency: the expense of setting up func_cells is *not* incurred on a per-call basis, it's done once when a function object is created (MAKE_FUNCTION), then reused across all calls to that function object. > I'm not clear where/how builtins are accessed though. __builtin__ is just another module, and also has a celldict for a __dict__. The empty squares in your diagram (the "bottom half" of your cells) sometimes point to cells in __builtin__'s celldict. They remain empty (NULL) in __builtin__'s celldict, though. > Is that what the extra indirection is, or are builtins incorporated into > the module dict somehow? In Guido's proposal, module celldicts sometimes point to builtin's cells. It's set up so that *all* names of builtins get an entry in the module's dict, even names that aren't referenced in the module (this avoids global analysis). Their initial entries look like: "len": {NULL, pointer to the "len" cell in the builtins} Setting "len" as a module global (if you ever do that) overwrites the NULL. Then later del'ing "len" again (if you ever do that) restores the NULL. For *most* purposes, a cell with a NULL first pointer acts as if it didn't exist. It's only the eval loop that understands the "deep structure". In the variant I sketched today, there are no cross-dict pointers, and the initial entries look like "len": {the actual value of "len" from builtins, true} instead. Then mutating the builtins requires reaching back into modules and updating their timdicts. In return, access code is simpler+faster, and there aren't semantic changes (compared to today) if the builtins mutate *after* a module's dict is initially populated (Guido's scheme appears vulnerable here in at least two ways). From guido@python.org Mon Feb 11 00:09:29 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 10 Feb 2002 19:09:29 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Sun, 10 Feb 2002 16:35:49 CST." <15462.62917.505362.521566@12-248-41-177.client.attbi.com> References: <15462.59536.874232.817836@12-248-41-177.client.attbi.com> <15462.62917.505362.521566@12-248-41-177.client.attbi.com> Message-ID: <200202110009.g1B09TV18217@pcp742651pcs.reston01.va.comcast.net> [Skip] > In the face of exec statements or calls to execfile can't the > compiler just generate the usual LOAD_NAME fallback instead of the > new-fangled LOAD_GLOBAL opcode? Very good! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Feb 11 00:10:53 2002 From: guido@python.org (Guido van Rossum) Date: Sun, 10 Feb 2002 19:10:53 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Sun, 10 Feb 2002 17:51:19 EST." References: Message-ID: <200202110010.g1B0Asb18230@pcp742651pcs.reston01.va.comcast.net> > [Skip] > > In the face of exec statements or calls to execfile can't the > > compiler just generate the usual LOAD_NAME fallback instead of the new- > > fangled LOAD_GLOBAL opcode? [Tim] > Note that exec doesn't have to be passed a string: you can pass it a > compiled code object just as well. The compiler can't guess how a > code object will be used at the time it's compiled. In theory there > would be nothing to stop exec from rewriting the bytecode in a > compiled code object passed to it, but I doubt we could get Guido to > buy that trick until he first buys rewriting bytecode to set > debugger breakpoints . Arg. So much for that idea. (Although I think the mutable bytecode idea *is* the right idea for setting breakpoints after all.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@alum.mit.edu Sun Feb 10 04:21:01 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Sat, 9 Feb 2002 23:21:01 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <15461.51537.381585.439205@gondolin.digicool.com> References: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net> <15461.51537.381585.439205@gondolin.digicool.com> Message-ID: <15461.62765.301509.19821@gondolin.digicool.com> >>>>> "JH" == Jeremy Hylton writes: JH> Case #1: Binding operation responsible for invalidating cache. JH> The module has a dlict for globals that contains three entries: JH> [math, mysin, yikes]. Each is a PyObject *. JH> The module also has a global attrs cache, where each entry is JH> struct { JH> int ce_initialized; /* just a flag */ PyObject **ce_ref; JH> } cache_entry; JH> In the case we're considering, ce_module points to math and JH> ce_module_index is math's index in the globals dlict. It's JH> assigned to when the module object is created and never changes. Just pretend I didn't write this paragraph :-(. I was going to describe the other case first, then changed my mind. The previous paragraph describes Case #2. The text before and after this paragraph looks clear to me. Does anyone else agree? I didn't think I had done any hand waving on globals and module attributes in the slides; so I expect that I'm not a good judge of what is hand waving and what is high-level description. Jeremy From aahz@rahul.net Mon Feb 11 00:41:34 2002 From: aahz@rahul.net (Aahz Maruch) Date: Sun, 10 Feb 2002 16:41:34 -0800 (PST) Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202110010.g1B0Asb18230@pcp742651pcs.reston01.va.comcast.net> from "Guido van Rossum" at Feb 10, 2002 07:10:53 PM Message-ID: <20020211004134.B799FE8C3@waltz.rahul.net> Guido van Rossum wrote: > >> [Skip] >>> In the face of exec statements or calls to execfile can't the >>> compiler just generate the usual LOAD_NAME fallback instead of the new- >>> fangled LOAD_GLOBAL opcode? > > [Tim] >> Note that exec doesn't have to be passed a string: you can pass it a >> compiled code object just as well. The compiler can't guess how a >> code object will be used at the time it's compiled. In theory there >> would be nothing to stop exec from rewriting the bytecode in a >> compiled code object passed to it, but I doubt we could get Guido to >> buy that trick until he first buys rewriting bytecode to set >> debugger breakpoints . > > Arg. So much for that idea. (Although I think the mutable bytecode > idea *is* the right idea for setting breakpoints after all.) Let me play stupid for a sec: how does a compiled code object get created? Is Tim saying that one can pass foo.bar to exec, where bar() is a function in module foo? If not, why can't we force compile() to generate the slower code? Alternatively, can we change the semantics of exec to require the use of compile() to generate code objects? (compile() on an existing code object would do an explicit rewrite of the bytecode.) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From jeremy@alum.mit.edu Sun Feb 10 05:01:31 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Sun, 10 Feb 2002 00:01:31 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <20020211004134.B799FE8C3@waltz.rahul.net> References: <200202110010.g1B0Asb18230@pcp742651pcs.reston01.va.comcast.net> <20020211004134.B799FE8C3@waltz.rahul.net> Message-ID: <15461.65195.711574.143348@gondolin.digicool.com> You can exec a code object and specify the environment to use for names. Jeremy >>> def f(): ... print x + y ... >>> x = 1 >>> y = 3 >>> f() 4 >>> exec f.func_code in {'x':0, 'y':-3}, {} -3 From tim.one@comcast.net Mon Feb 11 01:14:07 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 10 Feb 2002 20:14:07 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <20020211004134.B799FE8C3@waltz.rahul.net> Message-ID: [Aahz] > Let me play stupid for a sec: how does a compiled code object get > created? Explicitly via passing strings to compile/exec/eval or via execfile, or implicitly due to the normal operation of class, def and lambda statements, and the interactive prompt. > Is Tim saying that one can pass foo.bar to exec, where bar() is a function > in module foo? No, foo.bar is a function object, meaning basically that it's a code object bound to a specific name and a specifc bag of globals, and whose default argument values (if any) have been computed and frozen based on those globals. You can pass foo.bar.func_code to exec, though (that's the raw code object). Note that marshal can't handle function objects, but can handle code objects, and some people make extremely heavy use of extracting code objects for marshaling, then later unmarshaling and exec'ing them. When I'm tempted to exec, I'm more likely to use compile() in a separate step (to get better control over errors) and exec the resulting code object. > ... > Alternatively, can we change the semantics of exec to require the use > of compile() to generate code objects? > (compile() on an existing code object would do an explicit rewrite of > the bytecode.) I didn't follow this, but am not sure it would help if I did . From tim.one@comcast.net Mon Feb 11 08:13:46 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 11 Feb 2002 03:13:46 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <15462.60355.540553.195176@12-248-41-177.client.attbi.com> Message-ID: [Skip Montanaro, on def mylen(s): return len(s) ] > Yeah, it's > > TRACK_GLOBAL 'len' > LOAD_FAST > LOAD_FAST > CALL_FUNCTION 1 > UNTRACK_GLOBAL 'len' > RETURN_VALUE > > or something similar. (Stuff in <...> represent array indexes.) > > My scheme makes update of my local copy of __builtins__.len Who is the "me" in "my"? That is, is "my local copy" attached to the frame, or to the function object, or to the module globals, or ...? Since it's accessed via LOAD_FAST, I'm assuming it's attached to the frame object. > the responsibility of the guy who changes the global copy. Also in my variant of Guido's proposal (and the value of len is cached in the module dict there, which tracks all changes to the builtins as they occur). > Most of the time this never changes, Right. > so as the number of accesses to len increase, the average time per > lookup approaches that of a simple LOAD_FAST. You mean number of accesses to len per function call, I think. If I do for i in xrange(1000000): print mylen("abc") I'm going to do a TRACK_GLOBAL and UNTRACK_GLOBAL thingie too for each LOAD_FAST of len, and then the average time per len lookup really has to count the average time for those guys too. From tim.one@comcast.net Mon Feb 11 09:27:27 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 11 Feb 2002 04:27:27 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <15461.47363.911259.672824@gondolin.digicool.com> Message-ID: [Jeremy Hylton] > Here's a brief review of the example function. > > def mylen(s): > return len(s) > > LOAD_BUILTIN 0 (len) > LOAD_FAST 0 (s) > CALL_FUNCTION 1 > RETURN_VALUE > > The interpreter has a dlict for all the builtins. The details don't > matter here. Actually, the details are everything here . > Let's say that len is at index 4. > > The function mylen has an array: > func_builtin_index = [4] # an index for each builtin used in mylen > > The entry at index 0 of func_builtin_index is the index of len in the > interpreter's builtin dlict. It is either initialized when the > function is created or on first use of len. All clear except for the referent of "It" (the subject of the preceding sentence is "The entry at index 0", but that doesn't seem to make much sense as a referent). > (It doesn't matter for the mechanism and there's no need to decide which > is better yet.) > > The module has an md_globals_dirty flag. If it is true, then a > global was introduced dynamically, i.e. a name binding op occurred > that the compiler did not detect statically. Once it becomes true, can md_globals_dirty ever become false again? > The code object has a co_builtin_names that is like co_names except > that it only contains the names of builtins used by LOAD_BUILTIN. > It's there to get the correct behavior when shadowing of a builtin by > a local occurs at runtime. ^^^^^ Can that happen? Or did you mean when shadowing of a builtin by a global occurs at runtime? The LOAD_BUILTIN code below seems most consistent with the "global" rewording. > The frame grows a bunch of pointers -- > > f_module from the function (which stores it instead of func_globals) > f_builtin_names from the code object > f_builtins from the interpreter > > The implementation of LOAD_BUILTIN 0 is straightforward -- in pidgin C: > > case LOAD_BUILTIN: > if (f->f_module->md_globals_dirty) { > PyObject *w = PyTuple_GET_ITEM(f->f_builtin_names); Presumably this is missing an ", oparg" argument. > ... /* rest is just like current LOAD_GLOBAL > except that is used PyDLict_GetItem() > */ > } else { > int builtin_index = f->f_builtin_index[oparg]; > PyObject *x = f->f_builtins[builtin_index]; > if (x == NULL) > raise NameError > Py_INCREF(x); > PUSH(x); > } OK, that's the gritty detail I was looking for. When it comes time to code, note that it's better to negate the test and swap the "if" branches (a not-taken branch is usually quicker than a taken branch, and you want to favor the expected case). Question: couldn't the LOAD_BUILTIN opcode use builtin_index directly as its argument (and so skip one level of indirection)? We know which builtins the interpreter supplies, and the compiler could be taught a fixed correspondence between builtin names and little integers. There are only 114 keys in __builtin__.__dict__ today, so there's plenty of room in an instruction to hold the index. A tuple of std builtin names could also be a C extern shared by everyone, eliminating the need for f_builtin_names. > The LOAD_GLOBAL opcode ends up looking basically the same, except that > it doesn't need to check md_globals_dirty. > > case LOAD_GLOBAL: > int global_index = f->f_global_index[oparg]; > PyObject *x = f->f_module->md_globals[global_index]; > if (x == NULL) { > check for dynamically introduced builtin > } > Py_INCREF(x); > PUSH(x); f_global_index wasn't mentioned before its appearance in this code block. I can guess what it is. Again I wonder whether it's possible to snip a layer of indirection (for a fixed function and fixed oparg, can f->f_global_index[oparg] change across invocations of LOAD_GLOBAL? I'm guessing "no", in which case a third of the normal-case code is burning cycles without real need). > In the x == NULL case above, we need to take extra care for a builtin > that the compiler didn't expect. It's an odd case. There is a > global for the module named spam The module is named spam, or the global is named spam? I think the latter was intended. > that hasn't yet been assigned to in the module and there's also a > builtin named spam that will be hidden once spam is bound in the module. And can also be revealed again if someone reaches into the module and del's spam again, right? This looks fast, provided it works , and is along the lines of what I had in mind when I first tortured Guido with the idea of dlicts way back when. One major correction: you pronounce it "dee-likt". That's a travesty. I picked the name dlict because it's unpronounceable in any human language -- as befits an unthinkable idea . From mal@lemburg.com Mon Feb 11 11:12:47 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 11 Feb 2002 12:12:47 +0100 Subject: [Python-Dev] Accessing globals without dict lookup References: Message-ID: <3C67A72F.B313E47@lemburg.com> Just a few quick questions before go back into lurcking mode: Will it still be possible to: a) install new builtins in the __builtin__ namespace and have them available in all already loaded modules right away ? b) override builtins (e.g. open()) with my own copies (e.g. to increase security) in a way that makes these new copies override the previous ones in all modules ? Also, how does the new scheme get along with the restricted execution model ? (I have a feeling that this model needs some auditing since so many new ways of accessing variables and attributes were introduced since the days of 1.5.2) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From ping@lfw.org Mon Feb 11 13:14:09 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Mon, 11 Feb 2002 07:14:09 -0600 (CST) Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Message-ID: All right -- i have attempted to diagram a slightly more interesting example, using my interpretation of Guido's scheme. http://lfw.org/repo/cells.gif http://lfw.org/repo/cells-big.gif for a bigger image http://lfw.org/repo/cells.ai for the source file The diagram is supposed to represent the state of things after "import spam", where spam.py contains import eggs i = -2 max = 3 def foo(n): y = abs(i) + max return eggs.ham(y + n) How does it look? Guido, is it anything like what you have in mind? A couple of observations so far: 1. There are going to be lots of global-cell objects. Perhaps they should get their own allocator and free list. 2. Maybe we don't have to change the module dict type. We could just use regular dictionaries, with the special case that if retrieving the value yields a cell object, we then do the objptr/cellptr dance to find the value. (The cell objects have to live outside the dictionaries anyway, since we don't want to lose them on a rehashing.) 3. Could we change the name, please? It would really suck to have two kinds of things called "cell objects" in the Python core. 4. I recall Tim asked something about the cellptr-points-to-itself trick. Here's what i make of it -- it saves a branch: instead of PyObject* cell_get(PyGlobalCell* c) { if (c->cell_objptr) return c->cell_objptr; if (c->cell_cellptr) return c->cell_cellptr->cell_objptr; } it's PyObject* cell_get(PyGlobalCell* c) { if (c->cell_objptr) return c->cell_objptr; return c->cell_cellptr->cell_objptr; } This makes no difference when c->cell_objptr is filled, but it saves one check when c->cell_objptr is NULL in a non-shadowed variable (e.g. after "del x"). I believe that's the only case in which it matters, and it seems fairly rare to me that a module function will attempt to access a variable that's been deleted from the module. Because the module can't know what new variables might be introduced into __builtin__ after the module has been loaded, a failed lookup must finally fall back to a lookup in __builtin__. Given that, it seems like a good idea to set c->cell_cellptr = c when c->cell_objptr is set (for both shadowed and non-shadowed variables). In my picture, this would change the cell that spam.max points to, so that it points to itself instead of __builtin__.max's cell. That is: PyObject* cell_set(PyGlobalCell* c, PyObject* v) { c->cell_objptr = v; c->cell_cellptr = c; } This simplifies things further: PyObject* cell_get(PyGlobalCell* c) { return c->cell_cellptr->cell_objptr; } This buys us no branches, which might be a really good thing on today's speculative execution styles. I know i'm a few messages behind on the discussion -- i'll do some reading to catch up before i say any more. But i hope the diagram is somewhat helpful, anyway. -- ?!ng From ping@lfw.org Mon Feb 11 13:22:27 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Mon, 11 Feb 2002 07:22:27 -0600 (CST) Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Message-ID: On Mon, 11 Feb 2002, Ka-Ping Yee wrote: > This simplifies things further: > > PyObject* cell_get(PyGlobalCell* c) > { > return c->cell_cellptr->cell_objptr; > } I forgot to mention that this would also add loopback cellptrs for the two cells pointed to by __builtin__.abs and __builtin__.max. But hey... in that case the cellptr is always two steps away from the object. So why not just use PyObject**s instead of cells? dict -> ptr -> ptr -> object (Or, if we want to maintain backward compatibility with existing dictionaries, let a cell be an object, so we can check its type, and have it contain just one pointer instead of two?) Am i out to lunch? -- ?!ng From martin@v.loewis.de Mon Feb 11 12:15:59 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 11 Feb 2002 13:15:59 +0100 Subject: [Python-Dev] Speeding up instance attribute access In-Reply-To: <200202081918.g18JIKb03242@pcp742651pcs.reston01.va.comcast.net> References: <200202081918.g18JIKb03242@pcp742651pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > - We need fallbacks for various exceptional cases: I think assignment to __class__ also needs to be considered. Therefore, it may be best if the member array is a separate block (not allocated with the instances). It might also be worthwhile to incorporate __slots__ access into that scheme, to avoid having to find the member descriptor in the class dictionary. Regards, Martin From skip@pobox.com Mon Feb 11 14:16:32 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 11 Feb 2002 08:16:32 -0600 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: References: <15462.60355.540553.195176@12-248-41-177.client.attbi.com> Message-ID: <15463.53824.600024.850814@12-248-41-177.client.attbi.com> Tim> [Skip Montanaro, on Tim> def mylen(s): Tim> return len(s) Tim> ] >> Yeah, it's >> >> TRACK_GLOBAL 'len' >> LOAD_FAST >> LOAD_FAST >> CALL_FUNCTION 1 >> UNTRACK_GLOBAL 'len' >> RETURN_VALUE >> >> or something similar. (Stuff in <...> represent array indexes.) >> >> My scheme makes update of my local copy of __builtins__.len Tim> Who is the "me" in "my"? Sorry, should have been "the" instead of "my". TRACK_GLOBAL is responsible for making the original copy. I should have added another argument to it: TRACK_GLOBAL 'len', LOAD_FAST LOAD_FAST CALL_FUNCTION 1 UNTRACK_GLOBAL 'len', RETURN_VALUE Tim> You mean number of accesses to len per function call, I think. Yes. Tim> If I do Tim> for i in xrange(1000000): Tim> print mylen("abc") Tim> I'm going to do a TRACK_GLOBAL and UNTRACK_GLOBAL thingie too for Tim> each LOAD_FAST of len, and then the average time per len lookup Tim> really has to count the average time for those guys too. Actually, no. I originally meant to say "Ignoring the fact that my optimizer would leave this example untouched...", but deleted it while editing the message as more detail than you were asking for. Your example: def mylen(s): return len(s) doesn't access len in a loop, so it would be ignored. On the other hand: for i in xrange(1000000): print mylen("abc") would track mylen (but not xrange). Skip From skip@pobox.com Mon Feb 11 14:26:46 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 11 Feb 2002 08:26:46 -0600 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: References: Message-ID: <15463.54438.240250.933946@12-248-41-177.client.attbi.com> Ping> But hey... in that case the cellptr is always two steps away from Ping> the object. So why not just use PyObject**s instead of cells? I think it's because they aren't objects. You need to make the indirection explicit so that when some code does the equivalent of module.abs it realizes it needs to follow the chain. Thanks for the great diagram, btw. I knew if I did something feeble it would get rewritten correctly. Skip From guido@python.org Mon Feb 11 14:31:32 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 09:31:32 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Mon, 11 Feb 2002 12:12:47 +0100." <3C67A72F.B313E47@lemburg.com> References: <3C67A72F.B313E47@lemburg.com> Message-ID: <200202111431.g1BEVWJ19544@pcp742651pcs.reston01.va.comcast.net> > Just a few quick questions before go back into lurcking mode: Note that I've moved my design to a new PEP, PEP 280. Tim has added his approach there too. Please read it!!! > Will it still be possible to: > a) install new builtins in the __builtin__ namespace and have them > available in all already loaded modules right away ? > b) override builtins (e.g. open()) with my own copies > (e.g. to increase security) in a way that makes these new > copies override the previous ones in all modules ? Yes, this is the whole point of this design. In the original approach, when LOAD_GLOBAL_CELL finds a NULL in the second cell, it should go back to see if the __builtins__ dict has been modified (the pseudo code doesn't have this yet). Tim's alternative also takes care of this. > Also, how does the new scheme get along with the restricted > execution model ? Yes, again. > (I have a feeling that this model needs some auditing since so many > new ways of accessing variables and attributes were introduced since > the days of 1.5.2) You may be right. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Mon Feb 11 14:39:35 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 11 Feb 2002 08:39:35 -0600 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: References: Message-ID: <15463.55207.217946.237969@12-248-41-177.client.attbi.com> Ping> All right -- i have attempted to diagram a slightly more Ping> interesting example, using my interpretation of Guido's scheme. Very nice. One case I would like to see covered is that of a global that is deleted. Something like: import eggs i = -2 max = 3 j = 4 def foo(n): y = abs(i) + max return eggs.ham(y + n) del j I presume there would still be an entry in spam's module dict with a NULL objptr. The whole think makes sense to me if it avoids the possible two PyDict_GetItem calls in the LOAD_GLOBAL opcode. As I understand it, if accessed inside a function, LOAD_GLOBAL could be implemented something like this: case LOAD_GLOBAL: cell = func_cells[oparg]; if (cell.objptr) x = cell->objptr; else x = cell->cellptr->objptr; if (x == NULL) { ... error recovery ... break; } Py_INCREF(x); continue; This looks a lot better to me (no complex function calls). What happens in the module's top-level code where there is presumably no func_cells array? Do we simply have two different opcodes, one for use at the global level and one for use in functions? Skip From guido@python.org Mon Feb 11 14:42:54 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 09:42:54 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Mon, 11 Feb 2002 07:22:27 CST." References: Message-ID: <200202111442.g1BEgsZ19628@pcp742651pcs.reston01.va.comcast.net> > On Mon, 11 Feb 2002, Ka-Ping Yee wrote: > > This simplifies things further: > > > > PyObject* cell_get(PyGlobalCell* c) > > { > > return c->cell_cellptr->cell_objptr; > > } > > I forgot to mention that this would also add loopback cellptrs for > the two cells pointed to by __builtin__.abs and __builtin__.max. > > But hey... in that case the cellptr is always two steps away from > the object. So why not just use PyObject**s instead of cells? > > dict -> ptr -> ptr -> object > > (Or, if we want to maintain backward compatibility with existing > dictionaries, let a cell be an object, so we can check its type, > and have it contain just one pointer instead of two?) > > Am i out to lunch? I think so. Think of max in the example used for your diagram (thanks for that BTW!). The first cell for it contains 3; the second cell for it contains the built-in function 'max'. A double dereference would get the wrong value. Or did I misread your suggestion? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Feb 11 15:02:38 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 10:02:38 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Mon, 11 Feb 2002 08:39:35 CST." <15463.55207.217946.237969@12-248-41-177.client.attbi.com> References: <15463.55207.217946.237969@12-248-41-177.client.attbi.com> Message-ID: <200202111502.g1BF2cD19707@pcp742651pcs.reston01.va.comcast.net> > Very nice. One case I would like to see covered is that of a global > that is deleted. Something like: > > > import eggs > > i = -2 > max = 3 > j = 4 > > def foo(n): > y = abs(i) + max > return eggs.ham(y + n) > > del j > > I presume there would still be an entry in spam's module dict with a > NULL objptr. Yes. > The whole think makes sense to me if it avoids the possible two > PyDict_GetItem calls in the LOAD_GLOBAL opcode. As I understand it, if > accessed inside a function, LOAD_GLOBAL could be implemented something like > this: > > case LOAD_GLOBAL: Surely you meant LOAD_GLOBAL_CELL. > cell = func_cells[oparg]; > if (cell.objptr) x = cell->objptr; > else x = cell->cellptr->objptr; > if (x == NULL) { > ... error recovery ... > break; > } > Py_INCREF(x); > continue; > > This looks a lot better to me (no complex function calls). Here's my version: case LOAD_GLOBAL_CELL: cell = func_cells[oparg]; x = cell->objptr; if (x == NULL) { x = cell->cellptr->objptr; if (x == NULL) { ... error recovery ... break; } } Py_INCREF(x); continue; > What happens in the module's top-level code where there is > presumably no func_cells array? Do we simply have two different > opcodes, one for use at the global level and one for use in > functions? It could use LOAD_GLOBAL which should use PyMapping_GetItem on the globals dict. Or maybe even LOAD_NAME which should do the same. But we could also somehow create a func_cells array (hm, it would have to be called differently then I suppose). (I've added these to the FAQs in PEP 280 too.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@python.net Mon Feb 11 15:03:38 2002 From: gward@python.net (Greg Ward) Date: Mon, 11 Feb 2002 10:03:38 -0500 Subject: [Python-Dev] Want to co-design and implement a logging module? In-Reply-To: <20020204094149.C31089@ActiveState.com> References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com> Message-ID: <20020211150338.GA20372@gerg.ca> On 04 February 2002, Trent Mick volunteered to "do something" about a standard logging module for Python: > How about I try to have a PEP together within a week or two, and perhaps a > working base implementation? Well, it's been exactly 7 days. Trent, are you halfway done yet? ;-) (Yes, I've been thinking that the solution to the Distutils' verbosity problem lies somewhere down this road.) Greg -- Greg Ward - Unix geek gward@python.net http://starship.python.net/~gward/ Laziness, Impatience, Hubris. From guido@python.org Mon Feb 11 15:31:33 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 10:31:33 -0500 Subject: [Python-Dev] Speeding up instance attribute access In-Reply-To: Your message of "11 Feb 2002 13:15:59 +0100." References: <200202081918.g18JIKb03242@pcp742651pcs.reston01.va.comcast.net> Message-ID: <200202111531.g1BFVXO19857@pcp742651pcs.reston01.va.comcast.net> > Guido van Rossum writes: > > > - We need fallbacks for various exceptional cases: > > I think assignment to __class__ also needs to be > considered. Therefore, it may be best if the member array is a > separate block (not allocated with the instances). Good point. When __class__ is assigned and the new __class__ has a different layout of the member array then the old, all instance variables must be moved from the member array into a temporary dict, and then from the temporary dict redistributed over the new member array. If you're lucky, the temporary dict is empty after that; otherwise, it becomes the overflow dict. > It might also be worthwhile to incorporate __slots__ access into that > scheme, to avoid having to find the member descriptor in the class > dictionary. __slots__ are really allocated in the object, not in a separate memory block. But I agree it would be nice if they could somehow be integrated. --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Mon Feb 11 15:31:53 2002 From: mwh@python.net (Michael Hudson) Date: 11 Feb 2002 15:31:53 +0000 Subject: [Python-Dev] Want to co-design and implement a logging module? In-Reply-To: Greg Ward's message of "Mon, 11 Feb 2002 10:03:38 -0500" References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com> <20020211150338.GA20372@gerg.ca> Message-ID: <2mheoom3va.fsf@starship.python.net> Greg Ward writes: > On 04 February 2002, Trent Mick volunteered to "do something" > about a standard logging module for Python: > > How about I try to have a PEP together within a week or two, and perhaps a > > working base implementation? > > Well, it's been exactly 7 days. Trent, are you halfway done yet? ;-) > > (Yes, I've been thinking that the solution to the Distutils' verbosity > problem lies somewhere down this road.) But I believe that 1.5.2 compatibility is still relavent for distutils, so a logging module in 2.3 is not especially helpful, unless one can come up with some scheme whereby the standalone distutils packages can use a bundled logger and the 2.3 distutils use the library one. I had a go at implementing a very KISS approach to distutils logging this morning and found what I was doing conflicted horribly with distutils' current practice, so I stopped. Cheers, M. -- Also, remember to put the galaxy back when you've finished, or an angry mob of astronomers will come round and kneecap you with a small telescope for littering. -- Simon Tatham, ucam.chat, from Owen Dunn's review of the year From gward@python.net Mon Feb 11 15:32:01 2002 From: gward@python.net (Greg Ward) Date: Mon, 11 Feb 2002 10:32:01 -0500 Subject: [Python-Dev] Proposed standard module: Optik Message-ID: <20020211153201.GA20417@gerg.ca> Hi all -- I would like to propose adding my Optik module to the standard library. Optik is the all-singing, all-dancing, featureful, extensible, well-documented option-parsing module that I have always wanted. Now I have it, and I like it -- so much that I'd like it to just always be there whenever I fire up Python (2.3 or greater, of course). Please take a look at http://optik.sourceforge.net/ for the whole story, including all the documentation and code via CVS. Note that Optik is currently distributed as a package with three modules; it's a hair over 1000 lines of text (563 lines of code), though, so could easily be munged into a single file if that's preferred for the standard library. Two good arguments for adding Optik to the standard library: * David Goodger wants to use it for the standard doc-processing tools * I'd like to use it in the Distutils (and ditch distutils.fancy_getopt) Greg -- Greg Ward - Unix bigot gward@python.net http://starship.python.net/~gward/ No man is an island, but some of us are long peninsulas. From guido@python.org Mon Feb 11 15:42:34 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 10:42:34 -0500 Subject: [Python-Dev] Proposed standard module: Optik In-Reply-To: Your message of "Mon, 11 Feb 2002 10:32:01 EST." <20020211153201.GA20417@gerg.ca> References: <20020211153201.GA20417@gerg.ca> Message-ID: <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> > I would like to propose adding my Optik module to the standard library. No immediate objection, although there are some other fancy options packages around, and IMO you have to explain why Optik is better. Can we change the name? Optik is nice for a standalone 3rd party module/package but a bit too fancyful for a standard library module. It could be a new function in getopt: from getopt import OptionParser [...] parser = OptionParser() --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Feb 11 15:41:35 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 11 Feb 2002 16:41:35 +0100 Subject: [Python-Dev] Accessing globals without dict lookup References: <3C67A72F.B313E47@lemburg.com> <200202111431.g1BEVWJ19544@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C67E62F.5A0AD72F@lemburg.com> Guido van Rossum wrote: > > > Just a few quick questions before go back into lurking mode: > > Note that I've moved my design to a new PEP, PEP 280. Tim has added > his approach there too. Please read it!!! Thanks for the answers; looks like I can safely go back into lurking mode :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From gward@python.net Mon Feb 11 16:10:25 2002 From: gward@python.net (Greg Ward) Date: Mon, 11 Feb 2002 11:10:25 -0500 Subject: [Python-Dev] Proposed standard module: Optik In-Reply-To: <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020211161025.GA20794@gerg.ca> On 11 February 2002, Guido van Rossum said: > No immediate objection, although there are some other fancy options > packages around, and IMO you have to explain why Optik is better. Well, here's what I like about Optik: * it ties short options and long options together, so once you define your options you never have to worry about the fact that -f and --file are the same * it's strongly typed: if you say option --foo expects an int, then Optik makes sure the user supplied a string that can be int()'ified, and supplies that int to you * it automatically generates full help based on snippets of help text you supply with each option * it has a wide range of "actions" -- ie. what to do with the value supplied with each option. Eg. you can store that value in a variable, append it to a list, pass it to an arbitrary callback function, etc. * you can add new types and actions by subclassing -- how to do this is documented and tested * it's dead easy to implement simple, straightforward, GNU/POSIX- style command-line options, but using callbacks you can be as insanely flexible as you like * provides lots of mechanism and only a tiny bit of policy (namely, the --help and (optionally) --version options -- and you can trash that convention if you're determined to be anti-social) Anyways, read the docs at optik.sourceforge.net for the whole deal. > Can we change the name? Optik is nice for a standalone 3rd party > module/package but a bit too fancyful for a standard library module. Sure, no problem. > It could be a new function in getopt: > > from getopt import OptionParser > [...] > parser = OptionParser() I guess that's OK if we're agreed that Optik is the be-all, end-all option-parsing tool. (I happen to think so, but I'd like to get opinions from a few other python-dev'ers before I let this go to my head.) I'm pretty cool to names like "super_getopt" or "fancy_getopt", despite having perpetrated precisely the latter in the Distutils. ;-( Greg -- Greg Ward - geek-at-large gward@python.net http://starship.python.net/~gward/ Know thyself. If you need help, call the CIA. From gward@python.net Mon Feb 11 16:13:54 2002 From: gward@python.net (Greg Ward) Date: Mon, 11 Feb 2002 11:13:54 -0500 Subject: [Python-Dev] Want to co-design and implement a logging module? In-Reply-To: <2mheoom3va.fsf@starship.python.net> References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com> <20020211150338.GA20372@gerg.ca> <2mheoom3va.fsf@starship.python.net> Message-ID: <20020211161354.GB20794@gerg.ca> On 11 February 2002, Michael Hudson said: > But I believe that 1.5.2 compatibility is still relavent for > distutils I'm still catching up on distutils-sig traffic from the past year, so I don't want to overcommit myself here... but I've been thinking that we (I) should do one last Distutils release that is 1.5.2 compatible, and then we can decide if future Distutils releases will stick to 2.0-compatibility, or are allowed to require the version of Python that they go with. However, please *don't* everyone jump in and start a thread about this now. I'll take it up on distutils-sig when I've caught up. > I had a go at implementing a very KISS approach to distutils logging > this morning and found what I was doing conflicted horribly with > distutils' current practice, so I stopped. Probably because the Distutils current practice is an ill-thought-out mishmash. That'll have to be fixed first, I suspect. Sorry. ;-( Greg -- Greg Ward - Unix bigot gward@python.net http://starship.python.net/~gward/ NOBODY expects the Spanish Inquisition! From paul@prescod.net Mon Feb 11 16:16:57 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 11 Feb 2002 08:16:57 -0800 Subject: [Python-Dev] Proposed standard module: Optik References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C67EE79.9C39087F@prescod.net> Guido van Rossum wrote: > > > I would like to propose adding my Optik module to the standard library. > > No immediate objection, although there are some other fancy options > packages around, and IMO you have to explain why Optik is better. Maybe we should turn the Optik documentation into a PEP (or at least make a PEP with a pointer to it) so that people with competitive solutions can either suggest improvements or claim that their solution is a better starting point. Paul Prescod From martin@v.loewis.de Mon Feb 11 16:26:00 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 11 Feb 2002 17:26:00 +0100 Subject: [Python-Dev] Proposed standard module: Optik In-Reply-To: <20020211161025.GA20794@gerg.ca> References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <20020211161025.GA20794@gerg.ca> Message-ID: Greg Ward writes: > > It could be a new function in getopt: > > > > from getopt import OptionParser > > [...] > > parser = OptionParser() > > I guess that's OK if we're agreed that Optik is the be-all, end-all > option-parsing tool. (I happen to think so, but I'd like to get > opinions from a few other python-dev'ers before I let this go to my > head.) I'd also be in favour of providing option parsing through getopt only. If getopt is not enough, extend it (in moderate ways, rather adding customization mechanisms instead of alternatives, etc). If that involves incorporating code from Optik, fine. However, I don't think the standard library should have two modules that do essentially the same thing; such scenarious will raise question whether one is better than the other and which of them is maintained. Regards, Martin From guido@python.org Mon Feb 11 16:28:59 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 11:28:59 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Mon, 11 Feb 2002 07:14:09 CST." References: Message-ID: <200202111628.g1BGSxF20159@pcp742651pcs.reston01.va.comcast.net> > All right -- i have attempted to diagram a slightly more interesting > example, using my interpretation of Guido's scheme. [...] > How does it look? Guido, is it anything like what you have in mind? Yes, exactly. I've added pointers to your images to PEP 280. Maybe you can also create a diagram for Tim's "more aggressive" scheme? > A couple of observations so far: > > 1. There are going to be lots of global-cell objects. > Perhaps they should get their own allocator and free list. Yes. > 2. Maybe we don't have to change the module dict type. > We could just use regular dictionaries, with the special > case that if retrieving the value yields a cell object, > we then do the objptr/cellptr dance to find the value. > (The cell objects have to live outside the dictionaries > anyway, since we don't want to lose them on a rehashing.) And who would do the special dance? If PyDict_GetItem, it would add an extra test to code whose speed is critical in lots of other cases (plus it would be impossible to create a dictionary containing cells without having unwanted special magic). If in a wrapper, then .__dict__[] would return a surprise cell instead of a value. > 3. Could we change the name, please? It would really suck > to have two kinds of things called "cell objects" in > the Python core. Agreed. Or we could add a cellptr to the existing cell objects; or maybe a scheme could be devised that wouldn't need a cell to have a cellptr, and then we could use the existing cell objects unchanged. > 4. I recall Tim asked something about the cellptr-points-to-itself > trick. Here's what i make of it -- it saves a branch: instead of > > PyObject* cell_get(PyGlobalCell* c) > { > if (c->cell_objptr) return c->cell_objptr; > if (c->cell_cellptr) return c->cell_cellptr->cell_objptr; > } > > it's > > PyObject* cell_get(PyGlobalCell* c) > { > if (c->cell_objptr) return c->cell_objptr; > return c->cell_cellptr->cell_objptr; > } That's what my second "additional idea" in PEP 280 proposes: | - Make c.cellptr equal to c when a cell is created, so that | LOAD_GLOBAL_CELL can always dereference c.cellptr without a NULL | check. > This makes no difference when c->cell_objptr is filled, > but it saves one check when c->cell_objptr is NULL in > a non-shadowed variable (e.g. after "del x"). I believe > that's the only case in which it matters, and it seems > fairly rare to me that a module function will attempt to > access a variable that's been deleted from the module. Agreed. When x is not defined, it doesn't matter how much extra code we execute as long as we don't dereference NULL. :-) > Because the module can't know what new variables might > be introduced into __builtin__ after the module has been > loaded, a failed lookup must finally fall back to a lookup > in __builtin__. Given that, it seems like a good idea to > set c->cell_cellptr = c when c->cell_objptr is set (for > both shadowed and non-shadowed variables). In my picture, > this would change the cell that spam.max points to, so > that it points to itself instead of __builtin__.max's cell. > That is: > > PyObject* cell_set(PyGlobalCell* c, PyObject* v) > { > c->cell_objptr = v; > c->cell_cellptr = c; > } But now you'd have to work harder when you delete the global again (i.e. in cell_delete()); the shadowed built-in must be restored. > This simplifies things further: > > PyObject* cell_get(PyGlobalCell* c) > { > return c->cell_cellptr->cell_objptr; > } > > This buys us no branches, which might be a really good > thing on today's speculative execution styles. Good idea! (And before I *did* misread your followup, because I hadn't fully digested this msg. I think you're right that we might be able to use just a PyObject **; but I haven't fully digested Tim's more aggressive idea.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Feb 11 16:35:00 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 11:35:00 -0500 Subject: [Python-Dev] Proposed standard module: Optik In-Reply-To: Your message of "Mon, 11 Feb 2002 08:16:57 PST." <3C67EE79.9C39087F@prescod.net> References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <3C67EE79.9C39087F@prescod.net> Message-ID: <200202111635.g1BGZ0R20230@pcp742651pcs.reston01.va.comcast.net> > Maybe we should turn the Optik documentation into a PEP (or at least > make a PEP with a pointer to it) so that people with competitive > solutions can either suggest improvements or claim that their solution > is a better starting point. IMO we don't need a PEP, but we do need to solicit feedback from people with competitive solutions. Can you post something to c.l.py and c.l.py.announce? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Feb 11 16:40:35 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 11:40:35 -0500 Subject: [Python-Dev] Proposed standard module: Optik In-Reply-To: Your message of "11 Feb 2002 17:26:00 +0100." References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <20020211161025.GA20794@gerg.ca> Message-ID: <200202111640.g1BGeZT20267@pcp742651pcs.reston01.va.comcast.net> > I'd also be in favour of providing option parsing through getopt > only. If getopt is not enough, extend it (in moderate ways, rather > adding customization mechanisms instead of alternatives, etc). If that > involves incorporating code from Optik, fine. However, I don't think > the standard library should have two modules that do essentially the > same thing; such scenarious will raise question whether one is better > than the other and which of them is maintained. I think Optik provides one key idea that makes it better: an options parser object that can be invoked multiple times and each time returns a new options object whose attributes are variables corresponding to various options. I'd be happy to say that the old getopt.getopt() interface will be deprecated. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Mon Feb 11 16:36:31 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 11 Feb 2002 08:36:31 -0800 Subject: [Python-Dev] Proposed standard module: Optik References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <3C67EE79.9C39087F@prescod.net> <200202111635.g1BGZ0R20230@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C67F30F.76C4A67A@prescod.net> Guido van Rossum wrote: > > > Maybe we should turn the Optik documentation into a PEP (or at least > > make a PEP with a pointer to it) so that people with competitive > > solutions can either suggest improvements or claim that their solution > > is a better starting point. > > IMO we don't need a PEP, but we do need to solicit feedback from > people with competitive solutions. Can you post something to c.l.py > and c.l.py.announce? I am happy to do an announcement but I feel like there needs to be a place to redirect conversation. Should we set up a mailing list? Or do all interested people want to join comp.lang.python and perhaps use a subject prefix for filtering? "OPT: ..." Paul Prescod From fdrake@acm.org Mon Feb 11 16:41:51 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 11 Feb 2002 11:41:51 -0500 Subject: [Python-Dev] Python 2.2 group missing on SF Patches In-Reply-To: <3C64573B.3412540F@metaslash.com> References: <3C64573B.3412540F@metaslash.com> Message-ID: <15463.62543.247745.715347@grendel.zope.com> Neal Norwitz writes: > There is no 2.2 (or 2.2.1) choice under Group when submitting a patch > on Source Forge. I've added 2.2.x for the release22-maint branch. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido@python.org Mon Feb 11 16:47:07 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 11:47:07 -0500 Subject: [Python-Dev] Proposed standard module: Optik In-Reply-To: Your message of "Mon, 11 Feb 2002 08:36:31 PST." <3C67F30F.76C4A67A@prescod.net> References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <3C67EE79.9C39087F@prescod.net> <200202111635.g1BGZ0R20230@pcp742651pcs.reston01.va.comcast.net> <3C67F30F.76C4A67A@prescod.net> Message-ID: <200202111647.g1BGl7C20309@pcp742651pcs.reston01.va.comcast.net> > I am happy to do an announcement but I feel like there needs to be a > place to redirect conversation. Should we set up a mailing list? Or do > all interested people want to join comp.lang.python and perhaps use a > subject prefix for filtering? "OPT: ..." They can post to python-dev (and even subscribe -- it's open these days) or someone can summarize. I don't expect a huge discussion. Please make it clear in the announcement that followups in c.l.py will be ignored -- they must send at least one email to python-dev to make us aware that they're competing. Ideally, it should compare their solution to Greg's list of key features of Optik, which he posted here: http://mail.python.org/pipermail/python-dev/2002-February/019937.html Please include that link in your announcement. --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Mon Feb 11 17:28:11 2002 From: mwh@python.net (Michael Hudson) Date: 11 Feb 2002 17:28:11 +0000 Subject: [Python-Dev] Want to co-design and implement a logging module? In-Reply-To: Greg Ward's message of "Mon, 11 Feb 2002 11:13:54 -0500" References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com> <20020211150338.GA20372@gerg.ca> <2mheoom3va.fsf@starship.python.net> <20020211161354.GB20794@gerg.ca> Message-ID: <2mlme0j5ck.fsf@starship.python.net> Greg Ward writes: > On 11 February 2002, Michael Hudson said: > > But I believe that 1.5.2 compatibility is still relavent for > > distutils > > I'm still catching up on distutils-sig traffic from the past year, so I > don't want to overcommit myself here... but I've been thinking that we > (I) should do one last Distutils release that is 1.5.2 compatible, and > then we can decide if future Distutils releases will stick to > 2.0-compatibility, or are allowed to require the version of Python that > they go with. I;m not sure that idea will get widespread support. > However, please *don't* everyone jump in and start a thread about this > now. I'll take it up on distutils-sig when I've caught up. But I'll wait until you get caught up. > > I had a go at implementing a very KISS approach to distutils logging > > this morning and found what I was doing conflicted horribly with > > distutils' current practice, so I stopped. > > Probably because the Distutils current practice is an ill-thought-out > mishmash. That'll have to be fixed first, I suspect. Sorry. ;-( It was more to do with options processing (the fact that basically speaking all options translate to attributes on some object) than logging. I suspect I could have used Optik more easily... I'm also not sure how politic it would be to take an axe to the interfaces of the various *util modules. Cheers, M. -- You sound surprised. We're talking about a government department here - they have procedures, not intelligence. -- Ben Hutchings, cam.misc From trentm@ActiveState.com Mon Feb 11 17:54:41 2002 From: trentm@ActiveState.com (Trent Mick) Date: Mon, 11 Feb 2002 09:54:41 -0800 Subject: [Python-Dev] Want to co-design and implement a logging module? In-Reply-To: <20020211150338.GA20372@gerg.ca>; from gward@python.net on Mon, Feb 11, 2002 at 10:03:38AM -0500 References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com> <20020211150338.GA20372@gerg.ca> Message-ID: <20020211095441.B3536@ActiveState.com> On Mon, Feb 11, 2002 at 10:03:38AM -0500, Greg Ward wrote: > On 04 February 2002, Trent Mick volunteered to "do something" > about a standard logging module for Python: > > How about I try to have a PEP together within a week or two, and perhaps a > > working base implementation? > > Well, it's been exactly 7 days. Trent, are you halfway done yet? ;-) I have my thoughts together. I'll write up and post tonight. Meanwhile I have to get some work done for my employer. :) Regarding Distutils for Python 1.5.2 usage: the potential logging support *could* be back ported and included in the distutils package that gets put together for Python 1.5.2. Trent -- Trent Mick TrentM@ActiveState.com From mal@lemburg.com Mon Feb 11 18:08:32 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 11 Feb 2002 19:08:32 +0100 Subject: [Python-Dev] proposal: add basic time type tothestandardlibrary References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu> <5.1.0.14.2.20020210091537.02026d38@mercury-1.cbu.edu> <5.1.0.14.2.20020210124257.01eb3820@mercury-1.cbu.edu> Message-ID: <3C6808A0.9DCA4CA3@lemburg.com> > I have no desire to compete with the mxDateTime implementation. I want to > look at some of the solutions out there and take the best from everyone and > provide a module that will suit 95-100% of the people. For several reasons, > which I tried to point out in my mails, mxDateTime or Zope's Datetime in > its current states is not suitable. That's a strange conclusion since both of these modules have been around for quite some time (mxDateTime was started in Dec. 1997) and obviously *are* quite suitable for a large share of Python's users :-) BTW, mxDateTime can do quite a bit in terms of i18n: >>> from mx.DateTime import * >>> DateTimeFrom('11. Februar 2002') >>> DateTimeFrom('February, 11 2002') >>> from mx.DateTime import Locale >>> Locale.French.str(now()) 'lundi 11 f\xe9vrier 2002 19:07:12' >>> Locale.Spanish.str(now()) 'lunes 11 febrero 2002 19:07:19' >>> Locale.German.str(now()) 'Montag 11 Februar 2002 19:07:25' (hmm, I ought to insert some extra interpunctation...) Nevermind, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From Gerson.Kurz@t-online.de Mon Feb 11 18:07:55 2002 From: Gerson.Kurz@t-online.de (Gerson Kurz) Date: Mon, 11 Feb 2002 19:07:55 +0100 Subject: [Python-Dev] RE: RFC: Option Parsing Libraries Message-ID: I will get beat for this, but: can it be optionally non-case-sensitive? I know, I know, in time-honoured unix-tradition a commandline should be dangerous and unforgiving in use, but still, please? Also, I've just scanned the specs and didn't find some "rest-of-the-commandline-whatever-that-is" option. As in: filename options file1 file2 ... filen Other than that, it looks pretty neat. From mal@lemburg.com Mon Feb 11 18:29:04 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 11 Feb 2002 19:29:04 +0100 Subject: [Python-Dev] -U flag References: <3C6675C0.BE7B42FE@lemburg.com> <15462.55616.207319.531285@12-248-41-177.client.attbi.com> Message-ID: <3C680D70.D30EB38D@lemburg.com> Skip Montanaro wrote: > > (I think we've had this discussion before...) > > MAL> Wait... the -U option was added in order to be able to see how well > MAL> the 8-bit string / Unicode integration works. It's a know fact that > MAL> the Python standard lib is not Unicode compatible yet and that's > MAL> exactly what the -U option allows you to test (in a very simple > MAL> way). > > If -U is really just a "test" flag, I don't think it should show up in > "python -h" output. If noone objects, I'll remove the flag from the -h output. Ok ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Mon Feb 11 18:40:51 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 13:40:51 -0500 Subject: [Python-Dev] -U flag In-Reply-To: Your message of "Mon, 11 Feb 2002 19:29:04 +0100." <3C680D70.D30EB38D@lemburg.com> References: <3C6675C0.BE7B42FE@lemburg.com> <15462.55616.207319.531285@12-248-41-177.client.attbi.com> <3C680D70.D30EB38D@lemburg.com> Message-ID: <200202111840.g1BIeq721259@pcp742651pcs.reston01.va.comcast.net> > If noone objects, I'll remove the flag from the -h output. Ok ? +1 --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Feb 11 18:48:21 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 11 Feb 2002 19:48:21 +0100 Subject: [Python-Dev] -U flag References: <3C6675C0.BE7B42FE@lemburg.com> <15462.55616.207319.531285@12-248-41-177.client.attbi.com> <3C680D70.D30EB38D@lemburg.com> <200202111840.g1BIeq721259@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C6811F5.93E1607B@lemburg.com> Guido van Rossum wrote: > > > If noone objects, I'll remove the flag from the -h output. Ok ? > > +1 Done. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From marklists@mceahern.com Mon Feb 11 19:16:52 2002 From: marklists@mceahern.com (Mark McEahern) Date: Mon, 11 Feb 2002 11:16:52 -0800 Subject: [Python-Dev] RE: Option Parsing Libraries In-Reply-To: <3C67F75D.D1EF37DA@prescod.net> Message-ID: [Paul Prescod] > If you have a competitive library, or suggestions for changes to Optik, > please forward your comments to python-dev mailing list > (python-dev@python.org). I love optik. We use it for all of our option parsing. I have one feature request related to error handling. I've attached sample code below that shows a common thing I end up doing: raising an error if a required option is missing. I guess it's not really an option, then, is it? Anyway, I searched optik's documentation for some way to "inspect" the options collection itself for the original information used when creating the option. In the code below, you'll notice the requiredVar method takes a description parameter. It would be nice to be able to do something like this instead: if var is None: parser.error("Missing: %s" % options.var.description) In fact, it would seem that this itself is so common it would be "built-in" to optik itself. So that all I have to do is declare an option as required when I add it with add_option? Thanks, // mark #! /usr/bin/env python # testError.py from optik import OptionParser def requiredVar(parser, options, var, description): """Raise a parser error if var is None.""" if var is None: # Here's where it'd be nice to have access to the attributes of the # options; at the very least, so I could say which option is missing # without having to pass in the description. parser.error("Missing: %s" % description) def parseCommandLine(): """Parse the command line options and return (options, args).""" usage = """usage: %prog [options] Testing optik's error handling. """ parser = OptionParser(usage) parser.add_option("-f", "--file", type="string", dest="filename", metavar="FILE", help="read data from FILE", default=None) options, args = parser.parse_args() requiredVar(parser, options, options.filename, "filename") def main(): options, args = parseCommandLine() if __name__=="__main__": main() From gward@python.net Mon Feb 11 19:29:04 2002 From: gward@python.net (Greg Ward) Date: Mon, 11 Feb 2002 14:29:04 -0500 Subject: [Python-Dev] Proposed standard module: Optik In-Reply-To: <200202111647.g1BGl7C20309@pcp742651pcs.reston01.va.comcast.net> References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <3C67EE79.9C39087F@prescod.net> <200202111635.g1BGZ0R20230@pcp742651pcs.reston01.va.comcast.net> <3C67F30F.76C4A67A@prescod.net> <200202111647.g1BGl7C20309@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020211192904.GA22667@gerg.ca> On 11 February 2002, Guido van Rossum said: > They can post to python-dev (and even subscribe -- it's open these > days) or someone can summarize. I don't expect a huge discussion. > Please make it clear in the announcement that followups in c.l.py will > be ignored -- they must send at least one email to python-dev to make > us aware that they're competing. A good starting point for modules that compete with Optik can be found in "User Interfaces" section of the Vaults of Parnassus: http://www.vex.net/parnassus/apyllo.py/808292924 The contenders are: Cmdline Getargs GetPotPython Optik Options pypopt ...wow! I must confess, it didn't occur to me to check Parnassus before writing Optik; I just arrogantly assumed that I would get it right. It would be interesting to hear from the authors *and users* of the above modules. If you're curious what Optik's users have said, see the optik-users list archive: http://www.geocrawler.com/redir-sf.php3?list=optik-users Greg -- Greg Ward - nerd gward@python.net http://starship.python.net/~gward/ If you're not part of the solution, you're part of the precipitate. From gward@python.net Mon Feb 11 19:40:12 2002 From: gward@python.net (Greg Ward) Date: Mon, 11 Feb 2002 14:40:12 -0500 Subject: [Python-Dev] RE: RFC: Option Parsing Libraries In-Reply-To: References: Message-ID: <20020211194012.GB22667@gerg.ca> On 11 February 2002, Gerson Kurz said: > I will get beat for this, but: can it be optionally non-case-sensitive? I > know, I know, in time-honoured unix-tradition a commandline should be > dangerous and unforgiving in use, but still, please? Interesting idea; should be trivial given a case-insensitive dictionary. And hasn't such a beast been bandied about as an example of subclassing built-in types with Python 2.2? Anyways, that's an Optik feature requests, and belongs on optik-users@lists.sourceforge.net. If you're serious, take it up there. > Also, I've just scanned the specs and didn't find some > "rest-of-the-commandline-whatever-that-is" option. As in: > > filename options file1 file2 ... filen When you do this: parser = OptionParser(...) (options, args) = parser.parse_args() then args is the list of positional arguments left over after parsing options. But again, that's a question about Optik, and belongs (for now) on the optik-users list. Greg -- Greg Ward - just another Python hacker gward@python.net http://starship.python.net/~gward/ Paranoia is simply an optimistic outlook on life. From oren-py-d@hishome.net Mon Feb 11 20:09:54 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 11 Feb 2002 22:09:54 +0200 Subject: [Python-Dev] patch: speed up name access by up to 80% Message-ID: <20020211220954.A12061@hishome.net> Problem: Python name lookup in dictionaries is relatively slow. Possible solutions: 1. Find ways to bypass dictionary lookup. 2. Optimize the hell out of dictionary lookup. Examples of approach #1 are the existing fastlocals mechanism, PEP 266, PEP 267 and the recent proposal by GvR for speeding up instance attribute access. These proposals all face major difficulties resulting from the dynamic nature of Python's namespace and require tricky techniques of code analysis code to check for assignments that may change the visible namespace. I have chosen to try #2. The biggest difficulty with this approach is that dictionaries are already very well optimized :-) But I have found that Python dictionaries are mostly optimized for general-purpose use, not for use as namespace for running code. There are some characteristics unique to namespace access which have not been used for optimization: * Lookup keys are always interned strings. * The local/global/builtin fallback means that most accesses fail. The patch adds the function PyDict_GetItem_Fast to dictobject. This function is equivalent to PyDict_GetItem but is much faster for lookup using interned string keys and for lookups with a negative result. LOAD_NAME and LOAD_GLOBAL have been converted to use this function. Here are the timings for 200,000,000 accesses to fastlocals, locals, globals and builtins for Python 2.2 with and without the fastnames patch: 2.2 2.2+fastnames --------------------------------------- builtin.py 68.540s 37.900s global.py 54.570s 34.020s local.py 41.210s 32.780s fastlocal.py 24.530s 24.540s Fastlocals are still significantly faster than locals, but not by such a wide margin. You can notice that the speed differences between locals, globals and builtins are almost gone. The machine is a Pentium III at 866MHz running linux. Both interpreters were compiled with "./configure ; make". The test code is at the bottom of this message. You will not see any effect on the results of pybench because it uses only fastlocals inside the test loops. Real code that uses a lot of globals and builtins should see a significant improvement. Get the patch: http://www.tothink.com/python/fastnames/fastnames.patch The optimization techniques used: * Inline code PyDict_GetItem_Fast is actually an inline macro. If the item is stored at the first hash location it will be returned without any function calls. It also requires that the key used for the entry is the same interned string as the key. This fast inline version is possible because the first argument is known to be a valid dictionary and the second argument is known to be a valid interned string with a valid cached hash. * Key comparison by pointer If the inline macro fails PyDict_GetItem_Fast2 is called. This function searches for an entry with a key identical to the requested key. This search is faster than lookdict or lookdict_string because there are no expensive calls to external compare-by-value functions. Another small speedup is gained by not checking for free slots since this function is never used for setting items. If this search fails, the dictionary's ma_lookup function is called. * Interning of entry keys One reason the quick search could fail is because the entry was set using direct access to __dict__ instead of standard name assignment and therefore the entry key is not interned. In this case the entry key is replaced with the interned lookup key. The next fast search for the same key will succeed. There is a very good chance that it will be handled by the inline macro. * Negative entries In name lookup most accesses fail. In order to speed them up negative entries can mark a name as "positively not there", usually detected by the macro without requiring any function calls. Negative entries have the interned key as their me_key and me_value is NULL. Negative entries occupy real space in the hash table and cannot be reused as empty slots. This optimization technique is not practical for general purpose dictionaries because some types of code would quickly overload the dictionary with many negative entries. For name lookup the number of negative entries is bound by the number of global and builtin names referenced by the code that uses the dictionary as a namespace. This new type of slot has a surprisingly small impact on the rest of dictobject.c. Only one assertion had to be removed to accomodate it. All other code treats it as either an active entry (key !=NULL, !=dummy) or a deleted entry (value == NULL, key != NULL) and just happens to do the Right Thing for each case. If an entry with the same key as a negative entry is subsequently inserted into the dictionary it will overwrite the negative entry and be reflected immediately in the namespace. There is no caching and therefore no cache coherency issues. Known bugs: Negative entries do not resize the table. If there is not enough free space in the table they are simply not inserted. Assumes CACHE_HASH, INTERN_STRINGS without checking. Future directions: It should be possible to apply this to more than just LOAD_ATTR and LOAD_GLOBAL: attributes, modules, setting items, etc. The hit rate for the inline macro varies from 100% in most simple cases to 0% in cases where the first hash position in the table happens to be occupied by another entry. Even in these cases it is still very fast, but I want to get more consistent performance. I am starting to experiment with probabilistic techniques that shuffle entries in the hash table and try to ensure that the entries accessed most often are kept in the first hash positions as much as possible. Test code: * builtin.py class f: for i in xrange(10000000): hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; * global.py hex = 5 class f: for i in xrange(10000000): hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; * local.py class f: hex = 5 for i in xrange(10000000): hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; * fastlocal.py def f(): hex = 5 for i in xrange(10000000): hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; f() Oren From gward@python.net Mon Feb 11 20:37:16 2002 From: gward@python.net (Greg Ward) Date: Mon, 11 Feb 2002 15:37:16 -0500 Subject: [Python-Dev] Proposed standard module: Optik In-Reply-To: <20020211192904.GA22667@gerg.ca> References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <3C67EE79.9C39087F@prescod.net> <200202111635.g1BGZ0R20230@pcp742651pcs.reston01.va.comcast.net> <3C67F30F.76C4A67A@prescod.net> <200202111647.g1BGl7C20309@pcp742651pcs.reston01.va.comcast.net> <20020211192904.GA22667@gerg.ca> Message-ID: <20020211203716.GA22837@gerg.ca> --TB36FDmn/VVEgNH/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On 11 February 2002, I said: > A good starting point for modules that compete with Optik can be found > in "User Interfaces" section of the Vaults of Parnassus: > > http://www.vex.net/parnassus/apyllo.py/808292924 OK, I've looked at all the option-parsing packages listed in Parnassus. I've read the docs for all of them, and flipped through the source for some of them. Here's the executive summary: * only one of them, arglist.py by Ben Wolfson, has a nice OO design similar to Optik * the one feature that several of the competition offer but Optik does not (yet) is the ability to specify an option that *may* take a value, but doesn't necessarily *have to* take a value. Ironically, this is one of my requirements for the Distutils, motivated by the --home option to the "install" command. I think arglist.py is the only serious contender here. Based on my cursory inspection, all of the others have rather deep flaws. (Eg. they implement a non-standard syntax, or they do all their work at import time rather than providing a class to instantiate and do option-parsing work, or they have painful/awkward/hairy programming interface.) I'll attach my full notes. Anyone else who feels like doing this should start at the *bottom* of the list on Parnassus, since I devoted progressively less time and energy to each package along the way. ;-) Greg -- Greg Ward - geek gward@python.net http://starship.python.net/~gward/ Save energy: be apathetic. --TB36FDmn/VVEgNH/ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="competition.txt" THE COMPETITION --------------- arglist.py (Feb 2002) author: Ben Wolfson url: http://home.uchicago.edu/~wolfson/Python/ * fairly clean OO design, much like Optik: Option for each option, Argument for a collection of options * results of parsing command line (option values and leftover positional args) are accessible through Arguments object -- no separate "option values" object * handles short options much like Optik: "-ffoo" and "-f foo" seem to work, as does "-avx" where -a, -v, -x all value-less options * subtly different notion of "default value" from Optik -- if an option takes a value, and no default value is provided, the user must provide a value. With Optik (<= 1.2), if an option takes a value the user must always provide a value; the default value is for when that option isn't present at all. * dependent on Python 2.2 -- even uses a metaclass! (not that it really *needs* to) * no strong typing, much weaker callback interface; but "behaviors" are like Optik's "actions" -- there just aren't as many of them * main advantage over Optik: it's possible to define an option that takes a value, but doesn't require a value * error-handling? not sure -- think it raises an exception * long option abbreviations allowed? not sure Cmdline (1.0) author: Daniel Gindikin url: http://members.home.com/gindikin/dev/python/cmdline/ * weird API: just import the module and it does everything then * slightly weird user interface: in addition to the standard "--foo=bar" and "--foo bar", "foo=bar" and "foo:bar" also work: yuck * very cool error-handling: prints out the command-line, underlining the option with errors -- nice! * rudimentary type-checking -- if you ask for an integer value, and user supplied a string, it bombs with a useful error message * not extensible -- everything's done at module-level, no classes or anything nice like that * long option abbreviations allowed? not sure Getargs (1.3) author: ? (Ivan Van Laningham?) url: http://www.pauahtun.org/ftp.html * painful, clunky interface (eg. None specifies a boolean option, 0j a "count" option, 0 an integer option, 0.0 a float option) * I don't see how to specify a plain old string option! * documentation is confusing and poorly written * "long options" are Tk-style, eg. "-file", rather than GNU-style "--file" * order of options is lost -- not clear what happens if user does -ffoo -fbar? what is the value of -f? * long options can be abbreviated * last updated 1999 GetPot Python author: Frank-Rene Schaefer url: http://getpot.sourceforge.net/ * written in C++, so an extension is needed... or is it? not clear * docs cover C++ version * LGPL'd * seems to define a mini-language for defining command-line options; not sure where you're supposed to put those .pot source files Options author: Tim Colles Johan Vromans * port of Perl's Getopt::Long * not really OO or extensible, as near as I could tell * possible to specify option types and required-ness, but the syntax is hairy -- I think it's all done in one fell swoop (single call to GetOptions() does everything) --TB36FDmn/VVEgNH/-- From tim.one@comcast.net Mon Feb 11 20:45:56 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 11 Feb 2002 15:45:56 -0500 Subject: [Python-Dev] Proposed standard module: Optik In-Reply-To: <20020211153201.GA20417@gerg.ca> Message-ID: [Greg Ward] > ... > Two good arguments for adding Optik to the standard library: > > * David Goodger wants to use it for the standard doc-processing tools IIRC, David was the most recent person to have a massive getopt enhancement path rejected, so if you've got his backing both of the key players are covered . From skip@pobox.com Mon Feb 11 20:48:36 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 11 Feb 2002 14:48:36 -0600 Subject: [Python-Dev] patch: speed up name access by up to 80% In-Reply-To: <20020211220954.A12061@hishome.net> References: <20020211220954.A12061@hishome.net> Message-ID: <15464.11812.114930.543584@beluga.mojam.com> Oren> Problem: Python name lookup in dictionaries is relatively slow. ... [ lots of interesting stuff elided ] ... This looks pretty much like a no-brainer to me, assuming it stands up to close scrutiny. The only thing I'd change is that if PyDict_GetItem_Fast is a macro that only works for interned strings, I'd change its name to reflect its use: PyDict_GET_ITEM_INTERNED or something similar. As a further test, I suggest you give the pystone benchmark a whirl, not because it's such a kickass benchmark, but because it occasionally fiddles global variable values. I imagine it will do just fine and probably run a bit faster to boot. Skip From skip@pobox.com Mon Feb 11 20:52:30 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 11 Feb 2002 14:52:30 -0600 Subject: [Python-Dev] patch: speed up name access by up to 80% In-Reply-To: <20020211220954.A12061@hishome.net> References: <20020211220954.A12061@hishome.net> Message-ID: <15464.12046.67498.758229@beluga.mojam.com> The only thing I'd change is that if PyDict_GetItem_Fast is a macro that only works for interned strings, I'd change its name to reflect its use: PyDict_GET_ITEM_INTERNED or something similar. One other naming change is to prefix PyDict_GetItem_Fast2 with an underscore, since the comments about its use make it clear that it's an internal function. Skip From oren-py-d@hishome.net Mon Feb 11 21:29:32 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 11 Feb 2002 16:29:32 -0500 Subject: [Python-Dev] patch: speed up name access by up to 80% In-Reply-To: <15464.12046.67498.758229@beluga.mojam.com> References: <20020211220954.A12061@hishome.net> <15464.12046.67498.758229@beluga.mojam.com> Message-ID: <20020211212932.GA82642@hishome.net> On Mon, Feb 11, 2002 at 02:52:30PM -0600, Skip Montanaro wrote: > > The only thing I'd change is that if PyDict_GetItem_Fast is a macro that > only works for interned strings, I'd change its name to reflect its use: > PyDict_GET_ITEM_INTERNED or something similar. > > One other naming change is to prefix PyDict_GetItem_Fast2 with an > underscore, since the comments about its use make it clear that it's an > internal function. Done and up on the same URL. PyDict_GetItem_Fast -> PyDict_GETITEM_INTERNED PyDict_GetItem_Fast2 -> _PyDict_GetItem_Interned Oren From tim.one@comcast.net Mon Feb 11 21:57:03 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 11 Feb 2002 16:57:03 -0500 Subject: [Python-Dev] Want to co-design and implement a logging module? In-Reply-To: <20020211095441.B3536@ActiveState.com> Message-ID: [Trent Mick] > I have my thoughts together. I'll write up and post tonight. > Meanwhile I have to get some work done for my employer. :) No problem: Guido says this is your top priority . everyone-works-for-guido!-ly y'rs - tim From neal@metaslash.com Mon Feb 11 22:26:01 2002 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 11 Feb 2002 17:26:01 -0500 Subject: [Python-Dev] patch: speed up name access by up to 80% References: <20020211220954.A12061@hishome.net> Message-ID: <3C6844F9.954E1DA2@metaslash.com> Oren Tirosh wrote: > > Problem: Python name lookup in dictionaries is relatively slow. > http://www.tothink.com/python/fastnames/fastnames.patch I tried this patch (*) by running the regression tests: make && time ./python -E -tt Lib/test/regrtest.py All the expected tests passed and there were no failures, this is good. The bad news is that it was slower. It took 42 user seconds longer with the patch than without. Before patch: real 3m1.031s user 1m19.480s sys 0m2.400s After patch: real 3m38.071s user 1m51.760s sys 0m2.790s The box is Linux 2.4, Athlon 650, 256 MB. (*) Pretty sure this was patch #1, running sum yields: 53200 10. But it shouldn't matter, since it was only a name change, right? Neal From oren-py-d@hishome.net Mon Feb 11 22:55:38 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 11 Feb 2002 17:55:38 -0500 Subject: [Python-Dev] patch: speed up name access by up to 80% In-Reply-To: <3C6844F9.954E1DA2@metaslash.com> References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com> Message-ID: <20020211225538.GA93506@hishome.net> On Mon, Feb 11, 2002 at 05:26:01PM -0500, Neal Norwitz wrote: > Oren Tirosh wrote: > > > > Problem: Python name lookup in dictionaries is relatively slow. > > > http://www.tothink.com/python/fastnames/fastnames.patch > > I tried this patch (*) by running the regression tests: > > make && time ./python -E -tt Lib/test/regrtest.py > > All the expected tests passed and there were no failures, this is good. > The bad news is that it was slower. It took 42 user seconds longer > with the patch than without. I have tried this and got the same results for the patched and unpatched versions (+-1 second). The regression tests spend most of their time on things like threads, sockets, signals, etc that have a lot of variance and are not really affected by name lookup speed. I got strage results comparing to the python2.2 RPM package (some faster, some slower). I didn't start to get consistent results until I used two freshly compiled interpreters. I wonder with what options this package was compiled. Oren From neal@metaslash.com Mon Feb 11 23:20:01 2002 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 11 Feb 2002 18:20:01 -0500 Subject: [Python-Dev] patch: speed up name access by up to 80% References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com> <20020212004640.A20174@hishome.net> Message-ID: <3C6851A1.83609558@metaslash.com> Oren Tirosh wrote: > > On Mon, Feb 11, 2002 at 05:26:01PM -0500, Neal Norwitz wrote: > > I tried this patch (*) by running the regression tests: > > > > make && time ./python -E -tt Lib/test/regrtest.py > > > > All the expected tests passed and there were no failures, this is good. > > The bad news is that it was slower. It took 42 user seconds longer > > with the patch than without. > > I tried this and the results were identical for both version (+-1 second) > The regression tests most of their time on things like threads, sockets, > signals and stuff like that which is barely affected by this patch and > has a lot of other sources of variance. Some subset of the regression > tests may be good as a benchmark, though. > > I also got very strange results (some faster, some slower) with the > python2.2 RPM package and didn't start to get consistent results until I > used a freshly compiled interpreter for both the reference and DUT. I want > to see how the package was compiled and why it got such strange results. I was surprised by the results. I am using the latest version from CVS, plus I have some outstanding changes that would be unlikely to cause a conflict. They deal with removing consecutive line numbers (there is a patch on SF) and speeding up conditionals. Nothing that is remotely close to dictionaries. Here are all the options from the compile: gcc -c -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fkeep-inline-functions -fno-inline -fprofile-arcs -ftest-coverage -I. -I./Include -DHAVE_CONFIG_H It's possible that no inlining is hurting, but I would still have expected your patch to be faster than without. I also wouldn't expect the test coverage to be hurting (-fprofile-arcs -ftest-coverage), but that is possible. It would be nice if someone else could duplicate either of our results. Neal From wolfson@uchicago.edu Mon Feb 11 23:31:49 2002 From: wolfson@uchicago.edu (Ben Wolfson) Date: Mon, 11 Feb 2002 17:31:49 -0600 Subject: [Python-Dev] Re: RFC: Option Parsing Libraries References: Message-ID: <200202112333.g1BNX5Z24781@midway.uchicago.edu> I have an option-parsing module at http://home.uchicago.edu/~wolfson/Python/arglist.py ; from reading Optik's docs it isn't as featureful but it does answer the list given in http://mail.python.org/pipermail/python-dev/2002-February/019937.html. I haven't used Optik, but my module seems simpler to use. It supports arbitrary callbacks, can be strongly typed (insofar as passing, eg the int function as a callback will generate an error if the value isn't int-able), recognizes the identity of short and long forms, and has a reasonable range of actions--if an option has no argument, by default it records if it appeared and how often; if it does, it can append multiple occurences to a list or keep only the last. Callbacks aren't as flexible as Optik's (they are functions of one argument) but with nested scopes that probably wouldn't be terribly problematic. I'm currently in the process of re-writing arglist.py in my spare time since the code is rather messy, so if it's seen as a contender I could add features. -- BTR BEN WOLFSON HAS RUINED ROCK MUSIC FOR A GENERATION -- Crgre Jvyyneq From greg@cosc.canterbury.ac.nz Mon Feb 11 23:39:35 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Feb 2002 12:39:35 +1300 (NZDT) Subject: [Python-Dev] RE: Option Parsing Libraries In-Reply-To: Message-ID: <200202112339.MAA20826@s454.cosc.canterbury.ac.nz> Mark McEahern : > raising an error if a required option is missing. I guess it's not > really an option, then, is it? Maybe it should be called an argument parser instead of an option parser. Although "Argik" doesn't have quite the same ring to it. :-( Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aahz@rahul.net Mon Feb 11 23:44:54 2002 From: aahz@rahul.net (Aahz Maruch) Date: Mon, 11 Feb 2002 15:44:54 -0800 (PST) Subject: [Python-Dev] RE: Option Parsing Libraries In-Reply-To: <200202112339.MAA20826@s454.cosc.canterbury.ac.nz> from "Greg Ewing" at Feb 12, 2002 12:39:35 PM Message-ID: <20020211234455.744A5E8C5@waltz.rahul.net> Greg Ewing wrote: > > Although "Argik" doesn't have quite the same ring to it. :-( Would you like some roast turkey and mashed potatoes with that? (Sorry, an old in-joke for people who've been on netnews for more than a decade.) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From neal@metaslash.com Mon Feb 11 23:57:46 2002 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 11 Feb 2002 18:57:46 -0500 Subject: [Python-Dev] patch: speed up name access by up to 80% References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com> <20020211225538.GA93506@hishome.net> Message-ID: <3C685A7A.4D39A35D@metaslash.com> Oren Tirosh wrote: > > On Mon, Feb 11, 2002 at 05:26:01PM -0500, Neal Norwitz wrote: > > Oren Tirosh wrote: > > > > > > Problem: Python name lookup in dictionaries is relatively slow. > > > > > http://www.tothink.com/python/fastnames/fastnames.patch > > > > I tried this patch (*) by running the regression tests: > > > > make && time ./python -E -tt Lib/test/regrtest.py > > > > All the expected tests passed and there were no failures, this is good. > > The bad news is that it was slower. It took 42 user seconds longer > > with the patch than without. > > I have tried this and got the same results for the patched and unpatched > versions (+-1 second). The regression tests spend most of their time on > things like threads, sockets, signals, etc that have a lot of variance > and are not really affected by name lookup speed. I rebuilt everything from scratch and got results similar to Oren's, ie, roughly the same. This time I took off the test-coverage flags. (Sorry, I must have had them off for stock, but on with the Oren's patch). Before patch: real 2m57.416s user 1m12.830s sys 0m2.580s After patch: real 2m56.017s user 1m14.960s sys 0m2.380s I still have inlines turned off. Neal From tim.one@comcast.net Tue Feb 12 00:01:04 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 11 Feb 2002 19:01:04 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202111628.g1BGSxF20159@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido] > ... > I think you're right that we might be able to use just a PyObject **; but > I haven't fully digested Tim's more aggressive idea.) The overwhelming thrust of Tim's variant is to reduce the (by far) most frequent namespace operation to this: case LOAD_GLOBAL_CELL: cell = func_cells[oparg]; x = cell->objptr; /* note: not two levels, just one */ if (x != NULL) { Py_INCREF(x); continue; } ... error recovery ... break; *Everything* else follows from that; it's really a quite minor variant of the original proposal, consisting mostly of changes to spelling details due to having a different kind of cell. The sole big change is requiring that mutations to builtins propagate at once to their cached values in module celldicts. I believe Jeremy's scheme *could* do better than this for builtins, but not the way it's currently set up (I don't see why we can't define a fixed bijection between the standard builtin names and a range of contiguous little integers, and use that fixed bijection everywhere; the compiler could exploit this global (in the cross-module sense) bijection directly for LOAD_BUILTIN offsets, eliminating all the indirection for standard builtins, and eliminating the code-object-specific vectors of referenced builtin names; note that I don't care about speeding access to builtins with non-standard names -- fine by me if they're handled via LOAD_GLOBAL instead, and fall into its "oops! it's not a module global after all" case). From jeremy@alum.mit.edu Tue Feb 12 00:14:25 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Mon, 11 Feb 2002 19:14:25 -0500 Subject: [Python-Dev] patch: speed up name access by up to 80% In-Reply-To: <3C685A7A.4D39A35D@metaslash.com> References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com> <20020211225538.GA93506@hishome.net> <3C685A7A.4D39A35D@metaslash.com> Message-ID: <15464.24161.581535.548441@gondolin.digicool.com> So simple benchmark programs are a lot more interesting. I'd pick pystone, test_hmac, and test_htmlparser. Jeremy From greg@cosc.canterbury.ac.nz Tue Feb 12 00:03:17 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Feb 2002 13:03:17 +1300 (NZDT) Subject: [Python-Dev] RE: Option Parsing Libraries In-Reply-To: <20020211234455.744A5E8C5@waltz.rahul.net> Message-ID: <200202120003.NAA20832@s454.cosc.canterbury.ac.nz> aahz@rahul.net (Aahz Maruch): > Greg Ewing wrote: > > > > Although "Argik" doesn't have quite the same ring to it. :-( > > Would you like some roast turkey and mashed potatoes with that? No, thank you. I was just sneezing and hiccupping at the same time. :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin@v.loewis.de Tue Feb 12 00:10:42 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Tue, 12 Feb 2002 01:10:42 +0100 Subject: [Python-Dev] Incorporating Expat Message-ID: <200202120010.g1C0AgM01277@mira.informatik.hu-berlin.de> At IPC10, I discussed with Fred a strategy for incorporating Expat 1.95.2 into Python. I've now implemented most of this, consisting of: - adding the lib/ directory of Expat as Modules/expat - changing setup.py to build the included Expat library into the extension module - likewise for PCbuild/pyexpat.dsp - dropping support for older Expat versions from pyexpat (support for older Python version is still maintained) AFAICT, the only missing part is to add the relevant changes to Modules/Setup.in, and to update various documentation files. Please make sure to pick up new directories when updating your CVS sandbox. If you find any problems with that change, please let me know. Regards, Martin From guido@python.org Tue Feb 12 04:30:43 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Feb 2002 23:30:43 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: Your message of "Mon, 11 Feb 2002 19:01:04 EST." References: Message-ID: <200202120430.g1C4UhV29560@pcp742651pcs.reston01.va.comcast.net> [Ping, suggesting to always dereference the cell twice] > But hey... in that case the cellptr is always two steps away from > the object. So why not just use PyObject**s instead of cells? [Tim, sketching a scheme that always dereferences the cell once] > The sole big change is requiring that mutations to builtins > propagate at once to their cached values in module celldicts. Let's combine these ideas. Suppose there's a vector of pointers to pointers to objects, indexed by some index calculated by the compiler. Then the fast track in LOAD_GLOBAL_CELL could look like this: case LOAD_GLOBAL_CELL: x = *globals_vector[oparg]; if (x != NULL) { Py_INCREF(x); continue; } ... handle uncommon cases and errors here ... Here, globals_vector[i] is usually the address of the me_value slot of a PyDictEntry in either the globals dict or the builtins dict. These are subclasses of dict that trap assignment, deletion, and rehashing. There's a special C-global variable PyObject *unfilled = NULL; whose contents is always NULL; when globals_vector is initalized, every element of it is set to &unfilled. The code to handle uncommon cases and errors does a "lookdict" operation on the globals dict using the name of the global variable, which it gets from (e.g.) globals_names[oparg]. This requires some opening up of the dict implementation; lookdict is an internal routine that returns a PyDictEntry *, call it e. If e->me_value != NULL, we set globals_vector[oparg] to &e->me_value, and we're done. Otherwise, we do another lookdict on the builtins dict, and again if e->me_value != NULL, set globals_vector[oparg] to &e->me_value. Otherwise, we raise a NameError. Now we need to take care of a number of additional special cases that could invalidate the pointers we're collecting in globals_vector. The following things may invalidate those pointers: - globals_vector[i] points to a builtin, and a global with the same name is created - globals_vector[i] points to either a builtin or a global, and the dict into whose hashtable it points is rehashed (as the result of adding an item) - globals_vector[i] points to a builtin or global, which is deleted - globals_vector[i] points to a builtin, and the special global named __builtins__ is assigned to (switching to a different builtins dict) To deal with all these cases, our dict subclass keeps a list of weak references. The builtins dict has weak references pointing to all globals dicts that shadow this builtins dict (because of rexec there can be multiple builtins dicts); each globals dict has weak references pointing to all the globals_vector structures that reference it. On a rehash, all entries in each affected globals_vector are reset to &unfilled. The uncommon case handling code will gradually populate them again. On assignment or deletion it might pay off to be a little more careful and only invalidate the entry in the globals_vector corrsponding to the affected name. (In particular, assignment to a global that's already set, and deletion of a global that doesn't shadow a built-in, should probably be handled somewhat efficiently.) The globals_vector structure should contain a pointer to the corresponding globals_names array, and it should also contain a reference to the globals dict into whose hashtable it may point, to keep it alive. So it should probably be an object that contains a vector of pointers to pointers in addition to some other stuff. The globals_vector may be shared by all code objects compiled together; this makes it similar to the dlict. But the overflow handling is quite different, and by pointing directly into the hash table it is possible to handle all globals and builtins uniformly. I expect that the implementation won't be particularly hard; the lookdict operation already exists and we can easily subclass dict to trap the setitem and delitem operations. We will have to be careful not to use PyDict_SetItem() and PyDict_DelItem() on these subclasses, but that should be easy: I think that the only offenders here are STORE_NAME and friends, and these are exactly the operations that we're going to change anyway. (STORE_NAME or STORE_GLOBAL will become a bit slower, because of the check whether it needs to update any globals_vector structures; but that's OK since we're speeding up the corresponding LOAD operation quite a bit.) (I'd add this to PEP 280 but I'll wait for Tim to shoot holes in it first. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Tue Feb 12 04:31:39 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 11 Feb 2002 23:31:39 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <15463.55207.217946.237969@12-248-41-177.client.attbi.com> Message-ID: [Skip Montanaro] > ... > The whole think makes sense to me if it avoids the possible two > PyDict_GetItem calls in the LOAD_GLOBAL opcode. As I understand it, if > accessed inside a function, LOAD_GLOBAL could be implemented > something like this: > > case LOAD_GLOBAL: > cell = func_cells[oparg]; > if (cell.objptr) x = cell->objptr; > else x = cell->cellptr->objptr; > if (x == NULL) { > ... error recovery ... > break; > } > Py_INCREF(x); > continue; > > This looks a lot better to me (no complex function calls). Something much like that. Guido added code to the PEP (280). My suggested modifications reduce it to: case LOAD_GLOBAL_CELL: cell = func_cells[oparg]; x = cell->objptr; if (x != NULL) { Py_INCREF(x); continue; } ... error recovery ... break; Another difference is hiding in the "... error recovery ..." elisions. In Guido's scheme, this must also include code to deal with the possibility that a global went away and thereby uncovered a builtin that popped into existence after the module globals were initialized. Then it's still a non-error case, but the cell->cellptr has gotten out of synch with reality. In the variation, the caches are never allowed to get out of synch, so "... error recovery .." there should really be "... error reporting ...": you can't there in the variant unless NameError is certain. Hmm: We *all* seem to be missing a PUSH(x), so all of our schemes are dead wrong . Speaking of which, why does LOAD_FAST waste time checking against NULL twice?! case LOAD_FAST: x = GETLOCAL(oparg); if (x == NULL) { format_exc_check_arg( PyExc_UnboundLocalError, UNBOUNDLOCAL_ERROR_MSG, PyTuple_GetItem(co->co_varnames, oparg) ); break; } Py_INCREF(x); PUSH(x); if (x != NULL) continue; break; I'll fix that ... From tim.one@comcast.net Tue Feb 12 04:45:06 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 11 Feb 2002 23:45:06 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202111628.g1BGSxF20159@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Ping] >> 1. There are going to be lots of global-cell objects. >> Perhaps they should get their own allocator and free list. [Guido] > Yes. No no no no no, and 5*-1 beats a +1 even from the BDFL . Vanilla pymalloc is perfect for this: many small objects. A custom free list for cells is a waste of code, because a cell never goes away until the module does: cells will not "churn". We'll get a lot of them, but most of them will stay alive until the program ends, so the tiny performance gain you may be able to get from a thoroughly specialized free list "in theory" will never be realized in practice. From barry@zope.com Tue Feb 12 05:09:06 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 00:09:06 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path Message-ID: <15464.41842.664484.307330@anthem.wooz.org> I have a bit of a dilemma when it comes to sys.path and the location of the site-packages directory. The problem comes when someone is using Mailman 2.1 with Python 2.2. The latter comes with the email package, which is in the standard library. Through some contributions, my standalone email package now supports multibyte character sets in RFC-compliant ways (e.g. splitting long headers correctly). The question is, how do I get the updated package to Python 2.2 users? The standalone email package is a simple distutils thingie with a directory and a bunch of .py files. distutils sticks this in site-packages. But an "import email" will always get the standard library version instead of the site-packages version because site.py /appends/ site-packages to sys.path instead of prepending it. I can work around this by adding my own path-hacking code before any import of email.* modules. This is a bit ugly because now it means that the proper functioning of the application depends on import order, and that's nasty. So the question is: why does site.py append site-packages instead of prepending it to sys.path? If there's a valid reason, I don't remember it, and I'm currently blind to any valuable use case. If there's no good reason, it would seem to me that the following use case is better served by prepending: - We want to provide an enhanced version, or a fixed version of a module or package. Distribute it w/distutils and do a normal install. As long as you don't start Python w/ -S, you'll always get the improved version. Don't want the improved version? Start Python w/ -S or just don't ever install the new package. I'm mostly looking for rationale right now, before I try to decide whether it's something worth debating and/or changing. Thanks, -Barry From tim.one@comcast.net Tue Feb 12 05:09:55 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 12 Feb 2002 00:09:55 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <200202120430.g1C4UhV29560@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido] > ... > (I'd add this to PEP 280 but I'll wait for Tim to shoot holes in it > first. :-) At first sight this scheme made sense and was very appealing. Alas, I've been staring at the computer non-stop for 11 hours, my eyes are losing focus, and I just remembered I forgot to eat today. IOW, you'll have to wait for Tuesday to get shot . For now, I really liked that the indirection vector fills in adaptively, to speed accesses that are actually getting made. I don't know that that's an objective advantage, but I loved the image. From oren-py-d@hishome.net Tue Feb 12 07:05:14 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 12 Feb 2002 09:05:14 +0200 Subject: [Python-Dev] patch: speed up name access by up to 80% In-Reply-To: <15464.24161.581535.548441@gondolin.digicool.com>; from jeremy@zope.com on Mon, Feb 11, 2002 at 07:14:25PM -0500 References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com> <20020211225538.GA93506@hishome.net> <3C685A7A.4D39A35D@metaslash.com> <15464.24161.581535.548441@gondolin.digicool.com> Message-ID: <20020212090514.A22361@hishome.net> On Mon, Feb 11, 2002 at 07:14:25PM -0500, Jeremy Hylton wrote: > So simple benchmark programs are a lot more interesting. > > I'd pick pystone, test_hmac, and test_htmlparser. test_htmlparser (x100): 0m29.950s 0m29.730s test_hmac (x1000): 0m16.480s 0m15.720s (lower is better) pystone: 11261.3 11494.3 (higher is better) A small, but measureable improvement. You can see below that most accesses are still to fastlocals and, of course, the code has some real work to do other than looking up names. test_htmlparser: 362331 fastlocal non-dictionary lookups 60106 inline dictionary lookups 10554 fast dictionary lookups 151 slow dictionary lookups test_hmac: 13959 fastlocal non-dictionary lookups 9920 inline dictionary lookups 7548 fast dictionary lookups 240 slow dictionary lookups pystone: 1447094 fastlocal non-dictionary lookups 502190 inline dictionary lookups 111549 fast dictionary lookups 111 slow dictionary lookups Anyone has an example of a program that relies on a lot of global and builtin name accesses? Meanwhile I'm going to start working on LOAD_ATTR. if-the-evidence-doesn't-fit-the-theory...-ly yours, Oren From ping@lfw.org Tue Feb 12 07:45:30 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Tue, 12 Feb 2002 01:45:30 -0600 (CST) Subject: [Python-Dev] Zest Message-ID: A few people have expressed interest in the mail-archiving project i mentioned on Developer's Day (see http://lfw.org/python/pydev-sample for some sample output). I've registered a SourceForge project named "Zest" under which i'll be doing this work, and created a mailing list for anyone who wants to discuss it. Please direct any thoughts and comments about Zest to zest-devel@lists.sf.net. I don't expect to be blabbing a great deal on the list until i've got a better prototype ready, but it's good to have a place for the project to reside, so that ideas and conversations don't get lost. (Thanks for the prod, Barry.) If you're interested, please join the list: http://lists.sf.net/lists/listinfo/zest-devel Sorry for the diversion. -- ?!ng From rsc@plan9.bell-labs.com Tue Feb 12 08:44:19 2002 From: rsc@plan9.bell-labs.com (Russ Cox) Date: Tue, 12 Feb 2002 03:44:19 -0500 Subject: [Python-Dev] a different approach to argument parsing Message-ID: [Hi. I'm responsible for the Plan 9 port of Python; I typically just lurk here.] Regarding the argument parsing discussion, it seems like many of the "features" of the various argument parsing packages are aimed at the fact that in C (whence this all originated) the original getopt interface wasn't so great. To use getopt, you end up specifying the argument set twice: once to the parser and then once when processing the list of returned results. Packages like Optik make this a little better by letting you wrap up the actual processing in some form and hand that to the parser too. Still, you have to wrap up your argument parsing into little actions; the getopt style processing loop is usually a bit clearer. Ultimately, I find getopt unsatisfactory because of the duplication; and I find Optik and other similar packages unsatisfactory because of the contortions you have to go through to invoke them. I don't mean to pick on Optik, since many others appear to behave in similar ways, but it seems to be the yardstick. For concreteness, I'd much rather write: if o=='-n' or o=='--num': ncopies = opt.optarg(opt.needtype(int)) than: parser.add_option("-n", "--num", action="store", type="int", dest="ncopies") The second strikes me as clumsy at best. The Plan 9 argument parser (for C) avoids these problems by making the parser itself small enough to be a collection of preprocessor macros. Although the implementation is ugly, the external interface that programmers see is trivial. A modified version of the example at http://optik.sourceforge.net would be rendered: char *usagemessage = "usage: example [-f FILE] [-h] [-q] who where\n" "\n" " -h show this help message\n" " -f FILE write report to FILE\n" " -q don't print status messages to stdout\n"; void usage(void) { write(2, usagemessage, strlen(usagemessage)); exits("usage"); } void main(int argc, char **argv) { ... ARGBEGIN{ case 'f': report = EARGF(usage()); break; case 'q': verbose = 0; break; case 'h': default: usage(); }ARGEND if(argc != 2) usage(); ... [This is documented at http://plan9.bell-labs.com/magic/man2html/2/ARGBEGIN, for anyone who is curious.] Notice that the argument parsing machinery only gets the argument parameters in one place, and is kept so simple because it is driven by what happens in the actions: if I run "example -frsc" and the f option case doesn't call EARGF() to fetch the "rsc", the next iteration through the loop will be for option 'r'; a priori there's no way to tell. Now that Python has generators, it is easy to do a similar sort of thing, so that the argument parsing can be kept very simple. The running example would be written using the attached argument parser as: usagemessage=\ '''usage: example.py [-h] [-f FILE] [-n N] [-q] who where -h, --help show this help message -f FILE, --file=FILE write report to FILE -n N, --num=N print N copies of the report -q, --quiet don't print status messages to stdout ''' def main(): opt = OptionParser(usage=usagemessage) report = 'default.file' ncopies = 1 verbose = 1 for o in opt: if o=='-f' or o=='--file': report = opt.optarg() elif o=='-n' or o=='--num': ncopies = opt.optarg(opt.typecast(int, 'integer')) elif o=='-q' or o=='--quiet': verbose = 0 else: opt.error('unknown option '+o) if len(opt.args()) != 2: opt.error('incorrect argument count') print 'report=%s, ncopies=%s verbose=%s' % (report, ncopies, verbose) print 'arguments: ', opt.args() It's fairly clear what's going on, and the option parser itself is very simple too. While it may not have all the bells and whistles that some packages do, I think it's simplicity makes most of them irrelevant. It or something like it might be the right approach to take to present a simpler interface. The simplicity of the interface has the benefit that users (potentially anyone who writes a Python program) don't have to learn a lot of stuff to parse their command-line arguments. Suppose I want to write a program with an option that takes two arguments instead of one. Given the Optik-style example it's not at all clear how to do this. Given the above example, there's one obvious thing to try: call opt.optarg() twice. That sort of thing. Addressing the benchmark set by Optik: [1] > * it ties short options and long options together, so once you > define your options you never have to worry about the fact that > -f and --file are the same Here the code does that for you, and if you want to use some other convention, you're not tied to anything. (You do have to tie -f and --file in the usage message too, see answer to [3].) [2] > * it's strongly typed: if you say option --foo expects an int, > then Optik makes sure the user supplied a string that can be > int()'ified, and supplies that int to you There are plenty of ways you could consider adding this. The easiest is what I did in the example. The optarg argument fetcher takes a function to transform the argument before returning. Here, our function calls opt.error() if the argument cannot be converted to an int. The added bells and whistles that Optik adds (choice sets, etc.) can be added in this manner as well, as external functions that the parser doesn't care about, or as internally-supplied helper functions that the user can call if he wants. [3] > * it automatically generates full help based on snippets of > help text you supply with each option This is the one shortcoming: you have to write the usage message yourself. I feel that the benefit of having much clearer argument parsing makes it worth bearing this burden. Also, tools like Optik have to work fairly hard to present the usage message in a reasonable manner, and if it doesn't do what you want you either have to write extension code or just write your own usage message anyway. I'd rather give this up and get the rest of the benefits. [4] > * it has a wide range of "actions" -- ie. what to do with the > value supplied with each option. Eg. you can store that value > in a variable, append it to a list, pass it to an arbitrary > callback function, etc. Here the code provides the widest possible range of actions: you run arbitrary code for each option, and it's all in once place rather than scattered. [5] > * you can add new types and actions by subclassing -- how to > do this is documented and tested The need for new actions is obviated by not having actions at all. The need for new types could be addressed by the argument transformer, although I'm not really happy with that and wouldn't mind seeing it go away. In particular, ncopies = opt.optarg(opt.typecast(int, 'integer')) seems a bit more convoluted and slightly ad hoc compared to the straightforward: try: ncopies = int(opt.optarg()) except ValueError: opt.error(opt.curopt+' requires an integer argument') especially when the requirements get complicated, like the integer has to be prime. Perhaps a hybrid is best, using a collection of standard transformers for the common cases and falling back on actual code for the tough ones. [6] > * it's dead easy to implement simple, straightforward, GNU/POSIX- > style command-line options, but using callbacks you can be as > insanely flexible as you like Here, ditto, except you don't have to use callbacks in order to be as insanely flexible as you like. [7] > * provides lots of mechanism and only a tiny bit of policy (namely, > the --help and (optionally) --version options -- and you can > trash that convention if you're determined to be anti-social) In this version there is very little mechanism (no need for lots), and no policy. It would be easy enough to add the --help and --version hacks as a standard subclass. Anyhow, there it is. I've attached the code for the parser, which I just whipped up tonight. If people think this is a promising thing to explore and someone else wants to take over exploring, great. If yes promising but no takers, I'm willing to keep at it. Russ --- opt.py from __future__ import generators import sys, copy class OptionError(Exception): pass class OptionParser: def __init__(self, argv=sys.argv, usage=None): self.argv0 = argv[0] self.argv = argv[1:] self.usage = usage def __iter__(self): # this assumes the " while self.argv: if self.argv[0]=='-' or self.argv[0][0]!='-': break a = self.argv.pop(0) if a=='--': break if a[0:2]=='--': i = a.find('=') if i==-1: self.curopt = a yield self.curopt self.curopt = None else: self.curarg = a[i+1:] self.curopt = a[0:i] yield self.curopt if self.curarg: # wasn't fetched with optarg self.error(self.curopt+' does not take an argument') self.curopt = None continue self.curarg = a[1:] while self.curarg: a = self.curarg[0:1] self.curarg = self.curarg[1:] self.curopt = '-'+a yield self.curopt self.curopt = None def optarg(self, fn=lambda x:x): if self.curarg: ret = self.curarg self.curarg='' else: try: ret = self.argv.pop(0) except IndexError: self.error(self.curopt+' requires argument') return fn(ret) def _typecast(self, t, x, desc=None): try: return t(x) except ValueError: d = desc if d == None: d = str(t) self.error(self.curopt+' requires '+d+' argument') def typecast(self, t, desc=None): return lambda x: self._typecast(t, x, desc) def args(self): return self.argv def error(self, msg): if self.usage != None: sys.stderr.write('option error: '+msg+'\n\n'+self.usage) sys.stderr.flush() sys.exit(0) else: raise OptionError(), msg ######## import sys usagemessage=\ '''usage: example.py [-h] [-f FILE] [-n N] [-q] who where -h, --help show this help message -f FILE, --file=FILE write report to FILE -n N, --num=N print N copies of the report -q, --quiet don't print status messages to stdout ''' def main(): opt = OptionParser(usage=usagemessage) report = 'default.file' ncopies = 1 verbose = 1 for o in opt: if o=='-f' or o=='--file': report = opt.optarg() elif o=='-n' or o=='--num': ncopies = opt.optarg(opt.typecast(int, 'integer')) elif o=='-q' or o=='--quiet': verbose = 0 else: opt.error('unknown option '+o) if len(opt.args()) != 2: opt.error('incorrect argument count') print 'report=%s, ncopies=%s verbose=%s' % (report, ncopies, verbose) print 'arguments: ', opt.args() if __name__=='__main__': main() From martin@v.loewis.de Tue Feb 12 09:39:44 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 12 Feb 2002 10:39:44 +0100 Subject: [Python-Dev] Order that site-packages is added to sys.path In-Reply-To: <15464.41842.664484.307330@anthem.wooz.org> References: <15464.41842.664484.307330@anthem.wooz.org> Message-ID: barry@zope.com (Barry A. Warsaw) writes: > So the question is: why does site.py append site-packages instead of > prepending it to sys.path? I think the rationale was that you are precisely not supposed to override any of the standard modules. It was considered a good thing that if you do "import string" in some version of Python, you know exactly what you will get. There is currently one exception to that rule, which is the xml module and PyXML: the standard xml module allows being replaced by add-on (later, better) packages. However, there have been complaints that this is so: One of Paul Prescod's applications would break if PyXML was installed, since PyXML performed some stricter argument checking in certain cases. The same problem would occur more frequently if you have site-packages in front of the path: The add-on package may behave worse than the standard package in some cases (especially after installing a Python bugfix release); this problem is hard to track. In the specific case, I'd propose the following strategy: - Get the fixes to the Email package into the 2.2 maintainance branch, in addition to getting them into the trunk. This assumes that the patches really do fix bugs and are suitable for the general public etc. - If Python 2.2.1 is released before Mailman 2.1, you are done: Just tell your users that they need 2.2.1 or 2.1.2, but cannot use 2.2 (or need to live with limitations in MIME processing). - If this is not possible, rename the email package inside mailman (e.g. xemail). It then appears that the standard library package is not suitable for mailman, so just ignore its presence, and use your own (under a different name). - As a compromise, you might consider falling back to the email package if you determine it is good enough at installation time, by playing with xemail.__init__.__path__, or even replacing xemail with email in the same way that xml is replaced with _xmlplus. Regards, Martin From martin@v.loewis.de Tue Feb 12 09:45:36 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 12 Feb 2002 10:45:36 +0100 Subject: [Python-Dev] a different approach to argument parsing In-Reply-To: References: Message-ID: "Russ Cox" writes: > Anyhow, there it is. I've attached the code for the parser, which I > just whipped up tonight. If people think this is a promising thing to > explore and someone else wants to take over exploring, great. > If yes promising but no takers, I'm willing to keep at it. I think it is quite promising; I share your views on option parsing, and like an iterative interface myself very much. Regards, Martin From paul@prescod.net Tue Feb 12 10:23:47 2002 From: paul@prescod.net (Paul Prescod) Date: Tue, 12 Feb 2002 02:23:47 -0800 Subject: [Python-Dev] a different approach to argument parsing References: Message-ID: <3C68ED33.9244EDA0@prescod.net> I like the general direction but one thing makes me a little confused... Russ Cox wrote: > >... > > for o in opt: > if o=='-n' or o=='--num': > ncopies = opt.optarg(opt.needtype(int)) How does "opt" know that I am looking for the arguments to the --num command line argument and not the --file one? I guess I would expect an interface more like: for o, value in opt: if o=='-n' or o=='--num': ncopies = optparser.needtype(value, 'integer') Paul Prescod From mal@lemburg.com Tue Feb 12 10:36:00 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 12 Feb 2002 11:36:00 +0100 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15464.41842.664484.307330@anthem.wooz.org> Message-ID: <3C68F010.4736AE2F@lemburg.com> "Barry A. Warsaw" wrote: > > I have a bit of a dilemma when it comes to sys.path and the location > of the site-packages directory. > > The problem comes when someone is using Mailman 2.1 with Python 2.2. > The latter comes with the email package, which is in the standard > library. Through some contributions, my standalone email package now > supports multibyte character sets in RFC-compliant ways > (e.g. splitting long headers correctly). The question is, how do I > get the updated package to Python 2.2 users? > > The standalone email package is a simple distutils thingie with a > directory and a bunch of .py files. distutils sticks this in > site-packages. But an "import email" will always get the standard > library version instead of the site-packages version because site.py > /appends/ site-packages to sys.path instead of prepending it. > > I can work around this by adding my own path-hacking code before any > import of email.* modules. This is a bit ugly because now it means > that the proper functioning of the application depends on import > order, and that's nasty. Why not put put the updated email package into the Mailman package (is it a package?) ? That way you can update whatever part you want from the Python lib or replace it with something else. > So the question is: why does site.py append site-packages instead of > prepending it to sys.path? If there's a valid reason, I don't > remember it, and I'm currently blind to any valuable use case. I guess this is done for the same reason that e.g. /usr/local is last in PATH on Unix: system top level programs and libs should always have top priority. Otherwise, a user could easily override a system program/lib by placing a new version into the local dir which then gets picked up by other system programs. I'd suggest to better be explicit about what you do and to put the new code in the package (which is completely under your control). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Tue Feb 12 10:52:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 12 Feb 2002 11:52:05 +0100 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15464.41842.664484.307330@anthem.wooz.org> Message-ID: <3C68F3D5.A50F1347@lemburg.com> "Martin v. Loewis" wrote: > > - As a compromise, you might consider falling back to the email > package if you determine it is good enough at installation time, by > playing with xemail.__init__.__path__, or even replacing xemail with > email in the same way that xml is replaced with _xmlplus. Those kind of hacks should not be needed if Barry puts his own email package inside the Mailman package. All local imports will pick up his version automatically; even though I'd suggest to use explicit imports for it in the Mailman code to avoid magical problems ;-) Hacking __path__ should really only be the last resort... it (usually) breaks installers, gives importers a hard time, etc. We should not consider this good practice even though it may be needed sometimes (e.g. by PyXML). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Tue Feb 12 11:40:11 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 12 Feb 2002 12:40:11 +0100 Subject: [Python-Dev] patch: speed up name access by up to 80% References: <20020211220954.A12061@hishome.net> Message-ID: <3C68FF1B.C3F449DB@lemburg.com> Some other things you might want to try: * Inline small dictionary tables in the PyObject struct and only revert to external tables for larger ones. (I have an old patch for this one which you might want to update) * Optimize perfect hashings. Sometimes (hopefully most of the times) Python will generate a perfect hashing for a set of attributes. In that case, it could set a flag in the dictionary object to be able to use a faster lookup function. BTW, could you run pybench against your patch ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From oren-py-d@hishome.net Tue Feb 12 13:29:47 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 12 Feb 2002 08:29:47 -0500 Subject: [Python-Dev] patch: speed up name access by up to 80% In-Reply-To: <3C68FF1B.C3F449DB@lemburg.com> References: <20020211220954.A12061@hishome.net> <3C68FF1B.C3F449DB@lemburg.com> Message-ID: <20020212132947.GA13065@hishome.net> On Tue, Feb 12, 2002 at 12:40:11PM +0100, M.-A. Lemburg wrote: > > * Inline small dictionary tables in the PyObject struct and only > revert to external tables for larger ones. (I have an old patch > for this one which you might want to update) > > * Optimize perfect hashings. Sometimes (hopefully most of the times) > Python will generate a perfect hashing for a set of attributes. > In that case, it could set a flag in the dictionary object to > be able to use a faster lookup function. Interesting, but I am exploring other directions now: attribute access, hints associated to negative entries that should speed up the next lookup in the chain and getting the inline/fast ratio from 3:1 up to 10:1 or higher. > BTW, could you run pybench against your patch ? 18331152 fastlocal non-dictionary lookups 416661 inline dictionary lookups 131509 fast dictionary lookups 200 slow dictionary lookups With 97% of accesses using fastlocals it's not going to have any significant effect. Oren From guido@python.org Tue Feb 12 13:45:21 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 12 Feb 2002 08:45:21 -0500 Subject: [Python-Dev] a different approach to argument parsing In-Reply-To: Your message of "Tue, 12 Feb 2002 02:23:47 PST." <3C68ED33.9244EDA0@prescod.net> References: <3C68ED33.9244EDA0@prescod.net> Message-ID: <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net> OK, I was wrong when I expected there wouldn't be much traffic. The design of an option parser alternative does *not* belong on python-dev. Please get this discussion of the list NOW and move it elsewhere. You can come back here when you've agreed on a solution. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Feb 12 13:53:21 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 12 Feb 2002 08:53:21 -0500 Subject: [Python-Dev] patch: speed up name access by up to 80% In-Reply-To: Your message of "Tue, 12 Feb 2002 12:40:11 +0100." <3C68FF1B.C3F449DB@lemburg.com> References: <20020211220954.A12061@hishome.net> <3C68FF1B.C3F449DB@lemburg.com> Message-ID: <200202121353.g1CDrLq30278@pcp742651pcs.reston01.va.comcast.net> > * Inline small dictionary tables in the PyObject struct and only > revert to external tables for larger ones. (I have an old patch > for this one which you might want to update) I may be missing some context, but AFAIK we already do this. See dictobject.h: the last item in struct _dictobject is PyDictEntry ma_smalltable[PyDict_MINSIZE]; --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Feb 12 14:05:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 12 Feb 2002 15:05:05 +0100 Subject: [Python-Dev] patch: speed up name access by up to 80% References: <20020211220954.A12061@hishome.net> <3C68FF1B.C3F449DB@lemburg.com> <200202121353.g1CDrLq30278@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C692111.5C39D266@lemburg.com> Guido van Rossum wrote: > > > * Inline small dictionary tables in the PyObject struct and only > > revert to external tables for larger ones. (I have an old patch > > for this one which you might want to update) > > I may be missing some context, but AFAIK we already do this. See > dictobject.h: the last item in struct _dictobject is > > PyDictEntry ma_smalltable[PyDict_MINSIZE]; Nice :-) I must have missed that addition (or simply forgotten about it). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Tue Feb 12 14:15:12 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 12 Feb 2002 15:15:12 +0100 Subject: [Python-Dev] SSL support in _socket Message-ID: <3C692370.D21EF15D@lemburg.com> I have a problem with SSL support in _socket and the way setup.py does the autodetection: even though SSL may be installed on the system, it seems that they changed the exposed APIs between patch level releases. As a result, _socket compiles but the import fails on platforms which have the wrong OpenSSL version installed. setup.py then simply removes _socket from the extension list and builds Python without socket support which is a really Bad Thing since _socket without SSL support compiles just fine. What can we do about this ? Since auto-detection is happening rather early in setup.py it doesn't seem possible to apply some fallback scheme depending on extra knowledge for the various modules. Perhaps we should simply let setup.py build two extensions: _socket (without SSL) and _socketssl (with SSL) ?! If the _socketssl build or import fails for some reason, Python could still pick up the _socket extension in socket.py. Comments ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Tue Feb 12 14:25:24 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 12 Feb 2002 09:25:24 -0500 Subject: [Python-Dev] SSL support in _socket In-Reply-To: Your message of "Tue, 12 Feb 2002 15:15:12 +0100." <3C692370.D21EF15D@lemburg.com> References: <3C692370.D21EF15D@lemburg.com> Message-ID: <200202121425.g1CEPO330489@pcp742651pcs.reston01.va.comcast.net> > Perhaps we should simply let setup.py build two extensions: > _socket (without SSL) and _socketssl (with SSL) ?! If the > _socketssl build or import fails for some reason, Python > could still pick up the _socket extension in socket.py. +1 --Guido van Rossum (home page: http://www.python.org/~guido/) From david@boddie.net Tue Feb 12 15:08:20 2002 From: david@boddie.net (David Boddie) Date: Tue, 12 Feb 2002 15:08:20 +0000 Subject: [Python-Dev] Re: RFC: Option Parsing Libraries Message-ID: <20020212150948.0E0492AA62@wireless-084-136.tele2.co.uk> Paul Prescod wrote: > Greg Ward has proposed to add his Optik module to the Python library. > > * http://mail.python.org/pipermail/python-dev/2002-February/019934.html > > Optik is a parser for command line options with the features described > here: > > * http://mail.python.org/pipermail/python-dev/2002-February/019937.html > > If you have a competitive library, or suggestions for changes to Optik, > please forward your comments to python-dev mailing list > (python-dev@python.org). If a long-running conversation ensues, you may > need to join the list to participate. Discussions in comp.lang.python > will not be considered unless you at least forward a reference to > python-dev. If you propose a competitor to Optik, please describe how it > compares to the feature list described above. I have been working on a library which I hope begins to unify the presentation of arguments to the programmer with the syntax that is presented to the user. In the form of the list in the first reference above: cmdsyntax.py (Feb 2002) Author: David Boddie URL: http://www-solar.mcs.st-and.ac.uk/~davidb/Software/Python/cmdsyntax/ * An attempt at OO design, with limited functionality allowing you to: 1. Set up a syntax object using a syntax definition. 2. Supply arguments from sys.argv or a string and retrieve either: 2a. A dictionary containing values corresponding to the required arguments. 2b. A list of possible matches with the definition. * Arguments passed must conform to a familiar looking syntax definition. For example, the string "infile [-o outfile]" indicates that one argument is necessary and that another may be specified using the -o switch. * Short and long options are allowed. The short options cannot accept arguments as in the example "-ffoo", but lists of options such as "-avx" are supported and accept combinations of these options in any order. Long options support both the "--no-value" and "--name=value" variants. * Command arguments may be specified, which must be matched exactly by user input, as in the case "add"|"remove" value which requires that either the command "add" or "remove" be given with a following argument. * Arguments/options may be grouped using brackets. * No type information is specified, so arguments are not typed before being presented to the programmer. * Excessive method used to match arguments with the syntax definition: all possible definitions are generated then arguments are matched against each one. * No ability to catch remaining unspecified arguments. * No license. * Needs more testing. I'm not proposing this as a competitor to Optik, but I'm happy to donate any ideas and code to the effort. David ________________________________________________________________________ This email has been scanned for all viruses by the MessageLabs SkyScan service. For more information on a proactive anti-virus service working around the clock, around the globe, visit http://www.messagelabs.com ________________________________________________________________________ From nas@python.ca Tue Feb 12 15:13:37 2002 From: nas@python.ca (Neil Schemenauer) Date: Tue, 12 Feb 2002 07:13:37 -0800 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: ; from tim.one@comcast.net on Mon, Feb 11, 2002 at 11:31:39PM -0500 References: <15463.55207.217946.237969@12-248-41-177.client.attbi.com> Message-ID: <20020212071336.A32363@glacier.arctrix.com> Tim Peters wrote: > Speaking of which, why does LOAD_FAST waste time checking against NULL > twice?! If you would have approved my patch it would be fixed already. one-small-banana-left-ly y'rs Neil From gward@python.net Tue Feb 12 15:27:24 2002 From: gward@python.net (Greg Ward) Date: Tue, 12 Feb 2002 10:27:24 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path In-Reply-To: <15464.41842.664484.307330@anthem.wooz.org> References: <15464.41842.664484.307330@anthem.wooz.org> Message-ID: <20020212152724.GA24891@gerg.ca> On 12 February 2002, Barry A. Warsaw said: > The standalone email package is a simple distutils thingie with a > directory and a bunch of .py files. distutils sticks this in > site-packages. But an "import email" will always get the standard > library version instead of the site-packages version because site.py > /appends/ site-packages to sys.path instead of prepending it. > > I can work around this by adding my own path-hacking code before any > import of email.* modules. This is a bit ugly because now it means > that the proper functioning of the application depends on import > order, and that's nasty. Looong ago, I tried to persuade Guido that giving the Distutils the power to override standard library modules would, on rare occasions, be a good and useful thing. (Yet another idea stolen from Perl's MakeMaker, which can do precisely that. Sometimes, it's useful.) Guess who won? Greg -- Greg Ward - just another Python hacker gward@python.net http://starship.python.net/~gward/ A closed mouth gathers no foot. From paul@prescod.net Tue Feb 12 15:45:11 2002 From: paul@prescod.net (Paul Prescod) Date: Tue, 12 Feb 2002 07:45:11 -0800 Subject: [Python-Dev] a different approach to argument parsing References: <3C68ED33.9244EDA0@prescod.net> <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C693886.28EA8595@prescod.net> Guido van Rossum wrote: > > OK, I was wrong when I expected there wouldn't be much traffic. The > design of an option parser alternative does *not* belong on > python-dev. Please get this discussion of the list NOW and move it > elsewhere. You can come back here when you've agreed on a solution. Hey, I just did the announcement. I'm not the ringleader. If someone (e.g. at pythonlabs) can set up a mailman for us then I'll send out another announcement telling people about it. But I don't intend to become the point man for option parsing! I could also set up a yahoogroups list but those are somewhat annoying in my experience. Paul Prescod From gward@python.net Tue Feb 12 15:50:54 2002 From: gward@python.net (Greg Ward) Date: Tue, 12 Feb 2002 10:50:54 -0500 Subject: [Python-Dev] a different approach to argument parsing In-Reply-To: <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net> References: <3C68ED33.9244EDA0@prescod.net> <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net> Message-ID: <20020212155054.GB24891@gerg.ca> On 12 February 2002, Guido van Rossum said: > OK, I was wrong when I expected there wouldn't be much traffic. The > design of an option parser alternative does *not* belong on > python-dev. Please get this discussion of the list NOW and move it > elsewhere. You can come back here when you've agreed on a solution. I'm about to (try to) create a list on Starship for this: how does getopt-alternatives@python.net sound as a place to discuss this issue? Everyone who has posted on this thread will receive an invitation to join the list. (Assuming I can get Mailman to do my bidding, that is.) Glad I brought the whole thing up though: hopefully something good will emerge! Greg -- Greg Ward - just another Python hacker gward@python.net http://starship.python.net/~gward/ All things are possible -- except skiing through a revolving door. From barry@zope.com Tue Feb 12 16:03:59 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 11:03:59 -0500 Subject: [Python-Dev] a different approach to argument parsing References: <3C68ED33.9244EDA0@prescod.net> <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net> <20020212155054.GB24891@gerg.ca> Message-ID: <15465.15599.363507.279197@anthem.wooz.org> >>>>> "GW" == Greg Ward writes: GW> I'm about to (try to) create a list on Starship for this: how GW> does getopt-alternatives@python.net sound as a place to GW> discuss this issue? I'm happy to create the list on python.org if you prefer. I'd go for full SIG status: getopt-sig@python.org. Let its charter be short lived. If Greg's willing to be the champion, I'll set this up. -Barry From fdrake@acm.org Tue Feb 12 16:16:27 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 12 Feb 2002 11:16:27 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: References: <200202111628.g1BGSxF20159@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15465.16347.962182.714475@grendel.zope.com> Tim Peters writes: > Vanilla pymalloc is perfect for this: many small objects. A custom free > list for cells is a waste of code, because a cell never goes away until the > module does: cells will not "churn". We'll get a lot of them, but most of > them will stay alive until the program ends, so the tiny performance gain > you may be able to get from a thoroughly specialized free list "in theory" > will never be realized in practice. Have we become convinced that these cells need to be Python objects? I must have missed that. As long as we can keep them simple structures, we should be able to avoid individual allocations for them. It seems we have a fixed number of cells for both module objects and function objects (regardless of whether they are part of the new celldict or the containing function or module), so they can be allocated as an array rather than individually. So, I must be missing something. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@zope.com Tue Feb 12 16:43:24 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 11:43:24 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15464.41842.664484.307330@anthem.wooz.org> Message-ID: <15465.17964.571257.835907@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> I think the rationale was that you are precisely not supposed MvL> to override any of the standard modules. It was considered a MvL> good thing that if you do "import string" in some version of MvL> Python, you know exactly what you will get. Okay, I can see why that's useful. Let's say there was a way to add stuff to the front of sys.path, such that they could override the standard library. This might work just fine on a single user (or single application) system, but might be very broken on a multiuser or multiapp system ("I know what I'm installing in site-packages, so what's the problem?"). Hopefully, any overrides that were installed would be API compatible with the standard. Such overrides would probably be allowed to fix bugs or add functionality, but not remove functionality. This might still get us into trouble and this path leads to module versioning, etc. I don't want to go there now. I know how to handle my specific case (I've done it before), but just to close the loop, I can't wait for Python 2.2.1 because some of the features I'm depending on are new features, not just bug fixes. I think those will have to wait for Python 2.3 to be safe, so until then, I must distribute a separate package. That's fine, I can live with that. I think "python setup.py install --root blah" will do the trick for me, along with some application specific path-hackery. -Barry From barry@zope.com Tue Feb 12 16:46:26 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 11:46:26 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15464.41842.664484.307330@anthem.wooz.org> <3C68F010.4736AE2F@lemburg.com> Message-ID: <15465.18146.700970.932676@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> I guess this is done for the same reason that e.g. /usr/local MAL> is last in PATH on Unix: system top level programs and libs MAL> should always have top priority. Otherwise, a user could MAL> easily override a system program/lib by placing a new version MAL> into the local dir which then gets picked up by other system MAL> programs. Well, hopefully you'd control who can write into /usr/local so that you could trust overrides being installed there. On a single user system, I usually do in fact put /usr/local/bin early in my path specifically because I do want to override older, buggier, system programs. The analogy is similar in the Python situation. When I'm the only person using the system, and I'm in control of everything, being able to override the standard library is a very useful thing to do. When there's less trust in the environment I'm running in, or more sharing of common resources, it can be problematic. -Barry From barry@zope.com Tue Feb 12 16:48:58 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 11:48:58 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15464.41842.664484.307330@anthem.wooz.org> <20020212152724.GA24891@gerg.ca> Message-ID: <15465.18298.611257.213141@anthem.wooz.org> >>>>> "GW" == Greg Ward writes: GW> Looong ago, I tried to persuade Guido that giving the GW> Distutils the power to override standard library modules GW> would, on rare occasions, be a good and useful thing. (Yet GW> another idea stolen from Perl's MakeMaker, which can do GW> precisely that. Sometimes, it's useful.) Guess who won? distutils's --root option could be used to specific a different install directory than site-packages right? So conceivably site.py could prepend some directory onto sys.path, and distutils could be coaxed into installing there rather than site-packages. This might provide a principled way to override Python's standard library when you're really sure that's what you want to do. -Barry From mwh@python.net Tue Feb 12 17:05:02 2002 From: mwh@python.net (Michael Hudson) Date: 12 Feb 2002 17:05:02 +0000 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods Message-ID: <2m6652my0x.fsf@starship.python.net> Some time ago, Gareth McCaughan suggested a syntax for staticmethods. You'd write class C(object): def static(arg) [staticmethod]: return 1 + arg C.static(2) => 3 The way this works is that the above becomes syntactic sugar for roughly: class C(object): def $temp(arg): return 1 + arg static = staticmethod($temp) Anyway, I thought this was a reasonably pythonic idea, so I implemented it, and thought I'd mention it here. Patch at: http://starship.python.net/crew/mwh/hacks/meth-syntax-sugar.diff Some other things that become possible: >>> class D(object): ... def x(self) [property]: ... return "42" ... hello! >>> D().x '42' (the hello! is a debugging printf I haven't taken out yet...) >>> def published(func): ... func.publish = 1 ... return func ... >>> def m() [published]: ... print "hiya!" ... hello! >>> m.publish 1 >>> def hairy_constant() [apply]: ... return math.cos(1 + math.log(34)) ... hello! >>> hairy_constant -0.18495734252481616 >>> def memoize(func): ... cache = {} ... def f(*args): ... try: ... return cache[args] ... except: ... return cache.setdefault(args, func(*args)) ... return f ... >>> def fib(a) [memoize]: ... if a < 2: return 1 ... return fib(a-1) + fib(a-2) ... hello! >>> fib(40) 165580141 # fairly quickly I'm not sure all of these are Good Things (esp. the [apply] one...). OTOH, I think the idea is worth discussion (or squashing by Guido :). Cheers, M. -- For every complex problem, there is a solution that is simple, neat, and wrong. -- H. L. Mencken From jeremy@zope.com Tue Feb 12 17:08:51 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Tue, 12 Feb 2002 12:08:51 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path In-Reply-To: <15465.18298.611257.213141@anthem.wooz.org> References: <15464.41842.664484.307330@anthem.wooz.org> <20020212152724.GA24891@gerg.ca> <15465.18298.611257.213141@anthem.wooz.org> Message-ID: <15465.19491.615551.709595@gondolin.digicool.com> >>>>> "BAW" == Barry A Warsaw writes: BAW> distutils's --root option could be used to specific a different BAW> install directory than site-packages right? So conceivably BAW> site.py could prepend some directory onto sys.path, and BAW> distutils could be coaxed into installing there rather than BAW> site-packages. This might provide a principled way to override BAW> Python's standard library when you're really sure that's what BAW> you want to do. Why don't you use "--root /usr/local/lib/python2.2" and *really* override the standard library? It seems fragile to extend Python with yet more directories to search in a special order so that the interpreter picks up the correct copy of somemodule.py from among the four or five copies installed on the system and on the path. Jeremy From gward@python.net Tue Feb 12 17:17:52 2002 From: gward@python.net (Greg Ward) Date: Tue, 12 Feb 2002 12:17:52 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path In-Reply-To: <15465.19491.615551.709595@gondolin.digicool.com> References: <15464.41842.664484.307330@anthem.wooz.org> <20020212152724.GA24891@gerg.ca> <15465.18298.611257.213141@anthem.wooz.org> <15465.19491.615551.709595@gondolin.digicool.com> Message-ID: <20020212171752.GA25558@gerg.ca> On 12 February 2002, Jeremy Hylton said: > Why don't you use "--root /usr/local/lib/python2.2" and *really* > override the standard library? No: --root just lets you replace / with something else. It's mainly so you can build an RPM (eg.) without being superuser. Your example would install to /usr/local/lib/python2.2/usr/local/lib/python2.2/site-packages ...which is probably not what you meant. The distutils install command *is* pretty flexible; if someone cares to sit down and figure it out, I'm sure this is possible. It's just not documented or obvious. Greg -- Greg Ward - Linux weenie gward@python.net http://starship.python.net/~gward/ "He's dead, Jim. You get his tricorder and I'll grab his wallet." From jeremy@alum.mit.edu Tue Feb 12 17:29:37 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Tue, 12 Feb 2002 12:29:37 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path In-Reply-To: <20020212171752.GA25558@gerg.ca> References: <15464.41842.664484.307330@anthem.wooz.org> <20020212152724.GA24891@gerg.ca> <15465.18298.611257.213141@anthem.wooz.org> <15465.19491.615551.709595@gondolin.digicool.com> <20020212171752.GA25558@gerg.ca> Message-ID: <15465.20737.379286.281107@gondolin.digicool.com> Perhaps --home then? I know there's some command I've used to install Python packages in my Zope lib/python directory by spelling out its full path. Jeremy From jason-dated-1014226523.fe612b@mastaler.com Tue Feb 12 17:35:21 2002 From: jason-dated-1014226523.fe612b@mastaler.com (Jason R. Mastaler) Date: Tue, 12 Feb 2002 10:35:21 -0700 Subject: [Python-Dev] Re: RFC: Option Parsing Libraries In-Reply-To: (Paul Prescod's message of "Mon, 11 Feb 2002 08:54:53 -0800") References: Message-ID: I haven't specifically looked at Optik, but if another option parser is going to be added to the standard lib, there is one thing I'd like it to have which the getopt module currently doesn't: Support for optional arguments. That is, the ability to specify that an option *may* have an argument, and not just that it either must or can't have an argument. I find this limitation in getopt very frustrating. -- http://tmda.sourceforge.net/ From mal@lemburg.com Tue Feb 12 17:45:48 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 12 Feb 2002 18:45:48 +0100 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15464.41842.664484.307330@anthem.wooz.org> <3C68F010.4736AE2F@lemburg.com> <15465.18146.700970.932676@anthem.wooz.org> Message-ID: <3C6954CC.46B215D8@lemburg.com> "Barry A. Warsaw" wrote: > [distutils --root hackery] Why not use the subpackage approach I suggested ? It keeps the std lib in a sane state (meaning that the std lib installation only depends on the Python installation and no other hacks on top of it). Since you'll have to ship the complete package anyway, I don't see any win in installing over the std email package. If that's what you really want, I'd suggest to provide the updated email package as separate download and then test inside Mailman for the new version. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Tue Feb 12 17:55:23 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 12 Feb 2002 18:55:23 +0100 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods Message-ID: <3C69570B.CD65B453@lemburg.com> This is a multi-part message in MIME format. --------------E8FBACF218F16A772D2300AE Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit --------------E8FBACF218F16A772D2300AE Content-Type: message/rfc822 Content-Transfer-Encoding: 7bit Content-Disposition: inline Received: from lemburg.com (www.egenix.com [217.115.138.139]) by www.egenix.com (8.11.2/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id g1CHegs25065; Tue, 12 Feb 2002 18:40:42 +0100 Message-ID: <3C6953B8.91157457@lemburg.com> Date: Tue, 12 Feb 2002 18:41:12 +0100 From: "M.-A. Lemburg" Organization: eGenix.com Software GmbH X-Mailer: Mozilla 4.78 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Michael Hudson Subject: Re: [Python-Dev] syntactic sugar idea for {static,class}methods References: <2m6652my0x.fsf@starship.python.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Michael Hudson wrote: > > Some time ago, Gareth McCaughan suggested a syntax for staticmethods. > You'd write > > class C(object): > def static(arg) [staticmethod]: > return 1 + arg > > C.static(2) > => 3 > Certainly looks nice. I'd just use a shorter name for [staticmethod], e.g. [static]. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ --------------E8FBACF218F16A772D2300AE-- From barry@zope.com Tue Feb 12 18:03:05 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 13:03:05 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15464.41842.664484.307330@anthem.wooz.org> <3C68F010.4736AE2F@lemburg.com> <15465.18146.700970.932676@anthem.wooz.org> <3C6954CC.46B215D8@lemburg.com> Message-ID: <15465.22745.136540.172075@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> "Barry A. Warsaw" wrote: >> [distutils --root hackery] MAL> Why not use the subpackage approach I suggested ? MAL> It keeps the std lib in a sane state (meaning that the std MAL> lib installation only depends on the Python installation and MAL> no other hacks on top of it). MAL> Since you'll have to ship the complete package anyway, I MAL> don't see any win in installing over the std email MAL> package. If that's what you really want, I'd suggest to MAL> provide the updated email package as separate download and MAL> then test inside Mailman for the new version. I'm fine with installing in a Mailman specific location, but I still want to use as much of the distutils machinery as possible. It looks like python setup.py install --home=/some/path gets close enough. This will install the email package into /some/path/lib/python and I can easily arrange for that to be in the right place on sys.path, at least for the mail program and the cgi program. The command line scripts are a bit trickier because you can't wheedle your way into Python's startup machinery without 1) telling your users to setenv PYTHONPATH (yuck) or 2) importing a path-hacking module before any that require the override location. Since I already have to do #2 anyway, this isn't much of a problem, except that some imports will have to be rearranged. It also makes things a little trickier when a user does eventually upgrade to Python 2.3, which will obviate the need for the enhanced package (hopefully). Like everyone else, I'm sure I'll eventually just end up shipping my own complete Python distro to make sure it's got exactly what you need. ;) -Barry From barry@zope.com Tue Feb 12 18:03:41 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 13:03:41 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15464.41842.664484.307330@anthem.wooz.org> <20020212152724.GA24891@gerg.ca> <15465.18298.611257.213141@anthem.wooz.org> <15465.19491.615551.709595@gondolin.digicool.com> <20020212171752.GA25558@gerg.ca> Message-ID: <15465.22781.118071.730570@anthem.wooz.org> >>>>> "GW" == Greg Ward writes: GW> The distutils install command *is* pretty flexible; if someone GW> cares to sit down and figure it out, I'm sure this is GW> possible. It's just not documented or obvious. Any hope of actually documenting all this stuff? -Barry From barry@zope.com Tue Feb 12 18:12:45 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 13:12:45 -0500 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods References: <2m6652my0x.fsf@starship.python.net> Message-ID: <15465.23325.41291.966138@anthem.wooz.org> >>>>> "MH" == Michael Hudson writes: MH> Some time ago, Gareth McCaughan suggested a syntax for MH> staticmethods. You'd write MH> class C(object): | def static(arg) [staticmethod]: | return 1 + arg | C.static(2) | => 3 Very interesting! Why the square brackets though? Is that just for visual offset or is there a grammar constraint that requires them? I'd leave them out of the picture, unless you mean to imply that a list is acceptable in that position . salt-and-pep-per-ly y'rs, -Barry From skip@pobox.com Tue Feb 12 18:29:37 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 12 Feb 2002 12:29:37 -0600 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods In-Reply-To: <15465.23325.41291.966138@anthem.wooz.org> References: <2m6652my0x.fsf@starship.python.net> <15465.23325.41291.966138@anthem.wooz.org> Message-ID: <15465.24337.8353.562658@beluga.mojam.com> MH> class C(object): MH> def static(arg) [staticmethod]: MH> return 1 + arg BAW> Why the square brackets though? I believe Guido addressed this in his DevDay presentation. The list construct is to allow future extensions without requiring parser changes. Skip From barry@zope.com Tue Feb 12 18:44:11 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 13:44:11 -0500 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods References: <2m6652my0x.fsf@starship.python.net> <15465.23325.41291.966138@anthem.wooz.org> <15465.24337.8353.562658@beluga.mojam.com> Message-ID: <15465.25211.385898.880479@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: MH> s C(object): def static(arg) [staticmethod]: return 1 + arg BAW> Why the square brackets though? SM> I believe Guido addressed this in his DevDay presentation. SM> The list construct is to allow future extensions without SM> requiring parser changes. Okie dokie. -Barry From guido@python.org Tue Feb 12 19:11:15 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 12 Feb 2002 14:11:15 -0500 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods In-Reply-To: Your message of "Tue, 12 Feb 2002 12:29:37 CST." <15465.24337.8353.562658@beluga.mojam.com> References: <2m6652my0x.fsf@starship.python.net> <15465.23325.41291.966138@anthem.wooz.org> <15465.24337.8353.562658@beluga.mojam.com> Message-ID: <200202121911.g1CJBFr31684@pcp742651pcs.reston01.va.comcast.net> > MH> class C(object): > MH> def static(arg) [staticmethod]: > MH> return 1 + arg > > BAW> Why the square brackets though? > > I believe Guido addressed this in his DevDay presentation. The list > construct is to allow future extensions without requiring parser changes. It was only one of the many grammar options I proposed, semi-jokingly. --Guido van Rossum (home page: http://www.python.org/~guido/) From DavidA@ActiveState.com Tue Feb 12 19:10:25 2002 From: DavidA@ActiveState.com (David Ascher) Date: Tue, 12 Feb 2002 11:10:25 -0800 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods References: <2m6652my0x.fsf@starship.python.net> Message-ID: <3C6968A1.1ED5158A@activestate.com> Michael Hudson wrote: > > Some time ago, Gareth McCaughan suggested a syntax for staticmethods. > You'd write > > class C(object): > def static(arg) [staticmethod]: > return 1 + arg > > C.static(2) > => 3 Nice! Note that this is quite similar to the [WebMethod] in C#, VB.Net, etc. , and indeed we could have [webmethod] for some variation of a SOAP/RPC interface. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpcondeclaringwebservice.asp From gward@python.net Tue Feb 12 19:47:46 2002 From: gward@python.net (Greg Ward) Date: Tue, 12 Feb 2002 14:47:46 -0500 Subject: [Python-Dev] a different approach to argument parsing In-Reply-To: <15465.15599.363507.279197@anthem.wooz.org> References: <3C68ED33.9244EDA0@prescod.net> <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net> <20020212155054.GB24891@gerg.ca> <15465.15599.363507.279197@anthem.wooz.org> Message-ID: <20020212194745.GA27163@gerg.ca> On 12 February 2002, Barry A. Warsaw said: > I'm happy to create the list on python.org if you prefer. I'd go for > full SIG status: getopt-sig@python.org. Let its charter be short lived. > > If Greg's willing to be the champion, I'll set this up. Thanks Barry. getopt-alternatives@python.net is dead, long live getopt-sig@python.org! Join the list at http://mail.python.org/mailman/listinfo/getopt-sig I'll announce this on c.l.py.announce shortly. Greg -- Greg Ward - geek-at-large gward@python.net http://starship.python.net/~gward/ Nostalgia just isn't what it used to be. From tim.one@comcast.net Tue Feb 12 20:17:50 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 12 Feb 2002 15:17:50 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <20020212071336.A32363@glacier.arctrix.com> Message-ID: [Tim] > Speaking of which, why does LOAD_FAST waste time checking against NULL > twice?! [Neil Schemenauer] > If you would have approved my patch it would be fixed already. Heh. If you had entered the patch at priority 9, I might have gotten to it by this summer. At priority 3, we're talking years . I boosted it to 6. Note that the tiny patch I checked in also rearranged the code so that the mormal case became the fall-through case: if (normal) do normal stuff else do exceptional stuff Most dumb compilers on platforms that care use a "forward branches probably aren't taken, backward branches probably are" heuristic for setting branch-prediction hints in the machine code; and on platforms that don't care it's usually faster to fall through than to change the program counter anyway. > one-small-banana-left-ly y'rs Neil is-that-an-american-or-canadian-banana?-ly y'rs - tim From tim.one@comcast.net Tue Feb 12 20:24:47 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 12 Feb 2002 15:24:47 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <15465.16347.962182.714475@grendel.zope.com> Message-ID: [Fred] > Have we become convinced that these cells need to be Python objects? No, it's just easier that way. The existing dict code maps PyObject* to PyObject*, so we'd have to copy and fiddle *all* of the dict code if a celldict wants to map to anything other than a PyObject*. > I must have missed that. As long as we can keep them simple > structures, we should be able to avoid individual allocations for > them. It seems we have a fixed number of cells for both module > objects and function objects (regardless of whether they are part of > the new celldict or the containing function or module), so they can be > allocated as an array rather than individually. > > So, I must be missing something. cells don't live in function objects; a function object only has a vector of pointers *to* cells, and that is indeed a contiguous, fixed-size array. cells are the values in celldicts, and that's the only place they appear, and celldicts can grow dynamically (import fred; fred.brandnew = 1). From mal@lemburg.com Tue Feb 12 20:42:22 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 12 Feb 2002 21:42:22 +0100 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15464.41842.664484.307330@anthem.wooz.org> <3C68F010.4736AE2F@lemburg.com> <15465.18146.700970.932676@anthem.wooz.org> <3C6954CC.46B215D8@lemburg.com> <15465.22745.136540.172075@anthem.wooz.org> Message-ID: <3C697E2E.34174D26@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MAL" == M writes: > > MAL> "Barry A. Warsaw" wrote: > >> [distutils --root hackery] > > MAL> Why not use the subpackage approach I suggested ? > > MAL> It keeps the std lib in a sane state (meaning that the std > MAL> lib installation only depends on the Python installation and > MAL> no other hacks on top of it). > > MAL> Since you'll have to ship the complete package anyway, I > MAL> don't see any win in installing over the std email > MAL> package. If that's what you really want, I'd suggest to > MAL> provide the updated email package as separate download and > MAL> then test inside Mailman for the new version. > > I'm fine with installing in a Mailman specific location, but I still > want to use as much of the distutils machinery as possible. > > It looks like > > python setup.py install --home=/some/path > > gets close enough. No, no, no :-) What I am suggesting is to put the email package *inside* the Mailman package: Mailman/__init__.py ... email/__init__.py ... And then use "from Mailman import email" in Mailman source code. That's clean, doesn't interfere with the std lib and it's all your's ;-) (meaning that you have complete control over what email does in the Mailman context). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From barry@zope.com Tue Feb 12 20:57:42 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 15:57:42 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15464.41842.664484.307330@anthem.wooz.org> <3C68F010.4736AE2F@lemburg.com> <15465.18146.700970.932676@anthem.wooz.org> <3C6954CC.46B215D8@lemburg.com> <15465.22745.136540.172075@anthem.wooz.org> <3C697E2E.34174D26@lemburg.com> Message-ID: <15465.33222.548705.100220@anthem.wooz.org> >>>>> "MAL" == M writes: >> install --home=/some/path gets close enough. MAL> No, no, no :-) MAL> What I am suggesting is to put the email package *inside* the MAL> Mailman package: MAL> Mailman/__init__.py | ... | email/__init__.py | ... What I didn't say was s|/some/path|/path/to/Mailman| so we're saying (nearly) the same thing. MAL> And then use "from Mailman import email" in Mailman source MAL> code. That's clean, doesn't interfere with the std lib MAL> and it's all your's ;-) (meaning that you have complete MAL> control over what email does in the Mailman context). I could do this (and may) or I may use something like from Mailman.pythonlib import email, which is my normal place to put override modules. It's moderately more appealing to put Mailman.pythonlib on sys.path and just leave my "import email"'s alone. I know there are arguments against doing it that way, but I don't want to have to change dozens of files. -Barry From aahz@rahul.net Tue Feb 12 21:24:32 2002 From: aahz@rahul.net (Aahz Maruch) Date: Tue, 12 Feb 2002 13:24:32 -0800 (PST) Subject: [Python-Dev] Order that site-packages is added to sys.path In-Reply-To: <15465.33222.548705.100220@anthem.wooz.org> from "Barry A. Warsaw" at Feb 12, 2002 03:57:42 PM Message-ID: <20020212212433.3E8A4E8CF@waltz.rahul.net> Barry A. Warsaw wrote: > > It's moderately more appealing to put Mailman.pythonlib on sys.path > and just leave my "import email"'s alone. I know there are arguments > against doing it that way, but I don't want to have to change dozens > of files. For shame, Barry, isn't that what Python is for? ;-) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From nas@python.ca Tue Feb 12 21:27:24 2002 From: nas@python.ca (Neil Schemenauer) Date: Tue, 12 Feb 2002 13:27:24 -0800 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: ; from tim.one@comcast.net on Tue, Feb 12, 2002 at 03:17:50PM -0500 References: <20020212071336.A32363@glacier.arctrix.com> Message-ID: <20020212132724.A1443@glacier.arctrix.com> Tim Peters wrote: > if (normal) > do normal stuff > else do exceptional stuff > > Most dumb compilers on platforms that care use a "forward branches probably > aren't taken, backward branches probably are" heuristic for setting > branch-prediction hints in the machine code; and on platforms that don't > care it's usually faster to fall through than to change the program counter > anyway. I seem to remember someone saying that GCC generated better code for: if (exceptional) { do exceptional things break / return / goto } do normal things Is GCC in the dumb category? Also, the Linux is starting to use this set of macros more often: /* Somewhere in the middle of the GCC 2.96 development cycle, we * implemented a mechanism by which the user can annotate likely * branch directions and expect the blocks to be reordered * appropriately. Define __builtin_expect to nothing for earlier * compilers. */ #if __GNUC__ == 2 && __GNUC_MINOR__ < 96 #define __builtin_expect(x, expected_value) (x) #endif #define likely(x) __builtin_expect((x),1) #define unlikely(x) __builtin_expect((x),0) For example: if (likely(normal)) do normal stuff else do exceptional stuff I don't have GCC >= 2.96 otherwise I would have tried sprinkling some of those macros in ceval and testing the effect. Neil From barry@zope.com Tue Feb 12 21:26:16 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 12 Feb 2002 16:26:16 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path References: <15465.33222.548705.100220@anthem.wooz.org> <20020212212433.3E8A4E8CF@waltz.rahul.net> Message-ID: <15465.34936.922554.202330@anthem.wooz.org> >>>>> "AM" == Aahz Maruch writes: AM> For shame, Barry, isn't that what Python is for? ;-) Naw, it's what elisp if for . -Barry From James_Althoff@i2.com Tue Feb 12 21:38:24 2002 From: James_Althoff@i2.com (James_Althoff@i2.com) Date: Tue, 12 Feb 2002 13:38:24 -0800 Subject: [Python-Dev] re: syntactic sugar idea for {static,class}methods Message-ID: I would think that specifying a list (as in [staticmethod]) would be very desirable so that you could do a sequence of transformations, not just one. Michael's examples seem to suggest elements of an Aspect-oriented approach to things. If you have several relevant "Aspect wrappers", then you might want to apply each cascaded in sequence. If, using previous examples, I want a "static" method that is also "memoized" and "SOAPed" I could write: def mymethod(arg) [staticmethod,memoize,webmethod]: or some such combination that is presumably well-defined. Jim From Jack.Jansen@oratrix.nl Tue Feb 12 22:11:44 2002 From: Jack.Jansen@oratrix.nl (Jack Jansen) Date: Tue, 12 Feb 2002 23:11:44 +0100 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods In-Reply-To: <2m6652my0x.fsf@starship.python.net> Message-ID: <7FFEC9B0-2005-11D6-A45A-003065517236@oratrix.nl> On Tuesday, February 12, 2002, at 06:05 PM, Michael Hudson wrote: > Some time ago, Gareth McCaughan suggested a syntax for staticmethods. > You'd write > > class C(object): > def static(arg) [staticmethod]: > return 1 + arg > > C.static(2) > => 3 At some point in the past, when the actual implementation wasn't even finished, I suggested to Guido to use class C(object): def static(class, arg): return 1 + arg as the syntactic sugar. I think he wasn't against it at the time, but somehow found the actual implementation more important:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From tim.one@comcast.net Tue Feb 12 23:04:44 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 12 Feb 2002 18:04:44 -0500 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: <20020212132724.A1443@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > I seem to remember someone saying that GCC generated better code for: > > if (exceptional) { > do exceptional things > break / return / goto > } > do normal things > > Is GCC in the dumb category? Yes, any compiler that doesn't do branch prediction based on *semantic* analysis is dirt dumb. A simple example of semantic prediction is "comparing a pointer to NULL is probably going to yield false". Ditto comparing a number for equality with 0. I'd like to see a reference for the pattern above; it goes against the very common "forward branches usually aren't taken" heuristic. Note that Vladimir applied that gimmick to an extreme in obmalloc.c's malloc() function. > Also, the Linux is starting to use this set of macros more often: ["likely" and "unlikely"] They're late to the party. Cray had a "percent true" directive 20 years ago, allowing for 48 bits of precision in specifying how (un)likely . > ... > I don't have GCC >= 2.96 otherwise I would have tried sprinkling some of > those macros in ceval and testing the effect. Maybe more interesting: One of the folks at Zope Corp reported getting significant speedups by using some gcc option that can feed real-life branch histories back into the compiler. Less work and less error-prone than guessing annotations. From gward@python.net Tue Feb 12 23:41:44 2002 From: gward@python.net (Greg Ward) Date: Tue, 12 Feb 2002 18:41:44 -0500 Subject: [Python-Dev] SyntaxError tracebacks in 2.2 Message-ID: <20020212234144.GA28828@gerg.ca> Has anyone else noticed that SyntaxError tracebacks no longer include the name of the file where the error occurs? Instead, they just say "". Eg. $ python2.1 foo.py File "foo.py", line 1 foo = ^ SyntaxError: invalid syntax $ python2.2 foo.py File "", line 1 foo = ^ SyntaxError: invalid syntax This is annoying enough that I just filed SF bug #516712: http://sourceforge.net/tracker/index.php?func=detail&aid=516712&group_id=5470&atid=105470 Greg -- Greg Ward - Linux nerd gward@python.net http://starship.python.net/~gward/ There are no stupid questions -- only stupid people. From neal@metaslash.com Tue Feb 12 23:50:23 2002 From: neal@metaslash.com (Neal Norwitz) Date: Tue, 12 Feb 2002 18:50:23 -0500 Subject: [Python-Dev] SyntaxError tracebacks in 2.2 References: <20020212234144.GA28828@gerg.ca> Message-ID: <3C69AA3F.9E1932DD@metaslash.com> Greg Ward wrote: > > Has anyone else noticed that SyntaxError tracebacks no longer include > the name of the file where the error occurs? Instead, they just say > "". Eg. > > $ python2.1 foo.py > File "foo.py", line 1 > foo = > ^ > SyntaxError: invalid syntax > $ python2.2 foo.py > File "", line 1 > foo = > ^ > SyntaxError: invalid syntax > > This is annoying enough that I just filed SF bug #516712: > http://sourceforge.net/tracker/index.php?func=detail&aid=516712&group_id=5470&atid=105470 I believe Martin fixed this. With the latest from CVS: [neal@epoch src]$ ./python foo.py File "foo.py", line 1 foo = ^ SyntaxError: invalid syntax Neal From DavidA@ActiveState.com Wed Feb 13 00:16:07 2002 From: DavidA@ActiveState.com (David Ascher) Date: Tue, 12 Feb 2002 16:16:07 -0800 Subject: [Python-Dev] Accessing globals without dict lookup References: Message-ID: <3C69B047.EDE6048D@activestate.com> Tim Peters wrote: > Maybe more interesting: One of the folks at Zope Corp reported getting > significant speedups by using some gcc option that can feed real-life branch > histories back into the compiler. Less work and less error-prone than > guessing annotations. Slightly OT: Has anyone tried compiling Python w/ the Intel C++ compiler? --david From martin@v.loewis.de Wed Feb 13 00:19:44 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Feb 2002 01:19:44 +0100 Subject: [Python-Dev] SSL support in _socket In-Reply-To: <3C692370.D21EF15D@lemburg.com> References: <3C692370.D21EF15D@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > What can we do about this ? The standard solution is to modify Modules/Setup at installation time, to suit your local needs. > Perhaps we should simply let setup.py build two extensions: _socket > (without SSL) and _socketssl (with SSL) ?! If the _socketssl build > or import fails for some reason, Python could still pick up the > _socket extension in socket.py. -1: Instead of avoiding to use an existing OpenSSL installation, it would be much better if the socket module was fixed to work with all existing versions. Of course, without a precise bug report, we cannot know whether this was possible. Regards, Martin From martin@v.loewis.de Wed Feb 13 00:28:04 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Feb 2002 01:28:04 +0100 Subject: [Python-Dev] SyntaxError tracebacks in 2.2 In-Reply-To: <3C69AA3F.9E1932DD@metaslash.com> References: <20020212234144.GA28828@gerg.ca> <3C69AA3F.9E1932DD@metaslash.com> Message-ID: Neal Norwitz writes: > I believe Martin fixed this. Indeed; that's parsetok.c 2.29 and 2.28.8.1. I've closed the report as a duplicate. Regards, Martin From skip@pobox.com Wed Feb 13 03:54:01 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 12 Feb 2002 21:54:01 -0600 Subject: [Python-Dev] Accessing globals without dict lookup In-Reply-To: References: <20020212132724.A1443@glacier.arctrix.com> Message-ID: <15465.58201.826252.981746@12-248-41-177.client.attbi.com> Tim> Maybe more interesting: One of the folks at Zope Corp reported Tim> getting significant speedups by using some gcc option that can feed Tim> real-life branch histories back into the compiler. Less work and Tim> less error-prone than guessing annotations. That would be -fprofile-args and -fbranch-probabilities: -fprofile-arcs also makes it possible to estimate branch probabilities, and to calculate basic block execution counts. In general, basic block execution counts do not give enough information to estimate all branch probabilities. When the compiled program exits, it saves the arc execution counts to a file called sourcename.da. Use the compiler option -fbranch-probabilities when recompiling, to optimize using estimated branch probabilities. I fiddled with a bunch of gcc options a few months ago. I finally settled on -O3 -minline-all-stringops -fomit-frame-pointer Skip From tim.one@comcast.net Wed Feb 13 04:55:27 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 12 Feb 2002 23:55:27 -0500 Subject: [Python-Dev] Expat vs Windows Message-ID: Anyone understand what's going on with expat? I noticed pyexpat stopped compiling on Windows a day or two ago, but didn't have time to look at it. Today I see it compiles, but generates lots of linker warnings: Creating library ./pyexpat.lib and object ./pyexpat.exp LINK : warning LNK4049: locally defined symbol "_XML_GetSpecifiedAttributeCount" imported LINK : warning LNK4049: locally defined symbol "_XML_Parse" imported LINK : warning LNK4049: locally defined symbol "_XML_ErrorString" imported etc. Are we trying to break away from the SourceForge expat project? Seems a dubious idea, if so. In any case, I can make almost no time for repairing this on Windows, so need someone to explain what we're trying to accomplish here (btw, if someone already explained this on some mailing list, sorry, I'm hundreds of msgs behind the times). From tim.one@comcast.net Wed Feb 13 05:11:46 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 13 Feb 2002 00:11:46 -0500 Subject: [Python-Dev] Expat vs Windows In-Reply-To: Message-ID: [Tim] > Today I see it compiles, but generates lots of linker warnings: > ... Oops -- I don't look far enough. A different part still doesn't compile, at least not in a debug build: --------------------Configuration: pyexpat - Win32 Debug------------------- Compiling... xmlparse.c C:\Code\python\Modules\expat\xmlparse.c(1329) : error C2143: syntax error : missing ';' before 'constant' C:\Code\python\Modules\expat\xmlparse.c(1329) : error C2115: 'return' : incompatible types Error executing cl.exe. pyexpat_d.pyd - 2 error(s), 0 warning(s) It's griping about this: const XML_LChar * XML_ExpatVersion(void) { return VERSION; } From martin@v.loewis.de Wed Feb 13 07:53:58 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Feb 2002 08:53:58 +0100 Subject: [Python-Dev] Expat vs Windows In-Reply-To: References: Message-ID: Tim Peters writes: > Oops -- I don't look far enough. A different part still doesn't compile, at > least not in a debug build: That's because the VERSION define in the debug build read /D VERSION="1.95.2" whereas MSVC had wanted it as /D VERSION=\"1.95.2\" I still fail to see the rationale for requiring the backslashes there, or why I have to change every setting twice on Windows (which I forgot in this case); in any case, I moved the VERSION setting into expat.h, so this problem should be gone now. Regards, Martin From tim.one@comcast.net Wed Feb 13 08:00:29 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 13 Feb 2002 03:00:29 -0500 Subject: [Python-Dev] Order that site-packages is added to sys.path In-Reply-To: <15464.41842.664484.307330@anthem.wooz.org> Message-ID: Note that app developers eager to replace standard libraries, and an OS that allowed them to do so, are the causes of the aptly named "DLL Hell" on Windows. It can work fine for a single app, but it's truly hell when multiple apps resort to this, and end users don't have a prayer of sorting out the inevitable, vicious problems. if-you-need-your-own-xxx.py-you-know-where-to-shove-it-ly y'rs - tim From martin@v.loewis.de Wed Feb 13 08:10:47 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Feb 2002 09:10:47 +0100 Subject: [Python-Dev] Expat vs Windows In-Reply-To: References: Message-ID: Tim Peters writes: > Anyone understand what's going on with expat? I noticed pyexpat stopped > compiling on Windows a day or two ago, but didn't have time to look at it. > > Today I see it compiles, but generates lots of linker warnings: > > Creating library ./pyexpat.lib and object ./pyexpat.exp > LINK : warning LNK4049: > locally defined symbol "_XML_GetSpecifiedAttributeCount" imported I cannot reproduce this on my MSVC 6 installation. What does that warning mean? Does it indicate a problem of some sort? > Are we trying to break away from the SourceForge expat project? No, Modules/expat is a literal copy of SF expat 1.95.2, lib/. > (btw, if someone already explained this on some mailing list, sorry, > I'm hundreds of msgs behind the times). http://mail.python.org/pipermail/python-dev/2002-February/019974.html [assuming you read this message before catching up with the rest of python-dev] Regards, Martin From tim.one@comcast.net Wed Feb 13 08:15:08 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 13 Feb 2002 03:15:08 -0500 Subject: [Python-Dev] Expat vs Windows In-Reply-To: Message-ID: [Martin v. Loewis] > That's because the VERSION define in the debug build read > > /D VERSION="1.95.2" > > whereas MSVC had wanted it as > > /D VERSION=\"1.95.2\" > > I still fail to see the rationale for requiring the backslashes there, I expect it's the same as under most Unix shells: the cmdline processor chews up unescaped quotes, so if you want quotes to survive in what's passed to argv, you have to escape them. > or why I have to change every setting twice on Windows (which I forgot > in this case); You don't, if you first select "Multiple Configurations ... " from the "Settings for:" dropdown list. That controls which configuration(s) your changes apply to, so if you leave it at, e.g., "Win32 Release", you're explicitly instructing it to apply changes only to the Release build. > in any case, I moved the VERSION setting into expat.h, so this problem > should be gone now. Thank you! I won't get to try it until tomorrow night; maybe the linker warnings will vanish by then too ... From mal@lemburg.com Wed Feb 13 09:22:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 13 Feb 2002 10:22:05 +0100 Subject: [Python-Dev] SSL support in _socket References: <3C692370.D21EF15D@lemburg.com> Message-ID: <3C6A303D.E0DDC16A@lemburg.com> "Martin v. Loewis" wrote: > > "M.-A. Lemburg" writes: > > > What can we do about this ? > > The standard solution is to modify Modules/Setup at installation time, > to suit your local needs. I thought that Modules/Setup is deprecated and replaced by the auto setup tests in setup.py ? In any case, setup.py will simply remove _socket if it doesn't import correctly and so a casual sys admin or user will lose big if his OpenSSL installation happens to be out of sync with whatever we provide in _socket. > > Perhaps we should simply let setup.py build two extensions: _socket > > (without SSL) and _socketssl (with SSL) ?! If the _socketssl build > > or import fails for some reason, Python could still pick up the > > _socket extension in socket.py. > > -1: Instead of avoiding to use an existing OpenSSL installation, it > would be much better if the socket module was fixed to work with all > existing versions. > > Of course, without a precise bug report, we cannot know whether this > was possible. Some symbols starting with 'RAND_*' are aparently missing from OpenSSL on my notebook. On other occasions (i.e. on RedHat) I found that the system vendor had forgotten to provide a link to the 0.9 version of OpenSSL and instead used 1.0 as version number (which is completely wrong since there is no 1.0 version of OpenSSL). As a result, _socket built on a system with correctly setup libs wouldn't run on this particular RedHat installation. In summary: _socket is just too important to lose if something in the OpenSSL support goes wrong. The two build model I suggested fixes this problem elegantly and doesn't cost anything in terms of adding tons of code -- all we need is an #ifdef for the module name in _socketmodule.c -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mwh@python.net Wed Feb 13 10:41:13 2002 From: mwh@python.net (Michael Hudson) Date: 13 Feb 2002 10:41:13 +0000 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods In-Reply-To: barry@zope.com's message of "Tue, 12 Feb 2002 13:12:45 -0500" References: <2m6652my0x.fsf@starship.python.net> <15465.23325.41291.966138@anthem.wooz.org> Message-ID: <2mheolmzp2.fsf@starship.python.net> barry@zope.com (Barry A. Warsaw) writes: > >>>>> "MH" == Michael Hudson writes: > > MH> Some time ago, Gareth McCaughan suggested a syntax for > MH> staticmethods. You'd write > > MH> class C(object): > | def static(arg) [staticmethod]: > | return 1 + arg > > | C.static(2) > | => 3 > > Very interesting! Why the square brackets though? Is that just for > visual offset or is there a grammar constraint that requires them? Um, no big reason; they were what Gareth suggested, so I implemented that. He may have got the idea from the slides from one of Guido's presentations -- it was reading them that reminded me I'd done this and wanted to mention it here. Note, though, that my patch allows an arbitrary number of *expressions* in the square brackets; in principle you can do things like: >>> def h() [apply, (lambda f:(lambda : f() + 1))]: ... return 1 ... and have `h' be 2 (except that this caused an abort at the moment -- I must have missed something in my symtable code). Not sure whether this is a good idea, of course, but allowing arbitrary expressions does actually make the compiling easier. Allowing arbitrary expressions without delimiters sounds like a bad idea, both for parsers and people. > I'd leave them out of the picture, unless you mean to imply that a > list is acceptable in that position . Well, it is, at the moment... Cheers, M. -- : exploding like a turd Never had that happen to me, I have to admit. They do that often in your world? -- Eric The Read & Dave Brown, asr From guido@python.org Wed Feb 13 13:01:50 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 13 Feb 2002 08:01:50 -0500 Subject: [Python-Dev] SSL support in _socket In-Reply-To: Your message of "Wed, 13 Feb 2002 10:22:05 +0100." <3C6A303D.E0DDC16A@lemburg.com> References: <3C692370.D21EF15D@lemburg.com> <3C6A303D.E0DDC16A@lemburg.com> Message-ID: <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net> > Some symbols starting with 'RAND_*' are aparently missing from > OpenSSL on my notebook. Yes, this has bitten me too. It's apparently a relatively new API in OpenSSL and the SSL code in socket.c was changed to require it almost as soon as it appeared in OpenSSL. > In summary: _socket is just too important to lose if something > in the OpenSSL support goes wrong. The two build model I suggested > fixes this problem elegantly and doesn't cost anything in > terms of adding tons of code -- all we need is an #ifdef for > the module name in _socketmodule.c Since the SSL support mostly introduces new code that doesn't depend on other socket code (not 100% sure if this is true), can't we make the SSL support a separate module? Then socket.py (which is also used on Unix these days!!!) can glue them together. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Wed Feb 13 13:14:27 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 13 Feb 2002 14:14:27 +0100 Subject: [Python-Dev] SSL support in _socket References: <3C692370.D21EF15D@lemburg.com> <3C6A303D.E0DDC16A@lemburg.com> <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C6A66B3.3C4AE597@lemburg.com> Guido van Rossum wrote: > > > Some symbols starting with 'RAND_*' are apparently missing from > > OpenSSL on my notebook. > > Yes, this has bitten me too. It's apparently a relatively new API in > OpenSSL and the SSL code in socket.c was changed to require it almost > as soon as it appeared in OpenSSL. > > > In summary: _socket is just too important to lose if something > > in the OpenSSL support goes wrong. The two build model I suggested > > fixes this problem elegantly and doesn't cost anything in > > terms of adding tons of code -- all we need is an #ifdef for > > the module name in _socketmodule.c > > Since the SSL support mostly introduces new code that doesn't depend > on other socket code (not 100% sure if this is true), can't we make > the SSL support a separate module? Then socket.py (which is also used > on Unix these days!!!) can glue them together. Good idea. Checking the code it should be easy to do. I'll look into this later this week. Funny, BTW, that the source file is named socketmodule.c while the resulting DLL is called _socket... I suppose renaming socketmodule.c to _socket.c would be advisable. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Wed Feb 13 13:36:44 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 13 Feb 2002 08:36:44 -0500 Subject: [Python-Dev] SSL support in _socket In-Reply-To: Your message of "Wed, 13 Feb 2002 14:14:27 +0100." <3C6A66B3.3C4AE597@lemburg.com> References: <3C692370.D21EF15D@lemburg.com> <3C6A303D.E0DDC16A@lemburg.com> <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net> <3C6A66B3.3C4AE597@lemburg.com> Message-ID: <200202131336.g1DDaiV07604@pcp742651pcs.reston01.va.comcast.net> > Checking the code it should be easy to do. I'll look > into this later this week. Great! > Funny, BTW, that the source file is named socketmodule.c > while the resulting DLL is called _socket... I suppose > renaming socketmodule.c to _socket.c would be advisable. That requires asking the SF sysadmin a favor to move a file, or loses all he CVS history. So who cares. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Wed Feb 13 14:28:53 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Feb 2002 15:28:53 +0100 Subject: [Python-Dev] SSL support in _socket In-Reply-To: <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net> References: <3C692370.D21EF15D@lemburg.com> <3C6A303D.E0DDC16A@lemburg.com> <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Since the SSL support mostly introduces new code that doesn't depend > on other socket code (not 100% sure if this is true), can't we make > the SSL support a separate module? Then socket.py (which is also used > on Unix these days!!!) can glue them together. +1. Martin From martin@v.loewis.de Wed Feb 13 14:34:26 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 13 Feb 2002 15:34:26 +0100 Subject: [Python-Dev] SSL support in _socket In-Reply-To: <3C6A303D.E0DDC16A@lemburg.com> References: <3C692370.D21EF15D@lemburg.com> <3C6A303D.E0DDC16A@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > I thought that Modules/Setup is deprecated and replaced by the > auto setup tests in setup.py ? Not at all. It is just used less frequently. Personally, I think that is a pity. Python binary distributions, by default, on Unix, should build as many extension libraries statically into the interpreter as they can without dragging in too many additional shared libraries. IOW, _socket should be compiled statically into the interpreter, which you cannot do with distutils (by nature). The reason for linking them statically is efficiency: if used, the interpreter won't have to locate them in sys.path, they don't need to be compiled as PIC code, the dynamic linker does not need to bind that many symbols, etc; if not used, they don't consume any additional resources as they are demand-paged from the executable. Static linking is also desirable for frozen applications. For those reasons, I hope that Setup.dist continues to be maintained. Regards, Martin From barry@zope.com Wed Feb 13 14:47:12 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 13 Feb 2002 09:47:12 -0500 Subject: [Python-Dev] SSL support in _socket References: <3C692370.D21EF15D@lemburg.com> <3C6A303D.E0DDC16A@lemburg.com> Message-ID: <15466.31856.880945.17273@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> I thought that Modules/Setup is deprecated and replaced by MAL> the auto setup tests in setup.py ? In any case, setup.py will MAL> simply remove _socket if it doesn't import correctly and so a MAL> casual sys admin or user will lose big if his OpenSSL MAL> installation happens to be out of sync with whatever we MAL> provide in _socket. This is a more general problem with the current setup.py stuff for the standard library. It took me /ages/ to figure out why BerkeleyDB support was broken in Python 2.2 -- not just broken, but non-existant! "import bsddb" simply failed because the .so wasn't there. I couldn't figure out why that was until I trolled through the build output and realized that setup.py was deleting the .so because it got an import error after building the .so. Then I had to figure out how to build the .so and keep it around so I could then learn that it had link problems and from there, I realized why BerkeleyDB support in Python 2.2 is /really/ busted (it tries to be too smart about finding its libraries). It shouldn't have been this difficult to debug. Surely there must be some way to tell setup.py not to delete .so's it can't import so we have a prayer of finding the real problems. -Barry From barry@zope.com Wed Feb 13 14:49:59 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 13 Feb 2002 09:49:59 -0500 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods References: <2m6652my0x.fsf@starship.python.net> <15465.23325.41291.966138@anthem.wooz.org> <2mheolmzp2.fsf@starship.python.net> Message-ID: <15466.32023.403192.162135@anthem.wooz.org> >>>>> "MH" == Michael Hudson writes: MH> Not sure whether this is a good idea, of course, but allowing MH> arbitrary expressions does actually make the compiling easier. MH> Allowing arbitrary expressions without delimiters sounds like MH> a bad idea, both for parsers and people. >> I'd leave them out of the picture, unless you mean to imply >> that a list is acceptable in that position . MH> Well, it is, at the moment... Well, that's pretty neat! Maybe FAST, but neat. :) -Barry From mwh@python.net Wed Feb 13 14:57:27 2002 From: mwh@python.net (Michael Hudson) Date: 13 Feb 2002 14:57:27 +0000 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods In-Reply-To: barry@zope.com's message of "Wed, 13 Feb 2002 09:49:59 -0500" References: <2m6652my0x.fsf@starship.python.net> <15465.23325.41291.966138@anthem.wooz.org> <2mheolmzp2.fsf@starship.python.net> <15466.32023.403192.162135@anthem.wooz.org> Message-ID: <2m8z9x5t0o.fsf@starship.python.net> barry@zope.com (Barry A. Warsaw) writes: > >>>>> "MH" == Michael Hudson writes: > >> I'd leave them out of the picture, unless you mean to imply > >> that a list is acceptable in that position . > > MH> Well, it is, at the moment... > > Well, that's pretty neat! Maybe FAST, but neat. :) No, you're going to have to explain that. (Googling for "FAST" isn't terribly enlightening...). Cheers, M. -- well, take it from an old hand: the only reason it would be easier to program in C is that you can't easily express complex problems in C, so you don't. -- Erik Naggum, comp.lang.lisp From mwh@python.net Wed Feb 13 14:59:27 2002 From: mwh@python.net (Michael Hudson) Date: 13 Feb 2002 14:59:27 +0000 Subject: [Python-Dev] SSL support in _socket In-Reply-To: barry@zope.com's message of "Wed, 13 Feb 2002 09:47:12 -0500" References: <3C692370.D21EF15D@lemburg.com> <3C6A303D.E0DDC16A@lemburg.com> <15466.31856.880945.17273@anthem.wooz.org> Message-ID: <2m66515sxc.fsf@starship.python.net> barry@zope.com (Barry A. Warsaw) writes: > It shouldn't have been this difficult to debug. Surely there must be > some way to tell setup.py not to delete .so's it can't import so we > have a prayer of finding the real problems. Maybe it just shouldn't install the shared libs if they fail to import? Cheers, M. -- It's actually a corruption of "starling". They used to be carried. Since they weighed a full pound (hence the name), they had to be carried by two starlings in tandem, with a line between them. -- Alan J Rosenthal explains "Pounds Sterling" on asr From mal@lemburg.com Wed Feb 13 16:33:45 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 13 Feb 2002 17:33:45 +0100 Subject: [Python-Dev] setup.py auto-conf (SSL support in _socket) References: <3C692370.D21EF15D@lemburg.com> <3C6A303D.E0DDC16A@lemburg.com> <15466.31856.880945.17273@anthem.wooz.org> <2m66515sxc.fsf@starship.python.net> Message-ID: <3C6A9569.B494377C@lemburg.com> Michael Hudson wrote: > > barry@zope.com (Barry A. Warsaw) writes: > > > It shouldn't have been this difficult to debug. Surely there must be > > some way to tell setup.py not to delete .so's it can't import so we > > have a prayer of finding the real problems. > > Maybe it just shouldn't install the shared libs if they fail to > import? I'm not sure why setup.py is trying to be smart in the first place. A warning is certainly a good idea, but then setup.py should let the user decide what to do about the problem, IMHO. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From gmccaughan@synaptics-uk.com Wed Feb 13 17:11:57 2002 From: gmccaughan@synaptics-uk.com (Gareth McCaughan) Date: Wed, 13 Feb 2002 17:11:57 +0000 (GMT) Subject: [Python-Dev] syntactic sugar idea for {static,class}methods Message-ID: <200202131712.RAA29416@synaptics-uk.com> Michael Hudson wrote (replying to Barry Warsaw): > > Very interesting! Why the square brackets though? Is that just for > > visual offset or is there a grammar constraint that requires them? > > Um, no big reason; they were what Gareth suggested, so I implemented > that. He may have got the idea from the slides from one of Guido's > presentations -- it was reading them that reminded me I'd done this > and wanted to mention it here. Four reasons for the brackets. 1. Easier for the parser, I think. 2. Visually distinctive. 3. For me, it "reads" better than it would without the brackets. 4. Generalizes to a sequence of transformations. #4 is much the most important of these in my mind. One drawback of allowing an arbitrary list of transformations is that it might not be completely clear what order they're done in. I conjecture that most people will have the same intuition as I do about this, namely that the first-listed transformation is applied first. (It would be less obvious if the list came before the name of the definiendum instead of after.) Oh, and for the record: My suggestion was made long before I ever saw Guido's slides. :-) -- Gareth McCaughan From paul@prescod.net Wed Feb 13 17:33:56 2002 From: paul@prescod.net (Paul Prescod) Date: Wed, 13 Feb 2002 09:33:56 -0800 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods References: <200202131712.RAA29416@synaptics-uk.com> Message-ID: <3C6AA384.64D27FD0@prescod.net> Gareth McCaughan wrote: > >... > 4. Generalizes to a sequence of transformations. For me, this is crucial. A future version of Spark would probably put its parse annotations in the parens. Various type checking systems would probably do the same: def t_whitespace(self, s)[ grammar(r' \s+'), type(Node)]: pass This is going to happen so we need to be confident that we like this use of the syntax. I've been waiting for something like this for a while. Paul Prescod From barry@zope.com Wed Feb 13 19:32:12 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 13 Feb 2002 14:32:12 -0500 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods References: <2m6652my0x.fsf@starship.python.net> <15465.23325.41291.966138@anthem.wooz.org> <2mheolmzp2.fsf@starship.python.net> <15466.32023.403192.162135@anthem.wooz.org> <2m8z9x5t0o.fsf@starship.python.net> Message-ID: <15466.48956.191807.871000@anthem.wooz.org> >>>>> "MH" == Michael Hudson writes: MH> barry@zope.com (Barry A. Warsaw) writes: >> "MH" == Michael Hudson writes: >> I'd leave them out of the picture, unless you mean to imply >> that a list is acceptable in that position . >> MH> Well, it is, at the moment... Well, that's pretty neat! >> Maybe FAST, but neat. :) MH> No, you're going to have to explain that. (Googling for MH> "FAST" isn't terribly enlightening...). It stands for "facinating and stomach turning", a reference to a docstring-based mechanism John Aycock used for his parser technology. :) -Barry From barry@zope.com Wed Feb 13 19:33:13 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 13 Feb 2002 14:33:13 -0500 Subject: [Python-Dev] SSL support in _socket References: <3C692370.D21EF15D@lemburg.com> <3C6A303D.E0DDC16A@lemburg.com> <15466.31856.880945.17273@anthem.wooz.org> <2m66515sxc.fsf@starship.python.net> Message-ID: <15466.49017.607224.678650@anthem.wooz.org> >>>>> "MH" == Michael Hudson writes: >> It shouldn't have been this difficult to debug. Surely there >> must be some way to tell setup.py not to delete .so's it can't >> import so we have a prayer of finding the real problems. MH> Maybe it just shouldn't install the shared libs if they fail MH> to import? That would be an improvement because at least there'd be an artifact you can poke at after the build process is complete. -Barry From Anthony Baxter Thu Feb 14 00:55:52 2002 From: Anthony Baxter (Anthony Baxter) Date: Thu, 14 Feb 2002 11:55:52 +1100 Subject: [Python-Dev] SSL support in _socket In-Reply-To: Message from "M.-A. Lemburg" of "Wed, 13 Feb 2002 10:22:05 BST." <3C6A303D.E0DDC16A@lemburg.com> Message-ID: <200202140055.g1E0tq420840@burswood.off.ekorp.com> The whole subject of socket and SSL support came up at a lunchtime chat on developers day (and my jetlagged brain has totally failed to supply me the names of who I was talking to at the time...). Wouldn't it be better to rip the SSL support out entirely, and provide a way to hook the transport layer stuff on top of the standard socket object? Anthony -- Anthony Baxter It's never to late to have a happy childhood. From martin@v.loewis.de Thu Feb 14 01:02:35 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 14 Feb 2002 02:02:35 +0100 Subject: [Python-Dev] SSL support in _socket In-Reply-To: <200202140055.g1E0tq420840@burswood.off.ekorp.com> References: <200202140055.g1E0tq420840@burswood.off.ekorp.com> Message-ID: Anthony Baxter writes: > The whole subject of socket and SSL support came up at a lunchtime > chat on developers day (and my jetlagged brain has totally failed to > supply me the names of who I was talking to at the time...). Wouldn't > it be better to rip the SSL support out entirely, and provide a way > to hook the transport layer stuff on top of the standard socket > object? With OpenSSL? How do you make OpenSSL's internals use the Python socket object? Regards, Martin From jeremy@zope.com Thu Feb 14 17:39:26 2002 From: jeremy@zope.com (Jeremy Hylton) Date: Thu, 14 Feb 2002 12:39:26 -0500 Subject: [Python-Dev] SSL support in _socket In-Reply-To: <200202140055.g1E0tq420840@burswood.off.ekorp.com> References: <3C6A303D.E0DDC16A@lemburg.com> <200202140055.g1E0tq420840@burswood.off.ekorp.com> Message-ID: <15467.63054.123422.382424@gondolin.digicool.com> >>>>> "AB" == Anthony Baxter writes: AB> The whole subject of socket and SSL support came up at a AB> lunchtime chat on developers day (and my jetlagged brain has AB> totally failed to supply me the names of who I was talking to at AB> the time...). Wouldn't it be better to rip the SSL support out AB> entirely, and provide a way to hook the transport layer stuff on AB> top of the standard socket object? It is certainly attractive to focus future development on a separate C extension module. The current SSL support was included because we thought it would be nice to allow people to open https URLs. The code itself is problematic for many reasons, not least of which is its very minimal feature set. But getting the right Python interface for a large library like OpenSSL is a big task. I think it's better suite for 3rd party libraries that the core (and such libraries do exist, though I've never used them). Jeremy From mal@lemburg.com Thu Feb 14 17:45:40 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 14 Feb 2002 18:45:40 +0100 Subject: [Python-Dev] SSL support in _socket References: <3C6A303D.E0DDC16A@lemburg.com> <200202140055.g1E0tq420840@burswood.off.ekorp.com> <15467.63054.123422.382424@gondolin.digicool.com> Message-ID: <3C6BF7C4.5386DA52@lemburg.com> Jeremy Hylton wrote: > > >>>>> "AB" == Anthony Baxter writes: > > AB> The whole subject of socket and SSL support came up at a > AB> lunchtime chat on developers day (and my jetlagged brain has > AB> totally failed to supply me the names of who I was talking to at > AB> the time...). Wouldn't it be better to rip the SSL support out > AB> entirely, and provide a way to hook the transport layer stuff on > AB> top of the standard socket object? > > It is certainly attractive to focus future development on a separate C > extension module. The current SSL support was included because we > thought it would be nice to allow people to open https URLs. The code > itself is problematic for many reasons, not least of which is its very > minimal feature set. But getting the right Python interface for a > large library like OpenSSL is a big task. I think it's better suite > for 3rd party libraries that the core (and such libraries do exist, > though I've never used them). FYI, I'm moving the SSL out of _socket and into _ssl.c. socket.py will then try to import _ssl, but move along if it cannot import that module for some reason. For true SSL support, you should look at M2Crypto. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Feb 14 18:43:15 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 14 Feb 2002 19:43:15 +0100 Subject: [Python-Dev] The Python of Pythagoras? Message-ID: <3C6C0543.D6F57864@lemburg.com> Pythaguidoras ?! http://greatserpentmound.org/articles/python.html -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From Greg.Wilson@baltimore.com Thu Feb 14 18:42:02 2002 From: Greg.Wilson@baltimore.com (Greg Wilson) Date: Thu, 14 Feb 2002 13:42:02 -0500 Subject: [Python-Dev] student projects Message-ID: <930BBCA4CEBBD411BE6500508BB3328F523333@nsamcanms1.ca.baltimore.com> I'm going to be supervising one-term programming projects for several (3-4) senior Computer Science majors starting in September. They'll have 5-6 hours a week for 13 weeks to (a) learn their way around whatever technology is thrown at them, (b) build something worth building, and (c) write it up. So: any little itches anyone on this list would like to see scratched? Any little add-ons or uses for Jabber, SOAP, etc. that would only take you a weekend or two, but you've never quite gotten around to building? Thanks, Greg p.s. please reply to me directly; if there's enough interest, I'll put together a summary and re-post. ----------------------------------------------------------------------------------------------------------------- The information contained in this message is confidential and is intended for the addressee(s) only. If you have received this message in error or there are any problems please notify the originator immediately. The unauthorized use, disclosure, copying or alteration of this message is strictly forbidden. Baltimore Technologies plc will not be liable for direct, special, indirect or consequential damages arising from alteration of the contents of this message by a third party or as a result of any virus being passed on. This footnote confirms that this email message has been swept by Baltimore MIMEsweeper for Content Security threats, including computer viruses. From trentm@ActiveState.com Thu Feb 14 22:58:12 2002 From: trentm@ActiveState.com (Trent Mick) Date: Thu, 14 Feb 2002 14:58:12 -0800 Subject: [Python-Dev] Is Barry receiving email at barry@zope.com? PEP number please. Message-ID: <20020214145812.F25577@ActiveState.com> Barry, Could I have a PEP number for my logging system proposal please? Here is what I have put together so far. Others, Feel free to send me comments on this if you like. I will official post a request for comment when I get a PEP number for it. Trent ------------------------------------------------------------------------------ PEP: XXX Title: A Logging System Version: $Revision$ Last-Modified: $Date$ Author: trentm@activestate.com (Trent Mick) Python-Version: 2.3 Status: Draft Type: Standards Track Created: 4-Feb-2002 Post-History: Abstract This PEP describes a proposed logging package for Python's standard library. Basically the system involves the user creating one or more logging objects on which methods are called to log debugging notes/general information/warnings/errors/etc. Different logging 'levels' can be used to distinguish important messages from trivial ones. A registry of named singleton logger objects is maintained so that (1) different logical logging streams (or 'channels') exist (say, one for 'zope.zodb' stuff and another for 'mywebsite'-specific stuff); and (2) one does not have to pass logger object references around. The system is configurable at runtime. This configuration mechanism allows one to tune the level and type of logging done while not touching the application itself. Motivation If a single logging mechanism is enshrined in the standard library, 1) logging is more likely to be done 'well', and 2) multiple libraries will be able to be integrated into larger applications which can be logged reasonably coherently. Influences This proposal was put together after having somewhat studied the following logging packages: o java.util.logging in JDK 1.4 (a.k.a. JSR047) [1] o log4j [2] These two systems are *very* similar. o the Syslog package from the Protomatter project [3] o MAL's mx.Log package [4] This proposal will basically look like java.util.logging with a smattering of log4j. Simple Example This shows a very simple example of how the logging package can be used to generate simple logging output on stdout. --------- mymodule.py ------------------------------- import logging log = logging.getLogger("MyModule") def doit(): log.debug("doin' stuff") # do stuff ... ----------------------------------------------------- --------- myapp.py ---------------------------------- import mymodule, logging log = logging.getLogger("MyApp") log.info("start my app") try: mymodule.doit() except Exception, e: log.error("There was a problem doin' stuff.") log.info("end my app") ----------------------------------------------------- > python myapp.py 0 [myapp.py:4] INFO MyApp - start my app 36 [mymodule.py:5] DEBUG MyModule - doin' stuff 51 [myapp.py:9] INFO MyApp - end my app ^^ ^^^^^^^^^^^^ ^^^^ ^^^^^ ^^^^^^^^^^ | | | | `-- message | | | `-- logging name/channel | | `-- level | `-- location `-- time NOTE: Not sure exactly what the default format will look like yet. Control Flow [Note: excerpts from Java Logging Overview. [5]] Applications make logging calls on *Logger* objects. Loggers are organized in a hierarchical namespace and child Loggers may inherit some logging properties from their parents in the namespace. Notes on namespace: Logger names fit into a "dotted name" namespace, with dots (periods) indicating subnamespaces. The namespace of logger objects therefore corresponds to a single tree data structure. "" is the root of the namespace "Zope" would be a child node of the root "Zope.ZODB" would be a child node of "Zope" These Logger objects allocate *LogRecord* objects which are passed to *Handler* objects for publication. Both Loggers and Handlers may use logging *levels* and (optionally) *Filters* to decide if they are interested in a particular LogRecord. When it is necessary to publish a LogRecord externally, a Handler can (optionally) use a *Formatter* to localize and format the message before publishing it to an I/O stream. Each Logger keeps track of a set of output Handlers. By default all Loggers also send their output to their parent Logger. But Loggers may also be configured to ignore Handlers higher up the tree. The APIs are structured so that calls on the Logger APIs can be cheap when logging is disabled. If logging is disabled for a given log level, then the Logger can make a cheap comparison test and return. If logging is enabled for a given log level, the Logger is still careful to minimize costs before passing the LogRecord into the Handlers. In particular, localization and formatting (which are relatively expensive) are deferred until the Handler requests them. Levels The logging levels, in increasing order of importance, are: DEBUG INFO WARN ERROR FATAL ALL This is consistent with log4j and Protomatter's Syslog and not with JSR047 which has a few more levels and some different names. Implementation-wise: these are just integer constants, to allow simple comparsion of importance. See "What Logging Levels?" below for a debate on what standard levels should be defined. Loggers Each Logger object keeps track of a log level (or threshold) that it is interested in, and discards log requests below that level. The *LogManager* maintains a hierarchical namespace of named Logger objects. Generations are denoted with dot-separated names: Logger "foo" is the parent of Loggers "foo.bar" and "foo.baz". The main logging method is: class Logger: def log(self, level, msg, *args): """Log 'msg % args' at logging level 'level'.""" ... however convenience functions are defined for each logging level: def debug(self, msg, *args): ... def info(self, msg, *args): ... def warn(self, msg, *args): ... def error(self, msg, *args): ... def fatal(self, msg, *args): ... XXX How to defined a nice convenience function for logging an exception? mx.Log has something like this, doesn't it? XXX What about a .raising() convenience function? How about: def raising(self, exception, level=ERROR): ... It would create a log message describing an exception that is about to be raised. I don't like that 'level' is not first when it *is* first for .log(). Handlers Handlers are responsible for doing something useful with a given LogRecord. The following core Handlers will be implemented: - StreamHandler: A handler for writing to a file-like object. - FileHandler: A handler for writing to a single file or set of rotating files. More standard Handlers may be implemented if deemed desireable and feasible. Other interesting candidates: - SocketHandler: A handler for writing to remote TCP ports. - CreosoteHandler: A handler for writing to UDP packets, for low-cost logging. Jeff Bauer already had such a system [5]. - MemoryHandler: A handler that buffers log records in memory (JSR047). - SMTPHandler: Akin to log4j's SMTPAppender. - SyslogHandler: Akin to log4j's SyslogAppender. - NTEventLogHandler: Akin to log4j's NTEventLogAppender. - SMTPHandler: Akin to log4j's SMTPAppender. Formatters A Formatter is responsible for converting a LogRecord to a string representation. A Handler may call its Formatter before writing a record. The following core Formatters will be implemented: - Formatter: Provide printf-like formatting, perhaps akin to log4j's PatternAppender. Other possible candidates for implementation: - XMLFormatter: Serialize a LogRecord according to a specific schema. Could copy the schema from JSR047's XMLFormatter or log4j's XMLAppender. - HTMLFormatter: Provide a simple HTML output of log information. (See log4j's HTMLAppender.) Filters A Filter can be called by a Logger or Handler to decide if a LogRecord should be logged. JSR047 and log4j have slightly different filtering interfaces. The former is simpler: class Filter: def isLoggable(self): """Return a boolean.""" The latter is modeled after Linux's ipchains (where Filter's can be chained with each filter either 'DENY'ing, 'ACCEPT'ing, or being 'NEUTRAL' on each check). I would probably favor to former because it is simpler and I don't immediate see the need for the latter. No filter implementations are currently proposed (other that the do nothing base class) because I don't have enough experience to know what kinds of filters would be common. Users can always subclass Filter for their own purposes. Log4j includes a few filters that might be interesting. Configuration Note: Configuration for the proposed logging system is currently under-specified. The main benefit of a logging system like this is that one can control how much and what logging output one gets from an application without changing that application's source code. Log4j and Syslog provide for configuration via an external XML file. Log4j and JSR047 provide for configuration via Java properties (similar to -D #define's to a C/C++ compiler). All three provide for configuration via API calls. Configuration includes the following: - What logging level a logger should be interested in. - What handlers should be attached to which loggers. - What filters should be attached to which handlers and loggers. - Specifying attributes specific to certain Handlers and Filters. - Defining the default configuration. - XXX Add others. In general each application will have its own requirements for how a user may configure logging output. One application (e.g. distutils) may want to control logging levels via '-q,--quiet,-v,--verbose' options to setup.py. Zope may want to configure logging via certain environment variables (e.g. 'STUPID_LOG_FILE' :). Komodo may want to configure logging via its preferences system. This PEP proposes to clearly document the API for configuring each of the above listed configurable elements and to define a reasonable default configuration. This PEP does not propose to define a general XML or .ini file configuration schema and the backend to parse it. It might, however, be worthwhile to define an abstraction of the configuration API to allow the expressiveness of Syslog configuration. Greg Wilson made this argument: In Protomatter [Syslog], you configure by saying "give me everything that matches these channel+level combinations", such as "server.error" and "database.*". The log4j "configure by inheritance" model, on the other hand, is very clever, but hard for non-programmers to manage without a GUI that essentially reduces it to Protomatter's. Case Scenarios This section presents a few usage scenarios which will be used to help decide how best to specify the logging API. (1) A short simple script. This script does not have many lines. It does not heavily use an third party modules (i.e. the only code doing any logging would be the main script). Only one logging channel is really needed and thus, the channel name is unnecessary. The user doesn't want to bother with logging system configuration much. (2) Medium sized app with C extension module. Includes a few Python modules and a main script. Employs, perhaps, a few logging channels. Includes a C extension module which might want to make logging calls as well. (3) Distutils. A large number of Python packages/modules. Perhaps (but not necessarily) a number of logging channels are used. Specifically needs to facilitate the controlling verbosity levels via simple command line options to 'setup.py'. (4) Large, possibly multi-language, app. E.g. Zope or (my experience) Komodo. (I don't expect this logging system to deal with any cross-language issues but it is something to think about.) Many channels are used. Many developers involved. People providing user support are possibly not the same people who developed the application. Users should be able to generate log files (i.e. configure logging) while reproducing a bug to send back to developers. Implementation XXX Details to follow consensus that this proposal is a good idea. What Logging Levels? The following are the logging levels defined by the systems I looked at: - log4j: DEBUG, INFO, WARN, ERROR, FATAL - syslog: DEBUG, INFO, WARNING, ERROR, FATAL - JSR047: FINEST, FINER, FINE, CONFIG, INFO, WARNING, SEVERE - zLOG (used by Zope): TRACE=-300 -- Trace messages DEBUG=-200 -- Debugging messages BLATHER=-100 -- Somebody shut this app up. INFO=0 -- For things like startup and shutdown. PROBLEM=100 -- This isn't causing any immediate problems, but deserves attention. WARNING=100 -- A wishy-washy alias for PROBLEM. ERROR=200 -- This is going to have adverse effects. PANIC=300 -- We're dead! - mx.Log: SYSTEM_DEBUG SYSTEM_INFO SYSTEM_UNIMPORTANT SYSTEM_MESSAGE SYSTEM_WARNING SYSTEM_IMPORTANT SYSTEM_CANCEL SYSTEM_ERROR SYSTEM_PANIC SYSTEM_FATAL The current proposal is to copy log4j. XXX I suppose I could see adding zLOG's "TRACE" level, but I am not sure of the usefulness of others. Static Logging Methods (as per Syslog)? Both zLOG and Syslog provide module-level logging functions rather (or in addition to) logging methods on a created Logger object. XXX Is this something that is deemed worth including? Pros: - It would make the simplest case shorter: import logging logging.error("Something is wrong") instead of import logging log = logging.getLogger("") log.error("Something is wrong") Cons: - It provides more than one way to do it. - It encourages logging without a channel name, because this mechanism would likely be implemented by implicitly logging on the root (and nameless) logger of the hierarchy. References [1] java.util.logging http://java.sun.com/j2se/1.4/docs/guide/util/logging/ [2] log4j: a Java logging package http://jakarta.apache.org/log4j/docs/index.html [3] Protomatter's Syslog http://protomatter.sourceforge.net/1.1.6/index.html http://protomatter.sourceforge.net/1.1.6/javadoc/com/protomatter/syslog/syslog-whitepaper.html [4] MAL mentions his mx.Log logging module: http://mail.python.org/pipermail/python-dev/2002-February/019767.html [5] Jeff Bauer's Mr. Creosote http://starship.python.net/crew/jbauer/creosote/ Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil fill-column: 70 End: -- Trent Mick TrentM@ActiveState.com From DavidA@ActiveState.com Fri Feb 15 01:06:52 2002 From: DavidA@ActiveState.com (David Ascher) Date: Thu, 14 Feb 2002 17:06:52 -0800 Subject: [Python-Dev] Reminder: Python track at OSCON -- Deadline March 1! Message-ID: <3C6C5F2C.9E9B1B79@activestate.com> Reminder: The O'Reilly Open Source Convention (July 22-26, 2002 -- San Diego, CA) is accepting proposals for tutorials, talks, panels, and lightning talks. See the Call for Participation in the Python and Zope track on python.org. Proposals are due by March 1, so don't wait a moment longer! Details available at: CFP URL: http://www.python.org/workshops/oscon2002/cfp.html form: http://conferences.oreillynet.com/cs/os2002/create/e_sess Cheers, -- David Ascher [Guido's the program chair, but he's on the road, so I'm filling in] PS: Feel free to resend this to whatever mailing lists may be interested. From barry@zope.com Fri Feb 15 04:10:45 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 14 Feb 2002 23:10:45 -0500 Subject: [Python-Dev] Is Barry receiving email at barry@zope.com? PEP number please. References: <20020214145812.F25577@ActiveState.com> Message-ID: <15468.35397.821775.509858@anthem.wooz.org> >>>>> "TM" == Trent Mick writes: TM> Barry, Could I have a PEP number for my logging system TM> proposal please? Here is what I have put together so far. PEP 282. Spell checked and formatted, and checked in. (Remember that I usually batch up PEP stuff and handle them about once a week, so please be patient if I don't get to it for a day or two.) -Barry From MR MICHEAL ADAM" ATTN: THE PRESIDENT/CEO Dear Sir / Madam, I am Dr. Mrs. Marian Abacha, wife to the late Nigerian Head of state, General Sani Abacha who died on the 8th of June 1998 while still on active service for our Country. I am contacting you with the hope that you will be of great assistance to me, I currently have within my reach the sum of 76MILLION U.S dollars cash which l intend to use for investment purposes outside Nigeria. This money came as a result of a payback contract deal between my husband and a Russian firm in our country's multi-billion dollar Ajaokuta steel plant. The Russian partners returned my husband's share being the above sum after his death. Presently, the new civilian Government has intensified their probe into my husband's financial resources, which has led to the freezing of all our accounts, local and foreign, the revoking of all our business licenses and the arrest of my First son. In view of this I acted very fast to withdraw this money from one of our finance houses before it was closed down. I have deposited the money in a security vault for safe keeping with the help of very loyal officials of my late husband. No record is known about this fund by the government because there is no documentation showing that we received such funds. Due to the current situation in the country and government attitude to my financial affairs, I cannot make use of this money within. Bearing in mind that you may assist me, 20% of the total amount will be paid to you for your assistance, while 5% will be set aside for expenses incurred by the parties involved and this will be paid before sharing. Half of my75% will be paid in to my account on your instruction once the money hits your account, while the other half will be invested by your humble self in any viable business venture you deem fit, with you as manager of the invested funds. Remunerations, during the investment period will be on a 50/50 basis. Your URGENT response is needed. All correspondence must be through my lawyer,fax:234-1-4709814. Attentioned to my attorney (HAMZA IBU). Please do not forget to include your direct tel/fax line for easy reach. I hope I can trust you with my family's last financial hope.Regards Dr. Mrs. Marian Sani Abacha. C/o HAMZA IBU (counsel) URGENT AND CONFIDENTIAL MR. MICHEAL ADAM FAX: 234-1-7590900 Attn: The Chief Executive Officer REQUEST FOR URGENT AND CONFIDENTIAL BUSINESS RELATIONSHIP Please permit me to introduce myself to you, my names are Mr. MICHEAL ADAM a Petroleum Engineer with the Nigerian National Petroleum Corporation and a member of the contract award committee of the above corporation, which is under, The Federal Ministry of Petroleum and Natural Resources. CONFIDENTIAL THE SOURCE OF THE FUND IS AS FOLLOWS: With the assistance of some senior officials of the Federal Ministry of Finance and Office of the Accountant General of the Federation, we want to quietly transfer the sum of Nineteen Million US Dollars only ($19m US Dollars only) out of my country Nigeria. This US$19 M US Dollar was quietly over-estimated on the contract for Turn around Maintenance (TAM) of Port Harcourt petrochemical refinery in Nigeria (SOUTHERN NIGERIA) and the Rehabilitation of Petroleum Pipelines, Depot and Jetties. The actual contract value of this said project was US$171M US Dollars, but my colleagues and I deliberately increased the contract to our own benefit to the tune of $190M US Dollars, of which the over-estimated value of US$19M US Dollars belongs to us and this amount is what we want to secretly transfer into your personal or company account for safe keeping and sharing. The Federal Government and the Federal Ministry of Petroleum and Natural Resources have approved the total sum of US$190 Million US Dollars. The project has been completed and commissioned by the Federal Government and the original contractors have been paid their Contractual sum and what is left now is the US$19Million US Dollars. Under this circumstance and upon your acceptance we will register You/your Company as a sub-contractor to the original contractors with my corporation, so that this fund can be transferred into your account without hitch whatsoever. Our reasons of soliciting your assistance to transfer this fund to your account is owing to the policy of the Federal Government of Nigeria, the code conduct debars us civil servants (Government Workers) from operating a foreign account, hence we seeking your assistance. After several deliberations with my colleagues, we decided to give you 25% as your entitlement for your assistance for providing your account, while 70% will be for us and the remaining 5% would be used to offset all local and foreign expenses that might be incurred during this transaction. However this is based on the ground that you would assure me of the following: 1 That after the successful transfer of the $19m us dollars into your account, you will give us our own fare share of 70% without running away with the money or setting on it to our detriment. 2 That you will treat this business with utmost secrecy, Confidentiality, understanding and sincerity, which this business demands. 3 You will assist us (by way of advice) to invest our own share in business venture in your country. 4 Upon your acceptance of this proposal I will send a TEXT for you to fill in your letter headed paper and return back to me, as we shall use this TEXT to raise an application for payment on your behalf as you will be made the recognized beneficiary of the fund. KINDLY FORWARD YOUR TELEPHONE AND FAX NUMBER to me also. PLEASE NOTE: that this business is 100% risk free and will not implicate you in any way, sir. Finally please if you feel you cannot do this business with us, kindly delete this message from your computer or destroy it as it will do you no good showing it to a third party or anybody whatsoever, please kindly do us this favor for God sake. The kind of business you do does not effect the business. Sincerely yours, MR. MICHEAL ADAM From oren-py-d@hishome.net Fri Feb 15 09:25:55 2002 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 15 Feb 2002 04:25:55 -0500 Subject: [Python-Dev] syntactic sugar idea for {static,class}methods In-Reply-To: <200202131712.RAA29416@synaptics-uk.com> References: <200202131712.RAA29416@synaptics-uk.com> Message-ID: <20020215092555.GA12028@hishome.net> On Wed, Feb 13, 2002 at 05:11:57PM +0000, Gareth McCaughan wrote: > One drawback of allowing an arbitrary list of transformations > is that it might not be completely clear what order they're done in. > I conjecture that most people will have the same intuition > as I do about this, namely that the first-listed transformation > is applied first. (It would be less obvious if the list came > before the name of the definiendum instead of after.) The modifier order [memoize, staticmethod] sounds more like the sentence "foo is a memoized staticmethod" - at least in English it does. In French, Hebrew and several other languages it's the other way around, but Python is definitely English-oriented. So, do adjectives come before or after the noun in Dutch? :-) Oren From martin@v.loewis.de Fri Feb 15 09:36:20 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Fri, 15 Feb 2002 10:36:20 +0100 Subject: [Python-Dev] PEP needed? Introducing Tcl objects Message-ID: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de> I have a patch that makes the Tcl object API available to _tkinter, in the sense that Tcl invocations don't necessarily return strings, but return Tcl objects (or appropriately converted Python objects). This both helps to improve efficiency, and to improve correctness of Tkinter applications since less type guessing is needed. For backward compatibility, there is an option on the tkapp object to determine whether strings or objects are returned. This is on by default when using Tkinter, but can be turned off through setting Tkinter.want_objects to 0. I found that IDLE still works fine when using Tcl objects. Do I need to write a PEP for this change, should I post a patch to SF, or should I just apply the change to the CVS? Regards, Martin From gmccaughan@synaptics-uk.com Fri Feb 15 10:09:24 2002 From: gmccaughan@synaptics-uk.com (Gareth McCaughan) Date: Fri, 15 Feb 2002 10:09:24 +0000 (GMT) Subject: Re[2]: [Python-Dev] syntactic sugar idea for {static,class}methods In-Reply-To: <20020215092555.GA12028@hishome.net> References: <200202131712.RAA29416@synaptics-uk.com> <20020215092555.GA12028@hishome.net> Message-ID: <200202151010.KAA03070@synaptics-uk.com> On Fri, 15 Feb 2002 04:25:55 -0500, Oren Tirosh wrote: > On Wed, Feb 13, 2002 at 05:11:57PM +0000, Gareth McCaughan wrote: > > One drawback of allowing an arbitrary list of transformations > > is that it might not be completely clear what order they're done in. > > I conjecture that most people will have the same intuition > > as I do about this, namely that the first-listed transformation > > is applied first. (It would be less obvious if the list came > > before the name of the definiendum instead of after.) > > The modifier order [memoize, staticmethod] sounds more like the sentence > "foo is a memoized staticmethod" - at least in English it does. In French, > Hebrew and several other languages it's the other way around, but Python > is definitely English-oriented. Interesting. I read it more as: "Define a function, then memoize it and make it a static method". > So, do adjectives come before or after the noun in Dutch? :-) I don't think they do. :-) By the way, the fact that adjectives go before nouns in English is one reason why I don't read "def foo() [wibblify]" as if "wibblify" is an adjective. It can't be: it comes after the noun. PS. Court martial. C sharp. Letters patent. Bother. :-) -- g From mwh@python.net Fri Feb 15 10:29:59 2002 From: mwh@python.net (Michael Hudson) Date: 15 Feb 2002 10:29:59 +0000 Subject: Re[2]: [Python-Dev] syntactic sugar idea for {static,class}methods In-Reply-To: Gareth McCaughan's message of "Fri, 15 Feb 2002 10:09:24 +0000 (GMT)" References: <200202131712.RAA29416@synaptics-uk.com> <20020215092555.GA12028@hishome.net> <200202151010.KAA03070@synaptics-uk.com> Message-ID: <2m3d03xck8.fsf@starship.python.net> Gareth McCaughan writes: > On Fri, 15 Feb 2002 04:25:55 -0500, Oren Tirosh wrote: > > The modifier order [memoize, staticmethod] sounds more like the sentence > > "foo is a memoized staticmethod" - at least in English it does. In French, > > Hebrew and several other languages it's the other way around, but Python > > is definitely English-oriented. > > Interesting. I read it more as: "Define a function, then memoize it > and make it a static method". That's what my patch does, too, but I can't remember whether this was by accident or design :-/. Incidentally, I'm not sure class C: def s(): print 1 s = memoize(staticmethod(s)) would actually work (s would have type 'function'). I guess memoize could made cleverer than the version I posted. Cheers, M. -- "declare"? my bogometer indicates that you're really programming in some other language and trying to force Common Lisp into your mindset. this won't work. -- Erik Naggum, comp.lang.lisp From jacobs@penguin.theopalgroup.com Fri Feb 15 14:57:19 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 15 Feb 2002 09:57:19 -0500 (EST) Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <3C6C5F2C.9E9B1B79@activestate.com> Message-ID: [I tried to post this on SourceForge, but as usual, it hates my guts] I have been hacking on ways to make lighter-weight Python objects using the __slots__ mechanism that came with Python 2.2 new-style class. Everything has gone swimmingly until I noticed that slots do not get pickled/cPickled at all! Here is a simple test case: import pickle,cPickle class Test(object): __slots__ = ['x'] def __init__(self): self.x = 66666 test = Test() pickle_str = pickle.dumps( test ) cpickle_str = cPickle.dumps( test ) untest = pickle.loads( pickle_str ) untestc = cPickle.loads( cpickle_str ) print untest.x # raises AttributeError print untextc.x # raises AttributeError Clearly, this is incorrect behavior. The problem is due a change in object reflection semantics. Previously (before type-class unification), a standard Python object instance always contained a __dict__ that listed all of its attributes. Now, with __slots__, some classes that do store attributes have no __dict__ or one that only contains what did not fit into slots. Unfortunately, there is no trivial way to know what slots a particular class instance really has. This is because the __slots__ list in classes and instances can be mutable! Changing these lists _does not_ change the object layout at all, so I am unsure why they are not stored as tuples and the '__slots__' attribute is not made read-only. To be pedantic, the C implementation does have an immutable and canonical list(s) of slots, though they are well buried within the C extended type implementation. So, IMHO this bug needs to be fixed in two steps: First, I propose that class and instance __slots__ read-only and the lists made immutable. Otherwise, pickle, cPickle, and any others that want to use reflection will be SOL. There is certainly good precedent in several places for this change (e.g., __bases__, __mro__, etc.) I can submit a fairly trivial patch to do so. This change requires Guido's input, since I am guessing that I am simply not privy to the method, only the madness. Second, after the first issue is resolved, pickle and cPickle must then be modified to iterate over an instance's __slots__ (or possibly its class's) and store any attributes that exist. i.e., some __slots__ can be empty and thus should not be pickled. I can also whip up patches for this, though I'll wait to see how the first issue shakes out. Regards, -Kevin PS: There may be another problem when when one class inherits from another and both have a slot with the same name. e.g.: class Test(object): __slots__ = ['a'] class Test2(Test): __slots__ = ['a'] test=Test() test2=Test2() test2.__class__ = Test This code results in this error: Traceback (most recent call last): File "", line 1, in ? TypeError: __class__ assignment: 'Test' object layout differs from 'Test2' However, Test2's slot 'a' entirely masks Test's slot 'a'. So, either there should be some complex attribute access scheme to make both slots available OR (in my mind, the preferable solution) slots with the same name can simply be re-used or coalesced. Now that I think about it, this has implications for pickling objects as well. I'll likely leave this patch for Guido -- it tickles some fairly hairy bits of typeobject. Cool stuff, but the rabbit hole just keeps getting deeper and deeper.... -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From barry@zope.com Fri Feb 15 15:03:07 2002 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 15 Feb 2002 10:03:07 -0500 Subject: [Python-Dev] PEP needed? Introducing Tcl objects References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de> Message-ID: <15469.9003.4790.636511@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> Do I need to write a PEP for this change, should I post a MvL> patch to SF, or should I just apply the change to the CVS? As this is an enhancement to a library, I don't think you'd need a PEP. -Barry From guido@python.org Fri Feb 15 15:34:28 2002 From: guido@python.org (Guido van Rossum) Date: Fri, 15 Feb 2002 10:34:28 -0500 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: Your message of "Fri, 15 Feb 2002 10:36:20 +0100." <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de> References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de> Message-ID: <200202151534.g1FFYSK25793@pcp742651pcs.reston01.va.comcast.net> > I have a patch that makes the Tcl object API available to _tkinter, in > the sense that Tcl invocations don't necessarily return strings, but > return Tcl objects (or appropriately converted Python objects). This > both helps to improve efficiency, and to improve correctness of Tkinter > applications since less type guessing is needed. Cool! > For backward compatibility, there is an option on the tkapp object to > determine whether strings or objects are returned. This is on by > default when using Tkinter, but can be turned off through setting > Tkinter.want_objects to 0. I found that IDLE still works fine when > using Tcl objects. > > Do I need to write a PEP for this change, should I post a patch to SF, > or should I just apply the change to the CVS? My only worry is about breaking old apps. I'd like to see a patch on SF. No PEP is needed IMO. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Fri Feb 15 15:15:10 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 15 Feb 2002 16:15:10 +0100 Subject: [Python-Dev] PEP needed? Introducing Tcl objects References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de> Message-ID: <04dc01c1b636$d7324650$0900a8c0@spiff> martin wrote: > For backward compatibility, there is an option on the tkapp object to > determine whether strings or objects are returned. This is on by > default when using Tkinter "on" as in "return strings" or "return objects" ? I doubt it's a good idea to change the return type without any warning. > Do I need to write a PEP for this change, should I post a patch to SF, > or should I just apply the change to the CVS? if the default is "use old behaviour", check it in. if you insist on changing the return types, post it to SF. From john_coppola_r_s@yahoo.com Fri Feb 15 17:57:12 2002 From: john_coppola_r_s@yahoo.com (john coppola) Date: Fri, 15 Feb 2002 09:57:12 -0800 (PST) Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: Message-ID: <20020215175712.96785.qmail@web11807.mail.yahoo.com> Hi Kevin. I'm with you slots have great potential. First of all, I think it is a very bad idea that slots could be mutable. Here is something I wrote on the python-list. (Python list is too noisy and too many dumb questions involving people not thinking before tackle a problem. -with the exception of an interesting thread on licensing, and a spanish message.) Anyway, here is some food for thought on slots. Perhaps long-winded and already talked about, but here it is: (If you want to skip the intro and get to the meat and potatos go to the code, and read my remarks. - - - - - - Hello Python developers! I have been reading about Python 2.2's new features and I have to say there's lots of good stuff going on. The feature I dedicate this message to is slots. This ubiquitous new feature has the potential to boost performance of python quite considerable and it is the first time python has ever hinted at, for lack of a better word, "being static". I used this word loosely. What I mean by this is that typically one could set attributes arbitrarily on python objects, but now cometh slots, if __slots__ is defined, one could set only attribute names defined in slots if __slots__ exists. In introductory essay on the new python object system, "Unifying types and classes in 2_2", I found the benevolent dictator's answer for the existence of __slots__ unsatisfactory (or at least the example). As implied from the introductory essay, __slots__ was used as a means to eliminate a double dictionary for the dictionary subclass in the example (please don't be offended -- SO what). That is one application of slots, but there is a more powerful way to use this new feature. First let me discuss some notational issues I have with __slots__. It is possible to use any object that supports iteration as the __slots__ object (huh does my memory serve me correctly?). I feel that this is not a good idea. Consider the following example: class Foo(object): __slots__=['a','b','c'] foo=Foo() Bad, but you could do it. foo.__class__.__slots__.append('d') #Ouch! Suppose someone wrote code to initialize some instance like so: for attr in foo.__class__.__slots__: setattr(foo,attr,0) Granted this is decidedly bad code but the problem this creates is that attribute "d" does not exist in foo or any instance of Foo before or after the "append event". So an attribute error will be raised. Slot attributes are read only thus del foo.__slots__ is not possible. So neither should __slots__.append be legal. So the answer is to enforce __slots__ is tuple, or for a single attribute __slots__ is string, when compiling objects. Another issue I have with __slots__ is that there exists better notation to do the same thing. And part of python is to maintain a clean coding style. Why not introduce new a keyword into the language so "static attribute sets" are created like so: class Foo(object): attribute a attribute b attribute c Perhaps this syntax could be expanded further to include strict typing: class Foo(object): attribute a is str attribute b is float attribute c is int I feel that this notation is much more lucid and could lead to even more interesting notation: class Foo(object): attribute a is str attribute b is float attribute c is property attribute g is property of float I hope that it is understood from this context that "is" compares the type() to trailing type in the declaration. The meaning of the modifier "of" in the last line is a bit ambiguous at this moment. Perhaps it enforces the type passed into fset, and the return value of fget. The programmer doesn't need to know what the underlying __slots__ structure looks like, unless they are doing some heavy duty C. In which case they could probably handle it in all it's grand ugliness. Is the python interpreter __slot__ aware? Instead of the usual variable name lookup into __dict__ (which is not supported by instances who's class implements __slots__), a slot attribute could be directly referenced via pointer indirection. Does the interpreter substitute references for slot attributes? For that matter any namespace could utilize this new approach, even the local namespace of a code segment. Slots could be built implicitly for a local namespace. This could lead to significant performance boost. Or perhaps a JIT of sorts for python. Cause all you need is a bit of type info, and those segments could be accurately translated into native code. ------ --- Kevin Jacobs wrote: > [I tried to post this on SourceForge, but as usual, > it hates my guts] > > I have been hacking on ways to make lighter-weight > Python objects using the > __slots__ mechanism that came with Python 2.2 > new-style class. Everything > has gone swimmingly until I noticed that slots do > not get pickled/cPickled > at all! > > Here is a simple test case: > > import pickle,cPickle > class Test(object): > __slots__ = ['x'] > def __init__(self): > self.x = 66666 > > test = Test() > > pickle_str = pickle.dumps( test ) > cpickle_str = cPickle.dumps( test ) > > untest = pickle.loads( pickle_str ) > untestc = cPickle.loads( cpickle_str ) > > print untest.x # raises AttributeError > print untextc.x # raises AttributeError > > Clearly, this is incorrect behavior. The problem is > due a change in object > reflection semantics. Previously (before type-class > unification), a > standard Python object instance always contained a > __dict__ that listed all > of its attributes. Now, with __slots__, some > classes that do store > attributes have no __dict__ or one that only > contains what did not fit into > slots. > > Unfortunately, there is no trivial way to know what > slots a particular class > instance really has. This is because the __slots__ > list in classes and > instances can be mutable! Changing these lists > _does not_ change the object > layout at all, so I am unsure why they are not > stored as tuples and the > '__slots__' attribute is not made read-only. To be > pedantic, the C > implementation does have an immutable and canonical > list(s) of slots, though > they are well buried within the C extended type > implementation. > > So, IMHO this bug needs to be fixed in two steps: > > First, I propose that class and instance __slots__ > read-only and the lists > made immutable. Otherwise, pickle, cPickle, and any > others that want to use > reflection will be SOL. There is certainly good > precedent in several places > for this change (e.g., __bases__, __mro__, etc.) I > can submit a fairly > trivial patch to do so. This change requires > Guido's input, since I am > guessing that I am simply not privy to the method, > only the madness. > > Second, after the first issue is resolved, pickle > and cPickle must then be > modified to iterate over an instance's __slots__ (or > possibly its class's) > and store any attributes that exist. i.e., some > __slots__ can be empty and > thus should not be pickled. I can also whip up > patches for this, though I'll > wait to see how the first issue shakes out. > > Regards, > -Kevin > > PS: There may be another problem when when one > class inherits from another > and both have a slot with the same name. > > e.g.: > class Test(object): > __slots__ = ['a'] > > class Test2(Test): > __slots__ = ['a'] > > test=Test() > test2=Test2() > test2.__class__ = Test > > This code results in this error: > > Traceback (most recent call last): > File "", line 1, in ? > TypeError: __class__ assignment: 'Test' object > layout differs from 'Test2' > > However, Test2's slot 'a' entirely masks Test's > slot 'a'. So, either > there should be some complex attribute access > scheme to make both slots > available OR (in my mind, the preferable > solution) slots with the same > name can simply be re-used or coalesced. Now > that I think about it, > this has implications for pickling objects as > well. I'll likely leave > this patch for Guido -- it tickles some fairly > hairy bits of typeobject. > > Cool stuff, but the rabbit hole just keeps > getting deeper and deeper.... > > -- > Kevin Jacobs > The OPAL Group - Enterprise Systems Architect > Voice: (216) 986-0710 x 19 E-mail: > jacobs@theopalgroup.com > Fax: (216) 986-0714 WWW: > http://www.theopalgroup.com > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev __________________________________________________ Do You Yahoo!? Got something to say? Say it better with Yahoo! Video Mail http://mail.yahoo.com From jacobs@penguin.theopalgroup.com Fri Feb 15 18:08:39 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 15 Feb 2002 13:08:39 -0500 (EST) Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <20020215175712.96785.qmail@web11807.mail.yahoo.com> Message-ID: On Fri, 15 Feb 2002, john coppola wrote: > Hi Kevin. I'm with you slots have great potential. > First of all, I think it is a very bad idea that slots > could be mutable. They don't have to be mutable -- its more a relic of the implementation. e.g.: class A(object): __slots__ = ('a','b') A.__slots__.append('c') Traceback (most recent call last): File "", line 1, in ? AttributeError: 'tuple' object has no attribute 'append' Of course, I think that this is something of a mistake, though I'll reserve final justment until after I've heard Guido's reasoning. Its clear that he has bigger plans for both the syntax and semantics (especially if you heard some of his off-hand remarks at IPC10). -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From skip@pobox.com Fri Feb 15 19:13:34 2002 From: skip@pobox.com (Skip Montanaro) Date: Fri, 15 Feb 2002 13:13:34 -0600 Subject: [Python-Dev] IRC? Message-ID: <15469.24030.981884.932642@12-248-41-177.client.attbi.com> Is there a set time people congregate on IRC (#python-dev @ irc.openprojects.net)? Skip From mclay@nist.gov Fri Feb 15 19:19:19 2002 From: mclay@nist.gov (Michael McLay) Date: Fri, 15 Feb 2002 14:19:19 -0500 Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <20020215175712.96785.qmail@web11807.mail.yahoo.com> References: <20020215175712.96785.qmail@web11807.mail.yahoo.com> Message-ID: <200202151923.g1FJNVJ5017966@email.nist.gov> On Friday 15 February 2002 12:57 pm, john coppola wrote: > Another issue I have with __slots__ is that there > exists better notation to do the same thing. And part > of python is to maintain a clean coding style. Why not > introduce new a keyword into the language so > "static attribute sets" are created like so: > > class Foo(object): > attribute a > attribute b > attribute c Guido suggested several alternative syntaxes during his presentation on developers day at the Python conference. His slides are online from python.org. He used the word "slot" instead of "attribute". Both names have advantanges. He question the use of the term slot and was looking for an appropriate substitute. It has the advantage of being short and easy to type. Using the term "attribute" would be more descriptive. I find it very long to type and somewhat visually distracting. > > Perhaps this syntax could be expanded further to > include strict typing: > > class Foo(object): > attribute a is str > attribute b is float > attribute c is int I suggested a method for adding optional static typing to slots and submitted a patch[1] last November. My patch eliminated the special __slots__ member and replaced it with a special method to a class called addmember() that could be used to add slot definitions to a class. A docstring, an optional type, and a default value could be specified as parameters in the addmembers() method. If a type, or tuple of types was defined the tp_descr_set call in C would first check the type of the object being assigned to the member name using the isinstance() builtin. If isinstance failed an exception would be triggered. While my approach was patterened after the property() builtin, the Python Labs crowd didn't like the notation and rejected the patch. (I don't think it helped that the feature freeze was about to start and Guido was out of paternity leave.) Here was example of how the addmember() method worked: >>> class B(object): """class B's docstring """ a = addmember(types=int, default=56, doc="a docstring") b = addmember(types=int, doc="b's docstring") c = addmember(types=(int,float), default=5.0, doc="c docstring") d = addmember(types=(str,float), default="ham", doc="d docstring") >>> b = B() >>> b.a 56 >>> b.d 'ham' [1]http://sourceforge.net/tracker/index.php?func=detail&aid=480562&group_id=5470&atid=305470 I think most everyone would agree that adding optional static typing could enable some interesting optimizations, but not everyone at the conference was supportive of the encrouchment of static typing on their favorite dynamiically typed language. Greg Ward presented a proposed syntax for optional static typing at the end of the optimization sesssion on developer's day. Greg had made a presentation on Grouch, his postprocessing type checker for ZODB, during the lightning talks. The proposed syntax for optional static typing was a meld of Guido's proposed new syntax for slots, my addmembers() work, and Greg's Grouch. Guido wasn't keen on the docstring notation, but otherwise seemed to be receptive of the syntax and the idea of adding optional static typing. Here is what Greg presented: # ====================================================================== # WHAT IF... # # Grouch's type syntax were integrated into Python? class Thing: slot name : string "The name of the thing" class Animal (Thing): slot num_legs : int = 4 "Number of legs this animal has" slot furry : boolean = true "Does this animal have full-body fur?" class Vegetable (Thing): # hypothetical constraint (is this type info?) slot colour : string oneof ("red", "green", "blue") "Colour of this vegetable's external surface" class Mineral (Thing): slot atoms : [(atom:Atom, n:int)] """Characterize the proportion of elements in this mineral's crystal lattice: each tuple records the relative count of that atom in the lattice.""" # possible constraints on this attribute: # n > 0 # atom in AtomSet (collection of all atoms) # how to code these? do we even *want* syntax # for arbitrary constraints? or just require that # you code a set_atoms() method? (or a property # modifier, or something) >>> class B(object): """Docstring of class B """ slot a: int = 56 """a docstring """ precondition: if b > 3: default=3 slot b: int """Docstring of b """ slot c: int | float = 5.0 """Docstring of attribute c """ slot d: str | float = "spam" """Docstring of d """ postcondition: if self.b = 42.0 >>> b = B() >>> b.a From mal@lemburg.com Fri Feb 15 19:35:44 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 15 Feb 2002 20:35:44 +0100 Subject: [Python-Dev] IRC? References: <15469.24030.981884.932642@12-248-41-177.client.attbi.com> Message-ID: <3C6D6310.E69758D2@lemburg.com> Skip Montanaro wrote: > > Is there a set time people congregate on IRC (#python-dev @ > irc.openprojects.net)? Not really. I think that IRC is mostly a waste of time unless you have something serious to talk about (e.g. a meeting, specific problem, etc.), maybe just me, though. It would probably help with some tough problems though, e.g. porting issues, etc. Sort of like online support :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From fdrake@acm.org Fri Feb 15 19:35:20 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 15 Feb 2002 14:35:20 -0500 Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <200202151923.g1FJNVJ5017966@email.nist.gov> References: <20020215175712.96785.qmail@web11807.mail.yahoo.com> <200202151923.g1FJNVJ5017966@email.nist.gov> Message-ID: <15469.25336.313450.27375@grendel.zope.com> Michael McLay writes: > While my approach was patterened after the property() builtin, the > Python Labs crowd didn't like the notation and rejected the I'll note as well that at least some of us, if not all, don't like the property() syntax as well. My current favorite was one of Guido's proposals at Python 10: class Foo(object): property myprop: """A computed property on Foo objects.""" def __get__(self): return ... def __set__(self): ... def __delete__(self): ... -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mclay@nist.gov Fri Feb 15 19:46:10 2002 From: mclay@nist.gov (Michael McLay) Date: Fri, 15 Feb 2002 14:46:10 -0500 Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <15469.25336.313450.27375@grendel.zope.com> References: <20020215175712.96785.qmail@web11807.mail.yahoo.com> <200202151923.g1FJNVJ5017966@email.nist.gov> <15469.25336.313450.27375@grendel.zope.com> Message-ID: <200202151950.g1FJoMJ5006922@email.nist.gov> On Friday 15 February 2002 02:35 pm, Fred L. Drake, Jr. wrote: > Michael McLay writes: > > While my approach was patterened after the property() builtin, the > > Python Labs crowd didn't like the notation and rejected the > > I'll note as well that at least some of us, if not all, don't like the > property() syntax as well. My current favorite was one of Guido's > proposals at Python 10: I agree with you on this being a better notation. It unclutters the class definition. Had Guido suggested the alternative slot syntaxes back at the start of November I would have used one of the alternative syntaxes instead of creating a new builtin function. BTW, adding a builtin function is a pain. The trick of counting the number of parameters to determine behavior caused strange things to happen during the testing of the addmember function. > class Foo(object): > property myprop: > """A computed property on Foo objects.""" > > def __get__(self): > return ... > def __set__(self): > ... > def __delete__(self): Is someone working on an implementation of this? From fdrake@acm.org Fri Feb 15 20:00:44 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 15 Feb 2002 15:00:44 -0500 Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <200202151950.g1FJoMJ5006922@email.nist.gov> References: <20020215175712.96785.qmail@web11807.mail.yahoo.com> <200202151923.g1FJNVJ5017966@email.nist.gov> <15469.25336.313450.27375@grendel.zope.com> <200202151950.g1FJoMJ5006922@email.nist.gov> Message-ID: <15469.26860.234987.122278@grendel.zope.com> Michael McLay writes: > Is someone working on an implementation of this? Not that I know of. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From tim.one@comcast.net Fri Feb 15 20:13:32 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 15 Feb 2002 15:13:32 -0500 Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <200202151950.g1FJoMJ5006922@email.nist.gov> Message-ID: [Michael McLay] > ... > Had Guido suggested the alternative slot syntaxes back at the start of > November I would have used one of the alternative syntaxes instead > of creating a new builtin function. Guido said several times during 2.2 development that he didn't like using builtin functions for some of the new class features, but that syntax issues were off the table before 2.3 because there wasn't time to address them for 2.2. He didn't repeat this every time it was brought up, though; you can't imagine how pressed for time we all were, and Guido especially. >> class Foo(object): >> property myprop: >> """A computed property on Foo objects.""" >> >> def __get__(self): >> return ... >> def __set__(self): >> ... >> def __delete__(self): > Is someone working on an implementation of this? Not within PythonLabs at present. From john_coppola_r_s@yahoo.com Fri Feb 15 20:56:50 2002 From: john_coppola_r_s@yahoo.com (john coppola) Date: Fri, 15 Feb 2002 12:56:50 -0800 (PST) Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <200202151921.g1FJLdJ5016693@email.nist.gov> Message-ID: <20020215205650.44487.qmail@web11808.mail.yahoo.com> I'm definitely in preference to slot versus attribute. But may be we could just use def. class Foo: def a def b def c def somemethod(self): pass The distinction being the def statement is "unbound" so to speak. humm.... the code is not clear enough. class Foo: slot a slot b slot c def foo(): pass Better, much better. class Foo: slot a is str slot b is float slot c is property of float def foo(self): pass I like this. It reads like sentences. (remember, 'of' modified property to ensure that the fget return type would be correct and fset passed in object would be correct) I didn't much care for Groucho's notation, particularly if a slot could have multiple types, why bother assigning types to it at all. It should definitely be singular. One to one. The concept of slot does not need to be solely related to attributes within a class. Why not reserve slots for a methods, classes within a module, modules imported within modules. Then it will be easier to see the overall picture. Does the slot pattern relate to every object in python? I think it does. That's when the real benefit comes in. If python could utilize this pattern in every aspect, the big performance boost will occur. In a strange way slots has made python even more dynamic than it ever was. Prior to slots, objects had a static c structure. Slots enables variability in another dimension for the underlying C struct. On the outside it looks like python is becoming static, but what's really going on under the hood is quite the contrary. Definitely more burden on the compiler to build correct references and make the correct substitutions. __________________________________________________ Do You Yahoo!? Got something to say? Say it better with Yahoo! Video Mail http://mail.yahoo.com From john_coppola_r_s@yahoo.com Fri Feb 15 21:03:10 2002 From: john_coppola_r_s@yahoo.com (john coppola) Date: Fri, 15 Feb 2002 13:03:10 -0800 (PST) Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <15469.25336.313450.27375@grendel.zope.com> Message-ID: <20020215210310.45339.qmail@web11808.mail.yahoo.com> This is beautiful! It's like an innerclass. Perfect! Encapulated delegate. I can't wait for Python 3.0. > class Foo(object): > property myprop: > """A computed property on Foo objects.""" > > def __get__(self): > return ... > def __set__(self): > ... > def __delete__(self): > ... > -Fred > > -- > Fred L. Drake, Jr. > PythonLabs at Zope Corporation __________________________________________________ Do You Yahoo!? Got something to say? Say it better with Yahoo! Video Mail http://mail.yahoo.com From fdrake@acm.org Fri Feb 15 21:04:33 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 15 Feb 2002 16:04:33 -0500 Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <20020215210310.45339.qmail@web11808.mail.yahoo.com> References: <15469.25336.313450.27375@grendel.zope.com> <20020215210310.45339.qmail@web11808.mail.yahoo.com> Message-ID: <15469.30689.678773.846329@grendel.zope.com> john coppola writes: > This is beautiful! It's like an innerclass. > Perfect! Encapulated delegate. > I can't wait for Python 3.0. I can't wait to work on it! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jacobs@penguin.theopalgroup.com Fri Feb 15 21:09:01 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 15 Feb 2002 16:09:01 -0500 (EST) Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <15469.30689.678773.846329@grendel.zope.com> Message-ID: On Fri, 15 Feb 2002, Fred L. Drake, Jr. wrote: > john coppola writes: > > This is beautiful! It's like an innerclass. > > Perfect! Encapulated delegate. > > I can't wait for Python 3.0. > > I can't wait to work on it! In the mean time, does anyone have any comments on the original bug report(s)? Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From fdrake@acm.org Fri Feb 15 21:10:17 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 15 Feb 2002 16:10:17 -0500 Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ In-Reply-To: <15469.25336.313450.27375@grendel.zope.com> References: <20020215175712.96785.qmail@web11807.mail.yahoo.com> <200202151923.g1FJNVJ5017966@email.nist.gov> <15469.25336.313450.27375@grendel.zope.com> Message-ID: <15469.31033.656478.828919@grendel.zope.com> Fred L. Drake, Jr. writes: [describing a suggested property syntax] > class Foo(object): > property myprop: > """A computed property on Foo objects.""" > > def __get__(self): > return ... Perhaps it was obvious to everyone else, but it just occured to me that this lends itself to inheriting descriptor types: class ReadOnly(object): def __get__(self): raise NotImplementedError("sub-class must override this!") def __set__(self): raise AttributeError("read-only attribute") def __delete__(self): raise AttributeError("read-only attribute") class Foo(object): property myprop(ReadOnly): def __get__(self): return ... -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From martin@v.loewis.de Fri Feb 15 20:13:43 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 15 Feb 2002 21:13:43 +0100 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: <04dc01c1b636$d7324650$0900a8c0@spiff> References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de> <04dc01c1b636$d7324650$0900a8c0@spiff> Message-ID: "Fredrik Lundh" writes: > > For backward compatibility, there is an option on the tkapp object to > > determine whether strings or objects are returned. This is on by > > default when using Tkinter > > "on" as in "return strings" or "return objects" ? In Tkinter, it returns objects by default. > I doubt it's a good idea to change the return type without any > warning. This is not as bad as it sounds. For most functions, the return type does not change at all. Consider def winfo_depth(self): """Return the number of bits per pixel.""" return getint(self.tk.call('winfo', 'depth', self._w)) 'winfo depth' will return a Tcl int in Tcl, which is currently converted to a string in _tkinter, then converted back to an int. With the change, tk.call will already return an int, so the getint invocation becomes a no-op. For others, a conversion into string will continue to return the value that it currently returns: >>> l=Tkinter.Label() >>> l.config("foreground")[3] >>> str(_) 'Black' I would expect that few if any applications will be affected; those would need to change the default after import Tkinter. > if the default is "use old behaviour", check it in. > > if you insist on changing the return types, post it to SF. I'd like to change the return types. If that is not acceptable, I'd like to produce a DeprecationWarning if Tkinter is imported and the new-style behaviour (return objects) is not enabled. Regards, Martin From DavidA@ActiveState.com Fri Feb 15 23:17:17 2002 From: DavidA@ActiveState.com (David Ascher) Date: Fri, 15 Feb 2002 15:17:17 -0800 Subject: [Python-Dev] PEP needed? Introducing Tcl objects References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de> <04dc01c1b636$d7324650$0900a8c0@spiff> Message-ID: <3C6D96FD.20A4AA85@activestate.com> While we're on the topic of Tkinter, I got an email from Jeff Hobbs (Tcl guy at AS) re: Tkinter. He suspects that in: Py_BEGIN_ALLOW_THREADS PyThread_acquire_lock(tcl_lock, 1); tcl_tstate = tstate; result = Tcl_DoOneEvent(TCL_DONT_WAIT); tcl_tstate = NULL; PyThread_release_lock(tcl_lock); if (result == 0) Sleep(20); Py_END_ALLOW_THREADS The Sleep() call is a perf problem. If anyone wants to discuss it with Jeff, I've cc'ed him here. building bridges, --da From goodger@users.sourceforge.net Fri Feb 15 23:38:04 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 15 Feb 2002 18:38:04 -0500 Subject: [Python-Dev] spice for PEP 282 Message-ID: Here's some spice for the logger recipe. Please season to taste. Abstract ======== The dps.utils.Reporter class (http://docstring.sf.net/dps/utils.py) implements a logger but with *multiple* thresholds per category (stream/channel). Similarly to log4j, there's a "warninglevel" threshold, which determines if a message gets sent to the warning stream (sys.stderr). There's also an "errorlevel" threshold which determines if a message is converted into a *raised exception*, potentially halting processing. And a "debug" flag which turns debug messages on or off independently of the "warninglevel" threshold. I suggest that the Python stdlib logging module adopt some of these features. Background ========== I've been working on the DPS & reStructuredText projects (soon to be merged and officially renamed to "Docutils") on and off for some months now. Docutils will parse texts (files or docstrings) into DOM-like document trees, then convert them to HTML etc. Early on I saw the need to insert "system_message" feedback elements of different levels into the doctree and implemented dps.utils.Reporter. I included thresholds for logging to sys.stderr and raising exceptions, initially with only one setting (like log4j with only the root "category" set). This Reporter class has been very successful As a pointed reminder of how wheels are continually reinvented, I learned about log4j (just before the python-dev effort got underway; I probably read the same message that got Guido started). I already had 4 message levels (what log4j called "logging priorities"), and log4j's notion of "logging categories" seemed a powerful one, so I retrofitted the dps.utils.Reporter class with support for categories. I also added a "debug" category, which I had been handling separately. >From the revised PEP 258 (not yet checked in to the Python CVS): When the parser encounters an error in markup, it inserts a system message (DTD element 'system_message'). There are five levels of system messages: - Level-0, "DEBUG": an internal reporting issue. There is no effect on the processing. Level-0 system messages are handled separately from the others. - Level-1, "INFO": a minor issue that can be ignored. There is no effect on the processing. Typically level-1 system messages are not reported. - Level-2, "WARNING": an issue that should be addressed. If ignored, there may be unpredictable problems with the output. - Level-3, "ERROR": an error that should be addressed. If ignored, the output will contain errors. - Level-4, "SEVERE": a severe error that must be addressed. Typically level-4 system messages are turned into exceptions which halt processing. If ignored, the output will contain severe errors. Although the initial message levels were devised independently, they have a strong correspondence to VMS error condition severity levels [9]; the names in quotes for levels 1 through 4 were borrowed from VMS. Error handling has since been influenced by the log4j project [10]. ... [9] http://www.openvms.compaq.com:8000/73final/5841/ 5841pro_027.html#error_cond_severity [10] http://jakarta.apache.org/log4j/ Here's the docstring of dps.utils.Reporter: Info/warning/error reporter and ``system_message`` element generator. Five levels of system messages are defined, along with corresponding methods: `debug()`, `info()`, `warning()`, `error()`, and `severe()`. There is typically one Reporter object per process. A Reporter object is instantiated with thresholds for generating warnings and errors (raising exceptions), a switch to turn debug output on or off, and an I/O stream for warnings. These are stored in the default reporting category, '' (zero-length string). Multiple reporting categories may be set, each with its own warning and error thresholds, debugging switch, and warning stream. Categories are hierarchically-named strings that look like attribute references: 'spam', 'spam.eggs', 'neeeow.wum.ping'. The 'spam' category is the ancestor of 'spam.bacon.eggs'. Unset categories inherit stored values from their closest ancestor category that has been set. When a system message is generated, the stored values from its category (or ancestor if unset) are retrieved. The system message level is compared to the thresholds stored in the category, and a warning or error is generated as appropriate. Debug messages are produced iff the stored debug switch is on. Message output is sent to the stored warning stream. The Point ========= I submit that the priority/level spectrum is not continuous. There is a break between "debug" and "info": DEBUG -/- INFO - WARNING - ERROR - FATAL/SEVERE In the Docutils application, and (I think) in general, debug logging is better treated separately from info/warning/error logging. Debug logging is often used by the developer but only rarely used by the end-user. However, depending on the type of application, info/warning/error logging can be useful to the end-user. Compilers, parsers, and filters are such applications. In addition, having a separate threshold for warning output (typical logging) and error generation (raising exceptions) has been very useful to the Docutils application, and may be useful in general for a logging module. When I run the test suite, I run with warnings and errors turned off (I get feedback from system_message elements added to the doctree). Real processing typically runs with "WARNING" and higher generating warning output, and "SEVERE" (FATAL) raising exceptions. A "make sure this input has absolutely no problems whatsoever" run might have thresholds set lower, so "INFO" is reported and "WARNING" and higher turn into exceptions. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From martin@v.loewis.de Sat Feb 16 00:09:31 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 16 Feb 2002 01:09:31 +0100 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: <3C6D96FD.20A4AA85@activestate.com> References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de> <04dc01c1b636$d7324650$0900a8c0@spiff> <3C6D96FD.20A4AA85@activestate.com> Message-ID: David Ascher writes: > The Sleep() call is a perf problem. It certainly is, but it is also necessary to have. Regards, Martin From john_coppola_r_s@yahoo.com Sat Feb 16 00:34:47 2002 From: john_coppola_r_s@yahoo.com (john coppola) Date: Fri, 15 Feb 2002 16:34:47 -0800 (PST) Subject: [Python-Dev] property syntax In-Reply-To: <15469.31033.656478.828919@grendel.zope.com> Message-ID: <20020216003447.37144.qmail@web11807.mail.yahoo.com> Hey Fred. It might not be a good idea to nest the "property class" like an inner class. It may be plausible that property objects are reusable between classes. As implied by this syntax, it wouldn't be reuseable. Another point, is that they may be very large. Which would be messy. I did i bit of brainstorming. One purpose of the type objects is a means to coerce one object to another. So here is the pattern. Just like str(MyObject) requires __str__, or len(MyObject) requires __len__ or any of the factory functions for that matter, the property factory function would require that your object support both __get__ , __set__ , and __del__. Thats it. So instead of, property(fset,fget,fdel)you would instead have, property(AnyObjectSupportingAboveInterface). How the property factory function differs from the others is that it will only check for the existence of these methods, and will not execute the code within them. It instead sets a flag on the object indicating that it is active. Will be necessary to do checking on every object for every set, or every get. Not too bad though. How time consuming is two if statements? Is this making sense? John Coppola --- "Fred L. Drake, Jr." wrote: > > Fred L. Drake, Jr. writes: > [describing a suggested property syntax] > > class Foo(object): > > property myprop: > > """A computed property on Foo objects.""" > > > > def __get__(self): > > return ... > > Perhaps it was obvious to everyone else, but it just > occured to me > that this lends itself to inheriting descriptor > types: > > > class ReadOnly(object): > def __get__(self): > raise NotImplementedError("sub-class must > override this!") > > def __set__(self): > raise AttributeError("read-only attribute") > > def __delete__(self): > raise AttributeError("read-only attribute") > > > class Foo(object): > property myprop(ReadOnly): > def __get__(self): > return ... > > > > -Fred > > -- > Fred L. Drake, Jr. > PythonLabs at Zope Corporation > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev __________________________________________________ Do You Yahoo!? Yahoo! Sports - Coverage of the 2002 Olympic Games http://sports.yahoo.com From trentm@ActiveState.com Sat Feb 16 00:43:33 2002 From: trentm@ActiveState.com (Trent Mick) Date: Fri, 15 Feb 2002 16:43:33 -0800 Subject: [Python-Dev] PEP 282: A Logging System -- comments please Message-ID: <20020215164333.A31903@ActiveState.com> Howdy all, I would appreciate any comments you might have on this proposal for adding a logging system to the Python Standard Library. This PEP is still an early draft so please forward your comments just to me directly for now. Thanks, Trent ----------------------------------------------------------- PEP: 282 Title: A Logging System Version: $Revision: 1.1 $ Last-Modified: $Date: 2002/02/15 04:09:17 $ Author: trentm@activestate.com (Trent Mick) Status: Draft Type: Standards Track Created: 4-Feb-2002 Python-Version: 2.3 Post-History: Abstract This PEP describes a proposed logging package for Python's standard library. Basically the system involves the user creating one or more logging objects on which methods are called to log debugging notes/general information/warnings/errors/etc. Different logging 'levels' can be used to distinguish important messages from trivial ones. A registry of named singleton logger objects is maintained so that 1) different logical logging streams (or 'channels') exist (say, one for 'zope.zodb' stuff and another for 'mywebsite'-specific stuff) 2) one does not have to pass logger object references around. The system is configurable at runtime. This configuration mechanism allows one to tune the level and type of logging done while not touching the application itself. Motivation If a single logging mechanism is enshrined in the standard library, 1) logging is more likely to be done 'well', and 2) multiple libraries will be able to be integrated into larger applications which can be logged reasonably coherently. Influences This proposal was put together after having somewhat studied the following logging packages: o java.util.logging in JDK 1.4 (a.k.a. JSR047) [1] o log4j [2] These two systems are *very* similar. o the Syslog package from the Protomatter project [3] o MAL's mx.Log package [4] This proposal will basically look like java.util.logging with a smattering of log4j. Simple Example This shows a very simple example of how the logging package can be used to generate simple logging output on stdout. --------- mymodule.py ------------------------------- import logging log = logging.getLogger("MyModule") def doit(): log.debug("doin' stuff") # do stuff ... ----------------------------------------------------- --------- myapp.py ---------------------------------- import mymodule, logging log = logging.getLogger("MyApp") log.info("start my app") try: mymodule.doit() except Exception, e: log.error("There was a problem doin' stuff.") log.info("end my app") ----------------------------------------------------- > python myapp.py 0 [myapp.py:4] INFO MyApp - start my app 36 [mymodule.py:5] DEBUG MyModule - doin' stuff 51 [myapp.py:9] INFO MyApp - end my app ^^ ^^^^^^^^^^^^ ^^^^ ^^^^^ ^^^^^^^^^^ | | | | `-- message | | | `-- logging name/channel | | `-- level | `-- location `-- time NOTE: Not sure exactly what the default format will look like yet. Control Flow [Note: excerpts from Java Logging Overview. [5]] Applications make logging calls on *Logger* objects. Loggers are organized in a hierarchical namespace and child Loggers may inherit some logging properties from their parents in the namespace. Notes on namespace: Logger names fit into a "dotted name" namespace, with dots (periods) indicating sub-namespaces. The namespace of logger objects therefore corresponds to a single tree data structure. "" is the root of the namespace "Zope" would be a child node of the root "Zope.ZODB" would be a child node of "Zope" These Logger objects allocate *LogRecord* objects which are passed to *Handler* objects for publication. Both Loggers and Handlers may use logging *levels* and (optionally) *Filters* to decide if they are interested in a particular LogRecord. When it is necessary to publish a LogRecord externally, a Handler can (optionally) use a *Formatter* to localize and format the message before publishing it to an I/O stream. Each Logger keeps track of a set of output Handlers. By default all Loggers also send their output to their parent Logger. But Loggers may also be configured to ignore Handlers higher up the tree. The APIs are structured so that calls on the Logger APIs can be cheap when logging is disabled. If logging is disabled for a given log level, then the Logger can make a cheap comparison test and return. If logging is enabled for a given log level, the Logger is still careful to minimize costs before passing the LogRecord into the Handlers. In particular, localization and formatting (which are relatively expensive) are deferred until the Handler requests them. Levels The logging levels, in increasing order of importance, are: DEBUG INFO WARN ERROR FATAL ALL This is consistent with log4j and Protomatter's Syslog and not with JSR047 which has a few more levels and some different names. Implementation-wise: these are just integer constants, to allow simple comparison of importance. See "What Logging Levels?" below for a debate on what standard levels should be defined. Loggers Each Logger object keeps track of a log level (or threshold) that it is interested in, and discards log requests below that level. The *LogManager* maintains a hierarchical namespace of named Logger objects. Generations are denoted with dot-separated names: Logger "foo" is the parent of Loggers "foo.bar" and "foo.baz". The main logging method is: class Logger: def log(self, level, msg, *args): """Log 'msg % args' at logging level 'level'.""" ... however convenience functions are defined for each logging level: def debug(self, msg, *args): ... def info(self, msg, *args): ... def warn(self, msg, *args): ... def error(self, msg, *args): ... def fatal(self, msg, *args): ... XXX How to defined a nice convenience function for logging an exception? mx.Log has something like this, doesn't it? XXX What about a .raising() convenience function? How about: def raising(self, exception, level=ERROR): ... It would create a log message describing an exception that is about to be raised. I don't like that 'level' is not first when it *is* first for .log(). Handlers Handlers are responsible for doing something useful with a given LogRecord. The following core Handlers will be implemented: - StreamHandler: A handler for writing to a file-like object. - FileHandler: A handler for writing to a single file or set of rotating files. More standard Handlers may be implemented if deemed desirable and feasible. Other interesting candidates: - SocketHandler: A handler for writing to remote TCP ports. - CreosoteHandler: A handler for writing to UDP packets, for low-cost logging. Jeff Bauer already had such a system [5]. - MemoryHandler: A handler that buffers log records in memory (JSR047). - SMTPHandler: Akin to log4j's SMTPAppender. - SyslogHandler: Akin to log4j's SyslogAppender. - NTEventLogHandler: Akin to log4j's NTEventLogAppender. Formatters A Formatter is responsible for converting a LogRecord to a string representation. A Handler may call its Formatter before writing a record. The following core Formatters will be implemented: - Formatter: Provide printf-like formatting, perhaps akin to log4j's PatternAppender. Other possible candidates for implementation: - XMLFormatter: Serialize a LogRecord according to a specific schema. Could copy the schema from JSR047's XMLFormatter or log4j's XMLAppender. - HTMLFormatter: Provide a simple HTML output of log information. (See log4j's HTMLAppender.) Filters A Filter can be called by a Logger or Handler to decide if a LogRecord should be logged. JSR047 and log4j have slightly different filtering interfaces. The former is simpler: class Filter: def isLoggable(self): """Return a boolean.""" The latter is modeled after Linux's ipchains (where Filter's can be chained with each filter either 'DENY'ing, 'ACCEPT'ing, or being 'NEUTRAL' on each check). I would probably favor to former because it is simpler and I don't immediate see the need for the latter. No filter implementations are currently proposed (other that the do nothing base class) because I don't have enough experience to know what kinds of filters would be common. Users can always subclass Filter for their own purposes. Log4j includes a few filters that might be interesting. Configuration Note: Configuration for the proposed logging system is currently under-specified. The main benefit of a logging system like this is that one can control how much and what logging output one gets from an application without changing that application's source code. Log4j and Syslog provide for configuration via an external XML file. Log4j and JSR047 provide for configuration via Java properties (similar to -D #define's to a C/C++ compiler). All three provide for configuration via API calls. Configuration includes the following: - What logging level a logger should be interested in. - What handlers should be attached to which loggers. - What filters should be attached to which handlers and loggers. - Specifying attributes specific to certain Handlers and Filters. - Defining the default configuration. - XXX Add others. In general each application will have its own requirements for how a user may configure logging output. One application (e.g. distutils) may want to control logging levels via '-q,--quiet,-v,--verbose' options to setup.py. Zope may want to configure logging via certain environment variables (e.g. 'STUPID_LOG_FILE' :). Komodo may want to configure logging via its preferences system. This PEP proposes to clearly document the API for configuring each of the above listed configurable elements and to define a reasonable default configuration. This PEP does not propose to define a general XML or .ini file configuration schema and the backend to parse it. It might, however, be worthwhile to define an abstraction of the configuration API to allow the expressiveness of Syslog configuration. Greg Wilson made this argument: In Protomatter [Syslog], you configure by saying "give me everything that matches these channel+level combinations", such as "server.error" and "database.*". The log4j "configure by inheritance" model, on the other hand, is very clever, but hard for non-programmers to manage without a GUI that essentially reduces it to Protomatter's. Case Scenarios This section presents a few usage scenarios which will be used to help decide how best to specify the logging API. (1) A short simple script. This script does not have many lines. It does not heavily use any third party modules (i.e. the only code doing any logging would be the main script). Only one logging channel is really needed and thus, the channel name is unnecessary. The user doesn't want to bother with logging system configuration much. (2) Medium sized app with C extension module. Includes a few Python modules and a main script. Employs, perhaps, a few logging channels. Includes a C extension module which might want to make logging calls as well. (3) Distutils. A large number of Python packages/modules. Perhaps (but not necessarily) a number of logging channels are used. Specifically needs to facilitate the controlling verbosity levels via simple command line options to 'setup.py'. (4) Large, possibly multi-language, app. E.g. Zope or (my experience) Komodo. (I don't expect this logging system to deal with any cross-language issues but it is something to think about.) Many channels are used. Many developers involved. People providing user support are possibly not the same people who developed the application. Users should be able to generate log files (i.e. configure logging) while reproducing a bug to send back to developers. Implementation XXX Details to follow consensus that this proposal is a good idea. What Logging Levels? The following are the logging levels defined by the systems I looked at: - log4j: DEBUG, INFO, WARN, ERROR, FATAL - syslog: DEBUG, INFO, WARNING, ERROR, FATAL - JSR047: FINEST, FINER, FINE, CONFIG, INFO, WARNING, SEVERE - zLOG (used by Zope): TRACE=-300 -- Trace messages DEBUG=-200 -- Debugging messages BLATHER=-100 -- Somebody shut this app up. INFO=0 -- For things like startup and shutdown. PROBLEM=100 -- This isn't causing any immediate problems, but deserves attention. WARNING=100 -- A wishy-washy alias for PROBLEM. ERROR=200 -- This is going to have adverse effects. PANIC=300 -- We're dead! - mx.Log: SYSTEM_DEBUG SYSTEM_INFO SYSTEM_UNIMPORTANT SYSTEM_MESSAGE SYSTEM_WARNING SYSTEM_IMPORTANT SYSTEM_CANCEL SYSTEM_ERROR SYSTEM_PANIC SYSTEM_FATAL The current proposal is to copy log4j. XXX I suppose I could see adding zLOG's "TRACE" level, but I am not sure of the usefulness of others. Static Logging Methods (as per Syslog)? Both zLOG and Syslog provide module-level logging functions rather (or in addition to) logging methods on a created Logger object. XXX Is this something that is deemed worth including? Pros: - It would make the simplest case shorter: import logging logging.error("Something is wrong") instead of import logging log = logging.getLogger("") log.error("Something is wrong") Cons: - It provides more than one way to do it. - It encourages logging without a channel name, because this mechanism would likely be implemented by implicitly logging on the root (and nameless) logger of the hierarchy. References [1] java.util.logging http://java.sun.com/j2se/1.4/docs/guide/util/logging/ [2] log4j: a Java logging package http://jakarta.apache.org/log4j/docs/index.html [3] Protomatter's Syslog http://protomatter.sourceforge.net/1.1.6/index.html http://protomatter.sourceforge.net/1.1.6/javadoc/com/protomatter/syslog/syslog-whitepaper.html [4] MAL mentions his mx.Log logging module: http://mail.python.org/pipermail/python-dev/2002-February/019767.html [5] Jeff Bauer's Mr. Creosote http://starship.python.net/crew/jbauer/creosote/ Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil fill-column: 70 End: -- Trent Mick TrentM@ActiveState.com From loth@users.sourceforge.net Sat Feb 16 01:05:00 2002 From: loth@users.sourceforge.net (Burton Radons) Date: Fri, 15 Feb 2002 17:05:00 -0800 Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__ References: <20020215175712.96785.qmail@web11807.mail.yahoo.com> <200202151923.g1FJNVJ5017966@email.nist.gov> <15469.25336.313450.27375@grendel.zope.com> Message-ID: <3C6DB03C.8080101@users.sourceforge.net> Fred L. Drake, Jr. wrote: > Michael McLay writes: > > While my approach was patterened after the property() builtin, the > > Python Labs crowd didn't like the notation and rejected the > > I'll note as well that at least some of us, if not all, don't like the > property() syntax as well. My current favorite was one of Guido's > proposals at Python 10: > > > class Foo(object): > property myprop: > """A computed property on Foo objects.""" > > def __get__(self): > return ... > def __set__(self): > ... > def __delete__(self): > ... What's wrong with: class Foo(object): class myprop_class (object): """A computed property on Foo objects.""" def __get__(subclass, self, klass): return ... def __set__(subclass, self, value): ... def __delete__(subclass, self): ... myprop = myprop_class () It's a grand total of _two_ lines more than your example, and has more extensions possibilities to boot. While we're discussing strange usages of blocks, one feature I've always wanted has been subblocks passed to functions. It could be done using a new argument prefix "@" (for example). If the block is not otherwise taken (if it is in an if statement, for example), it can be followed by ":" and a normal block; this block is then put in the argument as a PyCodeObject (I think). The argument can also be given a value normally. The code object also has a few new methods for our convenience. So to implement your example above: def field (@block): dict = block.exec_save_locals () # Execute the block and return the locals dictionary rather than destroy it. fget = dict.get ("__get__", None) fset = dict.get ("__set__", None) fdel = dict.get ("__delete__", None) fdoc = dict.get ("__doc__", None) return property (fget, fset, fdel, fdoc) Now that we have that, we do your example: class Foo(object): myprop = field (): """A computed property on Foo objects.""" def __get__(self): return ... def __set__(self, value): ... def __delete__(self): ... There are other capabilities, but as I've never had a language that can do this I wouldn't know how many pragmatic possibilities there are. The advantage over Guido's method is that his suggestion solves a single problem and has no use outside of it, while mine, at least on the face of it, could be applied in other ways. From mclay@nist.gov Sat Feb 16 02:43:33 2002 From: mclay@nist.gov (Michael McLay) Date: Fri, 15 Feb 2002 21:43:33 -0500 Subject: [Python-Dev] PEP 282: A Logging System -- comments please In-Reply-To: <20020215164333.A31903@ActiveState.com> References: <20020215164333.A31903@ActiveState.com> Message-ID: <200202160247.g1G2lkJ5029529@email.nist.gov> On Friday 15 February 2002 07:43 pm, Trent Mick wrote: > Howdy all, > > I would appreciate any comments you might have on this proposal for adding > a logging system to the Python Standard Library. This PEP is still an early > draft so please forward your comments just to me directly for now. I scanned the PEP and didn't find a reference to the logging package supporting logging over a network. > Influences > > This proposal was put together after having somewhat studied the > following logging packages: > > o java.util.logging in JDK 1.4 (a.k.a. JSR047) [1] > o log4j [2] > These two systems are *very* similar. > o the Syslog package from the Protomatter project [3] > o MAL's mx.Log package [4] > > This proposal will basically look like java.util.logging with a > smattering of log4j. Marshal Rose submitted RFC3195[1] to the IETF for a syslog protocol. The specification is defined as a profile on top of the BEEP framework. The format of the messages are encoded in XML. Here is an example of an "entry" element. C: <.....eeeek! [1] http://www.beepcore.org/beepcore/docs/rfc3195.html. From aahz@rahul.net Sat Feb 16 03:22:02 2002 From: aahz@rahul.net (Aahz Maruch) Date: Fri, 15 Feb 2002 19:22:02 -0800 (PST) Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find In-Reply-To: <3C6DB03C.8080101@users.sourceforge.net> from "Burton Radons" at Feb 15, 2002 05:05:00 PM Message-ID: <20020216032202.711AFE8C4@waltz.rahul.net> Burton Radons wrote: > > What's wrong with: > > class Foo(object): > class myprop_class (object): > """A computed property on Foo objects.""" > > def __get__(subclass, self, klass): > return ... > def __set__(subclass, self, value): > ... > def __delete__(subclass, self): > ... How about this: class Foo(object): class myprop(property): ... -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From martin@v.loewis.de Sat Feb 16 09:44:54 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: Sat, 16 Feb 2002 10:44:54 +0100 Subject: [Python-Dev] sendall patches not in 2.2? Message-ID: <200202160944.g1G9isc01555@mira.informatik.hu-berlin.de> I wonder why Anthony's Grand sendall patch never found its way into Python 2.2, see http://aspn.activestate.com/ASPN/Mail/Message/Python-checkins/956422 http://mail.python.org/pipermail/python-bugs-list/2001-December/009299.html https://sourceforge.net/tracker/index.php?func=detail&aid=516715&group_id=5470&atid=305470 Unless there are any objections, I'll forward this patch to 2.2 (not sure what to do with imaplib, since that has been taken care of with an explicit loop in Python meanwhile). Regards, Martin From JeffH@ActiveState.com Sat Feb 16 10:01:39 2002 From: JeffH@ActiveState.com (Jeff Hobbs) Date: Sat, 16 Feb 2002 02:01:39 -0800 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: Message-ID: <001b01c1b6d0$ee28f940$ba03a8c0@activestate.ca> > From: martin@mira [mailto:martin@mira]On Behalf Of Martin v. Loewis ... > David Ascher writes: > > > The Sleep() call is a perf problem. > > It certainly is, but it is also necessary to have. Why? I suspect if you inverted the control behavior to run the Tcl event loop as it's designed and trigger signals with Tcl_AsyncMark, you would have no problem. Alternatively, you could do Tcl_CreateEventSource, of if threading is really necessary, build Tcl with threads and use Tcl_ThreadQueueEvent. It has all the APIs to approach this from several different angles without have to toss a gratuitous Sleep in there that does nothing more than have people scratch their head and wonder why Tkinter appears so slow. BTW, I know you were tying into Tk before Tk was properly thread-safe, but those issues have been addressed (although it is highly recommended to stick to using Tk in one thread as things like X aren't guaranteed to be thread-safe). Jeff From martin@v.loewis.de Sat Feb 16 12:29:02 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 16 Feb 2002 13:29:02 +0100 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: <001b01c1b6d0$ee28f940$ba03a8c0@activestate.ca> References: <001b01c1b6d0$ee28f940$ba03a8c0@activestate.ca> Message-ID: "Jeff Hobbs" writes: > > It certainly is, but it is also necessary to have. > > Why? I suspect if you inverted the control behavior to run the > Tcl event loop as it's designed and trigger signals with > Tcl_AsyncMark, you would have no problem. Alternatively, you > could do Tcl_CreateEventSource, of if threading is really > necessary, build Tcl with threads and use Tcl_ThreadQueueEvent. Let me first state what I think what problem this Sleep call solves: it allows a thread switch to occur, by blocking the thread so that the OS knows that it should schedule a different thread. Otherwise, this thread would hold the tcl lock essentially forever, since releasing the tcl lock would be immediately followed by regaining it. Some thread implementation won't allow, in this case, other threads blocked for the tcl lock to run. In the light of this rationale, can you please explain what Tcl_AsyncMark is and how it would avoid the problem, or what effect calling Tcl_CreateEventSource would have, or how Tcl_ThreadQueueEvent would help? > It has all the APIs to approach this from several different > angles without have to toss a gratuitous Sleep in there that > does nothing more than have people scratch their head and > wonder why Tkinter appears so slow. It does more than that: it avoids people thinking that their threads have blocked indefinitely, for no good reason. > BTW, I know you were tying into Tk before Tk was properly > thread-safe, but those issues have been addressed (although > it is highly recommended to stick to using Tk in one thread > as things like X aren't guaranteed to be thread-safe). Let's assume thread-safety of X is not a problem (as it isn't in most current installations). Are you then saying that Tk is thread-safe? What is the minimum Tk version that makes this guarantee? Where is this documented? I'm all in favour of getting rid of the Tcl lock. Regards, Martin From guido@python.org Sat Feb 16 16:23:09 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 16 Feb 2002 11:23:09 -0500 Subject: [Python-Dev] sendall patches not in 2.2? In-Reply-To: Your message of "Sat, 16 Feb 2002 10:44:54 +0100." <200202160944.g1G9isc01555@mira.informatik.hu-berlin.de> References: <200202160944.g1G9isc01555@mira.informatik.hu-berlin.de> Message-ID: <200202161623.g1GGN9130315@pcp742651pcs.reston01.va.comcast.net> > I wonder why Anthony's Grand sendall patch never found its way into > Python 2.2, see > > http://aspn.activestate.com/ASPN/Mail/Message/Python-checkins/956422 > http://mail.python.org/pipermail/python-bugs-list/2001-December/009299.html > https://sourceforge.net/tracker/index.php?func=detail&aid=516715&group_id=5470&atid=305470 > > Unless there are any objections, I'll forward this patch to 2.2 (not > sure what to do with imaplib, since that has been taken care of with > an explicit loop in Python meanwhile). An oversight! This should go into 2.2.1 definitely. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Feb 16 16:28:01 2002 From: guido@python.org (Guido van Rossum) Date: Sat, 16 Feb 2002 11:28:01 -0500 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: Your message of "Sat, 16 Feb 2002 02:01:39 PST." <001b01c1b6d0$ee28f940$ba03a8c0@activestate.ca> References: <001b01c1b6d0$ee28f940$ba03a8c0@activestate.ca> Message-ID: <200202161628.g1GGS1x30329@pcp742651pcs.reston01.va.comcast.net> > > > The Sleep() call is a perf problem. > > > > It certainly is, but it is also necessary to have. > > Why? I suspect if you inverted the control behavior to run the > Tcl event loop as it's designed and trigger signals with > Tcl_AsyncMark, you would have no problem. Alternatively, you > could do Tcl_CreateEventSource, of if threading is really > necessary, build Tcl with threads and use Tcl_ThreadQueueEvent. > It has all the APIs to approach this from several different > angles without have to toss a gratuitous Sleep in there that > does nothing more than have people scratch their head and > wonder why Tkinter appears so slow. Jeff, I really hope you can help us with this. I know it's a twisted mess. Years ago, I asked Ousterhout's help, but he was already too busy to pay attention to a competing language designer. :-( I hope that it's possible to do something better with Tcl/Tk 8.3 that doesn't require the sleep and maintains the existing _tkinter API / semantics. > BTW, I know you were tying into Tk before Tk was properly > thread-safe, but those issues have been addressed (although > it is highly recommended to stick to using Tk in one thread > as things like X aren't guaranteed to be thread-safe). Are they solved in Tcl/Tk 8.3? I'd be happy to require that version. I'm not (yet) happy to require an alpha/beta of 9.0 or whatever the Tcl community is now working at. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Sat Feb 16 18:29:46 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 16 Feb 2002 19:29:46 +0100 Subject: [Python-Dev] SSL support in _socket References: <3C692370.D21EF15D@lemburg.com> <3C6A303D.E0DDC16A@lemburg.com> <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net> <3C6A66B3.3C4AE597@lemburg.com> <200202131336.g1DDaiV07604@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C6EA51A.E8F58B0C@lemburg.com> Guido van Rossum wrote: > > > Checking the code it should be easy to do. I'll look > > into this later this week. > > Great! Done -- wasn't that easy after all, because the ssl object relies on the socket object. Please review and test. The header file chaos at the top of socketmodule.* looks scary. It works fine on Linux, but I have no idea what the situation is on other platforms. Side-note: I've added the "inter-module dynamic C API linking via Python trick" from the mx tools to the _socket module. _ssl only uses it to get at the type object, but the support can easily be extended if this should be needed for more C APIs from _socket. Also note: the non-Unix build process files need to be updated. > > Funny, BTW, that the source file is named socketmodule.c > > while the resulting DLL is called _socket... I suppose > > renaming socketmodule.c to _socket.c would be advisable. > > That requires asking the SF sysadmin a favor to move a file, or loses > all he CVS history. So who cares. I have left out this step. Perhaps Barry know a way to rename the socketmodule.* files without losing the history ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Sun Feb 17 03:41:53 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 16 Feb 2002 22:41:53 -0500 Subject: [Python-Dev] SSL support in _socket In-Reply-To: <3C6EA51A.E8F58B0C@lemburg.com> Message-ID: [MAL] > Side-note: I've added the "inter-module dynamic C API linking > via Python trick" from the mx tools to the _socket module. _ssl > only uses it to get at the type object, but the support can easily > be extended if this should be needed for more C APIs from > _socket. > > Also note: the non-Unix build process files need to be updated. I don't know what "inter-module dynamic C API linking via Python trick" means, but the Windows build doesn't compile anymore despite that it didn't and doesn't support SSL. I suspect it's because "inter-module" wrt sockets is really "cross-DLL" on Windows, and clever tricks are going to bite hard because of that. It's griping here: static PyTypeObject PySocketSock_Type = { C:\Code\python\Modules\socketmodule.c(1768) : error C2491: 'PySocketSock_Type' : definition of dllimport data not allowed and here: &PySocketSock_Type, C:\Code\python\Modules\socketmodule.c(2650) : error C2099: initializer is not a constant The changes to socketmodule.h pretty much baffle me. Why is the body of the function PySocketModule_ImportModuleAndAPI included in the header file? Why is the body of this function skipped unless PySocket_BUILDING_SOCKET is defined? All in all, this appears to be an extremely confusing way to define a function named PySocketModule_ImportModuleAndAPI in the new _ssl.c alone. So why isn't the function just defined in _ssl.c directly? There appears no reason to put it in the header file, and it's confusing there. This shows signs of adapting a complicated framework to a situation too simple to require most of what the framework does. If so, since there is no other use of this framework in Python, and the framework isn't documented in the Python codebase, the framework should be tossed, and something as simple as possible done instead. I can't make more time to sort this out now. It would help if the code were made more transparent (see last paragraph), so it consumed less time to figure out what it's intending to do. In the meantime, the Windows build will remain broken. From andymac@pcug.org.au Sun Feb 17 05:34:16 2002 From: andymac@pcug.org.au (Andrew MacIntyre) Date: Sun, 17 Feb 2002 16:34:16 +1100 (EST) Subject: [Python-Dev] OS/2 EMX port build directory committed Message-ID: I have committed PC/os2emx and its contents. If disaster results, please cc this e-mail address, as I haven't been able to get into my main e-mail account (ISP equipment problems). Andrew I MacIntyre "These thoughts are mine alone ..." Email: andymac@bullseye.apana.org.au (preferred) | Snail: PO Box 370 andymac@pcug.org.au (alternate) | Belconnen ACT 2616 andrew.macintyre@aba.gov.au (work) | Australia From andymac@pcug.org.au Sun Feb 17 05:43:08 2002 From: andymac@pcug.org.au (Andrew MacIntyre) Date: Sun, 17 Feb 2002 16:43:08 +1100 (EST) Subject: [Python-Dev] %#x/%#X format conversion patches for review Message-ID: Following on from discussion of the patch I uploaded containing OS/2 EMX changes to the Python core, I have uploaded patches to Objects/stringobject.c and Objects/unicodeobject.c to simplify the mess of dealing with these format conversions in the face of Python's preferences vs the C standard, C standard violations, and bugs. http://sf.net/tracker/?func=detail&aid=450266&group_id=5470&atid=305470 (this is my "Python core" OS/2 EMX patch in the Patch manager) I have assigned the patch to Martin von Loewis; if they should be moved to a separate patch for future reference, please let me know. Andrew I MacIntyre "These thoughts are mine alone ..." Email: andymac@bullseye.apana.org.au (preferred) | Snail: PO Box 370 andymac@pcug.org.au (alternate) | Belconnen ACT 2616 andrew.macintyre@aba.gov.au (work) | Australia From andymac@pcug.org.au Sun Feb 17 05:47:09 2002 From: andymac@pcug.org.au (Andrew MacIntyre) Date: Sun, 17 Feb 2002 16:47:09 +1100 (EST) Subject: [Python-Dev] Re: %#x/%#X format conversion patches for review In-Reply-To: Message-ID: Sorry, the correct patch URL is http://sf.net/tracker/?func=detail&aid=450267&group_id=5470&atid=305470 Andrew I MacIntyre "These thoughts are mine alone ..." Email: andymac@bullseye.apana.org.au (preferred) | Snail: PO Box 370 andymac@pcug.org.au (alternate) | Belconnen ACT 2616 andrew.macintyre@aba.gov.au (work) | Australia On Sun, 17 Feb 2002, Andrew MacIntyre wrote: > Following on from discussion of the patch I uploaded containing OS/2 EMX > changes to the Python core, I have uploaded patches to > Objects/stringobject.c and Objects/unicodeobject.c to simplify the mess of > dealing with these format conversions in the face of Python's preferences > vs the C standard, C standard violations, and bugs. From tim.one@comcast.net Sun Feb 17 06:06:57 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 17 Feb 2002 01:06:57 -0500 Subject: [Python-Dev] %#x/%#X format conversion patches for review In-Reply-To: Message-ID: [Andrew MacIntyre] > Following on from discussion of the patch I uploaded containing OS/2 EMX > changes to the Python core, I have uploaded patches to > Objects/stringobject.c and Objects/unicodeobject.c to simplify the mess > of dealing with these format conversions in the face of Python's > preferences vs the C standard, C standard violations, and bugs. > > http://sf.net/tracker/?func=detail&aid=450266&group_id=5470&atid=305470 > (this is my "Python core" OS/2 EMX patch in the Patch manager) There are 4 patch files attached to that report, and while I may have missed them, I didn't see any changes to stringobject or unicodeobject in any of them. Which patch contains these changes? Or are these changes in some other patch submission? From tim.one@comcast.net Sun Feb 17 06:11:06 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 17 Feb 2002 01:11:06 -0500 Subject: [Python-Dev] OS/2 EMX port build directory committed In-Reply-To: Message-ID: [Andrew MacIntyre] > I have committed PC/os2emx and its contents. If disaster results, Didn't hurt the Windows build, so there's no disaster from my POV. Thanks! > please cc this e-mail address, Which e-mail address? You listed three addresses below, and none of them are labelled "this" . > as I haven't been able to get into my main e-mail account (ISP equipment > problems). > > Andrew I MacIntyre "These thoughts are mine alone ..." > Email: andymac@bullseye.apana.org.au (preferred) | Snail: PO Box 370 > andymac@pcug.org.au (alternate) | > Belconnen ACT 2616 > andrew.macintyre@aba.gov.au (work) | Australia From tim.one@comcast.net Sun Feb 17 06:19:12 2002 From: tim.one@comcast.net (Tim Peters) Date: Sun, 17 Feb 2002 01:19:12 -0500 Subject: [Python-Dev] Re: %#x/%#X format conversion patches for review In-Reply-To: Message-ID: [Andrew MacIntyre] > Sorry, the correct patch URL is > http://sf.net/tracker/?func=detail&aid=450267&group_id=5470&atid=305470 Thanks. The changes to {string,unicode}object.c are decidedly non-scary. Check 'em in if you like. I think we're now officially at the point where our code would be smaller and simpler if we implemented sprintf entirely by ourselves instead of fighting platform sprintf quirks <0.5 wink>. From andymac@pcug.org.au Sun Feb 17 07:32:22 2002 From: andymac@pcug.org.au (Andrew MacIntyre) Date: Sun, 17 Feb 2002 18:32:22 +1100 (EST) Subject: [Python-Dev] OS/2 EMX port build directory committed In-Reply-To: Message-ID: On Sun, 17 Feb 2002, Tim Peters wrote: > [Andrew MacIntyre] > > I have committed PC/os2emx and its contents. If disaster results, > > Didn't hurt the Windows build, so there's no disaster from my POV. Thanks! I like good news! > > please cc this e-mail address, > > Which e-mail address? You listed three addresses below, and none of them > are labelled "this" . "this address" was supposed to imply the "from address" of the message, sorry. I've now closed patch #450265. Andrew I MacIntyre "These thoughts are mine alone ..." Email: andymac@bullseye.apana.org.au (preferred) | Snail: PO Box 370 andymac@pcug.org.au (alternate) | Belconnen ACT 2616 andrew.macintyre@aba.gov.au (work) | Australia From mal@lemburg.com Sun Feb 17 12:38:07 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 17 Feb 2002 13:38:07 +0100 Subject: [Python-Dev] SSL support in _socket References: Message-ID: <3C6FA42F.928A0062@lemburg.com> Tim Peters wrote: > > [MAL] > > Side-note: I've added the "inter-module dynamic C API linking > > via Python trick" from the mx tools to the _socket module. _ssl > > only uses it to get at the type object, but the support can easily > > be extended if this should be needed for more C APIs from > > _socket. > > > > Also note: the non-Unix build process files need to be updated. > > I don't know what "inter-module dynamic C API linking via Python trick" > means, but the Windows build doesn't compile anymore despite that it didn't > and doesn't support SSL. I suspect it's because "inter-module" wrt sockets > is really "cross-DLL" on Windows, and clever tricks are going to bite hard > because of that. No it's not (and that's the main advantage of the "trick"). Some explanation: The _ssl module needs access to the type object defined in the _socket module. Since cross-DLL linking introduces a lot of problems on many platforms, the "trick" is to wrap the C API of a module in a struct which then gets exported to other modules via a PyCObject. The code in socketmodule.c defines this struct (which currently only contains the type object reference, but could very well also include other C APIs needed by other modules) and exports it as PyCObject via the module dictionary under the name "CAPI". Other modules can now include the socketmodule.h file which defines the needed C APIs to import and set up a static copy of this struct in the importing module. After initialization, the importing module can then access the C APIs from the _socket module by simply referring to the static struct, e.g. /* Load _socket module and its C API; this sets up the global PySocketModule */ if (PySocketModule_ImportModuleAndAPI()) return; ... if (!PyArg_ParseTuple(args, "O!|zz:ssl", PySocketModule.Sock_Type, (PyObject*)&Sock, &key_file, &cert_file)) return NULL; (Perhaps I should copy the above explanation into the source files ?!) > It's griping here: > > static PyTypeObject PySocketSock_Type = { > C:\Code\python\Modules\socketmodule.c(1768) : error C2491: > 'PySocketSock_Type' : definition of dllimport data not allowed > > and here: > > &PySocketSock_Type, > C:\Code\python\Modules\socketmodule.c(2650) : error C2099: > initializer is not a constant Ah, you're right, the export of the type object is not needed anymore since this is now done using the PyCObject. Sorry, my bad. > The changes to socketmodule.h pretty much baffle me. Why is the body of the > function PySocketModule_ImportModuleAndAPI included in the header file? Why > is the body of this function skipped unless PySocket_BUILDING_SOCKET is > defined? All in all, this appears to be an extremely confusing way to > define a function named PySocketModule_ImportModuleAndAPI in the new _ssl.c > alone. So why isn't the function just defined in _ssl.c directly? There > appears no reason to put it in the header file, and it's confusing there. The reason for putting the code in the header file is to avoid duplication of code. The import API is needed by all modules wishing to use the C API of the socket module. Currently, only _ssl needs this, but I think it would be a good strategy to extend this technique to other modules as well (esp. the array module would be a good candidate). > This shows signs of adapting a complicated framework to a situation too > simple to require most of what the framework does. If so, since there is no > other use of this framework in Python, and the framework isn't documented in > the Python codebase, the framework should be tossed, and something as simple > as possible done instead. I don't think it's overly complicated. It's been in use in mxDateTime and various database modules including mxODBC for many years and I haven't received any complaints about it in the last few years. It would be nice if we could integrate better support for it into the Python core. Then we wouldn't need the header file source code definition anymore. IMHO, it's a very useful way of doing cross-DLL "linking" in a platform independent manner. Note that the whole idea originated from a discussion I had with Jim Fulton some years ago. As I understand, the PyCObject was invented for just this purpose. > I can't make more time to sort this out now. It would help if the code were > made more transparent (see last paragraph), so it consumed less time to > figure out what it's intending to do. In the meantime, the Windows build > will remain broken. As I read the checkins, you've remove the type object export. I am curious why the test_socket still fails on Windows though. Both test_socket and test_socket_ssl work just fine on Linux. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From ping@lfw.org Mon Feb 18 04:27:02 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Sun, 17 Feb 2002 22:27:02 -0600 (CST) Subject: [Python-Dev] Global name lookup schemes Message-ID: Okay, i spent another afternoon drawing silly pictures full of boxes and arrows. I swear, i'm going to be seeing pointers in my dreams tonight. Here are figures representing my current understanding of the various schemes on the table: Jeremy 1: the dlict scheme http://lfw.org/python/jeremy1.gif http://lfw.org/python/jeremy1.tif http://lfw.org/python/jeremy1.ai Jeremy, i think i'm still somewhat unclear -- notice the two question marks in the figure. What kind of animal is the cache? I assumed that the invalidation info lives in an array parallel to the dlict's array. Is this right? Guido 1: the original cellptr/objptr scheme http://lfw.org/python/guido1.gif http://lfw.org/python/guido1.tif http://lfw.org/python/guido1.ai Ping 1: guido1 + a tweak to always use two dereferencing steps http://lfw.org/python/ping1.gif http://lfw.org/python/ping1.tif http://lfw.org/python/ping1.ai Tim 1: timdicts and cells with shadow flags http://lfw.org/python/tim1.gif http://lfw.org/python/tim1.tif http://lfw.org/python/tim1.ai GIFs are small versions, TIFs are big versions, AIs are Adobe Illustrator source files. Please examine, send me corrections, discuss, enjoy... :) Still to do: Guido 2: the globals_vector scheme Skip 1: the global-tracking scheme (I don't actually know yet what in this diagram would be different from the way things work today. Statically, Skip's picture is mostly the same; it's the runtime behaviour that's different. Still, it's probably good to have a reference picture of today's data structures anyway.) -- ?!ng From tim.one@comcast.net Mon Feb 18 05:19:26 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 18 Feb 2002 00:19:26 -0500 Subject: [Python-Dev] SSL support in _socket In-Reply-To: <3C6FA42F.928A0062@lemburg.com> Message-ID: [M.-A. Lemburg] > Some explanation: > > The _ssl module needs access to the type object defined in > the _socket module. Since cross-DLL linking introduces a lot of > problems on many platforms, the "trick" is to wrap the > C API of a module in a struct which then gets exported to > other modules via a PyCObject. > > The code in socketmodule.c defines this struct (which currently > only contains the type object reference, but could very > well also include other C APIs needed by other modules) > and exports it as PyCObject via the module dictionary > under the name "CAPI". > > Other modules can now include the socketmodule.h file > which defines the needed C APIs to import and set up > a static copy of this struct in the importing module. > > After initialization, the importing module can then > access the C APIs from the _socket module by simply > referring to the static struct, e.g. > > /* Load _socket module and its C API; this sets up the global > PySocketModule */ > if (PySocketModule_ImportModuleAndAPI()) > return; > > ... > if (!PyArg_ParseTuple(args, "O!|zz:ssl", > > PySocketModule.Sock_Type, > > (PyObject*)&Sock, > &key_file, &cert_file)) > return NULL; > > (Perhaps I should copy the above explanation into the source > files ?!) I don't know. I really don't have time to try and understand this, but I can tell you I spent a lot of time staring at the code just trying to fix the part that didn't work, and it was slow and painful going. Without deep understanding, I can only repeat that all this machinery *seems* to be overkill in this specific case; and since there is no other case in the Python core, a mass of overly general machinery in the Python core seems out of place. > ... > Ah, you're right, the export of the type object is not > needed anymore since this is now done using the PyCObject. > Sorry, my bad. No problem -- that part turned out to be easy, once I found it. > ... > The reason for putting the code in the header file is > to avoid duplication of code. The import API is needed by > all modules wishing to use the C API of the socket module. But in this specific case you confirm that there is only one client: > Currently, only _ssl needs this, but I think it would be > a good strategy to extend this technique to other modules > as well (esp. the array module would be a good candidate). Possibly, but it's overly elaborate in this specific case. If it needed to be hypergeneral (and it doesn't here), it seems it would be better to make the code template *more* general, so that every importer of every module could include a common (e.g.) PyImportModuleAndApi.h header file one or more times, after setting a pile of #defines to specialize it to the module at hand. > I don't think it's overly complicated. You've confirmed that it is in this specific case, and that's the only case there is in the codebase all the Python developers work with. ... > As I read the checkins, you've remove the type object export. Well, I removed the DL_IMPORT. The problem was more that it wasn't exported, and now it doesn't need to be imported or exported. > I am curious why the test_socket still fails on Windows > though. Both test_socket and test_socket_ssl work just fine on > Linux. test_socket was a red herring. Merely trying to import socket died with NameError on Windows. That got fixed too, and the non-SLL socket tests on Windows worked fine then. From john_coppola_r_s@yahoo.com Mon Feb 18 08:41:06 2002 From: john_coppola_r_s@yahoo.com (john coppola) Date: Mon, 18 Feb 2002 00:41:06 -0800 (PST) Subject: [Python-Dev] new property factory arguments Message-ID: <20020218084106.26444.qmail@web11807.mail.yahoo.com> --0-1827853335-1014021666=:25719 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hello python developers. After discussions with Fred about defining how property objects are created, I decided to give it a whirl myself. After about minute of piddling, I came up with something I thought would be a hack but is quite interesting. Ultimately, this is what I came up with. class Foo(object): def __get__(self, container): print "this is get" print "self:", type(self) print "container:", type(container) print def __set__(self, container, value): print "this is set" print "self:", type(self) print "container:", type(container) print "value:", value print def __del__(self, container): print "this is del" print "self:", type(self) print "container:", type(container) print class Spam(object): x=property(Foo()) I feel this has the benefit of encapsulating the x property from the Spam object. If coupling is needed, then access to Spam can be obtained via container. This was the interesting hack I was talking about. I first tried it without container and got an error, but then I decided to see what that second argument was like hummm, interesting and I liked it. What's really cool is that Foo can be used by a completely separate class. Perhaps Foo is a singleton for a DB connection. A single connection could be created in __new__, and other attribute details created in __init__. So class Spam is decoupled from what is going on in Foo. Whereas the former syntax, this was not possible. I've attached a descrobject.diff file to this email as well as testprop.py. (I've never tried sending an attachment to python-dev, I hope it works.) Enjoy, John Coppola __________________________________________________ Do You Yahoo!? Yahoo! Sports - Coverage of the 2002 Olympic Games http://sports.yahoo.com --0-1827853335-1014021666=:25719 Content-Type: text/plain; name="descrobject.diff" Content-Description: descrobject.diff Content-Disposition: inline; filename="descrobject.diff" *** PythonOrig/Python-2.2/Objects/descrobject.c Sat Dec 15 00:00:30 2001 --- PythonDev/Python-2.2/Objects/descrobject.c Sun Feb 17 21:52:55 2002 *************** *** 1003,1024 **** } static int ! property_init(PyObject *self, PyObject *args, PyObject *kwds) { ! PyObject *get = NULL, *set = NULL, *del = NULL, *doc = NULL; ! static char *kwlist[] = {"fget", "fset", "fdel", "doc", 0}; ! propertyobject *gs = (propertyobject *)self; ! ! if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOOO:property", ! kwlist, &get, &set, &del, &doc)) ! return -1; ! ! if (get == Py_None) ! get = NULL; ! if (set == Py_None) ! set = NULL; ! if (del == Py_None) ! del = NULL; Py_XINCREF(get); Py_XINCREF(set); --- 1003,1023 ---- } static int ! property_init(PyObject *self, PyObject *args, PyObject *kw) { ! PyObject *get=NULL, *set=NULL, *del=NULL, *doc=NULL, *arg=NULL; ! static char *kwlist[] = {"object", 0}; ! propertyobject *gs = (propertyobject *)self; ! if (!PyArg_ParseTupleAndKeywords(args,kw,"|O:property",kwlist,&arg)) ! return -1; ! ! get = PyObject_GetAttrString(arg,"__get__"); ! set = PyObject_GetAttrString(arg,"__set__"); ! del = PyObject_GetAttrString(arg,"__del__"); ! doc = PyObject_GetAttrString(arg,"__doc__"); ! if (get == Py_None) get = NULL; ! if (set == Py_None) set = NULL; ! if (del == Py_None) del = NULL; Py_XINCREF(get); Py_XINCREF(set); *************** *** 1034,1049 **** } static char property_doc[] = ! "property(fget=None, fset=None, fdel=None, doc=None) -> property attribute\n" "\n" ! "fget is a function to be used for getting an attribute value, and likewise\n" ! "fset is a function for setting, and fdel a function for del'ing, an\n" ! "attribute. Typical use is to define a managed attribute x:\n" "class C(object):\n" ! " def getx(self): return self.__x\n" ! " def setx(self, value): self.__x = value\n" ! " def delx(self): del self.__x\n" ! " x = property(getx, setx, delx, \"I'm the 'x' property.\")"; static int property_traverse(PyObject *self, visitproc visit, void *arg) --- 1033,1050 ---- } static char property_doc[] = ! "property(object) -> property attribute\n" "\n" ! "__get__ is a function to be used for getting an attribute value, and\n" ! "likewise __set__ is a function for setting, and __del__ a function for\n" ! "del'ing, an attribute. Typical use is to define a managed attribute x:\n" "class C(object):\n" ! " def __get__(self, container): return self.__x\n" ! " def __set__(self, container, value): self.__x = value\n" ! " def __del__(self, container): del self.__x\n" ! "\n" ! "class D(object):\n" ! " x = property(object=C(), \"I'm the 'x' property.\")"; static int property_traverse(PyObject *self, visitproc visit, void *arg) --0-1827853335-1014021666=:25719 Content-Type: text/plain; name="TESTPROP.PY" Content-Description: TESTPROP.PY Content-Disposition: inline; filename="TESTPROP.PY" __doc__="""\ class Foo(object): def __get__(self, container): print "this is get" print "self:", type(self) print "container:", type(container) print def __set__(self, container, value): print "this is set" print "self:", type(self) print "container:", type(container) print "value:", value print def __del__(self, container): print "this is del" print "self:", type(self) print "container:", type(container) print __doc__ = "this is doc" """ exec __doc__ class Spam(object): x=property(Foo()) print __doc__ a=Spam() a.x=5 print "getting:", a.x del a.x --0-1827853335-1014021666=:25719-- From jason@jorendorff.com Mon Feb 18 09:24:47 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Mon, 18 Feb 2002 03:24:47 -0600 Subject: [Python-Dev] new property factory arguments In-Reply-To: <20020218084106.26444.qmail@web11807.mail.yahoo.com> Message-ID: With minor changes, this works already. class Foo(property): def __get__(self, container, _type=None): print "this is get" print "self:", self print "container:", container print def __set__(self, container, value): print "this is set" print "self:", self print "container:", container print "value:", value print def __delete__(self, container): print "this is del" print "self:", type(self) print "container:", type(container) print class Spam(object): x = Foo() ## Jason Orendorff http://www.jorendorff.com/ From mal@lemburg.com Mon Feb 18 11:04:29 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Feb 2002 12:04:29 +0100 Subject: [Python-Dev] SSL support in _socket References: Message-ID: <3C70DFBD.6D0DB861@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > Some explanation: > > > > The _ssl module needs access to the type object defined in > > the _socket module. Since cross-DLL linking introduces a lot of > > problems on many platforms, the "trick" is to wrap the > > C API of a module in a struct which then gets exported to > > other modules via a PyCObject. > > > > The code in socketmodule.c defines this struct (which currently > > only contains the type object reference, but could very > > well also include other C APIs needed by other modules) > > and exports it as PyCObject via the module dictionary > > under the name "CAPI". > > > > Other modules can now include the socketmodule.h file > > which defines the needed C APIs to import and set up > > a static copy of this struct in the importing module. > > > > After initialization, the importing module can then > > access the C APIs from the _socket module by simply > > referring to the static struct, e.g. > > > > /* Load _socket module and its C API; this sets up the global > > PySocketModule */ > > if (PySocketModule_ImportModuleAndAPI()) > > return; > > > > ... > > if (!PyArg_ParseTuple(args, "O!|zz:ssl", > > > > PySocketModule.Sock_Type, > > > > (PyObject*)&Sock, > > &key_file, &cert_file)) > > return NULL; > > > > (Perhaps I should copy the above explanation into the source > > files ?!) > > I don't know. I really don't have time to try and understand this, but I > can tell you I spent a lot of time staring at the code just trying to fix > the part that didn't work, and it was slow and painful going. Without deep > understanding, I can only repeat that all this machinery *seems* to be > overkill in this specific case; and since there is no other case in the > Python core, a mass of overly general machinery in the Python core seems out > of place. The idea of using the above framework was to get the discussion started and then perhaps extend this kind of support to other modules as well, e.g. to be able to create and access types from other modules at C level. Note that the framework only seem to be overkill at the moment (since it only exports one symbol). As soon as you add more APIs to the API struct, things look different -- e.g. a socket constructor at C level would be nice to have. > > ... > > Ah, you're right, the export of the type object is not > > needed anymore since this is now done using the PyCObject. > > Sorry, my bad. > > No problem -- that part turned out to be easy, once I found it. You should have just thrown the error message in my Inbox. > > ... > > The reason for putting the code in the header file is > > to avoid duplication of code. The import API is needed by > > all modules wishing to use the C API of the socket module. > > But in this specific case you confirm that there is only one client: > > > Currently, only _ssl needs this, but I think it would be > > a good strategy to extend this technique to other modules > > as well (esp. the array module would be a good candidate). > > Possibly, but it's overly elaborate in this specific case. If it needed to > be hypergeneral (and it doesn't here), it seems it would be better to make > the code template *more* general, so that every importer of every module > could include a common (e.g.) PyImportModuleAndApi.h header file one or more > times, after setting a pile of #defines to specialize it to the module at > hand. Right. > > I don't think it's overly complicated. > > You've confirmed that it is in this specific case, and that's the only case > there is in the codebase all the Python developers work with. Yeah, well, ok :-) You have to get the ball rolling somehow ;-) > > I am curious why the test_socket still fails on Windows > > though. Both test_socket and test_socket_ssl work just fine on > > Linux. > > test_socket was a red herring. Merely trying to import socket died with > NameError on Windows. That got fixed too, and the non-SLL socket tests on > Windows worked fine then. Thanks. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Mon Feb 18 11:17:17 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Feb 2002 12:17:17 +0100 Subject: [Python-Dev] new property factory arguments References: <20020218084106.26444.qmail@web11807.mail.yahoo.com> Message-ID: <3C70E2BD.3E72C6CA@lemburg.com> john coppola wrote: > > Hello python developers. After discussions with Fred > about defining how property objects are created, I > decided to give it a whirl myself. After about minute > of piddling, I came up with something I thought would > be a hack but is quite interesting. > ... > ! property_init(PyObject *self, PyObject *args, PyObject *kwds) > { > ! PyObject *get = NULL, *set = NULL, *del = NULL, *doc = NULL; > ! static char *kwlist[] = {"fget", "fset", "fdel", "doc", 0}; > ! propertyobject *gs = (propertyobject *)self; > ! > ! if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOOO:property", > ! kwlist, &get, &set, &del, &doc)) > ! return -1; > ... > --- 1003,1023 ---- > } > > static int > ! property_init(PyObject *self, PyObject *args, PyObject *kw) > { > ! PyObject *get=NULL, *set=NULL, *del=NULL, *doc=NULL, *arg=NULL; > ! static char *kwlist[] = {"object", 0}; > ! propertyobject *gs = (propertyobject *)self; > ! if (!PyArg_ParseTupleAndKeywords(args,kw,"|O:property",kwlist,&arg)) > ! return -1; > ! > ! get = PyObject_GetAttrString(arg,"__get__"); > ! set = PyObject_GetAttrString(arg,"__set__"); > ! del = PyObject_GetAttrString(arg,"__del__"); > ! doc = PyObject_GetAttrString(arg,"__doc__"); Wouldn't this break the documented API ? If so, I'd suggest to provide a second constructor which exposes the new signature instead. Should be easy to do in Python... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Mon Feb 18 11:25:26 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Feb 2002 12:25:26 +0100 Subject: [Python-Dev] Global name lookup schemes References: Message-ID: <3C70E4A6.57CD7E1E@lemburg.com> [Very nice pictures] Way cool, Ping ! Does AI provide tools for simplifying these kind of diagrams or did you do it all by hand ? Perhaps Guido ought to add these to the PEP as external reference ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From Jack.Jansen@oratrix.com Mon Feb 18 11:35:40 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Mon, 18 Feb 2002 12:35:40 +0100 Subject: [Python-Dev] SSL support in _socket In-Reply-To: Message-ID: On Monday, February 18, 2002, at 06:19 , Tim Peters wrote: > I don't know. I really don't have time to try and understand this, > but I > can tell you I spent a lot of time staring at the code just trying to > fix > the part that didn't work, and it was slow and painful going. Without > deep > understanding, I can only repeat that all this machinery *seems* to be > overkill in this specific case; and since there is no other case in the > Python core, a mass of overly general machinery in the Python core > seems out > of place. Well... The MacOS toolbox modules have a similar requirement (but currently implemented in a different way, see pymactoolboxglue.c if you're interested in the gory details) and various extension packages (such as Numeric) also have their own implementation of something similar. And there's packages like VTK which currently do hard cross-dll linking which could benefit from such a scheme. Maybe someone should try and come up with a list of requirements for inter-extension-module communication and PEP it? -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mal@lemburg.com Mon Feb 18 11:53:35 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Feb 2002 12:53:35 +0100 Subject: [Python-Dev] SSL support in _socket References: Message-ID: <3C70EB3F.249B55E0@lemburg.com> Jack Jansen wrote: > > On Monday, February 18, 2002, at 06:19 , Tim Peters wrote: > > I don't know. I really don't have time to try and understand this, > > but I > > can tell you I spent a lot of time staring at the code just trying to > > fix > > the part that didn't work, and it was slow and painful going. Without > > deep > > understanding, I can only repeat that all this machinery *seems* to be > > overkill in this specific case; and since there is no other case in the > > Python core, a mass of overly general machinery in the Python core > > seems out > > of place. > > Well... The MacOS toolbox modules have a similar requirement (but > currently implemented in a different way, see pymactoolboxglue.c if > you're interested in the gory details) and various extension packages > (such as Numeric) also have their own implementation of something > similar. > > And there's packages like VTK which currently do hard cross-dll linking > which could benefit from such a scheme. > > Maybe someone should try and come up with a list of requirements for > inter-extension-module communication and PEP it? Good idea. I can have a go at this next weekend. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From ping@lfw.org Mon Feb 18 12:15:09 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Mon, 18 Feb 2002 06:15:09 -0600 (CST) Subject: [Python-Dev] Global name lookup schemes In-Reply-To: <3C70E4A6.57CD7E1E@lemburg.com> Message-ID: On Mon, 18 Feb 2002, M.-A. Lemburg wrote: > Way cool, Ping ! Does AI provide tools for simplifying these > kind of diagrams or did you do it all by hand ? I spent all day on it. Illustrator was not much help, sadly. It's so lame that it can't even keep arrowheads stuck to arrows, and its grid-snapping behaviour is mysterious and unpredictable. Unfortunately it's the only tool i have that's remotely good enough for the job (works with my Wacom tablet, fast navigation, multiple undo). > Perhaps Guido ought to add these to the PEP as external > reference ?! I would like him to, once we have made sure they are accurate. -- ?!ng From mal@lemburg.com Mon Feb 18 13:49:52 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Feb 2002 14:49:52 +0100 Subject: [Python-Dev] Global name lookup schemes References: Message-ID: <3C710680.A06C59D@lemburg.com> Ka-Ping Yee wrote: > > On Mon, 18 Feb 2002, M.-A. Lemburg wrote: > > Way cool, Ping ! Does AI provide tools for simplifying these > > kind of diagrams or did you do it all by hand ? > > I spent all day on it. Illustrator was not much help, sadly. Ouch... and I thought someone has finally come up with a great tool for doing technical diagrams. Oh well; I'll stick with Corel Draw then. > It's so lame that it can't even keep arrowheads stuck to arrows, > and its grid-snapping behaviour is mysterious and unpredictable. > Unfortunately it's the only tool i have that's remotely good > enough for the job (works with my Wacom tablet, fast navigation, > multiple undo). Hmm, you sure did a great job on the diagrams given this environment. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From john_coppola_r_s@yahoo.com Mon Feb 18 14:20:28 2002 From: john_coppola_r_s@yahoo.com (john coppola) Date: Mon, 18 Feb 2002 06:20:28 -0800 (PST) Subject: [Python-Dev] new property factory arguments In-Reply-To: Message-ID: <20020218142028.9929.qmail@web11806.mail.yahoo.com> --- Jason Orendorff wrote: > With minor changes, this works already. > class Foo(property): > def __get__(self, container, _type=None): > print "this is get" > print "self:", self . . . I didn't subclass from property? I do believe my with example, any object new or old could be used as a property. And by looking at the code, property_init clearly did not include GetAttr methods for __get__, __set__, __del__. If fact, there is not reason to include __delete__, why not use __del__ instead? If you send any ole python class instance to property your code fails. Thats the need for the change. Without my patch... class Bar(object): # <== object! def __get__(self,container,tp=None): print "get" def __set__(self,container,value): print "set" def __delete__(self,container): print "del" >>> class Foo(object): x=property(Bar()) #performs coersion >>> a=Foo() >>> a.x Fails! With my patch, it works. Infact, I feel very strongly, that the old syntax should be removed. Better now then later. property(fget,fset,fdel,fdoc) does not make much sense in the new object oriented world of python. Sincerely, John Coppola __________________________________________________ Do You Yahoo!? Yahoo! Sports - Coverage of the 2002 Olympic Games http://sports.yahoo.com From gward@python.net Mon Feb 18 15:18:09 2002 From: gward@python.net (Greg Ward) Date: Mon, 18 Feb 2002 10:18:09 -0500 Subject: [Python-Dev] PEP 282: A Logging System -- comments please In-Reply-To: <200202160247.g1G2lkJ5029529@email.nist.gov> References: <20020215164333.A31903@ActiveState.com> <200202160247.g1G2lkJ5029529@email.nist.gov> Message-ID: <20020218151809.GA1335@gerg.ca> On 15 February 2002, Michael McLay said: > I scanned the PEP and didn't find a reference to the logging package > supporting logging over a network. I think the right way to handle that would be to pass a file-like object to the logging framework, which it then write()'s to. That should work just fine for stream (TCP) sockets; I *think* it would work for datagram (UDP) sockets too, but I'd want to test it first. Greg -- Greg Ward - Unix weenie gward@python.net http://starship.python.net/~gward/ Eschew obfuscation! From Jack.Jansen@oratrix.com Mon Feb 18 15:38:20 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Mon, 18 Feb 2002 16:38:20 +0100 Subject: [Python-Dev] Global name lookup schemes In-Reply-To: <3C710680.A06C59D@lemburg.com> Message-ID: <88F73897-2485-11D6-B2BF-0030655234CE@oratrix.com> On Monday, February 18, 2002, at 02:49 , M.-A. Lemburg wrote: > Ka-Ping Yee wrote: >> >> On Mon, 18 Feb 2002, M.-A. Lemburg wrote: >>> Way cool, Ping ! Does AI provide tools for simplifying these >>> kind of diagrams or did you do it all by hand ? >> >> I spent all day on it. Illustrator was not much help, sadly. > > Ouch... and I thought someone has finally come up with a great > tool for doing technical diagrams. Oh well; I'll stick with Corel > Draw then. I've heard very good things of OmniGraffle. Never used it myself, but their web browser OmniWeb absolutely rooooooooooolz! Check out www.omnigroup.com. MacOSX only, of course. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From fdrake@zope.com Mon Feb 18 15:51:14 2002 From: fdrake@zope.com (Fred Drake) Date: Mon, 18 Feb 2002 10:51:14 -0500 Subject: [Python-Dev] PEP 282: A Logging System -- comments please In-Reply-To: <20020218151809.GA1335@gerg.ca> Message-ID: On 15 February 2002, Michael McLay said: > I scanned the PEP and didn't find a reference to the logging package > supporting logging over a network. On Mon, 18 Feb 2002 10:18:09 -0500 Greg Ward wrote: > I think the right way to handle that would be to pass a > file-like object > to the logging framework, which it then write()'s to. It seems to me that it should be trivial to map from the logging API to syslog; it already provides for logging to remote systems as well as filtering. I'm sure this is already hashed out in the PEP though, so I should be reading that instead of commenting here. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jacobs@penguin.theopalgroup.com Mon Feb 18 16:29:14 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 18 Feb 2002 11:29:14 -0500 (EST) Subject: [Python-Dev] Meta-reflections Message-ID: Hello all, I've been meta-reflecting a lot lately: reflecting on reflection. My recent post on __slots__ not being picklable (and the resounding lack of response to it) inspired me to try my hand at channeling Guido and reverse- engineer some of the design decisions that went into the new-style class system. Unfortunately, the more I dug into the code, the more philosophical my questions became. So, I've written up some questions that help lay bare some of basic design questions that I've been asking myself and that you should be aware of. While there are several subtle issues I could raise, I do want some feedback on some simple and fundamental ones first. Please don't disqualify yourself from commenting because you haven't read the code or used the new features yet. I've written my examples assuming only a basic and cursor understanding of the new Python 2.2 features. [In this discussion I am only going to talk about native Python classes, not C-extension or native Python types (e.g., ints, lists, tuples, strings, cStringIO, etc.)] 1) Should class instances explicitly/directly know all of their attributes? Before Python 2.2, all object instances contained a __dict__ attribute that mapped attribute names to their values. This made pickling and some other reflection tasks fairly easy. e.g.: class Foo: def __init__(self): self.a = 1 self.b = 2 class Bar(Foo): def __init__(self): Foo.__init__(self) self.c = 3 bar = Bar() print bar.__dict__ > {'a': 1, 'c': 3, 'b': 2} I am aware that there are situations where this simple case does not hold (e.g., when implementing __setattr__ or __getattr__), but let's ignore those for now. Rather, I will concentrate on how this classical Python idiom interacts with the new slots mechanism. Here is the above example using slots: e.g.: class Foo(object): __slots__ = ['a','b'] def __init__(self): self.a = 1 self.b = 2 class Bar(Foo): __slots__ = ['c'] def __init__(self): Foo.__init__(self) self.c = 3 bar = Bar() print bar.__dict__ > AttributeError: 'Bar' object has no attribute '__dict__' We can see that the class instance 'bar' has no __dict__ attribute. This is because the slots mechanism allocates space for attribute storage directly inside the object, and thus does not use (or need) a per-object instance dictionary to store attributes. Of course, it is possible to request that a per-instance dictionary by inheriting from a new-style class that does not list any slots. e.g. continuing from above: class Baz(Bar): def __init__(self): Bar.__init__(self) self.d = 4 self.e = 5 baz = Baz() print baz.__dict__ > {'e': 5, 'd': 4} We have now created a class that has __dict__, but it only contains the attributes not stored in slots! So, should class instances explicitly know their attributes? Or more precisely, should class instances always have a __dict__ attribute that contains their attributes? Don't worry, this does not mean that we cannot also have slots, though it does have some other implications. Keep reading... 2) Should attribute access follow the same resolution order rules as methods? class Foo(object): __slots__ = ['a'] self.a def __init__(self): self.a = 1 class Bar(Foo): __slots__ = ('a',) def __init__(self): Foo.__init__(self) self.a = 2 bar = Bar() print bar.a > 2 print super(Bar,bar).a # this doesn't actually work > 2 or 1? Don't worry -- this isn't a proposal and no, this doesn't actually work. However, the current implementation only narrowly escapes this trap: print bar.__class__.a.__get__(bar) > 2 print bar.__class__.__base__.a.__get__(bar) > AttributeError: a Ok, let me explain what just happened. Slots are implemented via the new descriptor interface. In short, descriptor objects are properties and support __get__ and __set__ methods. The slot descriptors are told the offset within an object instance the PyObject* lives and proxy operations for them. So getting and setting slots involves: # print bar.a a_descr = bar.__class__.a print a_descr.__set__(bar) # bar.a = 1 a_descr = bar.__class__.a a_descr.__set__(bar, 1) So, above we get an attribute error when trying to access the 'a' slot from Bar since it was never initialized. However, with a little ugliness you can do the following: # Get the descriptors for Foo.a and Bar.a a_foo_descr = bar.__class__.__base__.a a_bar_descr = bar.__class__.a a_foo_descr.__set__(bar,1) a_bar_descr.__set__(bar,2) print bar.a > 2 print a_foo_descr.__get__(bar) > 1 print a_bar_descr.__get__(bar) > 2 In other words, the namespace for slots is not really flat, although there is no simple way to access these hidden attributes since method resolution order rules are not invoked by default. 3) Should __slots__ be immutable? The __slots__ attribute of a new-style class lists all of the slots defined by that class. It is represented as whatever sequence type what given when the object was declared: print Foo.__slots__ > ['a'] print Bar.__slots__ > ('a',) This allows us to do things like: Foo.__slots__.append('b') foo = Foo() foo.b = 42 > AttributeError: 'Foo' object has no attribute 'b' So modifying the slots does not do what one may expect. This is because slot descriptors and the space for slots are only allocated when the classes are created (i.e., when they are inherited from 'object', or from an object that descends from 'object'). 4) Should __slots__ be flat? bar.__slots__ only lists the slots specifically requested in bar, even though it inherits from 'foo', which has its own slots. Which would be the preferable behavior? class Foo(object): __slots__ = ('a','b') class Bar(object): __slots__ = ('c','d') print Bar.__slots__ > ('c','d') # current behavior or > ('a','b','c','d') # alternate behavior Clearly, this issue goes back to the ideas addressed in question 1. If slot descriptors are not stored in a per-instance dictionary, then the assumptions on how to do object reflection must change. However, which version of the following code do you prefer to print all attributes of a given object: Old style or if descriptors are stored in obj.__dict__: if hasattr(obj,'__dict__'): print ''.join([ '%s=%s' % nameval for nameval in obj.__dict__ ]) Currently in Python 2.2 (and still not quite correct): def print_slot_attrs(obj,cls=None): if not cls: cls = obj.__class__ for name,obj in cls.__dict__.items() if str(type(obj)) == "": if hasattr(obj, name): print "%s=%s" % (name,getattr(obj, name)) for base in cls.__bases__: print_slot_attrs(obj,base) if hasattr(obj,'__dict__'): print [ '%s=%s' % nameval for nameval in obj.__dict__ ] print_slot_attrs(obj) Flat and immutable slot namespace: a = [ '%s=%s' % nameval for nameval in obj.__dict__ ] a += [ '%s=%s' % (name,val) for name,val in obj.__slots__ \ if hasattr(obj, name) ] print ''.join(a) So, which one of these do you want to support or explain to a new user? Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From john_coppola_r_s@yahoo.com Mon Feb 18 16:46:55 2002 From: john_coppola_r_s@yahoo.com (john coppola) Date: Mon, 18 Feb 2002 08:46:55 -0800 (PST) Subject: [Python-Dev] Meta-reflections In-Reply-To: Message-ID: <20020218164655.38970.qmail@web11806.mail.yahoo.com> I haven't even finished reading this yet. This is good stuff! --- Kevin Jacobs wrote: > Hello all, > > I've been meta-reflecting a lot lately: reflecting > on reflection. > > My recent post on __slots__ not being picklable (and > the resounding lack of > response to it) inspired me to try my hand at > channeling Guido and reverse- > engineer some of the design decisions that went into > the new-style class > system. Unfortunately, the more I dug into the > code, the more philosophical > my questions became. So, I've written up some > questions that help lay bare > some of basic design questions that I've been asking > myself and that you > should be aware of. > > While there are several subtle issues I could raise, > I do want some feedback > on some simple and fundamental ones first. Please > don't disqualify yourself > from commenting because you haven't read the code or > used the new features > yet. I've written my examples assuming only a basic > and cursor > understanding of the new Python 2.2 features. > > [In this discussion I am only going to talk about > native Python classes, > not C-extension or native Python types (e.g., > ints, lists, tuples, > strings, cStringIO, etc.)] > > 1) Should class instances explicitly/directly know > all of their attributes? > > Before Python 2.2, all object instances > contained a __dict__ attribute > that mapped attribute names to their values. > This made pickling and > some other reflection tasks fairly easy. > > e.g.: > > class Foo: > def __init__(self): > self.a = 1 > self.b = 2 > > class Bar(Foo): > def __init__(self): > Foo.__init__(self) > self.c = 3 > > bar = Bar() > print bar.__dict__ > > {'a': 1, 'c': 3, 'b': 2} > > I am aware that there are situations where this > simple case does not > hold (e.g., when implementing __setattr__ or > __getattr__), but let's > ignore those for now. Rather, I will > concentrate on how this classical > Python idiom interacts with the new slots > mechanism. Here is the above > example using slots: > > e.g.: > > class Foo(object): > __slots__ = ['a','b'] > def __init__(self): > self.a = 1 > self.b = 2 > > class Bar(Foo): > __slots__ = ['c'] > def __init__(self): > Foo.__init__(self) > self.c = 3 > > bar = Bar() > print bar.__dict__ > > AttributeError: 'Bar' object has no > attribute '__dict__' > > We can see that the class instance 'bar' has no > __dict__ attribute. > This is because the slots mechanism allocates > space for attribute > storage directly inside the object, and thus > does not use (or need) a > per-object instance dictionary to store > attributes. Of course, it is > possible to request that a per-instance > dictionary by inheriting from a > new-style class that does not list any slots. > e.g. continuing from > above: > > class Baz(Bar): > def __init__(self): > Bar.__init__(self) > self.d = 4 > self.e = 5 > > baz = Baz() > print baz.__dict__ > > {'e': 5, 'd': 4} > > We have now created a class that has __dict__, > but it only contains the > attributes not stored in slots! So, should > class instances explicitly > know their attributes? Or more precisely, > should class instances > always have a __dict__ attribute that contains > their attributes? Don't > worry, this does not mean that we cannot also > have slots, though it > does have some other implications. Keep > reading... > > > 2) Should attribute access follow the same > resolution order rules as > methods? > > class Foo(object): > __slots__ = ['a'] > self.a > def __init__(self): > self.a = 1 > > class Bar(Foo): > __slots__ = ('a',) > def __init__(self): > Foo.__init__(self) > self.a = 2 > > bar = Bar() > print bar.a > > 2 > print super(Bar,bar).a # this doesn't > actually work > > 2 or 1? > > Don't worry -- this isn't a proposal and no, > this doesn't actually work. > However, the current implementation only > narrowly escapes this trap: > > print bar.__class__.a.__get__(bar) > > 2 > print bar.__class__.__base__.a.__get__(bar) > > AttributeError: a > > Ok, let me explain what just happened. Slots > are implemented via the > new descriptor interface. In short, descriptor > objects are properties > and support __get__ and __set__ methods. The > slot descriptors are told > the offset within an object instance the > PyObject* lives and proxy > operations for them. So getting and setting > slots involves: > > # print bar.a > a_descr = bar.__class__.a > print a_descr.__set__(bar) > > # bar.a = 1 > a_descr = bar.__class__.a > a_descr.__set__(bar, 1) > > So, above we get an attribute error when trying > to access the 'a' slot > from Bar since it was never initialized. > However, with a little > ugliness you can do the following: > > # Get the descriptors for Foo.a and Bar.a > a_foo_descr = bar.__class__.__base__.a > a_bar_descr = bar.__class__.a > a_foo_descr.__set__(bar,1) > a_bar_descr.__set__(bar,2) > > print bar.a > > 2 > print a_foo_descr.__get__(bar) > > 1 > print a_bar_descr.__get__(bar) > > 2 > > In other words, the namespace for slots is not > really flat, although > there is no simple way to access these hidden > attributes === message truncated === __________________________________________________ Do You Yahoo!? Yahoo! Sports - Coverage of the 2002 Olympic Games http://sports.yahoo.com From DavidA@ActiveState.com Mon Feb 18 19:01:51 2002 From: DavidA@ActiveState.com (David Ascher) Date: Mon, 18 Feb 2002 11:01:51 -0800 Subject: [Python-Dev] Meta-reflections References: Message-ID: <3C714F9F.841DB772@activestate.com> I think that you're making useful points, but I think that it's worth stepping even further back and deciding what the reflection API should be like from a "what's it for" POV? This relates to much of the discussion about what dir() should do on new-style classes, as well as why some Python objects have 'members', some have 'methods', etc. In my opinon, __dict__ is mostly an implementation detail, and it makes sense to me that the slot names dont' show up in there (after all, it's not a dictionary!). What I'd propose is that the inspect module grow some "abstract" reflection APIs which make it possible for folks who don't need to know about implementation details to get away with it. Looking at it, maybe it already has everything we need. I'm not quite sure why inspect.getmembers is called that, but maybe I'm the only one who's not sure what 'members' mean in Python. --david From jacobs@penguin.theopalgroup.com Mon Feb 18 19:33:04 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 18 Feb 2002 14:33:04 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <3C714F9F.841DB772@activestate.com> Message-ID: On Mon, 18 Feb 2002, David Ascher wrote: > I think that you're making useful points, but I think that it's worth > stepping even further back and deciding what the reflection API should > be like from a "what's it for" POV? Exactly! However, having a meta-discussion on meta-reflection is a little too abstract for the disinterested to jump in on. However most people who read python-dev use and come to rely on using __dict__ as The Python Reflection API for instance attributes. > This relates to much of the discussion about what dir() should do on > new-style classes, as well as why some Python objects have 'members', > some have 'methods', etc. Sure, except that I've _NEVER_ assumed dir() was anything more than a quick-and-dirty ultra-high level hack that was occaisonally useful for doing reflection. One call does not a reflection API make. > In my opinon, __dict__ is mostly an implementation detail, and it makes > sense to me that the slot names dont' show up in there (after all, it's > not a dictionary!). I think so too, though I don't want to ram my own views down people's throats on the matter. However, it is potentially valid to view __dict__ as the one true reflection API for getting access to all attributes. This isn't too outlandish since it effectively is in Python 2.2. Pickle and cPickle 2.2 (among several dozen other examples I've found) are currently implemented assuming this. If we wanted to keep this existing API we could support reflection on slots by extending object instances with only slot attributes to share a common read-only __dict__. New style class instances with per-instance __dict__'s should start with a mutable copy when instantiated. For the record, I don't think this is the right way to go, even though it is a valid way way of defining the Python reflection API. > What I'd propose is that the inspect module grow some "abstract" > reflection APIs which make it possible for folks who don't need to know > about implementation details to get away with it. Great idea! I've already got a stack of suggestions and patches that clean up other various bits of it. However, there was an unstated and important question left out of my last e-mail: We need to decide if slots are really 'attributes' or "something else". The "something else" being akin to __set/getattr__ virtual attributes, pure properties, and other techniques that will almost always require explicit hooks to into reflection APIs. My preference is the former, that slot declarations simply affect the allocation choices made by an object, and not the semantics of what may be done with that object (modulo issues when per-instance dicts are not allocated). Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From john_coppola_r_s@yahoo.com Mon Feb 18 20:01:17 2002 From: john_coppola_r_s@yahoo.com (john coppola) Date: Mon, 18 Feb 2002 12:01:17 -0800 (PST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <3C714F9F.841DB772@activestate.com> Message-ID: <20020218200117.32087.qmail@web11808.mail.yahoo.com> [ David Ascher wrote:] > I think that you're making useful points, but I > think that it's worth stepping even further back > and deciding what the reflection API should > be like from a "what's it for" POV? > This relates to much of the discussion about what > dir() should do on > new-style classes, as well as why some Python > objects have 'members', > some have 'methods', etc. I think his points were more than useful. His examples expose serious flaws with the use of slots, which I hardly see as satisfactory behavior. Particularly, are slot attributes to be treated like MRO or not? Hummm... Does each class have a separate set of slot attributes? class A: __slots__=('foo','bar') class B(A): __slots__=('foo','spam') What are we supposed to expect here? I believe things would be greatly simplified if the inheritance tree was traversed and all slot attributes were concatenated and regarded as unique for a given instance. So in the above example we would expect (breadth first then depth), __slots__ =('foo', 'spam', 'bar'). A note on dir... Since we have no other introspection tools aside from dir, I as of this revision of python dir should correctly display slot attributes with dict attributes. Does "dir" stand for directory or does it mean __dict__.keys()? I believe this is an implementation detail. Dir should lookup every attribute, but now we need two additional functions: slotdir(), dictdir(). Maybe better names: slots(), dictdir(). Another argument for why dir should be changed, is that cPickle and pickle would problably function correctly as a result of changing dir's implementation to reveal slot attributes. But even on another note which digs deeper into the philosophy of __slots__, why not use slots for class methods? Can this already be done? It can be used everywhere, modules, imported modules, classes, instances, etc. > > In my opinion, __dict__ is mostly an implementation > detail, and it makes > sense to me that the slot names dont' show up in > there (after all, it's > not a dictionary!). > > What I'd propose is that the inspect module grow > some "abstract" > reflection APIs which make it possible for folks who > don't need to know > about implementation details to get away with it. > > Looking at it, maybe it already has everything we > need. I'm not quite > sure why inspect.getmembers is called that, but > maybe I'm the only one > who's not sure what 'members' mean in Python. > > --david > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev __________________________________________________ Do You Yahoo!? Yahoo! Sports - Coverage of the 2002 Olympic Games http://sports.yahoo.com From john_coppola_r_s@yahoo.com Mon Feb 18 20:07:33 2002 From: john_coppola_r_s@yahoo.com (john coppola) Date: Mon, 18 Feb 2002 12:07:33 -0800 (PST) Subject: [Python-Dev] Re: comp.lang.python in English? (was Re: Why Python is like BASIC) In-Reply-To: Message-ID: <20020218200733.73832.qmail@web11803.mail.yahoo.com> --- Fran�ois Pinard wrote: > > Exactement. Bravo! Bien dit! :-) I agree with Fran�ois and I'm english speaking. -john __________________________________________________ Do You Yahoo!? Yahoo! Sports - Coverage of the 2002 Olympic Games http://sports.yahoo.com From mal@lemburg.com Mon Feb 18 20:19:47 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Feb 2002 21:19:47 +0100 Subject: [Python-Dev] Re: comp.lang.python in English? (was Re: Why Python is like BASIC) References: <20020218200733.73832.qmail@web11803.mail.yahoo.com> Message-ID: <3C7161E3.F195BE73@lemburg.com> john coppola wrote: >=20 > --- Fran=E7ois Pinard wrote: > > > > Exactement. Bravo! Bien dit! :-) >=20 > I agree with Fran=E7ois and I'm english speaking. Either my mailer is broken or I am missing some context here... what does this have to do with python-dev ? --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From john_coppola_r_s@yahoo.com Mon Feb 18 20:49:49 2002 From: john_coppola_r_s@yahoo.com (john coppola) Date: Mon, 18 Feb 2002 12:49:49 -0800 (PST) Subject: [Python-Dev] inadvertantly posted to wrong discussion In-Reply-To: <3C7161E3.F195BE73@lemburg.com> Message-ID: <20020218204949.12674.qmail@web11805.mail.yahoo.com> It was completely an accident. Sorry. __________________________________________________ Do You Yahoo!? Yahoo! Sports - Coverage of the 2002 Olympic Games http://sports.yahoo.com From DavidA@ActiveState.com Mon Feb 18 21:30:30 2002 From: DavidA@ActiveState.com (David Ascher) Date: Mon, 18 Feb 2002 13:30:30 -0800 Subject: [Python-Dev] Global name lookup schemes References: Message-ID: <3C717276.9FF18C31@activestate.com> Ka-Ping Yee wrote: > > On Mon, 18 Feb 2002, M.-A. Lemburg wrote: > > Way cool, Ping ! Does AI provide tools for simplifying these > > kind of diagrams or did you do it all by hand ? > > I spent all day on it. Illustrator was not much help, sadly. > It's so lame that it can't even keep arrowheads stuck to arrows, > and its grid-snapping behaviour is mysterious and unpredictable. FYI, a language that I really enjoyed working with for illustrations is MetaPost: http://cm.bell-labs.com/who/hobby/MetaPost.html http://www.tug.org/metapost.html It's a really cool language, and like a lot of drawing languages, debugging is pretty. =) I've wanted to bridge Python and MetaPost, but never found the time (or frankly, a good excuse). --da From martin@v.loewis.de Mon Feb 18 21:31:22 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 18 Feb 2002 22:31:22 +0100 Subject: [Python-Dev] Meta-reflections In-Reply-To: References: Message-ID: Kevin Jacobs writes: > 1) Should class instances explicitly/directly know all of their attributes? Since types are classes, this is the same question as "should type instances know all their attributes?" I don't think they should, in general: For example, there is no way to find out whether a string object has an interned pointer, and I don't think there should be. The __slots__ aren't really different here. In fact, if you do class Spam(object): __slots__ = ('a','b') s = Spam() s.a = {} del Spam.a you loose access to s.a, even though it is still available (I guess it is actually a bug that cyclic garbage collection won't find cycles involving slots). > 2) Should attribute access follow the same resolution order rules as > methods? Yes, I think so. > 4) Should __slots__ be flat? Yes. They should also be a property of the type, not a member of the dict of the type, and they should be a tuple of member object, not a list of strings. It might be reasonable to call this property __members__. > > ('c','d') # current behavior > or > > ('a','b','c','d') # alternate behavior Neither, nor; assuming you meant Bar to inherit from Foo, it should be (, , , ) Regards, Martin From ping@lfw.org Mon Feb 18 21:44:06 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Mon, 18 Feb 2002 15:44:06 -0600 (CST) Subject: [Python-Dev] Global name lookup schemes In-Reply-To: <3C717276.9FF18C31@activestate.com> Message-ID: On Mon, 18 Feb 2002, David Ascher wrote: > FYI, a language that I really enjoyed working with for illustrations is > MetaPost: I'd rather discuss the *diagrams* on this list than the diagram-making tools. :) (You can send your suggestions for better tools to me individually, if you like, and i'll summarize later if there's interest.) So what do you all think of the various global name lookup proposals? -- ?!ng From tim.one@comcast.net Mon Feb 18 22:25:58 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 18 Feb 2002 17:25:58 -0500 Subject: [Python-Dev] Global name lookup schemes In-Reply-To: Message-ID: [Ping] > I'd rather discuss the *diagrams* on this list than the diagram-making > tools. :) Unless you write a new tool in Python . > ... > So what do you all think of the various global name lookup proposals? I expect reality has once again managed to extinguish most post-Conference euphoria. I spent 200% of my "free time" this weekend doing research for PSF board issues, and still haven't gotten to even reading about Oren's (IIRC) dict gimmicks. Guido is off traveling. Jeremy is in the midst of moving. Skip is too busy approving posts of mine that stinking SpamCop rejects since I involuntarily switched ISPs. So I'm glad we harassed Guido into at least starting a PEP ... once-upon-a-time-ly y'rs - tim From nas@python.ca Mon Feb 18 22:58:06 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 18 Feb 2002 14:58:06 -0800 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: ; from tim.one@comcast.net on Mon, Feb 18, 2002 at 05:25:58PM -0500 References: Message-ID: <20020218145806.A26111@glacier.arctrix.com> I've been working on Skip's rattlesnake and have made some progress. Right now I'm trying to hack the compiler package and am looking for a good reference on code generation. I have the "New Dragon book" as well as "Essentials of Programming Lanuages" but neither one seem to be telling me want I want to know. Any suggestions? Neil From simon@netthink.co.uk Mon Feb 18 23:12:17 2002 From: simon@netthink.co.uk (Simon Cozens) Date: Mon, 18 Feb 2002 23:12:17 +0000 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: <20020218145806.A26111@glacier.arctrix.com> References: <20020218145806.A26111@glacier.arctrix.com> Message-ID: <20020218231217.GA5118@netthink.co.uk> Neil Schemenauer: > good reference on code generation. I have the "New Dragon book" as well > as "Essentials of Programming Lanuages" but neither one seem to be > telling me want I want to know. Any suggestions? You could try Appel: Modern Compiler Implementation in {C,ML,Java}. -- "How should I know if it works? That's what beta testers are for. I only coded it." (Attributed to Linus Torvalds, somewhere in a posting) From josh.winters@webstream.net Mon Feb 18 23:37:02 2002 From: josh.winters@webstream.net (josh.winters@webstream.net) Date: Mon, 18 Feb 2002 18:37:02 -0500 Subject: [Python-Dev] We would like to possibly get some info on your company Message-ID: Hello, We would like to possibly get some info on your company in an effort to explore the ways that we might be able to work together. We may be able to save you money and offer you the benefits of our reseller program. We have been developing and hosting web sites since 1997. We offer design, programming, hosting and webcasting and videoconferencing. We support Linux, NT and AS400. Please forward this to the proper party, or respond to the address below. You can also visit our web site at http://webstream.net for more information on our services. If by e-mail: josh.winters@webstream.net If by mail: WebStream Internet Solutions Outsourcing/Purchasing 2200 W.Commercial Blvd. Suite 204 Ft. Lauderdale, FL 33309 USA Thank you very much. Sincerely, Josh Winters josh.winters@webstream.net http://webstream.net Design * Programming * Hosting * WebCasting * Since 1997 From JeffH@ActiveState.com Tue Feb 19 00:25:57 2002 From: JeffH@ActiveState.com (Jeff Hobbs) Date: Mon, 18 Feb 2002 16:25:57 -0800 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: <200202161628.g1GGS1x30329@pcp742651pcs.reston01.va.comcast.net> Message-ID: <009201c1b8dc$00eb6fb0$ba03a8c0@activestate.ca> > I hope that it's possible to do something better with Tcl/Tk 8.3 that > doesn't require the sleep and maintains the existing _tkinter API / > semantics. I guess I would need to get a better understanding of why it was designed with the sleep in the first place. Martin mentioned that it allows a thread switch to occur, but a shorter sleep interval would have done the same. Tk 8.1+ has been thread-safe, but only in 8.3 have people been pushing it a little harder (most users of threads are Tcl-only). However, there are the different models between Python and Tcl threading, and perhaps that is a reason another method wasn't attempted early. Anyway, as to Martin's questions: > In the light of this rationale, can you please explain what > Tcl_AsyncMark is and how it would avoid the problem, or what effect > calling Tcl_CreateEventSource would have, or how Tcl_ThreadQueueEvent > would help? Tcl_AsyncMark is what you would call if you left the while loop looking more like: [global or tls] static stopRequested = 0; [in func] while (!stopRequested && foundEvent) { foundEvent = Tcl_DoOneEvent(TCL_ALL_EVENTS); } And whenever a signal occurs, you would do: ProcessSignal() { stopRequested = 1; Tcl_AsyncMark(asyncHandler); } The asyncHandler then has it's own callback routine of your choosing. Now this might not be what you want, as this is more the design for single-threaded systems that want an event loop. There is also the Tcl_CreateEventSource route. This allows you to provide a proc that gets called in addition to the internal Tcl one for processing events. This is most often used when tieing together event sources like Tk and Gtk, or Tk and MFC, ... You may simply need to call Tcl_SetMaxBlockTime. This will prevent Tcl from indefinitely blocking when no events are received. This may be the simplest solution to create the same effect as the Sleep, but without any other negative effects. Most all of these are described in fairly good detail here: http://www.tcl.tk/man/tcl8.3/TclLib/Notifier.htm Jeff From gward@python.net Tue Feb 19 01:14:19 2002 From: gward@python.net (Greg Ward) Date: Mon, 18 Feb 2002 20:14:19 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: References: Message-ID: <20020219011419.GB5121@gerg.ca> On 18 February 2002, Kevin Jacobs said: > My recent post on __slots__ not being picklable (and the resounding lack of > response to it) Certainly caught my attention, but I had nothing to add. > 1) Should class instances explicitly/directly know all of their attributes? I'm not sure you're asking the right question. If you're concerned with introspection, shouldn't the question be: "Should arbitrary code be able to find out the set of attributes associated with a given object?" The Pythonic answer is clearly yes. And if "attribute" means "something that follows a dot", then you can do this using dir(). Unfortunately, the expansion of dir() to include methods means it's no longer very useful for getting just instance attributes, whether they're in a __dict__ or some other method. So the obvious answer is to use vars(), which works on classic classes and __slots__-less new-style classes. (I think vars(x) is just a more sociable way to spell x.__dict__.) But it bombs on classes with __slots__: >>> class C(object): ... __slots__ = ['a', 'b'] ... >>> c = C() >>> vars(c) Traceback (most recent call last): File "", line 1, in ? TypeError: vars() argument must have __dict__ attribute Uh-oh. This is a problem. > 3) Should __slots__ be immutable? Yes, definitely. Clearly __slots__ is a property of the type (class), not of the instance, and once the class is defined, that's it. (Or that should be it.) It looks as though you can modify __slots__, but it has no effect; that's mildly bogus. > 4) Should __slots__ be flat? Hmmmm... probably. That's certainly consistent with "... once the class is defined, that's it". Greg -- Greg Ward - geek-at-large gward@python.net http://starship.python.net/~gward/ If you and a friend are being chased by a lion, it is not necessary to outrun the lion. It is only necessary to outrun your friend. From tim.one@comcast.net Mon Feb 18 23:51:52 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 18 Feb 2002 18:51:52 -0500 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: <20020218145806.A26111@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > I've been working on Skip's rattlesnake and have made some progress. Cool! I encourage this. That and 2 dollars will buy you a cup of coffee. > Right now I'm trying to hack the compiler package and am looking for a > good reference on code generation. I have the "New Dragon book" as well > as "Essentials of Programming Lanuages" but neither one seem to be > telling me want I want to know. Any suggestions? Write it in Python because you won't be happy with your first 2 or 3 attempts. Compiler books have historically been very heavy on front end issues, in large part because the theory of parsing is well understood. What relatively little you can find on back end issues tends to be sketchy and severely limited by the author's personal experience. In large part this is because almost no interesting optimization problem can be solved in linear time (whether it's optimal instruction selection, optimal register assignment, optimal instruction ordering, ...), so real-life back ends are a mountain of idiosyncratic heuristics. Excellent advice that almost nobody follows <0.5 wink>: choose a flexible intermediate representation, then structure all your transformations as independent passes, such that the output of every pass is acceptable as the input to every pass. Then keep each pass focused, as simple as possible (for example, if a transformation may create regions of dead code, don't dare try to clean it up in the same pass, or contort the logic even a little bit to try to avoid creating dead code -- instead let it create all the dead code it wants, and (re)invoke a "remove dead code" pass afterwards). *Because* back ends are a mountain of idiosyncratic heuristics, this design lets you add new ones, remove old ones, and reorder them with minimal pain. One compiler I read about (but didn't have the pleasure of using) actually allowed you to specify the sequence of back end transformations on the cmdline, using a regular expression notation where, e.g. (hoist dead)+ meant "run the hoist pass followed by the dead code removal pass, one or more times, until a fixed point is reached". Since none of that told you want to know either , what do you want to know? Sounds like a legit topic for python-dev. From dan@dberlin.org Tue Feb 19 01:50:57 2002 From: dan@dberlin.org (Daniel Berlin) Date: Mon, 18 Feb 2002 20:50:57 -0500 (EST) Subject: [Python-Dev] Rattlesnake progress In-Reply-To: <20020218231217.GA5118@netthink.co.uk> Message-ID: On Mon, 18 Feb 2002, Simon Cozens wrote: > Neil Schemenauer: > > good reference on code generation. I have the "New Dragon book" as well > > as "Essentials of Programming Lanuages" but neither one seem to be > > telling me want I want to know. Any suggestions? > > You could try Appel: Modern Compiler Implementation in {C,ML,Java}. When you get to optimizations, you want Advanced Compiler Design and Implementation by Muchnick. And/Or Building an Optimizing Compiler by Morgan. From jacobs@penguin.theopalgroup.com Tue Feb 19 01:51:48 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 18 Feb 2002 20:51:48 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: Message-ID: On 18 Feb 2002, Martin v. Loewis wrote: > Kevin Jacobs writes: > > > 1) Should class instances explicitly/directly know all of their attributes? > > Since types are classes, this is the same question as "should type > instances know all their attributes?" I don't think they should, in > general: For example, there is no way to find out whether a string > object has an interned pointer, and I don't think there should be. I explicitly made note that my discussion of slots was in the context of native new-style Python class and not C-types, even ones that can be used as bases class for other new-style classes. We will always need to hide C implementation details behind Python objects, but we are not talking about reflection on such hidden state. My belief is that slots should be treated as much as possible like normal attributes and not as "hidden object state". > class Spam(object): > __slots__ = ('a','b') > > s = Spam() > s.a = {} > del Spam.a > > you loose access to s.a, even though it is still available (I guess it > is actually a bug that cyclic garbage collection won't find cycles > involving slots). Not exactly -- the semantics are the same as regular attributes in this case. Continuing your example, you can then do s.a = 5 so access to the slot is not lost, only to the value. > > 2) Should attribute access follow the same resolution order rules as > > methods? > > Yes, I think so. Ouch! This implies a great deal more than you may be thinking of. For example, do you really want to be able to do this: class Foo(object): __slots__ = ('a',) class Bar(Foo): __slots__ = ('a',) bar = Bar() bar.a = 1 super(Bar, bar).a = 2 print bar.a > 1 This violates the traditional Python idiom of having a flat namespace for attributes, even in the presence of inheritance. This has very profound implications to Python semantics and performance. > > 4) Should __slots__ be flat? > > Yes. They should also be a property of the type, not a member of the > dict of the type, and they should be a tuple of member object, not a > list of strings. It might be reasonable to call this property > __members__. > > > > ('c','d') # current behavior > > or > > > ('a','b','c','d') # alternate behavior > > Neither, nor; assuming you meant Bar to inherit from Foo, it should be > > (, , > , ) An interesting idea that I had not considered. Currently the slot descriptor objects to not directly expose the name or type of the object except in the repr. This could easily be fixed. However this brings up another issue. The essence of a slot (or, more correctly, a slot descriptor) is to store an offset into a PyObject* that represents a value within an object. The name to which the slot is bound is not the intrinsic and defining characteristic. So it would be somewhat illogical to mandate static name bindings to slots. This supports the notion rebinding slot names during object inheritance (this is already partially implemented), or storing the descriptor objects in a __slots__ tuple and providing an interface to query and reset the name binding for each of them. Comments? Thoughts? Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From tim.one@comcast.net Tue Feb 19 04:57:22 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 18 Feb 2002 23:57:22 -0500 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: <009201c1b8dc$00eb6fb0$ba03a8c0@activestate.ca> Message-ID: [Jeff Hobbs] > I guess I would need to get a better understanding of why it was > designed with the sleep in the first place. Martin mentioned > that it allows a thread switch to occur, but a shorter sleep > interval would have done the same. I believe Martin was correct in large part. The other part is that, without a sleep at all, we would have a pure busy loop here, competing for cycles non-stop with every process on the box. About the length of the sleep, do note that Sleep(20) sleeps 20 milliseconds here (not seconds), and that the sleep is skipped so long as Tcl_DoOneEvent() says it's finding things to do. IOW, Tcl gets all the cycles it can it eat so long as it says it's busy, and doesn't generally wait more than about 0.02 seconds for another chance after it runs out of work to do. Back when 20 was first picked, machines were slow enough that an utterly idle Tkinter app in the background still showed up as consuming a measurable percentage of a CPU, thanks to this not-so-busy loop. We could afford to make the sleep shorter on faster boxes, but I'm not sure I buy the argument that we're making Tcl/Tk look sluggish. The reason we hate the loop is more that it's a miserably ugly hack. From martin@v.loewis.de Tue Feb 19 08:43:21 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Feb 2002 09:43:21 +0100 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: References: Message-ID: Tim Peters writes: > I believe Martin was correct in large part. The other part is that, without > a sleep at all, we would have a pure busy loop here, competing for cycles > non-stop with every process on the box. Avoiding the wait to be busy is probably the #1 reason for the sleep. The alternative to avoid a busy wait would be to do Tcl_DoOneEvent with TCL_ALL_EVENTS, however, once Tcl becomes idle, this will block, depriving any other thread of the opportunity to invoke Tcl. Of Jeff's options, invoking Tcl_SetMaxBlockTime seemed to be most promising: I want Tcl_DoOneEvent to return after 20ms, to give other Tcl threads a chance. So I invented the patch Index: _tkinter.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Modules/_tkinter.c,v retrieving revision 1.123 diff -u -r1.123 _tkinter.c --- _tkinter.c 26 Jan 2002 20:21:50 -0000 1.123 +++ _tkinter.c 19 Feb 2002 08:34:17 -0000 @@ -1676,7 +1967,11 @@ { int threshold = 0; #ifdef WITH_THREAD + Tcl_Time blocktime = {0, 20000}; PyThreadState *tstate = PyThreadState_Get(); + ENTER_TCL + Tcl_SetMaxBlockTime(&blocktime); + LEAVE_TCL #endif if (!PyArg_ParseTuple(args, "|i:mainloop", &threshold)) @@ -1688,16 +1983,15 @@ !errorInCmd) { int result; + #ifdef WITH_THREAD Py_BEGIN_ALLOW_THREADS PyThread_acquire_lock(tcl_lock, 1); tcl_tstate = tstate; - result = Tcl_DoOneEvent(TCL_DONT_WAIT); + result = Tcl_DoOneEvent(0); tcl_tstate = NULL; PyThread_release_lock(tcl_lock); - if (result == 0) - Sleep(20); Py_END_ALLOW_THREADS #else result = Tcl_DoOneEvent(0); However, it does not work. The script import Tkinter import thread import time c = 0 l = Tkinter.Label(text = str(c)) l.pack() def doit(): global c while 1: c+=1 l['text']=str(c) time.sleep(1) thread.start_new(doit, ()) l.tk.mainloop() ought to continously increase the counter in the label (once a second), but doesn't, atleast not on Linux, using Tcl 8.3.3. In the strace output, it appears that it first does a select call with a timeout, but that is followed by one without time limit before Tcl_DoOneEvent returns. Jeff, any ideas as to why this is happening? Regards, Martin From mwh@python.net Tue Feb 19 09:50:21 2002 From: mwh@python.net (Michael Hudson) Date: 19 Feb 2002 09:50:21 +0000 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: martin@v.loewis.de's message of "19 Feb 2002 09:43:21 +0100" References: Message-ID: <2m3czx3in6.fsf@starship.python.net> martin@v.loewis.de (Martin v. Loewis) writes: [schniiiip] > ought to continously increase the counter in the label (once a > second), but doesn't, atleast not on Linux, using Tcl 8.3.3. In the > strace output, it appears that it first does a select call with a > timeout, but that is followed by one without time limit before > Tcl_DoOneEvent returns. > > Jeff, any ideas as to why this is happening? Well, at least this one is easy. From the link Jeff posted: Information provided to Tcl_SetMaxBlockTime is only used for the next call to Tcl_WaitForEvent; it is discarded after Tcl_WaitForEvent returns. The next time an event wait is done each of the event sources' setup procedures will be called again, and they can specify new information for that event wait. so you need to move the Tcl_SetMaxBlockTime inside the while loop. It certainly looks to this novice as if 8.3 provides enough hooks to do what we want, but... Cheers, M. -- Like most people, I don't always agree with the BDFL (especially when he wants to change things I've just written about in very large books), ... -- Mark Lutz, http://python.oreilly.com/news/python_0501.html From simon@netthink.co.uk Tue Feb 19 10:35:24 2002 From: simon@netthink.co.uk (Simon Cozens) Date: Tue, 19 Feb 2002 10:35:24 +0000 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: References: <20020218231217.GA5118@netthink.co.uk> Message-ID: <20020219103524.GB8249@netthink.co.uk> Daniel Berlin: > > You could try Appel: Modern Compiler Implementation in {C,ML,Java}. > > When you get to optimizations, you want Advanced Compiler Design and > Implementation by Muchnick. > > And/Or Building an Optimizing Compiler by Morgan. Yeah. See also http://www.perldoc.com/readinglist.pl Don't-be-put-off-by-the-domain-name-ly yrs, Simon -- You are in a maze of little twisting passages, all different. From kjetilja@cs.uit.no Tue Feb 19 13:04:44 2002 From: kjetilja@cs.uit.no (Kjetil Jacobsen) Date: 19 Feb 2002 14:04:44 +0100 Subject: [Python-Dev] asyncore.poll behaviour Message-ID: <1014123884.20195.93.camel@tac-ce1.cs.UiT.No> hello, in python2.2 the semantics for asyncore.poll (which uses the select() system call) is different than for asyncore.poll3 (which uses the poll() system call) when an EINTR exception occurs. in asyncore.poll3, the pollset is correctly reset to an empty list, but in asyncore.poll this is not done, which in turn causes a lot of strange things to happen when an EINTR occurs (spurious handler invocations and so on). i've tried to upload the patch to sourceforge, but the patch manager has not responded for me the last couple of days so i'm sending it here instead. the fix is a simple one-liner which makes the semantics of asyncore.poll and asyncore.poll3 similar: *** /usr/local/lib/python2.2/asyncore.py Wed Jan 30 15:51:00 2002 --- asyncore.py Wed Jan 30 16:19:28 2002 *************** *** 80,85 **** --- 80,86 ---- except select.error, err: if err[0] != EINTR: raise + r, w, e = [], [], [] if DEBUG: print r,w,e btw, the asyncore.poll2 function does not seems to have either the behaviour of asyncore.poll or asyncore.poll3 with respect to handling of EINTR. perhaps asyncore.poll2 should be removed altogether or just remapped to asyncore.poll3? regards, - kjetil From dan@dberlin.org Tue Feb 19 13:14:20 2002 From: dan@dberlin.org (Daniel Berlin) Date: Tue, 19 Feb 2002 08:14:20 -0500 (EST) Subject: [Python-Dev] Rattlesnake progress In-Reply-To: Message-ID: On Mon, 18 Feb 2002, Tim Peters wrote: > [Neil Schemenauer] > > I've been working on Skip's rattlesnake and have made some progress. > > Cool! I encourage this. That and 2 dollars will buy you a cup of coffee. > > > Right now I'm trying to hack the compiler package and am looking for a > > good reference on code generation. I have the "New Dragon book" as well > > as "Essentials of Programming Lanuages" but neither one seem to be > > telling me want I want to know. Any suggestions? > > Write it in Python because you won't be happy with your first 2 or 3 > attempts. > > Compiler books have historically been very heavy on front end issues, in > large part because the theory of parsing is well understood. What > relatively little you can find on back end issues tends to be sketchy and > severely limited by the author's personal experience. In large part this is > because almost no interesting optimization problem can be solved in linear > time (whether it's optimal instruction selection, optimal register > assignment, optimal instruction ordering, ...), so real-life back ends are a > mountain of idiosyncratic heuristics. This is true. In fact, it's actually worse than "can be solved in linear time", it's "are currently thought/proved to be in NP". For graph coloring register allocation algorithms, it's even worse (if you thought that was possible). You can't even approximate the chromatic number of the graph (IE, the number of colors, and therefore, registers, it would take to color it) to more than a certain degree in an absurd time bound. However, you've missed the middle end, where a lot of interesting optimizations *can* be done in linear time or n log n time, and where most people now concentrate their time. On an SSA graph, you can do at least the following in linear time or n log n time: Partial Redundancy Elimination Conditional Constant Propagation Copy propagation Dead store elimination Dead load elimination Global code motion Global value numbering Store motion Load motion Dead code elimination Lots of loop optimizations Lots of memory hiearchy optimizations I've ignored interprocedural optimizations, including various pointer analyses that are linear time or close to it, because it would be harder to apply them to python. > > Excellent advice that almost nobody follows <0.5 wink>: choose a flexible > intermediate representation, then structure all your transformations as > independent passes, such that the output of every pass is acceptable as the > input to every pass. Everyone tries to do this these days, actually. At least, from my working on gcc and looking at the source to tons of compilers each year. You really need more than one level of IR to do serious optimization. Tradeoff between losing valueable info (such as array indexing operations) vs. simplicity of writing optimization passes usually causes people to do some types optimization on higher level IR's (particularly, loop optimizations), while other optimization passes on lower IR's. GCC is moving towards 3 IR's, a language independent tree IR, a mid-level RTL, and the current low-level RTL. > Then keep each pass focused, as simple as possible > (for example, if a transformation may create regions of dead code, don't > dare try to clean it up in the same pass, or contort the logic even a little > bit to try to avoid creating dead code -- instead let it create all the dead > code it wants, and (re)invoke a "remove dead code" pass afterwards). Usually you don't hit this problem inside a single pass like DCE, because they iterate until nothing changes. > One compiler I read about (but didn't have the pleasure of using) actually > allowed you to specify the sequence of back end transformations on the > cmdline, using a regular expression notation where, e.g. > > (hoist dead)+ > > meant "run the hoist pass followed by the dead code removal pass, one or > more times, until a fixed point is reached". > > Since none of that told you want to know either , what do you want to > know? Sounds like a legit topic for python-dev. From mwh@python.net Tue Feb 19 14:10:40 2002 From: mwh@python.net (Michael Hudson) Date: 19 Feb 2002 14:10:40 +0000 Subject: [Python-Dev] 2.2.1 issues Message-ID: <2madu5zhnj.fsf@starship.python.net> Well, we have the first 2.2 bugfix that isn't a no-brainer to port to 2.2.1. This is to do with the [ #495401 ] Build troubles: --with-pymalloc bug. As far as understand it, there were two problems. 1) with wide unicode characters, some function in unicodeobject.c to do with interpreting escape codes could write into memory it didn't own. 2) something to do with the handling of "unpaired high surrogates" in the utf-8 codec. Were these problems related? I think they got fixed at the same time, but I may have gotten confused. 1) shouldn't be too much of an issue to get into 2.2.1 (there was some contention about which fix performed better, but for 2.2.1 I don't care too much). 2) is more troublesome, because to fix it properly breaks .pycs, in turn because marshal uses the utf-8 codec to store unicode string constants, and this is a no-no according to PEP 6. Is it possible to worm around 2) by reconstructing valid strings from the bad marshal data, or has information been lost? How severe is the bug? Maybe it would be best to leave it unfixed in 2.2.1. Basically, I guess I'm saying I'm too much of a unicode dunce to understand all the issues involved in fixing this problems in 2.2, so as unofficial bugfix-porter, I'd like someone else (Marc? Martin?) to port these particular fixes. If the mechanics of fiddling with the branch is too much, sending me patches is fine. Cheers, M. -- This is the fixed point problem again; since all some implementors do is implement the compiler and libraries for compiler writing, the language becomes good at writing compilers and not much else! -- Brian Rogoff, comp.lang.functional From mal@lemburg.com Tue Feb 19 14:34:24 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 19 Feb 2002 15:34:24 +0100 Subject: [Python-Dev] 2.2.1 issues References: <2madu5zhnj.fsf@starship.python.net> Message-ID: <3C726270.7D33E687@lemburg.com> Michael Hudson wrote: > > Well, we have the first 2.2 bugfix that isn't a no-brainer to port to > 2.2.1. This is to do with the > > [ #495401 ] Build troubles: --with-pymalloc > > bug. > > As far as understand it, there were two problems. > > 1) with wide unicode characters, some function in unicodeobject.c to > do with interpreting escape codes could write into memory it didn't > own. > > 2) something to do with the handling of "unpaired high surrogates" in > the utf-8 codec. > > Were these problems related? I think they got fixed at the same time, > but I may have gotten confused. Right. 1) was caused by 2). Both are fixed now. > 1) shouldn't be too much of an issue to get into 2.2.1 (there was some > contention about which fix performed better, but for 2.2.1 I don't > care too much). > > 2) is more troublesome, because to fix it properly breaks .pycs, in > turn because marshal uses the utf-8 codec to store unicode string > constants, and this is a no-no according to PEP 6. > > Is it possible to worm around 2) by reconstructing valid strings from > the bad marshal data, or has information been lost? How severe is the > bug? Maybe it would be best to leave it unfixed in 2.2.1. Well, I posted a message to python-dev or the checkins list about this (don't remember). The situation is basically like this: In Python <= 2.2.0, you could write u = u"\uD800" in a .py file. The first time you import this file, Python will create a .pyc file for it using the broken UTF-8 encoding. The import will succeed. The second time you import the module, Python will try to use the .pyc file. Now reading that file in fails with a UnicodeError and Python also does not revert to the .py file. As a result, modules using unpaired surrogates in Unicode literals are simply broken in Python <= 2.2.0. The problem with backporting this patch is that in order for Python to properly recompile any broken module, the magic will have to be changed. Question is whether this is a reasonable thing to do in a patch level release... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From nas@python.ca Tue Feb 19 14:51:51 2002 From: nas@python.ca (Neil Schemenauer) Date: Tue, 19 Feb 2002 06:51:51 -0800 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: ; from dan@dberlin.org on Mon, Feb 18, 2002 at 08:50:57PM -0500 References: <20020218231217.GA5118@netthink.co.uk> Message-ID: <20020219065151.A28722@glacier.arctrix.com> Daniel Berlin wrote: > When you get to optimizations, you want Advanced Compiler Design and > Implementation by Muchnick. Right now I'm not planning to do any optimizations (except perhaps limiting the number of registers used). Neil From fdrake@acm.org Tue Feb 19 15:23:07 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 19 Feb 2002 10:23:07 -0500 Subject: [Python-Dev] 2.2.1 issues In-Reply-To: <3C726270.7D33E687@lemburg.com> References: <2madu5zhnj.fsf@starship.python.net> <3C726270.7D33E687@lemburg.com> Message-ID: <15474.28123.180241.360278@grendel.zope.com> M.-A. Lemburg writes: > The problem with backporting this patch is that in order > for Python to properly recompile any broken module, the > magic will have to be changed. Question is whether this > is a reasonable thing to do in a patch level release... Guido can rule as he sees fit, but I don't see any reason *not* to change the magic number. This seems like a pretty important fix to me. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From dan@dberlin.org Tue Feb 19 15:57:59 2002 From: dan@dberlin.org (Daniel Berlin) Date: Tue, 19 Feb 2002 10:57:59 -0500 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: <20020219065151.A28722@glacier.arctrix.com> Message-ID: <72820FED-2551-11D6-9D9B-000393575BCC@dberlin.org> On Tuesday, February 19, 2002, at 09:51 AM, Neil Schemenauer wrote: > Daniel Berlin wrote: >> When you get to optimizations, you want Advanced Compiler Design and >> Implementation by Muchnick. > > Right now I'm not planning to do any optimizations (except perhaps > limiting the number of registers used). > This is, of course, a tricky optimization to do. Limiting registers used involves splitting live ranges at the right places, etc. --Dan From jacobs@penguin.theopalgroup.com Tue Feb 19 16:01:26 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Tue, 19 Feb 2002 11:01:26 -0500 (EST) Subject: [Python-Dev] Rattlesnake progress In-Reply-To: <72820FED-2551-11D6-9D9B-000393575BCC@dberlin.org> Message-ID: On Tue, 19 Feb 2002, Daniel Berlin wrote: > On Tuesday, February 19, 2002, at 09:51 AM, Neil Schemenauer wrote: > > > Daniel Berlin wrote: > >> When you get to optimizations, you want Advanced Compiler Design and > >> Implementation by Muchnick. > > > > Right now I'm not planning to do any optimizations (except perhaps > > limiting the number of registers used). > > > This is, of course, a tricky optimization to do. > Limiting registers used involves splitting live ranges at the right > places, etc. Why limit the number of registers at all? So long as they fit in L1 cache you are golden. If not, no great loss. Of course, this does mean that you will want to have the ability to heap-allocate large register files, though I suspect that frame objects do this already for fast locals (of course, I haven't looked). -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From dan@dberlin.org Tue Feb 19 16:37:22 2002 From: dan@dberlin.org (Daniel Berlin) Date: Tue, 19 Feb 2002 11:37:22 -0500 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: Message-ID: On Tuesday, February 19, 2002, at 11:01 AM, Kevin Jacobs wrote: > On Tue, 19 Feb 2002, Daniel Berlin wrote: >> On Tuesday, February 19, 2002, at 09:51 AM, Neil Schemenauer wrote: >> >>> Daniel Berlin wrote: >>>> When you get to optimizations, you want Advanced Compiler Design and >>>> Implementation by Muchnick. >>> >>> Right now I'm not planning to do any optimizations (except perhaps >>> limiting the number of registers used). >>> >> This is, of course, a tricky optimization to do. >> Limiting registers used involves splitting live ranges at the right >> places, etc. > > Why limit the number of registers at all? So long as they fit in L1 > cache > you are golden. Err, what makes you think this? The largest problem on architectures like x86 is the number of registers. You end up with about 4 usable registers. (hardware register renaming only helps eliminate instruction dependencies, before someone mentions it). Performance quickly drops when you start spilling registers to the stack. In fact, i've seen multiple SPEC regressions of 15% or more caused by a single extra spilled register. Why? Because you have to save it and reload it multiple times. These *kill* pipelines, and instruction scheduling. It's also *much* harder to model the cache hierarchy properly so that you can make sure they'd fit in the l1 cache, than it is to make sure they stay in registers where needed in the first place. Try taking a performance critical loop entirely in registers, and change it to save to and load from memory into a register on every iteration. See how much slower it gets. --Dan From mwh@python.net Tue Feb 19 16:50:04 2002 From: mwh@python.net (Michael Hudson) Date: 19 Feb 2002 16:50:04 +0000 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: Daniel Berlin's message of "Tue, 19 Feb 2002 11:37:22 -0500" References: Message-ID: <2mbselpgar.fsf@starship.python.net> Daniel Berlin writes: > On Tuesday, February 19, 2002, at 11:01 AM, Kevin Jacobs wrote: > > > On Tue, 19 Feb 2002, Daniel Berlin wrote: > >> On Tuesday, February 19, 2002, at 09:51 AM, Neil Schemenauer wrote: > >> > >>> Daniel Berlin wrote: > >>>> When you get to optimizations, you want Advanced Compiler Design and > >>>> Implementation by Muchnick. > >>> > >>> Right now I'm not planning to do any optimizations (except perhaps > >>> limiting the number of registers used). > >>> > >> This is, of course, a tricky optimization to do. > >> Limiting registers used involves splitting live ranges at the right > >> places, etc. > > > > Why limit the number of registers at all? So long as they fit in L1 > > cache > > you are golden. > > Err, what makes you think this? > The largest problem on architectures like x86 is the number of registers. > You end up with about 4 usable registers. (hardware register renaming > only helps eliminate instruction dependencies, before someone mentions > it). > Performance quickly drops when you start spilling registers to the stack. I think you misunderstand what Rattlesnake is; AIUI it is (or will/intends to be) a register based VM for Python replacing the current stack based VM -- I think gcc still gets to decide which x86 registers to use... Cheers, M. -- ARTHUR: The ravenours bugblatter beast of Traal ... is it safe? FORD: Oh yes, it's perfectly safe ... it's just us who are in trouble. -- The Hitch-Hikers Guide to the Galaxy, Episode 6 From jeff@hobbs.org Tue Feb 19 16:53:53 2002 From: jeff@hobbs.org (Jeffrey Hobbs) Date: Tue, 19 Feb 2002 08:53:53 -0800 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: Message-ID: Martin, ... > Of Jeff's options, invoking Tcl_SetMaxBlockTime seemed to be most > promising: I want Tcl_DoOneEvent to return after 20ms, to give other > Tcl threads a chance. So I invented the patch ... > ought to continously increase the counter in the label (once a > second), but doesn't, atleast not on Linux, using Tcl 8.3.3. In the > strace output, it appears that it first does a select call with a > timeout, but that is followed by one without time limit before > Tcl_DoOneEvent returns. IIRC, Tcl_SetMaxBlockTime is a one-short call - it sets the next block time, not all block times. I'm sure there was a reason for this, but that was implemented before I was a core guy. Anyway, I think you just need to try: - result = Tcl_DoOneEvent(TCL_DONT_WAIT); + Tcl_SetMaxBlockTime(&blocktime); + result = Tcl_DoOneEvent(0); and see if that satisfies the need for responsiveness as well as not blocking. Thanks, Jeff From aahz@rahul.net Tue Feb 19 17:14:47 2002 From: aahz@rahul.net (Aahz Maruch) Date: Tue, 19 Feb 2002 09:14:47 -0800 (PST) Subject: [Python-Dev] 2.2.1 issues In-Reply-To: <15474.28123.180241.360278@grendel.zope.com> from "Fred L. Drake, Jr." at Feb 19, 2002 10:23:07 AM Message-ID: <20020219171447.EBA84E8C8@waltz.rahul.net> Fred L. Drake, Jr. wrote: > M.-A. Lemburg writes: >> >> The problem with backporting this patch is that in order >> for Python to properly recompile any broken module, the >> magic will have to be changed. Question is whether this >> is a reasonable thing to do in a patch level release... > > Guido can rule as he sees fit, but I don't see any reason *not* to > change the magic number. This seems like a pretty important fix to > me. The question is not whether it's an important fix, but whether the fix and its consequences are important enough to warrant changing the magic number. It's obviously possible for people to regen their .pyc files by deleting them, so I think we should wait for Guido to say "yes" before bumping the magic number, given that one of the cardinal points of the new bugfix process is that .pyc files will not be regenerated due to a bugfix release. Note carefully that I do agree that it's a serious enough issue to consider the possibility of breaking that rule, but I think we can't afford to pull the trigger without Guido's specific buy-in. We'll also need to think about how we're going to market it if we do bump the magic number. To me, then, the proper question is, "Is this an issue where *automatic* regeneration of .pyc files is sufficiently important?" (I don't know enough to have an opinion myself ;-), but I'll point out that the import failure means that at least it isn't a silent failure -- which I would absolutely agree needs a magic number bump.) -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From pedroni@inf.ethz.ch Tue Feb 19 17:29:55 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Tue, 19 Feb 2002 18:29:55 +0100 Subject: [Python-Dev] Meta-reflections References: Message-ID: <017501c1b96b$0c4e93c0$6d94fea9@newmexico> Hi. From: Kevin Jacobs > On 18 Feb 2002, Martin v. Loewis wrote: > > Kevin Jacobs writes: > > > 2) Should attribute access follow the same resolution order rules as > > > methods? > > > > Yes, I think so. > > Ouch! This implies a great deal more than you may be thinking of. For > example, do you really want to be able to do this: > > class Foo(object): > __slots__ = ('a',) > > class Bar(Foo): > __slots__ = ('a',) > > bar = Bar() > bar.a = 1 > super(Bar, bar).a = 2 > print bar.a > > 1 > > This violates the traditional Python idiom of having a flat namespace for > attributes, even in the presence of inheritance. This has very profound > implications to Python semantics and performance. > Probably I have not followed the discussion close enough. The variant with super does not work but >>> bar=Bar() >>> bar.a=1 >>> Foo.a.__set__(bar,2) >>> bar.a 1 >>> Foo.a.__get__(bar) 2 >>> works. Slots are already not flat. They have basically a similar behavior to fields in JVM object model (and I presume in .NET). Given that implementing slots with fields is one of the possibility for Jython (and some possible Python over .NET), with indeed some practical advantages [Btw for the moment I don't see obstacles to such an approach but I have not considered all the details], it is probably reasonable to keep things as they are. Consider also: >>> class Goo(object): ... __slots__ = ('a',) ... >>> class Bar(Goo,Foo): pass ... Traceback (most recent call last): File "", line 1, in ? TypeError: multiple bases have instance lay-out conflict that helps and reinforces that model. Samuele. From jacobs@penguin.theopalgroup.com Tue Feb 19 17:53:03 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Tue, 19 Feb 2002 12:53:03 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <017501c1b96b$0c4e93c0$6d94fea9@newmexico> Message-ID: On Tue, 19 Feb 2002, Samuele Pedroni wrote: > Slots are already not flat. > They have basically a similar behavior to fields > in JVM object model (and I presume in .NET). I agree, but do we want slots to be non-flat? It goes very much against the traditional Python idiom. In my opinion, I believe that slots should have exactly the same semantics as normal instance attributes, except for how/where they are allocated. > Given that implementing slots with fields is one of the possibility for > Jython This is possible for flat slot namespaces too; just remap new slots to existing ones when they overlap, instead of allocating a new one. > Consider also: > > >>> class Goo(object): > ... __slots__ = ('a',) > ... > >>> class Bar(Goo,Foo): pass > ... > Traceback (most recent call last): > File "", line 1, in ? > TypeError: multiple bases have instance lay-out conflict > > that helps and reinforces that model. I'll contend that the current implementation is flawed for this and several other reasons I've stated in my previous e-mails. Of course, we're waiting to hear back from Guido when he returns, since his opinion is infinitely more important than mine in this matter. Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From mal@lemburg.com Tue Feb 19 18:00:24 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 19 Feb 2002 19:00:24 +0100 Subject: [Python-Dev] 2.2.1 issues References: <20020219171447.EBA84E8C8@waltz.rahul.net> Message-ID: <3C7292B8.89E43335@lemburg.com> Aahz Maruch wrote: > > Fred L. Drake, Jr. wrote: > > M.-A. Lemburg writes: > >> > >> The problem with backporting this patch is that in order > >> for Python to properly recompile any broken module, the > >> magic will have to be changed. Question is whether this > >> is a reasonable thing to do in a patch level release... > > > > Guido can rule as he sees fit, but I don't see any reason *not* to > > change the magic number. This seems like a pretty important fix to > > me. > > The question is not whether it's an important fix, but whether the fix > and its consequences are important enough to warrant changing the magic > number. It's obviously possible for people to regen their .pyc files by > deleting them, so I think we should wait for Guido to say "yes" before > bumping the magic number, given that one of the cardinal points of the > new bugfix process is that .pyc files will not be regenerated due to a > bugfix release. We could of course ship the patch level release with the same magic number. Modules that haven't worked before will then start to work. Note that we haven't had *any* bug report directly related to this, so it's likely that noone has actually hit this bug in practice. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From nas@python.ca Tue Feb 19 18:20:53 2002 From: nas@python.ca (Neil Schemenauer) Date: Tue, 19 Feb 2002 10:20:53 -0800 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: ; from dan@dberlin.org on Tue, Feb 19, 2002 at 11:37:22AM -0500 References: Message-ID: <20020219102053.A29414@glacier.arctrix.com> Daniel Berlin wrote: > The largest problem on architectures like x86 is the number of > registers. You end up with about 4 usable registers. (hardware > register renaming only helps eliminate instruction dependencies, > before someone mentions it). Performance quickly drops when you start > spilling registers to the stack. I'm not going to be using hardware registers. Bytecode will be generated to run on a virtual machine. I can use a many registers as I want. However, I suspect it would be better to reuse registers rather than have one for every intermediate result. Neil From pedroni@inf.ethz.ch Tue Feb 19 18:25:26 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Tue, 19 Feb 2002 19:25:26 +0100 Subject: [Python-Dev] Meta-reflections References: Message-ID: <01a001c1b972$cdf9e040$6d94fea9@newmexico> From: Kevin Jacobs > On Tue, 19 Feb 2002, Samuele Pedroni wrote: > > Slots are already not flat. > > They have basically a similar behavior to fields > > in JVM object model (and I presume in .NET). > > I agree, but do we want slots to be non-flat? It goes very much against the > traditional Python idiom. In my opinion, I believe that slots should have > exactly the same semantics as normal instance attributes, except for > how/where they are allocated. Personally I don't expect slots to behave like attributes. I mean, the different naming is a hint. > > Given that implementing slots with fields is one of the possibility for > > Jython > > This is possible for flat slot namespaces too; just remap new slots to > existing ones when they overlap, instead of allocating a new one. Yes, but from the POV of fields this is less natural. There's a trade-off issue here. > > Consider also: > > > > >>> class Goo(object): > > ... __slots__ = ('a',) > > ... > > >>> class Bar(Goo,Foo): pass > > ... > > Traceback (most recent call last): > > File "", line 1, in ? > > TypeError: multiple bases have instance lay-out conflict > > > > that helps and reinforces that model. > > I'll contend that the current implementation is flawed for this and several > other reasons I've stated in my previous e-mails. Of course, we're waiting > to hear back from Guido when he returns, since his opinion is infinitely > more important than mine in this matter. > It is not flawed, it is just single-inheritance-of-struct-like-layout-based. I'm fine with that. To be honest I would find very annoying that what we are about to implement in Jython 2.2 should be somehow radically changed for Jython 2.3. We have not the necessary amount of human resources to happily play that kind of game. I hope and presume that Guido did know what he was designing, and I had that impression too. OTOH I agree that pickle should work for new-style classes too. regards, Samuele Pedroni. From jacobs@penguin.theopalgroup.com Tue Feb 19 17:23:40 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Tue, 19 Feb 2002 12:23:40 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <01a001c1b972$cdf9e040$6d94fea9@newmexico> Message-ID: On Tue, 19 Feb 2002, Samuele Pedroni wrote: > Personally I don't expect slots to behave like attributes. I mean, > the different naming is a hint. For me, slot declarations are a hint that certain attributes should be allocated 'statically' versus the normal Python 'dynamic' attribute allocation. In virtually all other ways I expect them to act like attributes. The question is should the static allocation introduce all the complex scoping rules that come with Java fields or C++ instance variables. If we go by the "principle of least surprise", it seems much better to keep the normal Python attribute rules than those of Java or C++. > > > Given that implementing slots with fields is one of the possibility for > > > Jython > > > > This is possible for flat slot namespaces too; just remap new slots to > > existing ones when they overlap, instead of allocating a new one. > > Yes, but from the POV of fields this is less natural. > There's a trade-off issue here. Less natural for Java maybe, but not for Python. > > I'll contend that the current implementation is flawed for this and several > > other reasons I've stated in my previous e-mails. Of course, we're waiting > > to hear back from Guido when he returns, since his opinion is infinitely > > more important than mine in this matter. > > It is not flawed, it is just single-inheritance-of-struct-like-layout-based. > I'm fine with that. Please read some of my earlier messages. There are other 'warts'. > To be honest I would find very annoying that what we are about > to implement in Jython 2.2 should be somehow radically changed for Jython 2.3. > We have not the necessary amount of human resources > to happily play that kind of game. Well, we are dealing with an implementation that is not documented _at all_. So, in virtually all respects, Jython 2.2 could ignore their existence totally and still function correctly. I hope that you will be pleased by the in-depth discussions on this topic, since it will likely lead to the formulation of refined documentation for many of these very fuzzily defined features. As an implementer, that kind of clarification can be invaluable since it means you don't have to guess at the semantics and have to change it later. > I hope and presume that Guido did know what he was designing, > and I had that impression too. > OTOH I agree that pickle should work for new-style classes too. He knew what he was designing, but was focused on achieving other goals with this infrastructure (like class/type unification). I have the feeling that slots were more of an experiment than anything else. Don't get me wrong -- they are insanely useful even in their current state. On the other hand, I don't think they're ready for prime-time until we smooth over the picky semantic issues relating to slot overloading, reflection and renaming. Just look at the Python standard library -- you'll notice that slots are not used anywhere. I predict that we will be using them extensively, especially in the standard library, as soon as they are deemed ready for prime-time. Best regards, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From tismer@tismer.com Tue Feb 19 19:15:31 2002 From: tismer@tismer.com (Christian Tismer) Date: Tue, 19 Feb 2002 20:15:31 +0100 Subject: [Python-Dev] Stackless Design Q. Message-ID: <3C72A453.7080905@tismer.com> Hi friends, my tasklets are flying. Now I am at the point where I'm worst suited for: Design an interface. First of all what it is: We have a "tasklet", a tiny object around chains of frames, with some additional structure that keeps C stack snippets. Ignore the details, tasklets are simply the heart of Stackless' coro/uthread/anything. The tstate maintains two lists of tasklets. One is the list of all running (or better "runnable"?) tasklets. These tasklets are not in "value state", they don't want to transmit a value. They can be scheduled as you like it. The other list keeps record of blocked tasklets. These are tasklets which are in "value state", they are waiting to do some transmission. Whenever a tasklet calls an other tasklet's "transfer" method for data transfer, the following happens: - the other tasklet is checked to be in blocked state. - the tasklet is removed from the runnable list, - it is blocked - data is transferred - the other tasklet is unblocked - the other tasklet is inserted into the runnables - the other tasklet is run The "transfer" word comes from Modula II. It implements coroutine behavior. Then, I have something like the ICON suspend, and I gave it the name "suspend" for now, since yield is in use. Suspend is a one-sided thing, and it is also needed to initiate a blocked state at all. Thre is a "client" variable that holds a reference to the tasklet that called "transfer". When suspend is called, then we have two cases: 1) There is another tasklet in the client variable. We take the client and call client.transfer(data) 2) There is no client set already. We go into blocked state and wait, until some tasklet transfers to us. What suspend does is yielding (like in generators), but also initial blocking, providing targets for transfer to jump to. What I'm missing is name and concept of an opposite method: Finish the data transfer, but leave both partners in runnable state. Ok, here a summary of what I have. Please tell me what you think, and what you'd like to change. stackless module: ----------------- schedule() switch to the next task in the runnable list. taskoutlet(func) call it with a function, and it generates a generator for tasklets. ret = suspend(value) initiates data exchange, see above. The current tasklet gets blocked. If client is set already, a transfer is performed. Example: def demo(n): print n factory = taskoutlet(demo) t = factory(42) # this is now a tasklet with bound arguments. tasklet methods: ---------------- t.insert() inserts t into the according tasklet ring at the "end". if t is in a ring already, it is removed first. The ring is either "runnables" or "blocked", depending on t's state. t.remove() removes t from whatever ring it is in. t.run() inserts t into runnables and switches to it immediately. ret = t.transfer(value) unblocks t, tansfers data, blocks myself. *Wanted* Again, What is the name of this function/method, and where does it belong? It - unblocks another tasklet - transfers data - does not block anything - schedules the other tasklet Or is this a bad design at all? If so, please help me as well. Thanks in advance - ciao - chris p.s.: Not all of the above is implemented already. -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From tim.one@comcast.net Tue Feb 19 20:01:28 2002 From: tim.one@comcast.net (Tim Peters) Date: Tue, 19 Feb 2002 15:01:28 -0500 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: <20020219102053.A29414@glacier.arctrix.com> Message-ID: [Neil Schemenauer] > I'm not going to be using hardware registers. Bytecode will be > generated to run on a virtual machine. I can use a many registers as I > want. However, I suspect it would be better to reuse registers rather > than have one for every intermediate result. I think your intuition there is darned good. Within a basic block, "the obvious" greedy scheme is provably optimal wrt minimizing the # of temp registers needed by the block: start the block with an initially empty set of temp registers. March down the instructions one at a time. For each input temp register whose contained value's last use is in the current instruction, return that temp register to the set of free temp registers. Then for each output temp register needed by the current instruction, take one (any) from the set of free temp registers; else if the set is empty invent a new temp register out of thin air (bumping a block high-water mark is an easy way to do this). That part is easy. What's hard (too expensive to be practical, in general) is provably minimizing the overall number of temps *across* basic blocks. Still, look up "graph coloring register assignment" on google and you'll find lots of effective heuristics. For a first cut, punt: just store everything still alive into memory at the end of a basic block. If you're retaining Rattlesnake's idea of treating "the register file" as a superset of the local vrbl vector, mounds of such "stores" will be nops. What's also hard is selecting instructions in such a way as to minimize the number of temp registers needed, ditto ordering instructions toward that end. When you think about those, you realize that minimizing the number of temps is actually a tradeoff, not an absolute good, both affecting and affected by other decisions. A mountain of idiosyncratic heuristics follows soon after . but=you-don't-have-to-solve-everything-at-the-start-ly y'rs - tim From martin@v.loewis.de Tue Feb 19 20:22:16 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Feb 2002 21:22:16 +0100 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: References: Message-ID: "Jeffrey Hobbs" writes: > IIRC, Tcl_SetMaxBlockTime is a one-short call - it sets the next > block time, not all block times. I'm sure there was a reason for > this, but that was implemented before I was a core guy. Anyway, > I think you just need to try: > > - result = Tcl_DoOneEvent(TCL_DONT_WAIT); > + Tcl_SetMaxBlockTime(&blocktime); > + result = Tcl_DoOneEvent(0); > > and see if that satisfies the need for responsiveness as well as > not blocking. Thanks, but that won't help. Tcl still performs a blocking select. Studying the Tcl source, it seems that the SetMaxBlockTime feature is broken in Tcl 8.3. DoOneEvent has /* * If TCL_DONT_WAIT is set, be sure to poll rather than * blocking, otherwise reset the block time to infinity. */ if (flags & TCL_DONT_WAIT) { tsdPtr->blockTime.sec = 0; tsdPtr->blockTime.usec = 0; tsdPtr->blockTimeSet = 1; } else { tsdPtr->blockTimeSet = 0; } So if TCL_DONT_WAIT is set, the blocktime is 0, otherwise, it is considered not set. It then goes on doing if ((flags & TCL_DONT_WAIT) || tsdPtr->blockTimeSet) { timePtr = &tsdPtr->blockTime; } else { timePtr = NULL; } result = Tcl_WaitForEvent(timePtr); So if TCL_DONT_WAIT isn't set, it will block; if it is, it will busy-wait. Looks like we lose either way. In-between, it invokes the setupProcs of each input source, so that they can set a maxblocktime, but I don't think _tkinter should hack itself into that process. So I don't see a solution on the path of changing how Tcl invokes select. About thread-safety: Is Tcl 8.3 thread-safe in its standard installation, so that we can just use it from multiple threads? If not, what is the compile-time check to determine whether it is thread-safe? If there is none, I really don't see a solution, and the Sleep must stay. Regards, Martin From martin@v.loewis.de Tue Feb 19 20:29:49 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Feb 2002 21:29:49 +0100 Subject: [Python-Dev] 2.2.1 issues In-Reply-To: <3C726270.7D33E687@lemburg.com> References: <2madu5zhnj.fsf@starship.python.net> <3C726270.7D33E687@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > Right. 1) was caused by 2). That wasn't actually the case. The overwriting of memory was really independent of the error in surrogate processing, and can be fixed independently. > As a result, modules using unpaired surrogates in Unicode > literals are simply broken in Python <= 2.2.0. I think this is unimportant enough to just accept this bug for Python 2.2.x. If people ever run into the problem, well: just don't do this. Unpaired surrogates will be entirely in Unicode 3.2. > The problem with backporting this patch is that in order > for Python to properly recompile any broken module, the > magic will have to be changed. Question is whether this > is a reasonable thing to do in a patch level release... The memory-overwriting problem can be fixed independently, e.g. with https://sourceforge.net/tracker/download.php?group_id=5470&atid=105470&file_id=15248&aid=495401 Regards, Martin From nas@python.ca Tue Feb 19 20:35:07 2002 From: nas@python.ca (Neil Schemenauer) Date: Tue, 19 Feb 2002 12:35:07 -0800 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: ; from tim.one@comcast.net on Tue, Feb 19, 2002 at 03:01:28PM -0500 References: <20020219102053.A29414@glacier.arctrix.com> Message-ID: <20020219123507.B29834@glacier.arctrix.com> Tim Peters wrote: > Within a basic block, "the obvious" greedy scheme is provably optimal wrt > minimizing the # of temp registers needed by the block I already had this part of the plan mostly figured out. Thanks for verifying my thinking however. > What's hard (too expensive to be practical, in general) is provably > minimizing the overall number of temps *across* basic blocks. This was the part that was worrying me. > Still, look up "graph coloring register assignment" on google and > you'll find lots of effective heuristics. For a first cut, punt: > just store everything still alive into memory at the end of a basic > block. Okay, that's easy. I suspect it will work fairly well in practice since most functions have a small number of basic blocks and that increasing the number of registers is cheap. > If you're retaining Rattlesnake's idea of treating "the register file" > as a superset of the local vrbl vector, mounds of such "stores" will > be nops. I'm planning to keep this idea. There seems to be no good reason to treat local variables any differently than registers. I suppose it would be fairly easy to add a simple peep-hole optimizer that would clean out the redundant stores. When you talked about flexible intermediate code did you have anything in mind? Hmm, perhaps constants can be handled in a similar way. The only way I can think of doing it at the moment is to copy the list of constants into registers when the frame is created. That seems like it could easily end up as a net loss though. > What's also hard is selecting instructions in such a way as to > minimize the number of temp registers needed, ditto ordering > instructions toward that end. But is there really any freedom to do reordering? For example, a BINARY_ADD opcode to end up doing practically anything. Neil From mal@lemburg.com Tue Feb 19 21:21:33 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 19 Feb 2002 22:21:33 +0100 Subject: [Python-Dev] 2.2.1 issues References: <2madu5zhnj.fsf@starship.python.net> <3C726270.7D33E687@lemburg.com> Message-ID: <3C72C1DD.B561052A@lemburg.com> "Martin v. Loewis" wrote: > > "M.-A. Lemburg" writes: > > > Right. 1) was caused by 2). > > That wasn't actually the case. The overwriting of memory was really > independent of the error in surrogate processing, and can be fixed > independently. In that case, it's probably best to just use this patch and leave the UTF-8 fix in 2.3 only. > The memory-overwriting problem can be fixed independently, e.g. with > > https://sourceforge.net/tracker/download.php?group_id=5470&atid=105470&file_id=15248&aid=495401 -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Tue Feb 19 20:31:25 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 19 Feb 2002 21:31:25 +0100 Subject: [Python-Dev] 2.2.1 issues In-Reply-To: <15474.28123.180241.360278@grendel.zope.com> References: <2madu5zhnj.fsf@starship.python.net> <3C726270.7D33E687@lemburg.com> <15474.28123.180241.360278@grendel.zope.com> Message-ID: "Fred L. Drake, Jr." writes: > Guido can rule as he sees fit, but I don't see any reason *not* to > change the magic number. This seems like a pretty important fix to > me. The memory-overwriting problem can be fixed without bumping the pyc magic. The rationale for bumping the pyc magic is pretty weak, IMO, so that aspect should not be propagated to 2.2.1. Regards, Martin From fdrake@acm.org Tue Feb 19 21:46:23 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 19 Feb 2002 16:46:23 -0500 Subject: [Python-Dev] 2.2.1 issues In-Reply-To: References: <2madu5zhnj.fsf@starship.python.net> <3C726270.7D33E687@lemburg.com> <15474.28123.180241.360278@grendel.zope.com> Message-ID: <15474.51119.2069.367137@grendel.zope.com> Martin v. Loewis writes: > The memory-overwriting problem can be fixed without bumping the pyc > magic. The rationale for bumping the pyc magic is pretty weak, IMO, > so that aspect should not be propagated to 2.2.1. I'm happy with that. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From pedroni@inf.ethz.ch Tue Feb 19 22:23:05 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Tue, 19 Feb 2002 23:23:05 +0100 Subject: [Python-Dev] Meta-reflections References: Message-ID: <036001c1b994$00dd8360$6d94fea9@newmexico> From: Kevin Jacobs > On Tue, 19 Feb 2002, Samuele Pedroni wrote: > > Personally I don't expect slots to behave like attributes. I mean, > > the different naming is a hint. > > For me, slot declarations are a hint that certain attributes should be > allocated 'statically' versus the normal Python 'dynamic' attribute > allocation. Interesting but for the implementation: class f(file): __slots__ = ('a',) slot a and file.softspace are in the same league, which is not attributes' league. They are struct member and the descriptor logic to access them exploit this. >From the implementation it seems clear that slots and attributes are not interchangeable. On the other hand that means that there cannot be a slot-only future for Python. > > > I'll contend that the current implementation is flawed for this and several > > > other reasons I've stated in my previous e-mails. Of course, we're waiting > > > to hear back from Guido when he returns, since his opinion is infinitely > > > more important than mine in this matter. > > > > It is not flawed, it is just single-inheritance-of-struct-like-layout-based. > > I'm fine with that. > > Please read some of my earlier messages. There are other 'warts'. Yes, but changing the whole impl design is probably not the only solution. I mean this literally. > > To be honest I would find very annoying that what we are about > > to implement in Jython 2.2 should be somehow radically changed for Jython 2.3. > > We have not the necessary amount of human resources > > to happily play that kind of game. > > Well, we are dealing with an implementation that is not documented _at all_. The 2.2 type/class unification tutorial has references to __slots__: http://www.python.org/2.2/descrintro.html What is true is that the surface aspects of the undelying design are not documented. > So, in virtually all respects, Jython 2.2 could ignore their existence > totally and still function correctly. False. See above. > I hope that you will be pleased by > the in-depth discussions on this topic, since it will likely lead to the > formulation of refined documentation for many of these very fuzzily defined > features. As an implementer, that kind of clarification can be invaluable > since it means you don't have to guess at the semantics and have to change > it later. This one is insolent. Btw the tutorial contain this: There's no check that prevents you to override an instance variable already defined by a base class using a __slots__ declaration. If you do that, the instance variable defined by the base class is inaccessible (except by retrieving its descriptor directly from the base class; this could be used to rename it). Doing this renders the meaning of your program undefined; a check to prevent this may be added in the future. > > I hope and presume that Guido did know what he was designing, > > and I had that impression too. > > OTOH I agree that pickle should work for new-style classes too. > > He knew what he was designing, but was focused on achieving other goals with > this infrastructure (like class/type unification). I have the feeling that > slots were more of an experiment than anything else. Don't get me wrong -- > they are insanely useful even in their current state. On the other hand, I > don't think they're ready for prime-time until we smooth over the picky > semantic issues relating to slot overloading, reflection and renaming. Just > look at the Python standard library -- you'll notice that slots are not used > anywhere. I predict that we will be using them extensively, especially in > the standard library, as soon as they are deemed ready for prime-time. > A possible approach: write a patch implementing your preferred semantics. You can keep it orthogonal from the rest, using a name different than "__slots__", for the first cut. regards, Samuele Pedroni. From greg@cosc.canterbury.ac.nz Wed Feb 20 03:01:34 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 20 Feb 2002 16:01:34 +1300 (NZDT) Subject: [Python-Dev] Stackless Design Q. In-Reply-To: <3C72A453.7080905@tismer.com> Message-ID: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz> > Now I am at the point where I'm worst suited for: > Design an interface. > Please tell me what you think, and what you'd like > to change. It's not clear exactly what you're after here. Are you trying to define the lowest-level interface upon which everything else will be built? If so, I think what you have presented is FAR too complex. It seems to me you need only two things: (1) A constructor for new tasklets: t = tasklet(f) Takes a callable object f of no parameters and returns a tasklet which will execute the code of f. The tasklet is initially suspended and does not execute any of f's code until it is switched to for the first time. (2) A way of switching to another tasklet: t.transfer() Suspends the currently-running tasklet and resumes tasklet t were it last left off. This will either be at the beginning or where it last called the transfer() of another tasklet. All the other stuff you talk about -- passing values between tasklets, rings of runnable tasklets, scheduling policies, etc -- can all be implemented in Python on top of these primitives. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From rushing@nightmare.com Wed Feb 20 07:41:39 2002 From: rushing@nightmare.com (Sam Rushing) Date: 19 Feb 2002 23:41:39 -0800 Subject: [Python-Dev] Re: [Stackless] Stackless Design Q. In-Reply-To: <3C72A453.7080905@tismer.com> References: <3C72A453.7080905@tismer.com> Message-ID: <1014190899.31006.5.camel@fang.nightmare.com> --=-o0D+Wc9k+nOeeoHZukjD Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Tue, 2002-02-19 at 11:15, Christian Tismer wrote: > Again, What is the name of this function/method, and > where does it belong? > It > - unblocks another tasklet > - transfers data > - does not block anything > - schedules the other tasklet >=20 > Or is this a bad design at all? In our current system, this function is called 'schedule()'; and it takes an 'args' tuple. It doesn't transfer control, it just makes the other coro ready to run ASAP. [i.e., next trip through the event loop it will be added to the set of 'runnable' coros]. Here is our coro::condition_variable::wake_one() for context: def wake_one (self, args=3D()): for coro in self._waiting: try: schedule (coro, args) except ScheduleError: pass else: self._waiting.pop(0) return 1 else: return 0 [ScheduleError is thrown if the coro has already been scheduled] -Sam --=-o0D+Wc9k+nOeeoHZukjD Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iD8DBQA8c1Mz96I2VlFshRwRAjTiAJ4opHWSmb45l5YgaroZoa3Oy6KhbgCgmLVT Vbg1DWF5JI62zVhxTtvlbHA= =ptSd -----END PGP SIGNATURE----- --=-o0D+Wc9k+nOeeoHZukjD-- From tim.one@comcast.net Wed Feb 20 07:46:07 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 20 Feb 2002 02:46:07 -0500 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: <20020219123507.B29834@glacier.arctrix.com> Message-ID: [Tim] >> Within a basic block, "the obvious" greedy scheme is ... [Neil Schemenauer] > I already had this part of the plan mostly figured out. Thanks for > verifying my thinking however. You're welcome. Note that you've been pretty mysterious about what it is you do want to know, so I'm more pleased that I channeled you than that I was slightly helpful . >> What's hard (too expensive to be practical, in general) is provably >> minimizing the overall number of temps *across* basic blocks. > This was the part that was worrying me. It can worry you later just as well. Python isn't C, and the dynamic semantics make it very much harder to prove that a subexpression is, e.g., a loop invariant, or that an instance of "+" won't happen to change the binding of every global, etc (ha! now I see you pointed that out yourself later -- good chanelling on your end too ). For that reason there's less need to get gonzo at the start. IIRC, the primary motivation for Rattlesnake was to cut eval loop overhead, and it's enough to aim for just that much at the start. >> If you're retaining Rattlesnake's idea of treating "the register file" >> as a superset of the local vrbl vector, mounds of such "stores" will >> be nops. > I'm planning to keep this idea. There seems to be no good reason to > treat local variables any differently than registers. Not now. If you're looking at going on to generate native machine code someday, well, this isn't the only simplification that will bite. > I suppose it would be fairly easy to add a simple peep-hole optimizer > that would clean out the redundant stores. When you talked about > flexible intermediate code did you have anything in mind? That's why I urged you to keep it in Python at the start. IRs come in all sorts of flavors, and AFAICT people more or less stumble into one that works well for them based on what they've done so far (and that was my experience too in my previous lives). You have to expect to rework it as ambitions gorw. Note Daniel Berlin's helpful comment that gcc is moving toward 3 IRs now; that's the way these things always go. At the start, I expect I'd represent a Python basic block as an object containing a plain Python list of instruction objects. Then you've got the whole universe of zippy Python builtin list ops to build on when mutating the instruction stream. Note that my focus is to minimize *your* time wrestling with this stuff: implementing fancy data structures is a waste of time at the start. I'd also be delighted to let cyclic gc clean up dead graph structures, so wouldn't spend any time trying, e.g., to craft a gimmick out of weakrefs to avoid hard cycles. You may or may not want to do a survey of popular IRs. Here's a nice *brief* page with links to follow: http://www.math.tau.ac.il/~guy/pa/ir.html I think a lot of this is a matter of taste. For example, some people swear by the "Value Dependence Graphs" that came out of Microsoft Research. I never liked it, and it's hard to explain why ("too complicated"). Static Single Assignment form is more to my tastes, but again it's hard to explain why ("almost but not quite too simple"). Regardless, you can get a lot done at the Rattlesnake level just following your gut intuitions, prodded by what turns out to be too clumsy. As with most other things, reading papers is much more useful after you've wrestled with the problems on your own. There's only one way to do it, but what that way is remains a mystery to me. > ... > But is there really any freedom to do reordering? For example, a > BINARY_ADD opcode to end up doing practically anything. That's right, and that's why you should gently file away but ignore almost all the advice you'll get . Skip kept discovering over and over again just how little his peephole optimizer could get away with doing on Python bytecode. Collapsing jumps to jumps is one of the few safe things you can get away with. BTW, the JUMP_IF_TRUE and JUMP_IF_FALSE opcodes peek at the top of the stack instead of consuming it. As a result, both the fall-through and target instructions are almost always the no-bang-for-the-buck POP_TOP. This always irritated me to death (it's *useful* behavior, IIRC, only for chained comparisons). If Rattlesnake cleans up just that much, it will be a huge win in my eyes, depite not being measurable . From jacobs@penguin.theopalgroup.com Wed Feb 20 10:45:38 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 20 Feb 2002 05:45:38 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <036001c1b994$00dd8360$6d94fea9@newmexico> Message-ID: On Tue, 19 Feb 2002, Samuele Pedroni wrote: > From: Kevin Jacobs > > On Tue, 19 Feb 2002, Samuele Pedroni wrote: > > > Personally I don't expect slots to behave like attributes. I mean, > > > the different naming is a hint. > > > > For me, slot declarations are a hint that certain attributes should be > > allocated 'statically' versus the normal Python 'dynamic' attribute > > allocation. > > Interesting but for the implementation: > > class f(file): > __slots__ = ('a',) > > slot a and file.softspace are in the same league, > which is not attributes' league. Currently this is true, though only you and Martin v. Loewis have replied agreeing that this should be the case. Everyone else I've spoken to _wants_ slots to act more like instance attributes. > Yes, but changing the whole impl design is probably not the only solution. > I mean this literally. I realize this. That's why I'm trying to build a consensus until some sort of clarity emerges. That's also why I'm asking for feedback on what should be the correct semantics of slots instead of assuming that the current implementation is the One True Bible of slots. If you think that slots are implemented correctly, then I welcome you to work with me to make exactly that case. Unless you (or others) step up to do that, I will continue to feel that the current slot implementation is flawed and will continue to advocate their reform. Unfortunately, repeatedly pointing out that my suggestions are not how they are implemented doesn't advance either of our cases. > > Well, we are dealing with an implementation that is not documented _at all_. > > The 2.2 type/class unification tutorial has references to __slots__: > > http://www.python.org/2.2/descrintro.html > > What is true is that the surface aspects of the undelying design are not > documented. A tutorial is not documentation. It is certainly suggestive of what will eventually be documented, but it is not documented until it is part of the Python Reference Manual. For example, I would not be surprised if large hunks of the descrintro ceases to work after the 'super' and 'property' syntax changes slated for Python 2.3. > > So, in virtually all respects, Jython 2.2 could ignore their existence > > totally and still function correctly. > > False. See above. Don't take my word for it -- ask Guido when he gets back. > > I hope that you will be pleased by > > the in-depth discussions on this topic, since it will likely lead to the > > formulation of refined documentation for many of these very fuzzily defined > > features. As an implementer, that kind of clarification can be invaluable > > since it means you don't have to guess at the semantics and have to change > > it later. > > This one is insolent. Please, lets not descend into name calling. I truly believe that I am providing a service to the general Python community by engaging in these discussions. If you feel that it is insolent to question the language implementers just because I am a newcomer and have some controversial issues, then I recommend that you rapidly get used to it. I do it all the time and don't plan to stop. > A possible approach: > write a patch implementing your preferred semantics. > You can keep it orthogonal from the rest, using > a name different than "__slots__", for the first > cut. I fully intend to provide a reference implementation of some of these ideas. In fact, its likely to be a fairly small patch. However, I still don't know what the ideal semantics are. I would very much value your input on the matter, even if on a purely theoretical level. So, lets start with the premise that __attrs__ is a declaration like __slots__ except that: 1) the namespace of these 'attrs' is flat -- repeated names in descendant classes either results in an error or silently re-using the existing slot. This maintains the traditional flat instance namespace of attributes. 2) A complete and immutable list of slots is available as a member of each type to allow for easy and efficient reflection. (though I am also in favor of working on better formal reflection APIs) 3) These 'attrs' are to otherwise have the same semantics as normal __dict__ instance attributes. e.g., they should be automatically picklable, they can have properties assigned to them, etc. Regards, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From ping@lfw.org Wed Feb 20 12:40:49 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Wed, 20 Feb 2002 06:40:49 -0600 (CST) Subject: [Python-Dev] Global variable access schemes Message-ID: I've added diagrams for Guido's more recent proposal, and summarized everything on a web page: http://lfw.org/python/globals.html Check out http://lfw.org/python/guido2a.gif and http://lfw.org/python/guido2b.gif. About Guido2: - I renamed some things -- globals_vector is a structure, not a vector, so i put it in md_cache and used the prefix mc_ for its fields. - When you del a module variable, do you just go through all of mc_names to find the entry to invalidate? (I suppose if you sort mc_names you can binary search.) - It should be possible to add entries in the cache for attributes in other modules, too, right? If we assume that varibles don't get deleted often, it should pay off. Haven't heard anything from anybody about this topic in a while. Has anyone been thinking about it? -- ?!ng From tismer@tismer.com Wed Feb 20 12:44:31 2002 From: tismer@tismer.com (Christian Tismer) Date: Wed, 20 Feb 2002 13:44:31 +0100 Subject: [Python-Dev] Stackless Design Q. References: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz> Message-ID: <3C739A2F.5030502@tismer.com> Greg Ewing wrote: >>Now I am at the point where I'm worst suited for: >>Design an interface. >> > >>Please tell me what you think, and what you'd like >>to change. >> > > It's not clear exactly what you're after here. Are you > trying to define the lowest-level interface upon which > everything else will be built? If so, I think what you > have presented is FAR too complex. The old Stackless with its continuations was at lowest possible level, in a sense. What I now try to do is a compromise: I would like to build the simplest possible but powerful set of methods. At the same time, I'd like to keep track of tasklets, since they are now containing vitual information about stack state, and I cannot afford to loose one of them, or we'll crash. The little doubly-linked list maintenance is very cheap to do. So my basic idea was to provide what is needed to get uthreads at very high speed, without the ned to use Python for the basic machinery. > It seems to me you need only two things: Yes, I need these two things, and some more. > All the other stuff you talk about -- passing values between > tasklets, rings of runnable tasklets, scheduling policies, etc -- > can all be implemented in Python on top of these primitives. Sure it can, with one exception: My tasklets will also support threading, that is they will become auto-scheduled if the user switches this on. But auto-scheduled frames are a diffeent kind of thing than those which are in "waiting for data" state. I need to distinguish them or I will crash. That's the reason why I keep these linked lists. Switching to the wrong tasklet should be rock solid in the kernel, this is nothing that I want people to play with from Python. Thanks a lot anyway, I'll try to make it even simpler. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From Paul.Moore@atosorigin.com Wed Feb 20 14:31:39 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 20 Feb 2002 14:31:39 -0000 Subject: [Python-Dev] Meta-reflections Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F1@UKRUX002.rundc.uk.origin-it.com> > I fully intend to provide a reference implementation of > some of these ideas. In fact, its likely to be a fairly > small patch. However, I still don't know what the ideal > semantics are. I would very much value your input on the > matter, even if on a purely theoretical level. So, lets > start with the premise that __attrs__ is a declaration like > __slots__ except that: It seems relevant to me that your choice of name ("attrs") indicates a relationship with attributes - which your ideas seem to deny. It is important to make the distinction entirely clear, otherwise your points are going to get obscured. I've been having a hard time keeping track of precisely what you are suggesting (on which basis, thanks for this summary...) > 1) the namespace of these 'attrs' is flat -- repeated > names in descendant classes either results in an error or > silently re-using the existing slot. This maintains the > traditional flat instance namespace of attributes. FWIW, I disagree with this completely. I would expect slots with a particular name in derived classes to hide the same-named slots in base classes. Whether or not the base class slot is available via some sort of "super" shenannigans is less relevant. But hiding semantics is critical. How do you expect to reliably make it possible to derive from a class with slots otherwise? Perl gets into this sort of mess with its implementation of objects. > 2) A complete and immutable list of slots is available > as a member of each type to allow for easy and efficient > reflection. (though I am also in favor of working on > better formal reflection APIs) Agreed - up to a point. I don't see a need for a way to distinguish between slots and "normal" attributes, personally. But I don't do anything fancy here, so my experience isn't very indicative. I'm more or less happy with dir() as a start, although I agree that a better formal reflection API would be helpful. I suspect that such a thing could be built in Python on top of the existing facilities, however... > 3) These 'attrs' are to otherwise have the same semantics > as normal __dict__ instance attributes. e.g., they should > be automatically picklable, they can have properties > assigned to them, etc. I think I agree here. However, if you want slots to behave like normal attributes, except for the flat namespace, I see no value. Why have the exception at all? Hmm, this raises the question of why we have slots at all. If they act exactly like attributes, why have them? As a user, I perceive them as an efficiency thing - they save the memory associated with a dictionary, and are probably faster as well. There can be tradeoffs which you pay for that efficiency, but that's all. No semantic difference. Actually, that's pretty much how the descrintro document describes slots. Strange that... I guess I am saying that I'm happy with slots as designed (documented in descrintro) - modulo some implementation bugs such as not getting automatically pickled. Paul. From jacobs@penguin.theopalgroup.com Wed Feb 20 13:30:05 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 20 Feb 2002 08:30:05 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F1@UKRUX002.rundc.uk.origin-it.com> Message-ID: Paul, thanks for the very constructive feedback! On Wed, 20 Feb 2002, Moore, Paul wrote: > > I fully intend to provide a reference implementation of > > some of these ideas. In fact, its likely to be a fairly > > small patch. However, I still don't know what the ideal > > semantics are. I would very much value your input on the > > matter, even if on a purely theoretical level. So, lets > > start with the premise that __attrs__ is a declaration like > > __slots__ except that: > > It seems relevant to me that your choice of name ("attrs") indicates a > relationship with attributes - which your ideas seem to deny. You are right -- lets call them 'slotattrs', since they should ideally have virtually the same semantics as attributes, except they are allocated like slots. > > 1) the namespace of these 'attrs' is flat -- repeated > > names in descendant classes either results in an error or > > silently re-using the existing slot. This maintains the > > traditional flat instance namespace of attributes. > > FWIW, I disagree with this completely. I would expect slots with a > particular name in derived classes to hide the same-named slots in base > classes. Whether or not the base class slot is available via some sort of > "super" shenannigans is less relevant. But hiding semantics is critical. How > do you expect to reliably make it possible to derive from a class with slots > otherwise? Perl gets into this sort of mess with its implementation of > objects. Attributes currently have a flat namespace, and the construct that I feel is most natural would maintain that characteristic. e.g.: class Base: def __init__(self): self.foo = 1 class Derived(Base): def __init__(self): Base.__init__(self) self.foo = 2 # this is the same foo as in Base Python already implements a form data hiding semantics in a different way, so I'm not sure it is a good idea to add another ad-hoc method to do the same thing. The current way to implement data hiding is by using the namespace munging trick by prefixing symbols with '__', which is then munged by prepending an underscore and the name of the class. class Foo: __var = 1 dir(Foo) > ['_Foo__var', '__doc__', '__module__'] > > 2) A complete and immutable list of slots is available > > as a member of each type to allow for easy and efficient > > reflection. (though I am also in favor of working on > > better formal reflection APIs) > > Agreed - up to a point. I don't see a need for a way to distinguish between > slots and "normal" attributes, personally. But I don't do anything fancy > here, so my experience isn't very indicative. Without a more formal reflection API, the traditional way to get all normal dictionary attributes is by using instance.__dict__.keys(). All I'm proposing is that instance.__slotattrs__ (or possibly instance.__class__.__slotattrs__) returns a list of objects that reveal the name of all slots in the instance (including those declared in base classes). I am not sure what that list should look like, though here are the current suggestions: 1) __slotattrs__ = ('a','b') 2) # slot descriptor objects -- the repr is shown here __slotattrs__ = ('', ') The only issue that concerns me is that I am not sure if the slot to slot name mapping should be fixed. The intrinsic definition of a slot is a type and the offset of the slot in the type. The name is just a binding to a slot descriptor, so it "feels" unnecessary to make that immutable. It either case, it is not a big issue for me. > > 3) These 'attrs' are to otherwise have the same semantics > > as normal __dict__ instance attributes. e.g., they should > > be automatically picklable, they can have properties > > assigned to them, etc. > > I think I agree here. However, if you want slots to behave like normal > attributes, except for the flat namespace, I see no value. Why have the > exception at all? Attributes currently have a flat namespace? I must not have been clear -- I _do_ want my slotattrs to be allocated like slots, mainly for efficiency reasons. > Hmm, this raises the question of why we have slots at all. If they act > exactly like attributes, why have them? As a user, I perceive them as an > efficiency thing - they save the memory associated with a dictionary, and > are probably faster as well. There can be tradeoffs which you pay for that > efficiency, but that's all. No semantic difference. Actually, that's pretty > much how the descrintro document describes slots. Strange that... EXACTLY! I want to use slots (or slotattrs, or whatever you call them) to be solely an allocation declaration. For small objects that you expect to create many, many instance of, they make a huge difference. I have some results that I measured on various implementations of a class to represent database rows. The purpose of this class is to give simple dictionary-like attribute access semantics to tuples returned as a result of default DB-API relational database queries. The trick is to add the ability to access fields by name (instead of by only by index) without incurring the overhead of allocating a dictionary for every instance. Below are results of a benchmark that compare various ways of implementing this class: time SIZE (sec) Bytes/row -------- ------ --------- overhead: 4,744KB 0.56 - tuple: 18,948KB 2.49 73 C extension w/ slots: 18,924KB 4.85 73 native dict*: 117MB 13.50 589 Python class w/ slots: 18,960KB 17.23 73 Python class w/o slots: 117MB 24.09 589 * the native dict implementation does not allow indexed access, and is only included as a reference point. [For more details and discussion of this specific application, please see this URL: http://opensource.theopalgroup.com/ ] Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From Paul.Moore@atosorigin.com Wed Feb 20 15:53:41 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 20 Feb 2002 15:53:41 -0000 Subject: [Python-Dev] Meta-reflections Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F2@UKRUX002.rundc.uk.origin-it.com> From: Kevin Jacobs [mailto:jacobs@penguin.theopalgroup.com] > > FWIW, I disagree with this completely. I would expect > > slots with a particular name in derived classes to hide > > the same-named slots in base classes. Whether or not the > > base class slot is available via some sort of "super" > > shenannigans is less relevant. But hiding semantics > > is critical. How do you expect to reliably make it > > possible to derive from a class with slots otherwise? > > Perl gets into this sort of mess with its implementation > > of objects. > > Attributes currently have a flat namespace, and the > construct that I feel is most natural would maintain that > characteristic. e.g.: Oops. Sorry, I got that completely wrong. OK, I side with silent re-use. That's what attributes do. [Meta-meta comment: the rule should be to work "just like" attributes.] So why did you feel the need to separate this point from your point (3)? It gives the impression that this point differs from attributes, in contrast to the things mentioned in (3). > > Agreed - up to a point. I don't see a need for a way > > to distinguish between slots and "normal" attributes, > > personally. But I don't do anything fancy here, so my > > experience isn't very indicative. > > Without a more formal reflection API, the traditional > way to get all normal dictionary attributes is by using > instance.__dict__.keys(). The official, and supported, way is to use dir(). This was hashed out on python-dev at the time. As I understand it, dir() always "worked", and was extended to support slots when they were added. __dict__ clearly only handles dict-based attributes, and so cannot be extended to include slots. The official advice on reflection was therefore modified to point out that dir() and __dict__.keys() were no longer equivalent, and dir() was "the way" to get the full set. (Whether this advice was included into the formal documentation, I couldn't confirm, but it was written down - arguing "if it's not in the documentation, it's not official", is a little naive, given the new and relatively experimental status of the whole area...) > All I'm proposing is that instance.__slotattrs__ (or > possibly instance.__class__.__slotattrs__) returns a > list of objects that reveal the name of all slots in the > instance (including those declared in base classes). Do you have any reason why you would need to get a list of only slots, or only dict-based attributes? Given that I'm arguing that the two should work exactly the same (apart from storage and efficiency details), it seems unreasonable to want to make the distinction (unless you're doing something incestuous and low-level, when you're on your own...) Remember, instance.__dict__['attrname'] is now regarded as incomplete in the face of slots. Again, I point you to the descrintro document, just below the discussion of slots, in the paragraphs starting from "The correct way to get any attribute from self inside __getattribute__ is to call the base class's __getattribute__ method". > The only issue that concerns me is that I am not sure if the > slot to slot name mapping should be fixed. I haven't melted my brain enough to understand the PEPs, but I believe that there are ways of doing all sorts of low-level hacking with descriptors, if you really want to. I don't believe that making this easy for "normal" users is a good thing. [BTW, this reminds me of your point on what is documented. I believe that the PEPs and descrintro count as the canonical documentation of these features. If they haven't been fully migrated into the Python documentation set yet, that's a secondary issue. The PEPs *are* the definition of what was agreed - people had time to comment on the PEPs at the time. And the descrintro document is the current draft of the user-level documentation. You can assume it will end up in the manuals. That's my view...] > EXACTLY! I want to use slots (or slotattrs, or whatever > you call them) to be solely an allocation declaration. > For small objects that you expect to create many, many > instance of, they make a huge difference. In which case, you are giving the impression of wanting large changes, when you really want things to stay as they are (modulo some relatively non-controversial (IMHO) bugs). If you read the descrintro document on slots, you will see that it presents an identical viewpoint. OK, there are some technical restrictions, which will always apply because of the nature of the optimisation, but the intention is clear. And it matches yours... Paul. From senn@maya.com Wed Feb 20 16:00:37 2002 From: senn@maya.com (Jeff Senn) Date: Wed, 20 Feb 2002 11:00:37 -0500 Subject: [Stackless] Re: [Python-Dev] Stackless Design Q. In-Reply-To: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz> (Greg Ewing's message of "Wed, 20 Feb 2002 16:01:34 +1300 (NZDT)") References: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz> Message-ID: Greg Ewing writes: > It's not clear exactly what you're after here. Are you > trying to define the lowest-level interface upon which > everything else will be built? If so, I think what you > have presented is FAR too complex. > > It seems to me you need only two things: ... > t = tasklet(f) > t.transfer() (Sorry if I missed something -- I've been *way* busy lately and haven't been giving this much attention -- that said...) But (if I understand the current plan) we will need mechanisms internal to the Python interpreter to transfer values and maintain blocked/running state anyway; since when you generate a tasklet and run it: t = tasklet(f) t.transfer() That may cause many more tasklets to be generated, run, and destroyed that you don't ever see ... recursions/function calls in f, and only-Christian-knows what else... so the transfer value mechanism might as well be built in. I haven't thought enough about the "unamed produce-and-continue function" to decide how exactly it should work. I have two concerns in implementing uthreads this way (scheduler in C): 1 -- there doesn't seem to be anyway to "kill" a tasklet 2 -- the scheduling algorithm will be hard to tune (we'll probably *at least* need tasklet priority...) Maybe there should still be a "timeslice" function so an in-Python scheduler can be written? -- -Jas -------------------- www.maya.com Jeff Senn | / / |-/ \ / /|� Chief Technologist | /|/| |/ o | /-| Head of R&D | Taming Complexity� From fdrake@acm.org Wed Feb 20 16:04:00 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 20 Feb 2002 11:04:00 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F2@UKRUX002.rundc.uk.origin-it.com> References: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F2@UKRUX002.rundc.uk.origin-it.com> Message-ID: <15475.51440.121712.347215@grendel.zope.com> Moore, Paul writes: > [BTW, this reminds me of your point on what is documented. I believe that > the PEPs and descrintro count as the canonical documentation of these > features. If they haven't been fully migrated into the Python documentation > set yet, that's a secondary issue. The PEPs *are* the definition of what was > agreed - people had time to comment on the PEPs at the time. And the > descrintro document is the current draft of the user-level documentation. > You can assume it will end up in the manuals. That's my view...] This is my perspective as well. I'm not in a hurry to document relatively volatile feature that may change and (hopefully!) be available using a nicer syntax in the future. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jacobs@penguin.theopalgroup.com Wed Feb 20 16:32:17 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 20 Feb 2002 11:32:17 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <15475.51440.121712.347215@grendel.zope.com> Message-ID: On Wed, 20 Feb 2002, Fred L. Drake, Jr. wrote: > Moore, Paul writes: > > [BTW, this reminds me of your point on what is documented. I believe that > > the PEPs and descrintro count as the canonical documentation of these > > features. If they haven't been fully migrated into the Python documentation > > set yet, that's a secondary issue. The PEPs *are* the definition of what was > > agreed - people had time to comment on the PEPs at the time. And the > > descrintro document is the current draft of the user-level documentation. > > You can assume it will end up in the manuals. That's my view...] > > This is my perspective as well. I'm not in a hurry to document > relatively volatile feature that may change and (hopefully!) be > available using a nicer syntax in the future. Then what is the criterion for deciding when to apply the standard Python deprecation procedures when things like super() and the __slots__ change? # Python 2.3? from __future__ import super_as_builtin from __future__ import hidden_slots I had (possibly incorrectly) assumed that the criterion was when it was officially documented in a release. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From A.M. Kuchling Wed Feb 20 16:54:55 2002 From: A.M. Kuchling (Andrew Kuchling) Date: Wed, 20 Feb 2002 11:54:55 -0500 Subject: [Python-Dev] Parser-SIG created Message-ID: Parser SIG: Selection of a parser for the standard library Description: This SIG is for discussing and comparing several different parser generators in order to assess which one would be worth including to the Python standard library. Deliverables (in roughly this order): 1) A list of requirements for a parser generator suitable for inclusion. 2) If no parser meets those requirements, the SIG might work to enhance one or more parsers until the requirements are met. (It would be nice if this step became a null operation; otherwise we might fall prey to creeping scope.) 3) A recommendation for a parser to include, along with a patch against the Python CVS tree. The BDFL can then ignore or follow the recommendation and patch as he sees fit. Martin von Loewis presented a paper at Python10 comparing several different parser generators in order to assess which one would be worth adding to the standard library; it will likely serve as the starting point for discussion. Jonathan Riehl suggested creating a Parser SIG, and I offered to champion it. The SIG will aim to complete its task in time for Python 2.3. No schedule for 2.3 has been officially announced yet, but probably the SIG will have to complete its mission by May or June 2002. To join the SIG mailing list, use Mailman at: http://mail.python.org/mailman/listinfo/parser-sig/ --amk (www.amk.ca) I can see you've been doing the TARDIS up a bit. I don't like it. -- The second Doctor, in "The Three Doctors" From jacobs@penguin.theopalgroup.com Wed Feb 20 17:08:07 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 20 Feb 2002 12:08:07 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F2@UKRUX002.rundc.uk.origin-it.com> Message-ID: On Wed, 20 Feb 2002, Moore, Paul wrote: > Oops. Sorry, I got that completely wrong. OK, I side with silent re-use. > That's what attributes do. [Meta-meta comment: the rule should be to work > "just like" attributes.] So why did you feel the need to separate this point > from your point (3)? It gives the impression that this point differs from > attributes, in contrast to the things mentioned in (3). Having a flat namespace (i.e., no hidden slots), and having all 'reachable' slots in a single list are really two separate issues. Right now, we have this situation: class Foo(object): __slots__ = ('a','b') class Bar(Foo): __slots__ = ('c','d') bar=Bar() print bar.__slots__ > ('c', 'd') print bar.__class__.__base__.__slots__ > ('a', 'b') I content that bar.__slots__ should be: > ('a', 'b', 'c', 'd') I think somewhere along the line I may have mixed up which 'flatness' I was talking about. > The official, and supported, way is to use dir(). I agree, but that "official" support has clear limitations. Right now, there are several examples in the Python standard library where we use obj.__dict__.keys() -- most significantly in pickle and cPickle. There is also the vars(obj), which may be the better reflection method, though it currently doesn't know about slots. > (Whether this advice was included into the formal documentation, I > couldn't confirm, but it was written down - arguing "if it's not in the > documentation, it's not official", is a little naive, given the new and > relatively experimental status of the whole area...) Naive, maybe, but saying "undocumented" is equivalent to "unsupported implementation detail" saves us from having to maintain backward compatibility and following the official Python deprecation process. > Do you have any reason why you would need to get a list of only slots, or > only dict-based attributes? Yes. Dict-based attributes always have values, while slot-based attributes can be unset and raise AttributeErrors when trying to access them. e.g., here is how I would handle pickling (excerpt from pickle.py): try: getstate = object.__getstate__ except AttributeError: stuff = object.__dict__ # added to support slots if hasattr(object.__slots__): for slot in object.__slots__: if hasattr(object, slot): stuff[slot] = getattr(object, slot) else: stuff = getstate() _keep_alive(stuff, memo) save(stuff) write(BUILD) > Given that I'm arguing that the two should work > exactly the same (apart from storage and efficiency details), it seems > unreasonable to want to make the distinction (unless you're doing something > incestuous and low-level, when you're on your own...) I'm not suggesting anything more incestuous and low-level than what is already done in many, many, many places. A larger, more-encompassing proposal is definitely welcome. > > EXACTLY! I want to use slots (or slotattrs, or whatever > > you call them) to be solely an allocation declaration. > > For small objects that you expect to create many, many > > instance of, they make a huge difference. > > In which case, you are giving the impression of wanting large changes, when > you really want things to stay as they are (modulo some relatively > non-controversial (IMHO) bugs). If you read the descrintro document on > slots, you will see that it presents an identical viewpoint. OK, there are > some technical restrictions, which will always apply because of the nature > of the optimisation, but the intention is clear. And it matches yours... Well, I've not found resounding agreement on the first two of my three basic issues/bugs I've raised so far: 1) Flat slot namespaces: Objects should not hiding slots when inherited by classes implementing the same slot name. 2) Flat slot descriptions: object.__slots__ being an immutable flat tuple of all slot names (including inherited ones), as opposed to being a potentially mutable sequence of only the slots defined by the most derived class. 3) First class status for slot reflection: making slots picklable by default, returned by vars(object), and made part of other relevant reflection APIs and standard implementations. The good news is that once Guido and others have spoken, I can have patches that accomplish all of this fairly quickly. I just don't want to do a lot of unnecessary work if it won't be accepted. Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From gmcm@hypernet.com Wed Feb 20 18:27:58 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 20 Feb 2002 13:27:58 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: References: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F1@UKRUX002.rundc.uk.origin-it.com> Message-ID: <3C73A45E.16869.1EABF7B2@localhost> On 20 Feb 2002 at 8:30, Kevin Jacobs wrote: > Attributes currently have a flat namespace, Instance attributes do, but that's a tautology. > and the > construct that I feel is most natural would maintain > that characteristic. e.g.: > > class Base: > def __init__(self): > self.foo = 1 > > class Derived(Base): > def __init__(self): > Base.__init__(self) > self.foo = 2 # this is the same foo as in > Base But these aren't: class Base foo = 1 class Derived(Base): foo = 2 -- Gordon http://www.mcmillan-inc.com/ From JeffH@ActiveState.com Wed Feb 20 18:33:25 2002 From: JeffH@ActiveState.com (Jeff Hobbs) Date: Wed, 20 Feb 2002 10:33:25 -0800 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: Message-ID: <001101c1ba3d$15e9cda0$ba03a8c0@activestate.ca> > So if TCL_DONT_WAIT isn't set, it will block; if it is, it will > busy-wait. Looks like we lose either way. > > In-between, it invokes the setupProcs of each input source, so that > they can set a maxblocktime, but I don't think _tkinter should hack > itself into that process. That's correct - I should have looked a bit more into what I did before (I was always tying in another GUI's event loop). However, I don't see why you should not consider the extra event source. Tk uses this itself for X. It would be something like: [in tk setup] Tcl_CreateEventSource(TkinterSetupProc, NULL, NULL); /* *---------------------------------------------------------------------- * * TkinterSetupProc -- * * This procedure implements the setup part of the Tkinter * event source. It is invoked by Tcl_DoOneEvent before entering * the notifier to check for events on all displays. * * Results: * None. * * Side effects: * The maximum block time will be set to 20000 usecs to ensure that * the notifier returns control to Tcl. * *---------------------------------------------------------------------- */ static void TkinterSetupProc(clientData, flags) ClientData clientData; /* Not used. */ int flags; { static Tcl_Time blockTime = { 0, 20000 }; Tcl_SetMaxBlockTime(&blockTime); } In fact, you can look at tk/unix/tkUnixEvent.c to see something similar already done in Tk. > About thread-safety: Is Tcl 8.3 thread-safe in its standard > installation, so that we can just use it from multiple threads? If > not, what is the compile-time check to determine whether it is > thread-safe? If there is none, I really don't see a solution, and the You would compile with --enable-threads (both Tcl and Tk). Jeff From jacobs@penguin.theopalgroup.com Wed Feb 20 18:49:51 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 20 Feb 2002 13:49:51 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <3C73A45E.16869.1EABF7B2@localhost> Message-ID: On Wed, 20 Feb 2002, Gordon McMillan wrote: > On 20 Feb 2002 at 8:30, Kevin Jacobs wrote: > > Attributes currently have a flat namespace, > > Instance attributes do, but that's a tautology. Yes, though one implication of the new slots mechanism in Python 2.2 is that we now have have non-flat per-instance namespaces for slots. i.e., we would have per-instance slots that would hide other per-instance slots of the same name from ancestor classes: class Base(object): __slots__ = ['foo'] def __init__(self): self.foo = 1 # which slot this sets depends on type(self) # if type(self) == Base, then the slot is # described by Base.foo. # else if type(self) == Derived, then the # slot is described by Derived.foo class Derived(Base): __slots__ = ['foo'] def __init__(self): Base.__init__(self) self.foo = 2 # this is NOT the same foo as in Base o = Derived() print o.foo > 2 o.__class__.__base__.foo = 3 print o.foo > 2 print o.__class__.__base__.foo > 3 So slots, as currently implemented, do not act like attributes, and this whole discussion revolves around whether they should or should not. Regards, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From DavidA@ActiveState.com Wed Feb 20 19:25:42 2002 From: DavidA@ActiveState.com (David Ascher) Date: Wed, 20 Feb 2002 11:25:42 -0800 Subject: [Python-Dev] Meta-reflections References: Message-ID: <3C73F836.5FCDB2EE@activestate.com> Kevin Jacobs wrote: > Yes, though one implication of the new slots mechanism in Python 2.2 is that > we now have have non-flat per-instance namespaces for slots. i.e., we would > have per-instance slots that would hide other per-instance slots of the same > name from ancestor classes: > > class Base(object): > __slots__ = ['foo'] > def __init__(self): > self.foo = 1 # which slot this sets depends on type(self) > # if type(self) == Base, then the slot is > # described by Base.foo. > # else if type(self) == Derived, then the > # slot is described by Derived.foo > > class Derived(Base): > __slots__ = ['foo'] > def __init__(self): > Base.__init__(self) > self.foo = 2 # this is NOT the same foo as in Base > > o = Derived() > print o.foo > > 2 > o.__class__.__base__.foo = 3 > print o.foo > > 2 > print o.__class__.__base__.foo > > 3 > > So slots, as currently implemented, do not act like attributes, and this > whole discussion revolves around whether they should or should not. This example is not a great example of that, since the code above does exactly the same thing if you delete the lines defining __slots__. You're modifying class attributes in that case, but I think it's important to keep the examples which illustrate the problem "pure" and "realistic". My take on this thread is that I think it's simply not going to happen that slots are going to act 100% like attributes except for performance/memory considerations. It would be nice, but if that had been possible, then they'd simply be an internal optimization and would have no syntactic impact. There are much more shallow ways in which slots aren't like attributes: >>> class A(object): ... __slots__ = ('a',) ... >>> a = A() >>> a.a = 123 # set a slot on a >>> A.a = 555 # set a slot on A >>> a.a # Way 1: A's slot overrides a's 555 >>> b = A() >>> b.a 555 >>> del A.a >>> a.a # Way 2: deleting the class slot # did not uncover the instance slot AttributeError: 'A' object has no attribute 'a' --david From martin@v.loewis.de Wed Feb 20 19:33:01 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 20 Feb 2002 20:33:01 +0100 Subject: [Python-Dev] Meta-reflections In-Reply-To: References: Message-ID: Kevin Jacobs writes: > Then what is the criterion for deciding when to apply the standard Python > deprecation procedures when things like super() and the __slots__ change? It may be that the change does not need to involve deprecation of anything; first let's see the new feature, then decide how to deprecate the exiting one. Regards, Martin From JeffH@ActiveState.com Wed Feb 20 19:43:00 2002 From: JeffH@ActiveState.com (Jeff Hobbs) Date: Wed, 20 Feb 2002 11:43:00 -0800 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: Message-ID: <002601c1ba46$ce683f20$ba03a8c0@activestate.ca> > In-between, it invokes the setupProcs of each input source, so that > they can set a maxblocktime, but I don't think _tkinter should hack > itself into that process. BTW in addition to my last message, you might want to create an ExitHandler that delete the event source. Also, you might add more code to the TkinterSetupProc to only set a block time if multiple threads are actually used (or only create the event source at that time). This would make simple Tkinter apps be efficient and snappy all the time. Jeff From jacobs@penguin.theopalgroup.com Wed Feb 20 19:49:43 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 20 Feb 2002 14:49:43 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <3C73F836.5FCDB2EE@activestate.com> Message-ID: On Wed, 20 Feb 2002, David Ascher wrote: > Kevin Jacobs wrote: > This example is not a great example of that, since the code above does > exactly the same thing if you delete the lines defining __slots__. True, which is why the current implementation (IMHO) isn't broken; just flawed. There is, in effect, a flat slot namespace, only by virtue of the fact that there is no simple and explicit slot resolution syntax. This basically means that most arguments for using the current implementation of slots as a data-hiding mechanism over inheritance are very weak unless significant additional syntax is created. > You're modifying class attributes in that case, but I think it's > important to keep the examples which illustrate the problem "pure" and > "realistic". Nope -- they aren't class attributes at all, they are per-instance slots with class-level descriptors (with which you expose another bug below). > My take on this thread is that I think it's simply not going to happen > that slots are going to act 100% like attributes except for > performance/memory considerations. It would be nice, but if that had > been possible, then they'd simply be an internal optimization and would > have no syntactic impact. I'd like to know why else you think that? I'm fairly confident that I can submit a patch that accomplishes this (and even fix the following issue). > There are much more shallow ways in which slots aren't like attributes: > > >>> class A(object): > ... __slots__ = ('a',) > ... > >>> a = A() > >>> a.a = 123 # set a slot on a > >>> A.a = 555 # set a slot on A > >>> a.a # Way 1: A's slot overrides a's > 555 > >>> b = A() > >>> b.a > 555 > >>> del A.a > >>> a.a # Way 2: deleting the class slot > # did not uncover the instance slot > AttributeError: 'A' object has no attribute 'a' Ouch! You've discovered another problem with the current implementation. You have effectively removed the slot descriptor from class A and replaced it with a class attribute. In fact, I don't think you can ever re-create the slot descriptor! This is actually the best form of data hiding in pure Python I've seen to date. The fix is to make slot descriptors read-only, like the rest of the immutible class attributes. Sigh, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From DavidA@ActiveState.com Wed Feb 20 19:57:37 2002 From: DavidA@ActiveState.com (David Ascher) Date: Wed, 20 Feb 2002 11:57:37 -0800 Subject: [Python-Dev] Meta-reflections References: Message-ID: <3C73FFB1.667C2D8B@activestate.com> Kevin: > > My take on this thread is that I think it's simply not going to happen > > that slots are going to act 100% like attributes except for > > performance/memory considerations. It would be nice, but if that had > > been possible, then they'd simply be an internal optimization and would > > have no syntactic impact. > > I'd like to know why else you think that? Ye old poverty of the imagination argument, coupled with the assumption that Guido had time to finish what he'd started I guess =). > I'm fairly confident that I can > submit a patch that accomplishes this (and even fix the following issue). Great! I can't channel Guido very well, but everytime that he's talked about slots in my presence, he talked about their main purpose as a memory-savings one. If he didn't have other intents, and if you can limit their impact to pure memory savings, then more power to all of us thanks to you. > Ouch! You've discovered another problem with the current implementation. > You have effectively removed the slot descriptor from class A and replaced > it with a class attribute. In fact, I don't think you can ever re-create > the slot descriptor! Ooh, cool. You're right: after deleting the class attribute: >>> a.a = 1100 Traceback (most recent call last): File "", line 1, in ? AttributeError: 'A' object has no attribute 'a' which is a really wacky error message if you look at the case of the letters... --david From jacobs@penguin.theopalgroup.com Wed Feb 20 20:09:07 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 20 Feb 2002 15:09:07 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <3C73FFB1.667C2D8B@activestate.com> Message-ID: On Wed, 20 Feb 2002, David Ascher wrote: > > > My take on this thread is that I think it's simply not going to happen > > > that slots are going to act 100% like attributes except for > > > performance/memory considerations. It would be nice, but if that had > > > been possible, then they'd simply be an internal optimization and would > > > have no syntactic impact. > > > > I'd like to know why else you think that? > > Ye old poverty of the imagination argument, coupled with the assumption > that Guido had time to finish what he'd started I guess =). This is why I haven't unleashed a patch, even though I pretty much know exactly how to fix all of the problems we've noticed and make slots work as I imagine they should. Some of the things I heard Guido say at IPC10 lead me to believe that he has something up his sleeve wrt slots (specifically some plan about storing dicts in __slots__ that do something nifty). So I'm going to sit on my hands until Guido gets back into town and sets us all straight. Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From greg@cosc.canterbury.ac.nz Thu Feb 21 00:08:01 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 21 Feb 2002 13:08:01 +1300 (NZDT) Subject: [Python-Dev] Stackless Design Q. In-Reply-To: <3C739A2F.5030502@tismer.com> Message-ID: <200202210008.NAA22179@s454.cosc.canterbury.ac.nz> Christian Tismer : > I'd like to keep track > of tasklets, since they are now containing vitual > information about stack state, and I cannot afford > to loose one of them, or we'll crash. I'm not sure what the problem is here. A tasklet isn't going to go away until there are no more references to it anywhere, and once that happens, there is no longer any way of switching to it. > So my basic idea was to provide what is needed to get uthreads at > very high speed, without the ned to use Python for the basic > machinery. Well, the higher-level stuff doesn't *have* to be implemented in Python. But I think it should in principle be possible. Then you can experiment with different designs for higher-level faciliies in Python, find out which ones are most useful, and re-code those in C later. > My tasklets will also support threading, that is they will become > auto-scheduled if the user switches this on. I'm not sure how this is going to interact with the facility for switching to a specific tasklet. Seems to me that, in the presence of pre-emptive scheduling, it no longer makes sense to do so, since some other tasklet could easily get scheduled a moment later. The most you can do is say "I don't want to run any more now, let some other tasklet have a go". So it appears that we already have two distinct layers of functionality here: a low-level, non-preemptive layer where we explicitly switch from one tasklet to another, and a higher-level, preemptive one where we let the scheduler take care of picking what to run next. These two layers should be clearly separated, with the higher one built strictly on the facilities provided by the lower one. In particular, there should be exactly one way of switching between tasklets, i.e. by calling t.transfer(). Preemptive switching should be done by some kind of signal or event handler which does this. > But auto-scheduled frames are a diffeent kind > of thing than those which are in "waiting for data" > state. I need to distinguish them or I will crash. If you get rid of the idea of passing values between tasklets as part of the switching process, then this distinction disappears. I think that value-passing and tasklet-switching are orthogonal activities and would be better decoupled. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Feb 21 00:59:46 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 21 Feb 2002 13:59:46 +1300 (NZDT) Subject: [Stackless] Re: [Python-Dev] Stackless Design Q. In-Reply-To: Message-ID: <200202210059.NAA22187@s454.cosc.canterbury.ac.nz> Jeff Senn : > That may cause many more tasklets to be generated, run, and > destroyed that you don't ever see ... recursions/function calls in > f, and only-Christian-knows what else... so the transfer value > mechanism might as well be built in. I don't understand what you mean. Are you saying that every function call creates a new tasklet? That stack frame == tasklet? If that's the case, then we're back to continuations! But I don't think so, because Christian said that a tasklet contains "a chain of frames", not just one frame. > 2 -- the scheduling algorithm will be hard to tune (we'll probably > *at least* need tasklet priority...) Maybe there should still > be a "timeslice" function so an in-Python scheduler can be written? The huge variety of possible scheduling policies is all the more reason *not* to make scheduling part of the core functionality. The user should be free to implement his own scheduling layer on top of the primitives if he doesn't like what is provided. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From gmcm@hypernet.com Thu Feb 21 02:00:50 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 20 Feb 2002 21:00:50 -0500 Subject: [Stackless] Re: [Python-Dev] Stackless Design Q. In-Reply-To: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz> References: <3C72A453.7080905@tismer.com> Message-ID: <3C740E82.8148.204A91B4@localhost> On 20 Feb 2002 at 16:01, Greg Ewing wrote: [Christian's plea] > It seems to me you need only two things: > > (1) A constructor for new tasklets: > > t = tasklet(f) [snip] > (2) A way of switching to another tasklet: > > t.transfer() [snip] > All the other stuff you talk about -- passing > values between tasklets, rings of runnable tasklets, > scheduling policies, etc -- can all be implemented > in Python on top of these primitives. Unless you've got a way to detect or pass tasklet's through transfer, you don't have enough. -- Gordon http://www.mcmillan-inc.com/ From greg@cosc.canterbury.ac.nz Thu Feb 21 02:08:47 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Thu, 21 Feb 2002 15:08:47 +1300 (NZDT) Subject: [Stackless] Re: [Python-Dev] Stackless Design Q. In-Reply-To: <3C740E82.8148.204A91B4@localhost> Message-ID: <200202210208.PAA22202@s454.cosc.canterbury.ac.nz> Gordon McMillan : > Unless you've got a way to detect or pass tasklet's through transfer, > you don't have enough. You'll have to elaborate. I don't have any idea what you mean by that! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From Paul.Moore@atosorigin.com Thu Feb 21 09:58:36 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 21 Feb 2002 09:58:36 -0000 Subject: [Python-Dev] Meta-reflections Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F3@UKRUX002.rundc.uk.origin-it.com> From: Kevin Jacobs [mailto:jacobs@penguin.theopalgroup.com] > Having a flat namespace (i.e., no hidden slots), and having > all 'reachable' slots in a single list are really two > separate issues. Right now, we have this situation: > > class Foo(object): > __slots__ = ('a','b') > > class Bar(Foo): > __slots__ = ('c','d') > > bar=Bar() > print bar.__slots__ > > ('c', 'd') > print bar.__class__.__base__.__slots__ > > ('a', 'b') > > I content that bar.__slots__ should be: > > ('a', 'b', 'c', 'd') > > I think somewhere along the line I may have mixed up which > 'flatness' I was> talking about. Um. Be aware that I'm not 100% sure about the "no hidden slots" point. I only support it on the basis of acting the same as attributes. But I'm not sure about it for attributes, either... (although I doubt it will ever change). As for your contention over __slots__, I don't agree. I don't have a strong view, but I feel that __slots__ is really only meant as a way of *defining* the slots (and as such, may be replaced later by better syntax). Think of it as write-only. Modifying it after the class is defined, or reading it, aren't really well defined (and don't need to be). If slot definition had been spelt "slot a" instead of "__slots__ = ['a']", you wouldn't necessarily expect to have a readable attribute containing the list of slots... > > The official, and supported, way is to use dir(). > > I agree, but that "official" support has clear limitations. I'm not sure what you mean. > Right now, there are several examples in the Python standard > library where we use obj.__dict__.keys() -- most significantly > in pickle and cPickle. But aren't we agreed that this is the source of a bug (that slots aren't picklable)? > There is also the vars(obj), which may be the better reflection > method, though it currently doesn't know about slots. Possibly you're right. This could easily be raised as a feature request. And possibly even as a bug (vars() should know about slots). > Naive, maybe, but saying "undocumented" is equivalent to "unsupported > implementation detail" saves us from having to maintain backward > compatibility and following the official Python deprecation process. Equally, saying "it's not in the manual, so tough luck" is unreasonably harsh. We need to be reasonable about this. > > Do you have any reason why you would need to get a list of > > only slots, or only dict-based attributes? > > Yes. Dict-based attributes always have values, while > slot-based attributes can be unset and raise AttributeErrors > when trying to access them. Hmm. I could argue this a couple of ways. Slots should contain None when unassigned (no AttributeErrors), or code should be (re-)written to cope with AttributeError from things in dir(). I wouldn't argue it as "slots can raise AttributeError, so we need to treat slots and dict-based attibutes separately, in 2 passes". > here is how I would handle pickling (excerpt from pickle.py): > > try: > getstate = object.__getstate__ > except AttributeError: > stuff = object.__dict__ > > # added to support slots > if hasattr(object.__slots__): > for slot in object.__slots__: > if hasattr(object, slot): > stuff[slot] = getattr(object, slot) > > else: > stuff = getstate() > _keep_alive(stuff, memo) > save(stuff) > write(BUILD) Why not just change the line stuff = object.__dict__ to stuff = [a for a in dir(object) if hasattr(object,a) and not callable(getattr(object,a))] [The hasattr() gets rid of unbound slots - this may be another argument for unbound slots containing None, and the callable() gets rid of methods]. Then slots are covered, as well as any future non-dict-based attribute types. If vars(obj) was fixed to include slots, like dir() was, then this could be reduced to "stuff = vars(object)" (modulo protection against AttributeError). > I'm not suggesting anything more incestuous and low-level than what is > already done in many, many, many places. A larger, more-encompassing > proposal is definitely welcome. I'm not sure we need a larger proposal. A smaller one may work better. I'm arguing above for 1. Make unbound slots return None rather than AttributeError 2. Make vars() return slots as well as dict-based attributes 3. Document __dict__ as legacy usage, not slots-aware 4. Fix bugs caused by use of the legacy __dict__ approach 5. Educate users in the new approaches which are slots-aware (dir/vars, calling base class setattr, etc) (and maybe a sixth, don't make __slots__ a reflection API - make it an implementation detail, a bit like __dict__ is now viewed) > Well, I've not found resounding agreement on the first two of > my three basic issues/bugs I've raised so far: > > 1) Flat slot namespaces: Objects should not hiding slots > when inherited by classes implementing the same slot name. You're right - I'm not in "resounding" agreement. I think it's probably better, for consistency with dict-based attributes, but I sort of wish it wasn't. (The fact that I've not hit problems because of the equivalent property of attributes means that I'm probably wrong, and attributes are fine as they are, though.) > 2) Flat slot descriptions: object.__slots__ being an > immutable flat tuple of all slot names (including > inherited ones), as opposed to being a potentially > mutable sequence of only the slots defined by the most > derived class. This I disagree with. I think __slots__ should be immutable. But I'm happy with "don't do that" as the way of implementing immutability, if that's what Guido prefers. I definitely don't think __slots__ should return something different when you read it than what you assigned to it in the first place (which is what including inherited slots does). But I don't really think people have any business reading __slots__ in any case (see my arguments above). > 3) First class status for slot reflection: making slots picklable by > default, returned by vars(object), and made part of > other relevant reflection APIs and standard implementations. I agree on this one. Paul. From tismer@tismer.com Thu Feb 21 12:13:28 2002 From: tismer@tismer.com (Christian Tismer) Date: Thu, 21 Feb 2002 13:13:28 +0100 Subject: [Stackless] Re: [Python-Dev] Stackless Design Q. References: <200202210008.NAA22179@s454.cosc.canterbury.ac.nz> Message-ID: <3C74E468.1090403@tismer.com> Greg Ewing wrote: > Christian Tismer : ... >>But auto-scheduled frames are a diffeent kind >>of thing than those which are in "waiting for data" >>state. I need to distinguish them or I will crash. >> > > If you get rid of the idea of passing values between tasklets as part > of the switching process, then this distinction disappears. I think > that value-passing and tasklet-switching are orthogonal activities and > would be better decoupled. Hmm, first I thought you were wrong: Any Python function that calls something, may it be a stackless schedule function or something else, expects a value to be returned. Always and ever. But when I have a scheduler counter built into the Python interpreter loop, then a schedule will happen *between* opcodes. Such a frame is not awaiting data, and therefor not suitable to be switched to by one which is in data transfer. Now I see it: You mean I can make this schedule function behave like a normal function call, that accepts and drops a dummy value? In fact, this would make all tasklets compatible. thinking - thanks - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From jacobs@penguin.theopalgroup.com Thu Feb 21 12:36:14 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Thu, 21 Feb 2002 07:36:14 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F3@UKRUX002.rundc.uk.origin-it.com> Message-ID: On Thu, 21 Feb 2002, Moore, Paul wrote: > > I agree, but that "official" support has clear limitations. > > I'm not sure what you mean. When you request dir(object), there is a fairly significant amount of work done. Even though it is implemented in C, I can foresee a non-trivial performance hit to a great deal of code. vars(object) is better, though it also has some performance implications. I know that we are talking about Python, and performance is not of paramount importance. But realize that my company produces _extremely_ large Python applications used for financial and business tracking and forecasting. We are acutely aware of bottle-necks in critical paths such as object serialization. I'm just not looking forward to 25% slowdowns in pickling (number pulled out of hat) and I'm sure the Zope guys aren't either... > > Right now, there are several examples in the Python standard > > library where we use obj.__dict__.keys() -- most significantly > > in pickle and cPickle. > > But aren't we agreed that this is the source of a bug (that slots aren't > picklable)? That is not the bug -- if for no other reason, the standard library is free to use implementation specific knowledge. Getting obj.__dict__ is a really slick and efficient way to reflect on all normal instance variables. > > Naive, maybe, but saying "undocumented" is equivalent to "unsupported > > implementation detail" saves us from having to maintain backward > > compatibility and following the official Python deprecation process. > > Equally, saying "it's not in the manual, so tough luck" is unreasonably > harsh. We need to be reasonable about this. I don't mean to imply that we need to be harsh, though in some classes we do not want to worry about backward compatibility. How do we tell users which features are safe to use, so that they don't write thousands of lines of code that suddenly break when the next Python version is released? Well, not documenting it in the official Python reference manual is a pretty good way. Personally, I'm extremely wary of using anything that isn't. Such features can also be documented in the reference manual and explicitly marked as "subject to change", but that is not the case we are currently dealing with. > > Yes. Dict-based attributes always have values, while > > slot-based attributes can be unset and raise AttributeErrors > > when trying to access them. > > Hmm. I could argue this a couple of ways. Slots should contain None when > unassigned (no AttributeErrors), or code should be (re-)written to cope with > AttributeError from things in dir(). I wouldn't argue it as "slots can raise > AttributeError, so we need to treat slots and dict-based attibutes > separately, in 2 passes". I don't see how filling slots with default values is compatible with the premise that we want slots to act as close to normal instance attributes as possible. I've implemented quite a few things with slots. In fact, I have an experimental branch of a 200k LOC project that re-implements many low level components using slots. The speed-ups and memory savings from doing so are very, very compelling. There are cases where I declare slots that may never be used using any particular instance lifetime. I do expect them to raise an AttributeError, just like they would have before they were slots. If I fill the slot, assigning it to None has another very different semantic meaning than an AttributeError. Another good example is pickling. Why would you ever want to pickle empty slots? They have no value, not even a default one, so why waste the processor cycles or the disk space? > Why not just change the line stuff = object.__dict__ to > > stuff = [a for a in dir(object) if hasattr(object,a) and not > callable(getattr(object,a))] Um, because its wrong? I pickle _lots_ of callable objects. It also pickles class-attributes as instance-attributes. Here is a better version that can be used once vars(object) has been fixed: stuff = dict([ (a,getattr(object,a)) for a in vars(object) if hasattr(object,a)]) Note that it does an unnecessary getattr, hasattr, memory allocation and incurs loop overhead on every dict attribute, but otherwise it should work once vars is fixed. > 1. Make unbound slots return None rather than AttributeError I am strongly against this. It doesn't make sense to start supplying implicit default values. Explicit is better than implicit... > 2. Make vars() return slots as well as dict-based attributes Agree. > 3. Document __dict__ as legacy usage, not slots-aware Agree, though __dict__ should still be a valid way of accessing all non-slot instance attributes. Too much legacy code would break if this were not so. > 4. Fix bugs caused by use of the legacy __dict__ approach I'd rephrase that as: fix reflection code that assumes attributes only live in __dict__. > 5. Educate users in the new approaches which are slots-aware > (dir/vars, calling base class setattr, etc) Calling base class setattr? I'm not sure what you mean? > > 2) Flat slot descriptions: object.__slots__ being an > > immutable flat tuple of all slot names (including > > inherited ones), as opposed to being a potentially > > mutable sequence of only the slots defined by the most > > derived class. > > This I disagree with. I think __slots__ should be immutable. But I'm happy > with "don't do that" as the way of implementing immutability, if that's what > Guido prefers. Not knowing what Guido is planning, all I can say is that he has made __bases__ and __mro__ explicitly immutable. If we now intend to make __slots__ immutable as well, then it is better to do so explicitly. > I definitely don't think __slots__ should return something > different when you read it than what you assigned to it in the first place > (which is what including inherited slots does). But I don't really think > people have any business reading __slots__ in any case (see my arguments > above). By your logic, people don't have any business reading __dict__, but they do. Imagine what would happen if we didn't expose __dict__ in Python 2.3? Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From David Abrahams" Hi All, The following extension module (AA) is a reduced example of what I'm doing to make extension classes in 2.2. I followed the examples given by typeobject.c. When I "import AA,pdb" I get a crash in GC. Investigating further, I see this makes sense: GC is enabled in class_metatype_object, yet class_type_object does not follow the first rule of objects whose type has GC enabled: "The memory for the object must be allocated using PyObject_GC_New() or PyObject_GC_VarNew()." So, I guess the question is, how does PyBaseObject_Type (also statically allocated) get away with it? TIA, Dave ---------------- // Copyright David Abrahams 2002. Permission to copy, use, // modify, sell and distribute this software is granted provided this // copyright notice appears in all copies. This software is provided // "as is" without express or implied warranty, and with no claim as // to its suitability for any purpose. #include PyTypeObject class_metatype_object = { PyObject_HEAD_INIT(0) 0, "Boost.Python.class", PyType_Type.tp_basicsize, 0, 0, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ 0, /* tp_repr */ 0, /* tp_as_number */ 0, /* tp_as_sequence */ 0, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ 0, /* tp_str */ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_BASETYPE, /* tp_flags */ 0, /* tp_doc */ 0, /* tp_traverse */ 0, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ 0, /* tp_iternext */ 0, /* tp_methods */ 0, /* tp_members */ 0, /* tp_getset */ 0, // &PyType_Type, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ 0, /* tp_init */ 0, /* tp_alloc */ 0, // PyType_GenericNew /* tp_new */ }; // Get the metatype object for all extension classes. PyObject* class_metatype() { if (class_metatype_object.tp_dict == 0) { class_metatype_object.ob_type = &PyType_Type; class_metatype_object.tp_base = &PyType_Type; if (PyType_Ready(&class_metatype_object)) return 0; } Py_INCREF(&class_metatype_object); return (PyObject*)&class_metatype_object; } // Each extension instance will be one of these typedef struct instance { PyObject_HEAD void* objects; } instance; static void instance_dealloc(PyObject* inst) { instance* kill_me = (instance*)inst; inst->ob_type->tp_free(inst); } PyTypeObject class_type_object = { PyObject_HEAD_INIT(0) file://&class_metatype_object) 0, "Boost.Python.instance", sizeof(PyObject), 0, instance_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ 0, /* tp_repr */ 0, /* tp_as_number */ 0, /* tp_as_sequence */ 0, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ 0, /* tp_str */ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_BASETYPE, /* tp_flags */ 0, /* tp_doc */ 0, /* tp_traverse */ 0, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ 0, /* tp_iternext */ 0, /* tp_methods */ 0, /* tp_members */ 0, /* tp_getset */ 0, file://&PyBaseObject_Type, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ 0, /* tp_init */ PyType_GenericAlloc, /* tp_alloc */ PyType_GenericNew }; PyObject* class_type() { if (class_type_object.tp_dict == 0) { class_type_object.ob_type = (PyTypeObject*)class_metatype(); class_type_object.tp_base = &PyBaseObject_Type; if (PyType_Ready(&class_type_object)) return 0; } Py_INCREF(&class_type_object); return (PyObject*)&class_type_object; } PyObject* make_class() { PyObject* bases, *args, *mt, *result; bases = PyTuple_New(1); PyTuple_SET_ITEM(bases, 0, class_type()); args = PyTuple_New(3); PyTuple_SET_ITEM(args, 0, PyString_FromString("AA")); PyTuple_SET_ITEM(args, 1, bases); PyTuple_SET_ITEM(args, 2, PyDict_New()); mt = class_metatype(); result = PyObject_CallObject(mt, args); Py_XDECREF(mt); Py_XDECREF(args); return result; } static PyMethodDef SpamMethods[] = { {NULL, NULL} /* Sentinel */ }; DL_EXPORT(void) initAA() { PyObject *m, *d; m = Py_InitModule("AA", SpamMethods); d = PyModule_GetDict(m); PyDict_SetItemString(d, "AA", make_class()); } +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+ From Paul.Moore@atosorigin.com Thu Feb 21 13:19:47 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 21 Feb 2002 13:19:47 -0000 Subject: [Python-Dev] Meta-reflections Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F6@UKRUX002.rundc.uk.origin-it.com> From: Kevin Jacobs [mailto:jacobs@penguin.theopalgroup.com] > On Thu, 21 Feb 2002, Moore, Paul wrote: > > > I agree, but that "official" support has clear limitations. > > > > I'm not sure what you mean. > > When you request dir(object), there is a fairly significant > amount of work done. [...] > I know that we are talking about Python, and performance is > not of paramount importance. Hmm. I tend to favour "do it right, then do it fast". If there's a performance hit on dir(), why can't it be made faster? If nothing else, as a part of the core, dir() has the right to access __dict__ and __slots__. So there's no a priori reason why dir() should be slower than *any* user-coded way of doing the same. Of course, we *really* want vars() here, as we're otherwise doing work in dir() to get entries that we then throw away. But that's the only issue. Get vars() to work, and if it's too slow, you can argue that it's a bug because "I can get the same results by using the following code, which is faster". > I'm just not looking forward to 25% slowdowns in pickling > (number pulled out of hat) and I'm sure the Zope guys > aren't either... There's bound to be some slowdown, from the (new) need to find slots as well as dict-based attributes. I'm happy if you want it minimised. But that's a new point you've raised, which I don't have the expertise to comment on. > That is not the bug -- if for no other reason, the standard > library is free to use implementation specific knowledge. Getting > obj.__dict__ is a really slick and efficient way to reflect > on all normal instance variables. I'm not sure I agree here - it's better if the standard library uses interfaces which are available to the user. And if pickling can be made fast, why shouldn't the machinery that makes this possible be made available to the end user? You could say that this argues in favour of making __dict__ and __slots__ part of the "official" reflection API. My view is that it argues for making the "official" API (which I'm assuming will be vars() for now) efficient enough that people don't need to use __disct__ and __slots__. Encapsulation is good. > I don't see how filling slots with default values is > compatible with the premise that we want slots to act > as close to normal instance attributes as possible. Fair enough. I offered that as one option. Clearly you prefer the other (that what's in dir() and/or __slots__ cannot be guaranteed not to raise AttributeError). I'm happy either way, not having a vested interest in the issue. > > Why not just change the line stuff = object.__dict__ to > > > > stuff = [a for a in dir(object) if hasattr(object,a) and not > > callable(getattr(object,a))] > > Um, because its wrong? Sorry - it was an off-the-top-of-the-head suggestion. But it made my real point, which was that you can do it with dir(). > stuff = dict([ (a,getattr(object,a)) for a in vars(object) > if hasattr(object,a)]) > > Note that it does an unnecessary getattr, hasattr, memory > allocation and incurs loop overhead on every dict attribute, > but otherwise it should work once vars is fixed. Efficiency again. I'd have to bow to your greater experience here. Although with pickling, doesn't I/O usually outweigh any performance cost? > > 3. Document __dict__ as legacy usage, not slots-aware > > Agree, though __dict__ should still be a valid way of > accessing all non-slot instance attributes. Too much > legacy code would break if this were not so. That's what I meant. Document it as the historical way of getting at instance attributes. Still available, but code which uses it will not support slots. After all, if you pass classes using slots into code which uses __dict__, things will go wrong. That's just another sort of breakage. Nobody's arguing that __dict__ should go away. Except possibly from the documentation :-) > Calling base class setattr? I'm not sure what you mean? It's in the part of the descrintro document I pointed you at. Traditional implementations of setattr used assignment to self.__dict__['attr'] to avoid infinite recursion. The "new way" discussed in the descrintro document is to call the base class setattr. > By your logic, people don't have any business reading > __dict__, but they do. They don't have any business *any more*. An important distinction. (And it's not anywhere near as black and white as that comment implies - I know that). > Imagine what would happen if we didn't expose __dict__ in Python 2.3? Nothing at all, if we provide an alternative. Except for backward compatibility issues, which there's a well-documented deprecation process to address. Of course, nobody is proposing the removal of __dict__. All I'm suggesting is that we document its limitations, point out better ways, and leave it at that. Paul. From jacobs@penguin.theopalgroup.com Thu Feb 21 13:38:42 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Thu, 21 Feb 2002 08:38:42 -0500 (EST) Subject: [Python-Dev] A little GC confusion In-Reply-To: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com> Message-ID: On Thu, 21 Feb 2002, David Abrahams wrote: > The following extension module (AA) is a reduced example of what I'm doing > to make extension > classes in 2.2. I followed the examples given by typeobject.c. When I > "import AA,pdb" I get a crash in GC. Investigating further, I see this makes > sense: GC is enabled in class_metatype_object, yet class_type_object does > not follow the first rule of objects whose type has GC enabled: > > "The memory for the object must be allocated using PyObject_GC_New() > or PyObject_GC_VarNew()." > > So, I guess the question is, how does PyBaseObject_Type (also statically > allocated) get away with it? I doesn't have any time to really look at your code, but I thought I'd point out a trick that several extension modules use to protect statically allocated type objects. Here is the code from socketmodule.c: /* static PyTypeObject PySocketSock_Type = { . . . 0, /* set below */ /* tp_alloc */ PySocketSock_new, /* tp_new */ 0, /* set below */ /* tp_free */ }; /* buried in init_socket */ PySocketSock_Type.tp_alloc = PyType_GenericAlloc; PySocketSock_Type.tp_free = _PyObject_Del; This trick ensures that the static type object is never freed. Also, there is a funny-looking line in your PyTypeObject: 0, file://&PyBaseObject_Type, /* tp_base */ -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From tismer@tismer.com Thu Feb 21 13:48:07 2002 From: tismer@tismer.com (Christian Tismer) Date: Thu, 21 Feb 2002 14:48:07 +0100 Subject: [Stackless] Re: [Python-Dev] Stackless Design Q. References: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz> Message-ID: <3C74FA97.7080700@tismer.com> Jeff Senn wrote: > Greg Ewing writes: > >>It's not clear exactly what you're after here. Are you >>trying to define the lowest-level interface upon which >>everything else will be built? If so, I think what you >>have presented is FAR too complex. >> >>It seems to me you need only two things: >> > ... > >> t = tasklet(f) >> t.transfer() >> > > (Sorry if I missed something -- I've been *way* busy lately and > haven't been giving this much attention -- that said...) > > But (if I understand the current plan) we will need mechanisms > internal to the Python interpreter to transfer values and maintain > blocked/running state anyway; since when you generate a tasklet and > run it: > > t = tasklet(f) > t.transfer() > > That may cause many more tasklets to be generated, run, and destroyed > that you don't ever see ... recursions/function calls in f, and > only-Christian-knows what else... so the transfer value mechanism > might as well be built in. I think all these little things are cheap to implement. > I haven't thought enough about the "unamed produce-and-continue > function" to decide how exactly it should work. Somebody named it "resume", and together with "suspend" we get a nice couple. On the other hand: I'm not sure whether resume should block its caller. I'm very undecided after all the input I got, if it is in fact better to forget data transfer completely by now and just make switching primitives which always work. > I have two concerns in implementing uthreads this way (scheduler in > C): > > 1 -- there doesn't seem to be anyway to "kill" a tasklet Not yet, but I want to provide an exception to kill tasklets. Also it will be prossible to just pick it off and drop it, but I'm a little concerned about the C stack inside. This might be the last resort if the exception doesn't work. > 2 -- the scheduling algorithm will be hard to tune (we'll probably > *at least* need tasklet priority...) Maybe there should still > be a "timeslice" function so an in-Python scheduler can be written? We had the timeslice function, yes. I think to make things simpler this time and just periodically call the scheduler which is written in C. I also have a rough concept of priorities which can be very cheaply implemented. Maybe I implement some default behavior, but allow these objects to be subclassed from Python? ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From mwh@python.net Thu Feb 21 13:47:56 2002 From: mwh@python.net (Michael Hudson) Date: 21 Feb 2002 13:47:56 +0000 Subject: [Python-Dev] A little GC confusion In-Reply-To: Kevin Jacobs's message of "Thu, 21 Feb 2002 08:38:42 -0500 (EST)" References: Message-ID: <2mr8nfeyk3.fsf@starship.python.net> Kevin Jacobs writes: > I doesn't have any time to really look at your code, but I thought I'd point > out a trick that several extension modules use to protect statically > allocated type objects. Here is the code from socketmodule.c: > > /* static PyTypeObject PySocketSock_Type = { > . > . > . > 0, /* set below */ /* tp_alloc */ > PySocketSock_new, /* tp_new */ > 0, /* set below */ /* tp_free */ > }; > > /* buried in init_socket */ > PySocketSock_Type.tp_alloc = PyType_GenericAlloc; > PySocketSock_Type.tp_free = _PyObject_Del; > > This trick ensures that the static type object is never freed. Um, I think you'll find this is because PyType_GenericAlloc & _PyObject_Del aren't compile-time constants when _socket is dynamically linked (they're defined in a different dll). Cheers, M. -- > so python will fork if activestate starts polluting it? I find it more relevant to speculate on whether Python would fork if the merpeople start invading our cities riding on the backs of giant king crabs. -- Brian Quinlan, comp.lang.python From David Abrahams" BTW, I haven't been approved for this list yet, so if you could cc: any replies to me directly at david.abrahams@rcn.com it would be greatly appreciated. Thanks, Dave From jacobs@penguin.theopalgroup.com Thu Feb 21 14:10:54 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Thu, 21 Feb 2002 09:10:54 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F6@UKRUX002.rundc.uk.origin-it.com> Message-ID: On Thu, 21 Feb 2002, Moore, Paul wrote: > Hmm. I tend to favour "do it right, then do it fast". If there's a > performance hit on dir(), why can't it be made faster? [snip] > Of course, we *really* want vars() here, as we're otherwise doing work in > dir() to get entries that we then throw away. dir(object) simply doesn't do what we want. I've tried several times to write a correct pickler using dir(object) and have always run into problems due to pathological corner-cases. I encourage you to try your hand at it. In the process I've found another issue with the slots implementation. I'll post the details to python-dev in a separate e-mail. > > Note that it does an unnecessary getattr, hasattr, memory > > allocation and incurs loop overhead on every dict attribute, > > but otherwise it should work once vars is fixed. > > Efficiency again. I'd have to bow to your greater experience here. Although > with pickling, doesn't I/O usually outweigh any performance cost? I can't speak for everyone's applications, but we frequently pickle to memory or to the operating system buffer-cache don't live long enough to hit the disk. Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From thomas.heller@ion-tof.com Thu Feb 21 14:14:03 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 21 Feb 2002 15:14:03 +0100 Subject: [Python-Dev] A little GC confusion References: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com> Message-ID: <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook> From: "David Abrahams" > Hi All, > > The following extension module (AA) is a reduced example of what I'm doing > to make extension > classes in 2.2. I followed the examples given by typeobject.c. When I > "import AA,pdb" I get a crash in GC. For me it crashes after import AA, gc gc.collect() (Win2k Prof, SP1, MSVC6.0) I'm not really sure, but it seems your code does not crash any longer if you remove the Py_TPFLAGS_HAVE_GC from your definition of class_metatype_object. This flag *will* be set by PyType_Ready(); I guess it is inherited from the base type (PyType_Type in your case). Thomas From fdrake@acm.org Thu Feb 21 14:22:01 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 21 Feb 2002 09:22:01 -0500 Subject: [Python-Dev] Re: A little GC confusion In-Reply-To: <0ffe01c1bae2$21033c80$0500a8c0@boostconsulting.com> References: <0ffe01c1bae2$21033c80$0500a8c0@boostconsulting.com> Message-ID: <15477.649.42690.538861@grendel.zope.com> David Abrahams writes: > BTW, I haven't been approved for this list yet, so if you could cc: any > replies to me directly at david.abrahams@rcn.com it would be greatly > appreciated. You have now been approved. The usual list admins are either away on vacation or are having connectivity problems; sorry for the delay. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From thomas.heller@ion-tof.com Thu Feb 21 14:36:14 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 21 Feb 2002 15:36:14 +0100 Subject: [Python-Dev] A little GC confusion References: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com> <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook> Message-ID: <1aea01c1bae5$2088d500$e000a8c0@thomasnotebook> PS: I have code very similar to yours, and a question: Why does your class_type_object have PyPyBase_ObjectType as tp_base? To implement a subtypable type this is not needed IMO, or do I miss something? Thomas From David Abrahams" <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook> Message-ID: <102b01c1bae6$5463de00$0500a8c0@boostconsulting.com> ----- Original Message ----- From: "Thomas Heller" > From: "David Abrahams" > > Hi All, > > > > The following extension module (AA) is a reduced example of what I'm doing > > to make extension > > classes in 2.2. I followed the examples given by typeobject.c. When I > > "import AA,pdb" I get a crash in GC. > > For me it crashes after > import AA, gc > gc.collect() > (Win2k Prof, SP1, MSVC6.0) > > I'm not really sure, but it seems your code does not crash any longer > if you remove the Py_TPFLAGS_HAVE_GC from your definition of class_metatype_object. Yes, I'm aware of that. What I don't understand is how the builtin metatype gets away with Py_TPFLAGS_HAVE_GC when some of its instance types are not even heap-allocated. -Dave From thomas.heller@ion-tof.com Thu Feb 21 14:52:11 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 21 Feb 2002 15:52:11 +0100 Subject: [Python-Dev] A little GC confusion References: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com> <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook> <1aea01c1bae5$2088d500$e000a8c0@thomasnotebook> <104301c1bae7$bbb57e50$0500a8c0@boostconsulting.com> Message-ID: <1b2a01c1bae7$5ae47d60$e000a8c0@thomasnotebook> From: "David Abrahams" > From: "Thomas Heller" > > > PS: I have code very similar to yours, and a question: > > > > Why does your class_type_object have PyPyBase_ObjectType as > > tp_base? To implement a subtypable type this is not needed > > IMO, or do I miss something? > > I wanted subclasses of my class_type_object to have the same properties as > new-style classes, so it just made sense to me to do that. > > Python's documentation is... less than complete... so I'm sure I'm missing > something. Sure. Unfortunately, PEP253 still talks of PyType_InitDict instead of PyType_Ready, but we're beyond that already;-) Thomas From thomas.heller@ion-tof.com Thu Feb 21 15:01:16 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 21 Feb 2002 16:01:16 +0100 Subject: [Python-Dev] A little GC confusion References: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com> <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook> <102b01c1bae6$5463de00$0500a8c0@boostconsulting.com> Message-ID: <1b6b01c1bae8$9f6af350$e000a8c0@thomasnotebook> > > I'm not really sure, but it seems your code does not crash any longer > > if you remove the Py_TPFLAGS_HAVE_GC from your definition of > class_metatype_object. > > Yes, I'm aware of that. What I don't understand is how the builtin metatype > gets away with Py_TPFLAGS_HAVE_GC when some of its instance types are not > even heap-allocated. > Hm, I don't understand you, Are you talking about Py_TPFLAGS_HEAPTYPE? Thomas From David Abrahams" <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook> <102b01c1bae6$5463de00$0500a8c0@boostconsulting.com> <1b6b01c1bae8$9f6af350$e000a8c0@thomasnotebook> Message-ID: <10f401c1baeb$3f4f5850$0500a8c0@boostconsulting.com> ----- Original Message ----- From: "Thomas Heller" To: "David Abrahams" ; Sent: Thursday, February 21, 2002 10:01 AM Subject: Re: [Python-Dev] A little GC confusion > > > I'm not really sure, but it seems your code does not crash any longer > > > if you remove the Py_TPFLAGS_HAVE_GC from your definition of > > class_metatype_object. > > > > Yes, I'm aware of that. What I don't understand is how the builtin metatype > > gets away with Py_TPFLAGS_HAVE_GC when some of its instance types are not > > even heap-allocated. > > > Hm, I don't understand you, Are you talking about Py_TPFLAGS_HEAPTYPE? No, please re-read my initial posting. Py_TPFLAGS_HAVE_GC places requirements on the allocation method of instances, at least according to the docs. From gmcm@hypernet.com Thu Feb 21 15:45:00 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 21 Feb 2002 10:45:00 -0500 Subject: [Stackless] Re: [Python-Dev] Stackless Design Q. In-Reply-To: <200202210208.PAA22202@s454.cosc.canterbury.ac.nz> References: <3C740E82.8148.204A91B4@localhost> Message-ID: <3C74CFAC.24406.233D20CD@localhost> On 21 Feb 2002 at 15:08, Greg Ewing wrote: > Gordon McMillan : > > > Unless you've got a way to detect or pass tasklet's > > through transfer, you don't have enough. > > You'll have to elaborate. I don't have any idea what > you mean by that! You need a way to refer to "this" tasklet from Python, and pass that to the "other" tasklet. Alternatively, you need "the tasklet that transferred to me". This is implicit in generators; it needs to be explicit to do coroutines. You can't write a scheduler in Python without it - you need the client tasklets to transfer to the scheduler tasklet. -- Gordon http://www.mcmillan-inc.com/ From pedroni@inf.ethz.ch Thu Feb 21 16:34:39 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Thu, 21 Feb 2002 17:34:39 +0100 Subject: [Python-Dev] Meta-reflections References: Message-ID: <00bb01c1baf5$a8e4b800$6d94fea9@newmexico> [Kevin Jacobs] > > In the process I've found another issue with the slots implementation. > I'll post the details to python-dev in a separate e-mail. > FYI bug reported only on python-dev have a high probability to get lost into vacuum (Tim often warns against that). Now a seemingly bug is a seeminhly bug, so I have reported your bug to SF: http://sourceforge.net/tracker/index.php?func=detail&aid=520644&group_id=5470&a tid=105470 In general don't expect that someone will post bugs on your behalf. regards, Samuele Pedroni. From jacobs@penguin.theopalgroup.com Thu Feb 21 17:30:56 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Thu, 21 Feb 2002 12:30:56 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <00bb01c1baf5$a8e4b800$6d94fea9@newmexico> Message-ID: On Thu, 21 Feb 2002, Samuele Pedroni wrote: > [Kevin Jacobs] > > > > In the process I've found another issue with the slots implementation. > > I'll post the details to python-dev in a separate e-mail. > > > > FYI bug reported only on python-dev have a high probability > to get lost into vacuum (Tim often warns against that). > > Now a seemingly bug is a seemingly bug, so I have reported > your bug to SF: > > http://sourceforge.net/tracker/index.php?func=detail&aid=520644&group_id=5470&a > tid=105470 > > In general don't expect that someone will post bugs on your behalf. Thanks. I have a collection of about ~8 more bugs that is expending as I grow my test suite. Before I spray all of them onto SF, I want to hear from Guido, since some of my "bugs" are potentially subjective. I _have_ tried three times to post a summary-bug to SF and its not worked (as usual). Is just me or is SF flaky as hell? The last time I tried to post a bug, it kicked me out and was "Down for maintenance" for some time after that. Now it won't let me login since it thinks I haven't responded to the new account confirmation e-mail. Grrrrrrrrrr -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From tim.one@comcast.net Thu Feb 21 21:06:47 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 21 Feb 2002 16:06:47 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: Message-ID: [Kevin Jacobs] > ... > I have a collection of about ~8 more bugs that is expending as I > grow my test suite. Before I spray all of them onto SF, I want > to hear from Guido, since some of my "bugs" are potentially subjective. The best way to hear from Guido is to post bugs, and suspected bugs, to SourceForge, one bug per report. There's so much verbiage about this now on Python-Dev that I doubt he'll ever be able to make time to catch up with it when he returns. A great advantage of a good bug report is that it's focused and brief. Slots were definitely intended as a memory optimization, and the ways in which they don't act like "regular old attributes" are at best warts. > I _have_ tried three times to post a summary-bug to SF and its not worked > (as usual). Is just me or is SF flaky as hell? The last time I tried to > post a bug, it kicked me out and was "Down for maintenance" for some time > after that. Now it won't let me login since it thinks I haven't > responded to the new account confirmation e-mail. Grrrrrrrrrr It *sounds* like you're getting started with SF. Once it agrees not to hate you , life gets a lot easier. It's not flaky in general, but it does suffer bouts of extreme flakiness from time to time. From David Abrahams" Message-ID: <12d501c1bb1e$8778d2e0$0500a8c0@boostconsulting.com> FWIW, some of my Boost colleagues have been watching SF's future prospects with some suspicion. The financial outlook is worrisome; I submitted a support request in April 2001 that still hasn't been addressed ( http://sourceforge.net/tracker/?func=detail&aid=414066&group_id=1&atid=35000 1). We're establishing all new services elsewhere, and even moving some old ones. For the long-term health of Python, you might want to make sure you're prepared to move quickly if neccessary. -Dave ----- Original Message ----- From: "Tim Peters" To: "'Python Dev'" Sent: Thursday, February 21, 2002 4:06 PM Subject: RE: [Python-Dev] Meta-reflections > [Kevin Jacobs] > > ... > > I have a collection of about ~8 more bugs that is expending as I > > grow my test suite. Before I spray all of them onto SF, I want > > to hear from Guido, since some of my "bugs" are potentially subjective. > > The best way to hear from Guido is to post bugs, and suspected bugs, to > SourceForge, one bug per report. There's so much verbiage about this now on > Python-Dev that I doubt he'll ever be able to make time to catch up with it > when he returns. A great advantage of a good bug report is that it's > focused and brief. > > Slots were definitely intended as a memory optimization, and the ways in > which they don't act like "regular old attributes" are at best warts. > > > I _have_ tried three times to post a summary-bug to SF and its not worked > > (as usual). Is just me or is SF flaky as hell? The last time I tried to > > post a bug, it kicked me out and was "Down for maintenance" for some time > > after that. Now it won't let me login since it thinks I haven't > > responded to the new account confirmation e-mail. Grrrrrrrrrr > > It *sounds* like you're getting started with SF. Once it agrees not to hate > you , life gets a lot easier. It's not flaky in general, but it does > suffer bouts of extreme flakiness from time to time. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > From pedroni@inf.ethz.ch Thu Feb 21 21:13:57 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Thu, 21 Feb 2002 22:13:57 +0100 Subject: [Python-Dev] Meta-reflections References: Message-ID: <01ed01c1bb1c$acd05f60$6d94fea9@newmexico> > [Kevin Jacobs] > > ... > > I have a collection of about ~8 more bugs that is expending as I > > grow my test suite. Before I spray all of them onto SF, I want > > to hear from Guido, since some of my "bugs" are potentially subjective. > > The best way to hear from Guido is to post bugs, and suspected bugs, to > SourceForge, one bug per report. There's so much verbiage about this now on > Python-Dev that I doubt he'll ever be able to make time to catch up with it > when he returns. A great advantage of a good bug report is that it's > focused and brief. It's very true. > Slots were definitely intended as a memory optimization, and the ways in > which they don't act like "regular old attributes" are at best warts. > I see, but it seems that the only way to coherently and transparently remove the warts implies that the __dict__ of a new-style class instance with slots should be tied with the instance and cannot be anymore a vanilla dict. Something only Guido can rule about. some-more-verbiage-ly y'rs - Samuele. From tim.one@comcast.net Thu Feb 21 22:41:18 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 21 Feb 2002 17:41:18 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: <12d501c1bb1e$8778d2e0$0500a8c0@boostconsulting.com> Message-ID: [David Abrahams] > FWIW, some of my Boost colleagues have been watching SF's future > prospects with some suspicion. It's worth a lot, and we do too -- at least in fits, when somebody remembers it's something that's going to kill us someday. > The financial outlook is worrisome; I submitted a > support request in April 2001 that still hasn't been addressed ( > ). Well, that's really a feature request, and *nobody* responds well to witty oblique references to the Odyssey except me . > We're establishing all new services elsewhere, and even moving some old > ones. For the long-term health of Python, you might want to make > sure you're prepared to move quickly if neccessary. We supposedly have a cron job set up to suck down Python's CVS tarball every night (the people who would know if this is currently working are out this week). What I don't think we ever figured out how to do was capture the info in the trackers (bugs, patches, feature requests). That would be a major loss, as well as a chance to forget about 500 people who can't figure out how to use threads on HP-UX, so let's call it a wash . From tim.one@comcast.net Thu Feb 21 22:51:13 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 21 Feb 2002 17:51:13 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: <01ed01c1bb1c$acd05f60$6d94fea9@newmexico> Message-ID: [Tim] > Slots were definitely intended as a memory optimization, and the ways in > which they don't act like "regular old attributes" are at best warts. [Samuele Pedroni] > I see, but it seems that the only way to coherently and transparently > remove the warts implies that the __dict__ of a new-style class > instance with slots should be tied with the instance and cannot > be anymore a vanilla dict. Something only Guido can rule about. He'll be happy to . Optimizations aren't always wart-free, and then living with warts is a price paid for benefiting from the optimization. I'm sure Guido would consider it "a bug" if slots are ignored by the pickling mechanism, but wouldn't for an instant consider it "a bug" that the set of slots in effect when a class is created can't be dynamically expanded later (this latter is more a sensible restriction than a wart, IMO -- and likely in Guido's too). From guido@python.org Fri Feb 22 00:28:19 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 21 Feb 2002 19:28:19 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: Your message of "Thu, 21 Feb 2002 17:41:18 EST." References: Message-ID: <200202220028.g1M0SJ024961@pcp742651pcs.reston01.va.comcast.net> > What I don't think we ever figured out how to do was capture the > info in the trackers (bugs, patches, feature requests). That would > be a major loss, as well as a chance to forget about 500 people who > can't figure out how to use threads on HP-UX, so let's call it a > wash . >From a recent SF mailing to project administrators: DATA EXPORT --------------------------- We have added a new tool for project administrators to backup their Project data. It is now possible to export data from the Trackers (bug tracker, support tracker, etc), mailing lists, and forum data in to a single XML text file. This can be done at any time. This is actually not a new feature. The ability to export data was available through March of 2001 until we did a major upgrade of the site, which broke the export scripts. We have now re-worked the code, and it's available again. Enjoy. http://sourceforge.net/export SOMEBODY with admin perms should set up a cron job to such down the nightly XML. It's big! (Are we still sucking down the nightly cvs tarballs? We should!) --Guido van Rossum (home page: http://www.python.org/~guido/) From pedroni@inf.ethz.ch Fri Feb 22 00:38:27 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Fri, 22 Feb 2002 01:38:27 +0100 Subject: [Python-Dev] Meta-reflections References: Message-ID: <043801c1bb39$3ed8a540$6d94fea9@newmexico> From: Tim Peters > [Tim] > > Slots were definitely intended as a memory optimization, and the ways in > > which they don't act like "regular old attributes" are at best warts. > > [Samuele Pedroni] > > I see, but it seems that the only way to coherently and transparently > > remove the warts implies that the __dict__ of a new-style class > > instance with slots should be tied with the instance and cannot > > be anymore a vanilla dict. Something only Guido can rule about. > > He'll be happy to . Optimizations aren't always wart-free, and then > living with warts is a price paid for benefiting from the optimization. I'm > sure Guido would consider it "a bug" if slots are ignored by the pickling > mechanism, but wouldn't for an instant consider it "a bug" that the set of > slots in effect when a class is created can't be dynamically expanded later > (this latter is more a sensible restriction than a wart, IMO -- and likely > in Guido's too). > I was thinking along the line of the C equiv of this: [Yup the situation of a subclass of a class with slots is more relevant] class C(object): __slots__ = ['_a'] class D(C): pass def allslots(cls): mro = list(cls.__mro__) mro.reverse() allslots = {} for c in mro: cdict = c.__dict__ if '__slots__' in cdict: for slot in cdict['__slots__']: allslots[slot] = cdict[slot] return allslots class slotdict(dict): __slots__ = ['_inst','_allslots'] def __init__(self,inst,allslots): self._inst = inst self._allslots = allslots def __getitem__(self,k): if self._allslots.has_key(k): # self _allslots should be reachable as self._inst.__class__.__allslots__ # AttributeError should become a KeyError ? return self._allslots[k].__get__(self._inst) else: return dict.__getitem__(self,v) def __setitem__(self,k,v): if self._allslots.has_key(k): # self _allslots should be reachable as self._inst.__class__.__allslots__ # AttributeError should become a KeyError ? return self._allslots[k].__set__(self._inst,v) else: return dict.__setitem__(self,v) # other methods accordingly d=D() d.__dict__ = slotdict(d,allslots(D)) # should be so automagically # allslots(D) should be probably accessible as d.__class__.__allslots__ # for transparency C.__dict__ should not contain any slot descr # __allslots__ should be readonly and disallow rebinding # d.__dict__ should disallow rebinding # c =C() ; c.__dict__ should return a proxy dict lazily or even more so ... Lots of things to rule about and trade-offs to consider. the-more-it's-arbitrary-the-more-you-need-_one_-ruler-ly y'rs - Samuele. From tim.one@comcast.net Fri Feb 22 01:46:35 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 21 Feb 2002 20:46:35 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: <200202220028.g1M0SJ024961@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido] > From a recent SF mailing to project administrators: > > DATA EXPORT > --------------------------- Jeremy (and less so I) played with that in the past (before it was publicized), but hit a brick wall: there seemed to be a cap on how many records it would deliver, and we couldn't brute-force our way around it. Maybe it's better now. > ... > SOMEBODY with admin perms should set up a cron job to such down the > nightly XML. It's big! (Are we still sucking down the nightly cvs > tarballs? We should!) IIRC, Barry was doing that on a home machine, and if so he's not around this week to answer. From greg@cosc.canterbury.ac.nz Fri Feb 22 02:41:29 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Feb 2002 15:41:29 +1300 (NZDT) Subject: [Python-Dev] A little GC confusion In-Reply-To: Message-ID: <200202220241.PAA22324@s454.cosc.canterbury.ac.nz> Kevin Jacobs : > I doesn't have any time to really look at your code, but I thought I'd point > out a trick that several extension modules use to protect statically > allocated type objects. > 0, /* set below */ /* tp_alloc */ > PySocketSock_new, /* tp_new */ > 0, /* set below */ /* tp_free */ I don't think that has anything to do with protecting the type object. As I understand it, static type objects are protected by having their refcount statically initialised to 1, so that it will never drop to zero. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From Anthony Baxter Fri Feb 22 03:04:57 2002 From: Anthony Baxter (Anthony Baxter) Date: Fri, 22 Feb 2002 14:04:57 +1100 Subject: [Python-Dev] Meta-reflections In-Reply-To: Message from Tim Peters of "Thu, 21 Feb 2002 17:41:18 CDT." Message-ID: <200202220304.g1M34vn21406@burswood.off.ekorp.com> >>> Tim Peters wrote > What I don't think we ever figured out how to do was capture the info in the > trackers (bugs, patches, feature requests). That would be a major loss, as > well as a chance to forget about 500 people who can't figure out how to use > threads on HP-UX, so let's call it a wash . I still think adding a 'Resolution' of "HP/UX" would be a good way to clean up the trackers... Anthony. -- Anthony Baxter It's never to late to have a happy childhood. From fdrake@acm.org Fri Feb 22 03:17:21 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 21 Feb 2002 22:17:21 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: <200202220028.g1M0SJ024961@pcp742651pcs.reston01.va.comcast.net> References: <200202220028.g1M0SJ024961@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15477.47169.910859.986885@grendel.zope.com> Guido van Rossum writes: > SOMEBODY with admin perms should set up a cron job to such down the > nightly XML. It's big! (Are we still sucking down the nightly cvs > tarballs? We should!) It's failing for me now; I'll submit a support request. I think the tarballs are being downloaded to the python.org machine; I'm not sure if they're still landing on Barry's home machine. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From greg@cosc.canterbury.ac.nz Fri Feb 22 03:21:09 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Feb 2002 16:21:09 +1300 (NZDT) Subject: [Stackless] Re: [Python-Dev] Stackless Design Q. In-Reply-To: <3C74CFAC.24406.233D20CD@localhost> Message-ID: <200202220321.QAA22332@s454.cosc.canterbury.ac.nz> Gordon McMillan : > You need a way to refer to "this" tasklet from Python Yes, that occurred to me as well. Would a built-in function called current_tasklet() provide what you want? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From fdrake@acm.org Fri Feb 22 03:28:14 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 21 Feb 2002 22:28:14 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: <15477.47169.910859.986885@grendel.zope.com> References: <200202220028.g1M0SJ024961@pcp742651pcs.reston01.va.comcast.net> <15477.47169.910859.986885@grendel.zope.com> Message-ID: <15477.47822.794530.781796@grendel.zope.com> I wrote: > It's failing for me now; I'll submit a support request. http://sourceforge.net/tracker/index.php?func=detail&aid=521302&group_id=1&atid=200001 -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From greg@cosc.canterbury.ac.nz Fri Feb 22 03:54:53 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Feb 2002 16:54:53 +1300 (NZDT) Subject: [Stackless] Re: [Python-Dev] Stackless Design Q. In-Reply-To: <3C74FA97.7080700@tismer.com> Message-ID: <200202220354.QAA22337@s454.cosc.canterbury.ac.nz> Christian Tismer : > Now I see it: You mean I can make this schedule function behave > like a normal function call, that accepts and drops a dummy > value? Yes, that's right. (Or more precisely, it would take no parameters and return None.) > when I have a scheduler counter built into the Python > interpreter loop I can see the attraction of having pre-emption built in this way -- it would indeed be extremely efficient. But I think you need to make a decision about whether your tasklet model is going to be fundamentally pre-emptive or fundamentally non-pre-emptive, because, as I said before, the notion of switching to a specific tasklet is incompatible with pre-emptive scheduling. If you want to go with a fundamentally pre-emptive model, I would suggest the following primitives: t = tasklet(f) Creates a new tasklet executing f. The new tasklet is initially blocked. t.block() Removes tasklet t from the set of runnable tasklets. t.unblock() Adds tasklet t to the set of runnable tasklets. current_tasklet() A built-in function which returns the currently running tasklet. Using this model, a coroutine switch would be implemented using something like def transfer(t): "Transfer from the currently running tasklet to t." t.unblock() current_tasklet().block() although some locking may be needed in there somewhere. Have to think about that some more. For sending values from one tasklet to another, I think I'd use an intermediate object to mediate the transfer, something like a channel in Occam: c = channel() # tasklet 1 does: c.send(value) # tasklet 2 does: value = c.receive() Tasklet 1 blocks at the send() until tasklet 2 reaches the receive(), or vice versa if tasklet 2 reaches the receive() first. When they're both ready, the value is transferred and both tasklets are unblocked. The advantage of this is that it's more symmetrical. Instead of one tasklet having to know about the other, they don't know about each other but they both know about the intermediate object. > I want to provide an exception to kill tasklets. > Also it will be prossible to just pick it off and drop it, > but I'm a little concerned about the C stack inside. As I said before, if there are no references left to a tasklet, there's no way it can ever be switched to again, so its C stack is no longer relevant. Unless you can have return addresses from one C stack pointing into another, or something... can you? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@comcast.net Fri Feb 22 05:01:27 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 22 Feb 2002 00:01:27 -0500 Subject: [Python-Dev] Meta-reflections In-Reply-To: <15477.47169.910859.986885@grendel.zope.com> Message-ID: [Guido] > SOMEBODY with admin perms should set up a cron job to such down the > nightly XML. [Fred] > It's failing for me now; I'll submit a support request. It doesn't crap out for me, but this is the entire file I get back: """ """ Yes, I was logged in as an admin at the time. Else I get this: """ You are not an admin of this project. Permission denied. """ BTW, from the verbal description of what's supposed to happen, it sounds like it may not include attachments (like patches). From adam@isdn.net.il Fri Feb 22 07:23:12 2002 From: adam@isdn.net.il (adam) Date: Fri, 22 Feb 2002 09:23:12 +0200 Subject: [Python-Dev] warning before a legal claim Message-ID: <001301c1bb71$c9832c00$0101c80a@LocalHost> This is a multi-part message in MIME format. ------=_NextPart_000_0010_01C1BB82.8C967DE0 Content-Type: text/plain; charset="x-user-defined" Content-Transfer-Encoding: quoted-printable warning before a legal claim Remove us from your announcements list !!=20 ------=_NextPart_000_0010_01C1BB82.8C967DE0 Content-Type: text/html; charset="x-user-defined" Content-Transfer-Encoding: quoted-printable

warning before a legal claim

Remove us from your announcements list !!

------=_NextPart_000_0010_01C1BB82.8C967DE0-- From adam@isdn.net.il Fri Feb 22 07:23:54 2002 From: adam@isdn.net.il (adam) Date: Fri, 22 Feb 2002 09:23:54 +0200 Subject: [Python-Dev] warning before a legal claim References: Message-ID: <001701c1bb71$e30c0980$0101c80a@LocalHost> warning before a legal claim Remove us from your announcements list !! ----- Original Message ----- From: To: Sent: Friday, February 22, 2002 5:56 AM Subject: Python-Dev digest, Vol 1 #1903 - 15 msgs > Send Python-Dev mailing list submissions to > python-dev@python.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.python.org/mailman/listinfo/python-dev > or, via email, send a message with subject or body 'help' to > python-dev-request@python.org > > You can reach the person managing the list at > python-dev-admin@python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-Dev digest..." > > > Today's Topics: > > 1. Re: Meta-reflections (Kevin Jacobs) > 2. RE: Meta-reflections (Tim Peters) > 3. Re: Meta-reflections (David Abrahams) > 4. Re: Meta-reflections (Samuele Pedroni) > 5. RE: Meta-reflections (Tim Peters) > 6. RE: Meta-reflections (Tim Peters) > 7. Re: Meta-reflections (Guido van Rossum) > 8. Re: Meta-reflections (Samuele Pedroni) > 9. RE: Meta-reflections (Tim Peters) > 10. Re: A little GC confusion (Greg Ewing) > 11. Re: Meta-reflections (Anthony Baxter) > 12. Re: Meta-reflections (Fred L. Drake, Jr.) > 13. Re: [Stackless] Re: [Python-Dev] Stackless Design Q. (Greg Ewing) > 14. Re: Meta-reflections (Fred L. Drake, Jr.) > 15. Re: [Stackless] Re: [Python-Dev] Stackless Design Q. (Greg Ewing) > > --__--__-- > > Message: 1 > Date: Thu, 21 Feb 2002 12:30:56 -0500 (EST) > From: Kevin Jacobs > To: Samuele Pedroni > cc: "'Python Dev'" > Subject: Re: [Python-Dev] Meta-reflections > > On Thu, 21 Feb 2002, Samuele Pedroni wrote: > > [Kevin Jacobs] > > > > > > In the process I've found another issue with the slots implementation. > > > I'll post the details to python-dev in a separate e-mail. > > > > > > > FYI bug reported only on python-dev have a high probability > > to get lost into vacuum (Tim often warns against that). > > > > Now a seemingly bug is a seemingly bug, so I have reported > > your bug to SF: > > > > http://sourceforge.net/tracker/index.php?func=detail&aid=520644&group_id=547 0&a > > tid=105470 > > > > In general don't expect that someone will post bugs on your behalf. > > Thanks. I have a collection of about ~8 more bugs that is expending as I > grow my test suite. Before I spray all of them onto SF, I want to hear from > Guido, since some of my "bugs" are potentially subjective. > > I _have_ tried three times to post a summary-bug to SF and its not worked > (as usual). Is just me or is SF flaky as hell? The last time I tried to > post a bug, it kicked me out and was "Down for maintenance" for some time > after that. Now it won't let me login since it thinks I haven't responded > to the new account confirmation e-mail. Grrrrrrrrrr > > -Kevin > > -- > Kevin Jacobs > The OPAL Group - Enterprise Systems Architect > Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com > Fax: (216) 986-0714 WWW: http://www.theopalgroup.com > > > > > --__--__-- > > Message: 2 > Date: Thu, 21 Feb 2002 16:06:47 -0500 > From: Tim Peters > Subject: RE: [Python-Dev] Meta-reflections > To: 'Python Dev' > > [Kevin Jacobs] > > ... > > I have a collection of about ~8 more bugs that is expending as I > > grow my test suite. Before I spray all of them onto SF, I want > > to hear from Guido, since some of my "bugs" are potentially subjective. > > The best way to hear from Guido is to post bugs, and suspected bugs, to > SourceForge, one bug per report. There's so much verbiage about this now on > Python-Dev that I doubt he'll ever be able to make time to catch up with it > when he returns. A great advantage of a good bug report is that it's > focused and brief. > > Slots were definitely intended as a memory optimization, and the ways in > which they don't act like "regular old attributes" are at best warts. > > > I _have_ tried three times to post a summary-bug to SF and its not worked > > (as usual). Is just me or is SF flaky as hell? The last time I tried to > > post a bug, it kicked me out and was "Down for maintenance" for some time > > after that. Now it won't let me login since it thinks I haven't > > responded to the new account confirmation e-mail. Grrrrrrrrrr > > It *sounds* like you're getting started with SF. Once it agrees not to hate > you , life gets a lot easier. It's not flaky in general, but it does > suffer bouts of extreme flakiness from time to time. > > > > --__--__-- > > Message: 3 > Reply-To: "David Abrahams" > From: "David Abrahams" > To: > Subject: Re: [Python-Dev] Meta-reflections > Date: Thu, 21 Feb 2002 16:23:36 -0500 > > FWIW, some of my Boost colleagues have been watching SF's future prospects > with some suspicion. The financial outlook is worrisome; I submitted a > support request in April 2001 that still hasn't been addressed ( > http://sourceforge.net/tracker/?func=detail&aid=414066&group_id=1&atid=35000 > 1). We're establishing all new services elsewhere, and even moving some old > ones. For the long-term health of Python, you might want to make sure you're > prepared to move quickly if neccessary. > > -Dave > ----- Original Message ----- > From: "Tim Peters" > To: "'Python Dev'" > Sent: Thursday, February 21, 2002 4:06 PM > Subject: RE: [Python-Dev] Meta-reflections > > > > [Kevin Jacobs] > > > ... > > > I have a collection of about ~8 more bugs that is expending as I > > > grow my test suite. Before I spray all of them onto SF, I want > > > to hear from Guido, since some of my "bugs" are potentially subjective. > > > > The best way to hear from Guido is to post bugs, and suspected bugs, to > > SourceForge, one bug per report. There's so much verbiage about this now > on > > Python-Dev that I doubt he'll ever be able to make time to catch up with > it > > when he returns. A great advantage of a good bug report is that it's > > focused and brief. > > > > Slots were definitely intended as a memory optimization, and the ways in > > which they don't act like "regular old attributes" are at best warts. > > > > > I _have_ tried three times to post a summary-bug to SF and its not > worked > > > (as usual). Is just me or is SF flaky as hell? The last time I tried > to > > > post a bug, it kicked me out and was "Down for maintenance" for some > time > > > after that. Now it won't let me login since it thinks I haven't > > > responded to the new account confirmation e-mail. Grrrrrrrrrr > > > > It *sounds* like you're getting started with SF. Once it agrees not to > hate > > you , life gets a lot easier. It's not flaky in general, but it > does > > suffer bouts of extreme flakiness from time to time. > > > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev@python.org > > http://mail.python.org/mailman/listinfo/python-dev > > > > > > --__--__-- > > Message: 4 > From: "Samuele Pedroni" > To: "Tim Peters" , > "'Python Dev'" , > "Kevin Jacobs" > Subject: Re: [Python-Dev] Meta-reflections > Date: Thu, 21 Feb 2002 22:13:57 +0100 > > > > [Kevin Jacobs] > > > ... > > > I have a collection of about ~8 more bugs that is expending as I > > > grow my test suite. Before I spray all of them onto SF, I want > > > to hear from Guido, since some of my "bugs" are potentially subjective. > > > > The best way to hear from Guido is to post bugs, and suspected bugs, to > > SourceForge, one bug per report. There's so much verbiage about this now on > > Python-Dev that I doubt he'll ever be able to make time to catch up with it > > when he returns. A great advantage of a good bug report is that it's > > focused and brief. > > It's very true. > > > Slots were definitely intended as a memory optimization, and the ways in > > which they don't act like "regular old attributes" are at best warts. > > > > I see, but it seems that the only way to coherently and transparently > remove the warts implies that the __dict__ of a new-style class > instance with slots should be tied with the instance and cannot > be anymore a vanilla dict. Something only Guido can rule about. > > some-more-verbiage-ly y'rs - Samuele. > > > > --__--__-- > > Message: 5 > Date: Thu, 21 Feb 2002 17:41:18 -0500 > From: Tim Peters > Subject: RE: [Python-Dev] Meta-reflections > To: David Abrahams > Cc: python-dev@python.org > > [David Abrahams] > > FWIW, some of my Boost colleagues have been watching SF's future > > prospects with some suspicion. > > It's worth a lot, and we do too -- at least in fits, when somebody remembers > it's something that's going to kill us someday. > > > The financial outlook is worrisome; I submitted a > > support request in April 2001 that still hasn't been addressed ( > > > 01>). > > Well, that's really a feature request, and *nobody* responds well to witty > oblique references to the Odyssey except me . > > > We're establishing all new services elsewhere, and even moving some old > > ones. For the long-term health of Python, you might want to make > > sure you're prepared to move quickly if neccessary. > > We supposedly have a cron job set up to suck down Python's CVS tarball every > night (the people who would know if this is currently working are out this > week). > > What I don't think we ever figured out how to do was capture the info in the > trackers (bugs, patches, feature requests). That would be a major loss, as > well as a chance to forget about 500 people who can't figure out how to use > threads on HP-UX, so let's call it a wash . > > > > --__--__-- > > Message: 6 > Date: Thu, 21 Feb 2002 17:51:13 -0500 > From: Tim Peters > Subject: RE: [Python-Dev] Meta-reflections > To: 'Python Dev' > > [Tim] > > Slots were definitely intended as a memory optimization, and the ways in > > which they don't act like "regular old attributes" are at best warts. > > [Samuele Pedroni] > > I see, but it seems that the only way to coherently and transparently > > remove the warts implies that the __dict__ of a new-style class > > instance with slots should be tied with the instance and cannot > > be anymore a vanilla dict. Something only Guido can rule about. > > He'll be happy to . Optimizations aren't always wart-free, and then > living with warts is a price paid for benefiting from the optimization. I'm > sure Guido would consider it "a bug" if slots are ignored by the pickling > mechanism, but wouldn't for an instant consider it "a bug" that the set of > slots in effect when a class is created can't be dynamically expanded later > (this latter is more a sensible restriction than a wart, IMO -- and likely > in Guido's too). > > > > --__--__-- > > Message: 7 > To: python-dev@python.org > Subject: Re: [Python-Dev] Meta-reflections > From: Guido van Rossum > Date: Thu, 21 Feb 2002 19:28:19 -0500 > > > What I don't think we ever figured out how to do was capture the > > info in the trackers (bugs, patches, feature requests). That would > > be a major loss, as well as a chance to forget about 500 people who > > can't figure out how to use threads on HP-UX, so let's call it a > > wash . > > From a recent SF mailing to project administrators: > > DATA EXPORT > --------------------------- > We have added a new tool for project administrators to backup their > Project data. It is now possible to export data from the Trackers (bug > tracker, support tracker, etc), mailing lists, and forum data in to a > single XML text file. This can be done at any time. > > This is actually not a new feature. The ability to export data was > available through March of 2001 until we did a major upgrade of the > site, which broke the export scripts. We have now re-worked the code, > and it's available again. Enjoy. http://sourceforge.net/export > > SOMEBODY with admin perms should set up a cron job to such down the > nightly XML. It's big! (Are we still sucking down the nightly cvs > tarballs? We should!) > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > --__--__-- > > Message: 8 > From: "Samuele Pedroni" > To: "Tim Peters" , > "'Python Dev'" > Subject: Re: [Python-Dev] Meta-reflections > Date: Fri, 22 Feb 2002 01:38:27 +0100 > > > From: Tim Peters > > [Tim] > > > Slots were definitely intended as a memory optimization, and the ways in > > > which they don't act like "regular old attributes" are at best warts. > > > > [Samuele Pedroni] > > > I see, but it seems that the only way to coherently and transparently > > > remove the warts implies that the __dict__ of a new-style class > > > instance with slots should be tied with the instance and cannot > > > be anymore a vanilla dict. Something only Guido can rule about. > > > > He'll be happy to . Optimizations aren't always wart-free, and then > > living with warts is a price paid for benefiting from the optimization. I'm > > sure Guido would consider it "a bug" if slots are ignored by the pickling > > mechanism, but wouldn't for an instant consider it "a bug" that the set of > > slots in effect when a class is created can't be dynamically expanded later > > (this latter is more a sensible restriction than a wart, IMO -- and likely > > in Guido's too). > > > > I was thinking along the line of the C equiv of this: > [Yup the situation of a subclass of a class with slots > is more relevant] > > class C(object): > __slots__ = ['_a'] > > > class D(C): pass > > > def allslots(cls): > mro = list(cls.__mro__) > mro.reverse() > allslots = {} > for c in mro: > cdict = c.__dict__ > if '__slots__' in cdict: > for slot in cdict['__slots__']: > allslots[slot] = cdict[slot] > return allslots > > class slotdict(dict): > __slots__ = ['_inst','_allslots'] > def __init__(self,inst,allslots): > self._inst = inst > self._allslots = allslots > > def __getitem__(self,k): > if self._allslots.has_key(k): > # self _allslots should be reachable as > self._inst.__class__.__allslots__ > # AttributeError should become a KeyError ? > return self._allslots[k].__get__(self._inst) > else: > return dict.__getitem__(self,v) > > def __setitem__(self,k,v): > if self._allslots.has_key(k): > # self _allslots should be reachable as > self._inst.__class__.__allslots__ > # AttributeError should become a KeyError ? > return self._allslots[k].__set__(self._inst,v) > else: > return dict.__setitem__(self,v) > > # other methods accordingly > > d=D() > d.__dict__ = slotdict(d,allslots(D)) # should be so automagically > > # allslots(D) should be probably accessible as d.__class__.__allslots__ > # for transparency C.__dict__ should not contain any slot descr > > # __allslots__ should be readonly and disallow rebinding > # d.__dict__ should disallow rebinding > > # c =C() ; c.__dict__ should return a proxy dict lazily or even more so ... > > Lots of things to rule about and trade-offs to consider. > > the-more-it's-arbitrary-the-more-you-need-_one_-ruler-ly y'rs - Samuele. > > > > --__--__-- > > Message: 9 > Date: Thu, 21 Feb 2002 20:46:35 -0500 > From: Tim Peters > Subject: RE: [Python-Dev] Meta-reflections > To: python-dev@python.org > > [Guido] > > From a recent SF mailing to project administrators: > > > > DATA EXPORT > > --------------------------- > > Jeremy (and less so I) played with that in the past (before it was > publicized), but hit a brick wall: there seemed to be a cap on how many > records it would deliver, and we couldn't brute-force our way around it. > Maybe it's better now. > > > ... > > SOMEBODY with admin perms should set up a cron job to such down the > > nightly XML. It's big! (Are we still sucking down the nightly cvs > > tarballs? We should!) > > IIRC, Barry was doing that on a home machine, and if so he's not around this > week to answer. > > > > --__--__-- > > Message: 10 > Date: Fri, 22 Feb 2002 15:41:29 +1300 (NZDT) > From: Greg Ewing > Subject: Re: [Python-Dev] A little GC confusion > To: python-dev@python.org > > Kevin Jacobs : > > > I doesn't have any time to really look at your code, but I thought I'd point > > out a trick that several extension modules use to protect statically > > allocated type objects. > > > 0, /* set below */ /* tp_alloc */ > > PySocketSock_new, /* tp_new */ > > 0, /* set below */ /* tp_free */ > > I don't think that has anything to do with protecting the type > object. > > As I understand it, static type objects are protected by > having their refcount statically initialised to 1, so that > it will never drop to zero. > > Greg Ewing, Computer Science Dept, +--------------------------------------+ > University of Canterbury, | A citizen of NewZealandCorp, a | > Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | > greg@cosc.canterbury.ac.nz +--------------------------------------+ > > > --__--__-- > > Message: 11 > To: Tim Peters > cc: David Abrahams , python-dev@python.org > From: Anthony Baxter > Reply-to: Anthony Baxter > Subject: Re: [Python-Dev] Meta-reflections > Date: Fri, 22 Feb 2002 14:04:57 +1100 > > > >>> Tim Peters wrote > > What I don't think we ever figured out how to do was capture the info in the > > trackers (bugs, patches, feature requests). That would be a major loss, as > > well as a chance to forget about 500 people who can't figure out how to use > > threads on HP-UX, so let's call it a wash . > > I still think adding a 'Resolution' of "HP/UX" would be a good way to > clean up the trackers... > > Anthony. > > -- > Anthony Baxter > It's never to late to have a happy childhood. > > > > --__--__-- > > Message: 12 > Date: Thu, 21 Feb 2002 22:17:21 -0500 > To: Guido van Rossum > Cc: python-dev@python.org > Subject: Re: [Python-Dev] Meta-reflections > From: "Fred L. Drake, Jr." > > > Guido van Rossum writes: > > SOMEBODY with admin perms should set up a cron job to such down the > > nightly XML. It's big! (Are we still sucking down the nightly cvs > > tarballs? We should!) > > It's failing for me now; I'll submit a support request. > > I think the tarballs are being downloaded to the python.org machine; > I'm not sure if they're still landing on Barry's home machine. > > > -Fred > > -- > Fred L. Drake, Jr. > PythonLabs at Zope Corporation > > > --__--__-- > > Message: 13 > Date: Fri, 22 Feb 2002 16:21:09 +1300 (NZDT) > From: Greg Ewing > Subject: Re: [Stackless] Re: [Python-Dev] Stackless Design Q. > To: python-dev@python.org, stackless@tismer.com > > Gordon McMillan : > > > You need a way to refer to "this" tasklet from Python > > Yes, that occurred to me as well. Would a built-in function > called current_tasklet() provide what you want? > > Greg Ewing, Computer Science Dept, +--------------------------------------+ > University of Canterbury, | A citizen of NewZealandCorp, a | > Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | > greg@cosc.canterbury.ac.nz +--------------------------------------+ > > > --__--__-- > > Message: 14 > Date: Thu, 21 Feb 2002 22:28:14 -0500 > To: python-dev@python.org > Subject: Re: [Python-Dev] Meta-reflections > From: "Fred L. Drake, Jr." > > > I wrote: > > It's failing for me now; I'll submit a support request. > > http://sourceforge.net/tracker/index.php?func=detail&aid=521302&group_id=1&a tid=200001 > > > -Fred > > -- > Fred L. Drake, Jr. > PythonLabs at Zope Corporation > > > --__--__-- > > Message: 15 > Date: Fri, 22 Feb 2002 16:54:53 +1300 (NZDT) > From: Greg Ewing > Subject: Re: [Stackless] Re: [Python-Dev] Stackless Design Q. > To: python-dev@python.org, stackless@tismer.com > > Christian Tismer : > > > Now I see it: You mean I can make this schedule function behave > > like a normal function call, that accepts and drops a dummy > > value? > > Yes, that's right. (Or more precisely, it would take > no parameters and return None.) > > > when I have a scheduler counter built into the Python > > interpreter loop > > I can see the attraction of having pre-emption built in this > way -- it would indeed be extremely efficient. > > But I think you need to make a decision about whether your > tasklet model is going to be fundamentally pre-emptive or > fundamentally non-pre-emptive, because, as I said before, > the notion of switching to a specific tasklet is incompatible > with pre-emptive scheduling. > > If you want to go with a fundamentally pre-emptive model, > I would suggest the following primitives: > > t = tasklet(f) > Creates a new tasklet executing f. The new tasklet > is initially blocked. > > t.block() > Removes tasklet t from the set of runnable tasklets. > > t.unblock() > Adds tasklet t to the set of runnable tasklets. > > current_tasklet() > A built-in function which returns the currently > running tasklet. > > Using this model, a coroutine switch would be implemented > using something like > > def transfer(t): > "Transfer from the currently running tasklet to t." > t.unblock() > current_tasklet().block() > > although some locking may be needed in there somewhere. > Have to think about that some more. > > For sending values from one tasklet to another, I think > I'd use an intermediate object to mediate the transfer, > something like a channel in Occam: > > c = channel() > > # tasklet 1 does: > c.send(value) > > # tasklet 2 does: > value = c.receive() > > Tasklet 1 blocks at the send() until tasklet 2 reaches > the receive(), or vice versa if tasklet 2 reaches the > receive() first. When they're both ready, the value is > transferred and both tasklets are unblocked. > > The advantage of this is that it's more symmetrical. > Instead of one tasklet having to know about the > other, they don't know about each other but they > both know about the intermediate object. > > > I want to provide an exception to kill tasklets. > > Also it will be prossible to just pick it off and drop it, > > but I'm a little concerned about the C stack inside. > > As I said before, if there are no references left to a > tasklet, there's no way it can ever be switched to again, > so its C stack is no longer relevant. Unless you can have > return addresses from one C stack pointing into another, > or something... can you? > > Greg Ewing, Computer Science Dept, +--------------------------------------+ > University of Canterbury, | A citizen of NewZealandCorp, a | > Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | > greg@cosc.canterbury.ac.nz +--------------------------------------+ > > > > --__--__-- > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > > > End of Python-Dev Digest From adam@isdn.net.il Fri Feb 22 07:25:31 2002 From: adam@isdn.net.il (adam) Date: Fri, 22 Feb 2002 09:25:31 +0200 Subject: [Python-Dev] warning before a legal claim References: Message-ID: <002301c1bb72$1c8923a0$0101c80a@LocalHost> warning before a legal claim Remove us from your announcements list !! From adam@isdn.net.il Fri Feb 22 07:29:23 2002 From: adam@isdn.net.il (adam) Date: Fri, 22 Feb 2002 09:29:23 +0200 Subject: [Python-Dev] warning before a legal claim References: Message-ID: <004d01c1bb72$a675fa20$0101c80a@LocalHost> warning before a legal claim Remove us from your announcements list From tismer@tismer.com Fri Feb 22 08:15:19 2002 From: tismer@tismer.com (Christian Tismer) Date: Fri, 22 Feb 2002 09:15:19 +0100 Subject: [Stackless] Re: [Python-Dev] Stackless Design Q. References: <200202220354.QAA22337@s454.cosc.canterbury.ac.nz> Message-ID: <3C75FE17.1040807@tismer.com> Greg Ewing wrote: > Christian Tismer : ... > I can see the attraction of having pre-emption built in this > way -- it would indeed be extremely efficient. > > But I think you need to make a decision about whether your > tasklet model is going to be fundamentally pre-emptive or > fundamentally non-pre-emptive, because, as I said before, > the notion of switching to a specific tasklet is incompatible > with pre-emptive scheduling. Yes. I will go a bit further and adapt the process model of the Alef language a bit. > If you want to go with a fundamentally pre-emptive model, > I would suggest the following primitives: [blocking stuff - ok] > Using this model, a coroutine switch would be implemented > using something like > > def transfer(t): > "Transfer from the currently running tasklet to t." > t.unblock() > current_tasklet().block() > > although some locking may be needed in there somewhere. > Have to think about that some more. > > For sending values from one tasklet to another, I think > I'd use an intermediate object to mediate the transfer, > something like a channel in Occam: > > c = channel() > > # tasklet 1 does: > c.send(value) > > # tasklet 2 does: > value = c.receive() > > Tasklet 1 blocks at the send() until tasklet 2 reaches > the receive(), or vice versa if tasklet 2 reaches the > receive() first. When they're both ready, the value is > transferred and both tasklets are unblocked. > > The advantage of this is that it's more symmetrical. > Instead of one tasklet having to know about the > other, they don't know about each other but they > both know about the intermediate object. Yes. This all sounds very familiar to me. In private conversation with Russ Cox, Bell Labs, I learned about rendevouz techniques which are quite similar. Having read the Alef user guide which can be found at http://plan9.bell-labs.com/who/rsc/thread.html http://plan9.bell-labs.com/who/rsc/ug.pdf I got the following picture: (Thanks to Russ Cox, these are his ideas!) We use a two-level structure. Toplevel is something similar to threads, processes in Alef language. These are pre-emptively scheduled by an internal scheduler that switches after every n opcodes. These threads are groups of tasklets, which have collaborative scheduling between them. This gives us a lot of flexibility: If people prefer thread-like behavior, they can use the system provided approach and just use the toplevel layer with just one tasklet in it. Creating new tasklets inside a process then has coroutine-like behavior. I'm just busy designing the necessary structures, things should not get too complicated on the C level. >>I want to provide an exception to kill tasklets. >>Also it will be prossible to just pick it off and drop it, >>but I'm a little concerned about the C stack inside. >> > > As I said before, if there are no references left to a > tasklet, there's no way it can ever be switched to again, > so its C stack is no longer relevant. Unless you can have > return addresses from one C stack pointing into another, > or something... can you? Well, the problem is that an extension *might* be sitting inside a tasklet's stack with a couple of allotted objects. I would assume that the extension frees these objects when I send an exception to the tasklet. But without that, I cannot be sure if all resources are freed. thanks a lot for your help - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/ From jason@jorendorff.com Fri Feb 22 09:31:55 2002 From: jason@jorendorff.com (Jason Orendorff) Date: Fri, 22 Feb 2002 03:31:55 -0600 Subject: [Python-Dev] A little GC confusion In-Reply-To: <1b6b01c1bae8$9f6af350$e000a8c0@thomasnotebook> Message-ID: Thomas Heller wrote: > David Abrahams wrote: > > What I don't understand is how the builtin metatype > > gets away with Py_TPFLAGS_HAVE_GC when some of its instance > > types are not even heap-allocated. > > Hm, I don't understand you, Are you talking about Py_TPFLAGS_HEAPTYPE? I think David is asking about line 1404 of Objects/typeobject.c, where it says that PyType_Type is Py_TPFLAGS_HAVE_GC. How can it have GC when many instances are static objects, not allocated with PyObject_GC_VarNew()? I don't know the answer. ## Jason Orendorff http://www.jorendorff.com/ From martin@v.loewis.de Fri Feb 22 10:03:53 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 22 Feb 2002 11:03:53 +0100 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: <001101c1ba3d$15e9cda0$ba03a8c0@activestate.ca> References: <001101c1ba3d$15e9cda0$ba03a8c0@activestate.ca> Message-ID: "Jeff Hobbs" writes: > That's correct - I should have looked a bit more into what I did > before (I was always tying in another GUI's event loop). However, > I don't see why you should not consider the extra event source. > Tk uses this itself for X. It would be something like: That does not work, either. I'm using the patch attached below, and I'm getting the output ... setupproc called 729 setupproc called 730 setupproc called 731 setupproc called 732 setupproc called 733 setupproc called 734 setupproc called 735 ... That is, even though the setupproc is called, and even though the select is not blocking anymore, DoOneEvent does not return (I don't see the "Event done" messages). Regards, Martin Index: _tkinter.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Modules/_tkinter.c,v retrieving revision 1.123 diff -u -r1.123 _tkinter.c --- _tkinter.c 26 Jan 2002 20:21:50 -0000 1.123 +++ _tkinter.c 22 Feb 2002 09:56:13 -0000 @@ -133,6 +139,10 @@ These locks expand to several statements and brackets; they should not be used in branches of if statements and the like. + To give other threads a chance to access Tcl while the Tk mainloop is + runnning, an input source is registered with Tcl which results in Tcl + not blocking for more than 20ms. + */ static PyThread_type_lock tcl_lock = 0; @@ -237,24 +248,6 @@ /**** Utils ****/ -#ifdef WITH_THREAD -#ifndef MS_WINDOWS - -/* Millisecond sleep() for Unix platforms. */ - -static void -Sleep(int milli) -{ - /* XXX Too bad if you don't have select(). */ - struct timeval t; - t.tv_sec = milli/1000; - t.tv_usec = (milli%1000) * 1000; - select(0, (fd_set *)0, (fd_set *)0, (fd_set *)0, &t); -} -#endif /* MS_WINDOWS */ -#endif /* WITH_THREAD */ - - static char * AsString(PyObject *value, PyObject *tmp) { @@ -1671,6 +1948,37 @@ /** Event Loop **/ +#ifdef WITH_THREAD +static int setupproc_registered; +/* + *---------------------------------------------------------------------- + * + * TkinterSetupProc -- + * + * This procedure implements the setup part of the Tkinter + * event source. It is invoked by Tcl_DoOneEvent before entering + * the notifier to check for events on all displays. + * + * Results: + * None. + * + * Side effects: + * The maximum block time will be set to 20000 usecs to ensure that + * the notifier returns control to Tcl. + * + *---------------------------------------------------------------------- + */ + +static void +TkinterSetupProc(ClientData clientData, int flags) +{ + static Tcl_Time blockTime = { 0, 20000 }; + static int i = 0; + printf("setupproc called %d\n",++i); + Tcl_SetMaxBlockTime(&blockTime); +} +#endif + static PyObject * Tkapp_MainLoop(PyObject *self, PyObject *args) { @@ -1682,22 +1990,29 @@ if (!PyArg_ParseTuple(args, "|i:mainloop", &threshold)) return NULL; +#ifdef WITH_THREAD + if (!setupproc_registered) { + Tcl_CreateEventSource(TkinterSetupProc, NULL, NULL); + setupproc_registered = 1; + } +#endif + quitMainLoop = 0; while (Tk_GetNumMainWindows() > threshold && !quitMainLoop && !errorInCmd) { int result; + #ifdef WITH_THREAD Py_BEGIN_ALLOW_THREADS PyThread_acquire_lock(tcl_lock, 1); tcl_tstate = tstate; - result = Tcl_DoOneEvent(TCL_DONT_WAIT); + result = Tcl_DoOneEvent(0); + printf("Event done\n"); tcl_tstate = NULL; PyThread_release_lock(tcl_lock); - if (result == 0) - Sleep(20); Py_END_ALLOW_THREADS #else result = Tcl_DoOneEvent(0); @@ -2033,12 +2364,10 @@ PyThread_acquire_lock(tcl_lock, 1); tcl_tstate = event_tstate; - result = Tcl_DoOneEvent(TCL_DONT_WAIT); + result = Tcl_DoOneEvent(0); tcl_tstate = NULL; PyThread_release_lock(tcl_lock); - if (result == 0) - Sleep(20); Py_END_ALLOW_THREADS #else result = Tcl_DoOneEvent(0); From martin@v.loewis.de Fri Feb 22 10:10:01 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 22 Feb 2002 11:10:01 +0100 Subject: [Python-Dev] PEP needed? Introducing Tcl objects In-Reply-To: <002601c1ba46$ce683f20$ba03a8c0@activestate.ca> References: <002601c1ba46$ce683f20$ba03a8c0@activestate.ca> Message-ID: "Jeff Hobbs" writes: > BTW in addition to my last message, you might want to create > an ExitHandler that delete the event source. Also, you might > add more code to the TkinterSetupProc to only set a block time > if multiple threads are actually used (or only create the > event source at that time). This would make simple Tkinter > apps be efficient and snappy all the time. I'm not sure this will be necessary (provided I get this to work at all); after all, all that the timeout will do is to setup the event loop 50 times in a second. Computers should have no problems with that these days; in a snappy Tkinter app, there will be much more than 50 events per second. Furthermore, such a change would not affect snappiness at all, only efficiency (and only slightly so). Regards, Martin From martin@v.loewis.de Fri Feb 22 10:25:09 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 22 Feb 2002 11:25:09 +0100 Subject: [Python-Dev] A little GC confusion In-Reply-To: References: Message-ID: "Jason Orendorff" writes: > I think David is asking about line 1404 of Objects/typeobject.c, > where it says that PyType_Type is Py_TPFLAGS_HAVE_GC. > How can it have GC when many instances are static objects, not > allocated with PyObject_GC_VarNew()? Because the type type implements tp_is_gc (typeobject.c:1378), declaring static type objects as not being gc. In turn, garbage collection will not attempt to look at the GC header of these type objects. Regards, Martin From David Abrahams" Message-ID: <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com> ----- Original Message ----- From: "Martin v. Loewis" > "Jason Orendorff" writes: > > > I think David is asking about line 1404 of Objects/typeobject.c, > > where it says that PyType_Type is Py_TPFLAGS_HAVE_GC. > > How can it have GC when many instances are static objects, not > > allocated with PyObject_GC_VarNew()? > > Because the type type implements tp_is_gc (typeobject.c:1378), > declaring static type objects as not being gc. In turn, garbage > collection will not attempt to look at the GC header of these type > objects. Aha! And the implementation is... static int type_is_gc(PyTypeObject *type) { return type->tp_flags & Py_TPFLAGS_HEAPTYPE; } so, wouldn't it make more sense that the Python source always checks Py_TPFLAGS_HEAPTYPE before tp_is_gc? Also, is there any guideline for which type slots get automatically copied from the base type? Since my slots are nearly all zero I expected to inherit most of the slots from type_type. -Dave From David Abrahams" <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com> Message-ID: <150601c1bba1$e22467d0$0500a8c0@boostconsulting.com> ----- Original Message ----- From: "David Abrahams" > ----- Original Message ----- > From: "Martin v. Loewis" > > Because the type type implements tp_is_gc (typeobject.c:1378), > > declaring static type objects as not being gc. In turn, garbage > > collection will not attempt to look at the GC header of these type > > objects. > > > Aha! And the implementation is... > > static int > type_is_gc(PyTypeObject *type) > { > return type->tp_flags & Py_TPFLAGS_HEAPTYPE; > } > > so, wouldn't it make more sense that the Python source always checks > Py_TPFLAGS_HEAPTYPE before tp_is_gc? Nice try, but no cigar I'm afraid: copying the tp_is_gc slot from PyType_Type into my metatype before PyType_Ready() doesn't prevent the crash. Does anyone really understand what happens here? From mwh@python.net Fri Feb 22 13:46:10 2002 From: mwh@python.net (Michael Hudson) Date: 22 Feb 2002 13:46:10 +0000 Subject: [Python-Dev] 2.2.1 issues In-Reply-To: martin@v.loewis.de's message of "19 Feb 2002 21:29:49 +0100" References: <2madu5zhnj.fsf@starship.python.net> <3C726270.7D33E687@lemburg.com> Message-ID: <2mzo21fx3x.fsf@starship.python.net> martin@v.loewis.de (Martin v. Loewis) writes: > "M.-A. Lemburg" writes: > > > Right. 1) was caused by 2). > > That wasn't actually the case. The overwriting of memory was really > independent of the error in surrogate processing, and can be fixed > independently. OK, thanks for the clarification. > > As a result, modules using unpaired surrogates in Unicode > > literals are simply broken in Python <= 2.2.0. > > I think this is unimportant enough to just accept this bug for Python > 2.2.x. If people ever run into the problem, well: just don't do this. > Unpaired surrogates will be entirely in Unicode 3.2. I think you're missing a word in the last sentence? > > The problem with backporting this patch is that in order > > for Python to properly recompile any broken module, the > > magic will have to be changed. Question is whether this > > is a reasonable thing to do in a patch level release... > > The memory-overwriting problem can be fixed independently, e.g. with > > https://sourceforge.net/tracker/download.php?group_id=5470&atid=105470&file_id=15248&aid=495401 Thanks, I've now checked this fix in, and will consider the whole issue to be closed until further notice. Cheers, M. -- That's why the smartest companies use Common Lisp, but lie about it so all their competitors think Lisp is slow and C++ is fast. (This rumor has, however, gotten a little out of hand. :) -- Erik Naggum, comp.lang.lisp From mwh@python.net Fri Feb 22 14:10:10 2002 From: mwh@python.net (Michael Hudson) Date: 22 Feb 2002 14:10:10 +0000 Subject: [Python-Dev] What's blocking 2.2.1? Message-ID: <2mwux5fvzx.fsf@starship.python.net> I think I'm caught up on porting fixes from the trunk to the release22-maint branch. Now would be a good time to shout if you think I've missed something (although I might not read my email before Monday). (I've entirely ignored the Mac subtree here. Jack, that's your problem, I'm afraid). There are still some bugs in the trunk that need fixing, though. [ #496873 ] cPickle / time.struct_time loop - I think this one is firmly in Guido's domain. [ #501591 ] dir() doc is old - probably not that hard. are all that are marked as 2.2.1 candidates (apart from two MacOS bugs), but there are probably more. I don't want to trawl through all the 250+ (!) open bugs to look for them if I don't have to, so can I ask people to nominate bugs they know of? Cheers, M. -- We've had a lot of problems going from glibc 2.0 to glibc 2.1. People claim binary compatibility. Except for functions they don't like. -- Peter Van Eynde, comp.lang.lisp From jacobs@penguin.theopalgroup.com Fri Feb 22 15:05:09 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 22 Feb 2002 10:05:09 -0500 (EST) Subject: [Python-Dev] Meta-reflections In-Reply-To: <043801c1bb39$3ed8a540$6d94fea9@newmexico> Message-ID: On Fri, 22 Feb 2002, Samuele Pedroni wrote: > I was thinking along the line of the C equiv of this: [...code snipped...] [ An updated version of a comment to SF issue: http://sourceforge.net/tracker/?func=detail&atid=105470&aid=520644&group_id=5470 ] Samuele's sltattr.py is an interesting approach, though I am not entirely sure it is sufficient to address all of the problems with slots. Here is a mostly complete list of smaller changes that are somewhat orthogonal to how we address accesses to __dict__: 1) Flatten slot lists: Change obj.__class__.__slots__ to return an immutable list of all slot descriptors in the object (including all those of base classes). The motivation for this is similar in spirit to storing a flattened __mro__. The advantages of this change are: a) allows for fast and explicit object reflection that correctly finds all dict attributes, all slot attributes. b) allows reflection implementations (like vars(object) and pickle) to treat dict and slot attrs differently if we choose not to proxy __dict__. This has several advantages, as explained in change #2. Also importantly, this way it is not possible to "lose" descriptors permanently by deleting them from obj.__class__.__dict__. 2) Update reflection API even if we do not choose to proxy __dict__: Alter vars(object) to return a dictionary of all attributes, including both the contents of the non-proxied __dict__ and the valid attributes that result from iterating over __slots__ and evaluating the descriptors. The details of how this is best implemented depend on how we wish to define the behavior of modifying the resulting dictionary. It could be either: a) explicitly immutable, which involves creating proxy objects b) mutable, which involves copying c) undefined, which means implicitly immutable Aside from the questions over the nature of the return type, this implementation (coupled with #1) has distinct advantages. Specifically the native object.__dict__ has a very natural internal representation that pairs attribute names directly with values. In contrast, a fair amount of additional work is needed to extract the slots that store values and create a dictionary of their names and values. Other implementations will require a great deal more work since they would have to traverse though base classes to collecting slot descriptors. 3) Flatten slot inheritance: Update the new-style object inheritance mechanism to re-use slots of the same name, rather than creating a new slot and hiding the old. This makes the inheritance semantics of slots equivalent to those of normal instance attributes and avoids introducing an ad-hoc and obscure method of data hiding. 4) Update standard library to use new reflection API (and make them robust to properies at the same time) if we choose not to proxy __dict__. Virtually all of the changes are simple and involve updating these constructs: a) obj.__dict__ b) obj.__dict__[blah] c) obj.__dict__[blah] = x (What these will become depends on other factors, including the context and semantics of vars(obj).) Here is a fairly complete list of Python 2.2 modules that will need to be updated: copy, copy_reg, inspect, pickle, pydoc, cPickle, Bastion, codeop, dis, doctest, gettext, ihooks, imputil, knee, pdb, profile, rexec, rlcompleter, tempfile, unittest, xmllib, xmlrpclib 5) (NB: potentially controversial and not required) We could alter the descriptor protocol to make slots (and properties) more transparent when the values they reference do not exist. Here is an example to illustrate this: class A(object): foo = 1 class B(A): __slots__ = ('foo',) b = B() print b.foo > 1 or AttributeError? Currently an AttributeError is raised. However, it is a fairly easy change to make AttributeErrors signal that attribute resolution is to continue until either a valid descriptor is evaluated, an instance-attribute is found, or until the resolution fails after search the meta-type, the type, and the instance dictionary. I am prepared to submit patches to address each of these issues. However, I do want feedback beforehand, so that I do not waste time implementing something that will never be accepted. Regards, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From Jack.Jansen@oratrix.com Fri Feb 22 15:07:10 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Fri, 22 Feb 2002 16:07:10 +0100 Subject: [Python-Dev] What's blocking 2.2.1? In-Reply-To: <2mwux5fvzx.fsf@starship.python.net> Message-ID: On Friday, February 22, 2002, at 03:10 , Michael Hudson wrote: > I think I'm caught up on porting fixes from the trunk to the > release22-maint branch. Now would be a good time to shout if you > think I've missed something (although I might not read my email before > Monday). > > (I've entirely ignored the Mac subtree here. Jack, that's your > problem, I'm afraid). I'll do a quick check to see whether there's anything that is vital for Mac OS X unix Python that has to go in, I'll let you know. I think MacPython will have to be done after the unix/win distribution has been made, but that depends on your timeframe (i.e. if you can hand the Mac/ portion of the tree over to me real soon now I can try and squeeze the time in). -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From fdrake@acm.org Fri Feb 22 15:26:21 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Feb 2002 10:26:21 -0500 Subject: [Python-Dev] A little GC confusion In-Reply-To: References: Message-ID: <15478.25373.94172.389436@grendel.zope.com> "Jason Orendorff" writes: > How can it have GC when many instances are static objects, not > allocated with PyObject_GC_VarNew()? Martin v. Loewis writes: > Because the type type implements tp_is_gc (typeobject.c:1378), > declaring static type objects as not being gc. In turn, garbage > collection will not attempt to look at the GC header of these type > objects. I'm starting to really fear writing the documentation for all this! There are going to be a lot of mostly-inscrutible details to get right, and people are already asking the questions, so it really will need to be written down. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jafo@tummy.com Fri Feb 22 23:01:27 2002 From: jafo@tummy.com (Sean Reifschneider) Date: Fri, 22 Feb 2002 16:01:27 -0700 Subject: [Python-Dev] patch: speed up name access by up to 80% In-Reply-To: <20020211225538.GA93506@hishome.net>; from oren-py-d@hishome.net on Mon, Feb 11, 2002 at 05:55:38PM -0500 References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com> <20020211225538.GA93506@hishome.net> Message-ID: <20020222160127.A13114@tummy.com> On Mon, Feb 11, 2002 at 05:55:38PM -0500, Oren Tirosh wrote: >I got strage results comparing to the python2.2 RPM package (some faster, >some slower). I didn't start to get consistent results until I used two The 2.2-3 RPM from python.org does "./configure --prefix=/usr" (unless you enable the pymalloc or ipv6 flags before building, the RPMs up there do not). It then does "make". About the only thing that may be unusual would be that RPM may automatically strip the resulting binary. I'm not doing it manually. Interesting that you're seeing oddness. Note that if you download the SRPM, you can install the .src.rpm and then build it by doing: rpm -bc /usr/src/redhat/SPECS/python-2.2.spec (or other similar location that the spec file would be installed depending on your distribution). Also note that you can have it build a patched version by adding the patch below the "Patch1:" line as "Patch2:", and also add a line below "%patch1" which reads "%patch2" (unless you have to give options such as "-p1" -- see the example for "%patch0"). You can then do a fresh build of the code using the above command. You can also build an RPM by using "rpm -ba [...]". One of the big wins with a packaging system -- reproducability... Sean -- Let us live!!! Let us love!!! Let us share the deepest secrets of our souls!!! You first. Sean Reifschneider, Inimitably Superfluous tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python From tim.one@comcast.net Fri Feb 22 23:55:05 2002 From: tim.one@comcast.net (Tim Peters) Date: Fri, 22 Feb 2002 18:55:05 -0500 Subject: [Python-Dev] What's blocking 2.2.1? In-Reply-To: <2mwux5fvzx.fsf@starship.python.net> Message-ID: [Michael Hudson] > I think I'm caught up on porting fixes from the trunk to the > release22-maint branch. Voluminous thanks for the work, Michael! > There are still some bugs in the trunk that need fixing, though. > ... > [ #501591 ] dir() doc is old > - probably not that hard. I just reassigned that one to me; Fred is off today, and it's shallow (the docstring got updated to a correct state when I implemented 2.2 dir() changes, but somehow or other the docs didn't). I'll fix this before I pass out tonight. From martin@v.loewis.de Sat Feb 23 00:47:44 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 23 Feb 2002 01:47:44 +0100 Subject: [Python-Dev] 2.2.1 issues In-Reply-To: <2mzo21fx3x.fsf@starship.python.net> References: <2madu5zhnj.fsf@starship.python.net> <3C726270.7D33E687@lemburg.com> <2mzo21fx3x.fsf@starship.python.net> Message-ID: Michael Hudson writes: > > Unpaired surrogates will be entirely in Unicode 3.2. > > I think you're missing a word in the last sentence? "banned" or "outlawed" is the right word, I guess :-) > Thanks, I've now checked this fix in, and will consider the whole > issue to be closed until further notice. Thanks! If I find the time, I'll review the code. Regards, Martin From martin@v.loewis.de Sat Feb 23 00:43:55 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 23 Feb 2002 01:43:55 +0100 Subject: [Python-Dev] A little GC confusion In-Reply-To: <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com> References: <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com> Message-ID: "David Abrahams" writes: > static int > type_is_gc(PyTypeObject *type) > { > return type->tp_flags & Py_TPFLAGS_HEAPTYPE; > } > > so, wouldn't it make more sense that the Python source always checks > Py_TPFLAGS_HEAPTYPE before tp_is_gc? No. Most GC objects do not have the HEAPTYPE flag (they actually aren't even type objects). Regards, Martin From martin@v.loewis.de Sat Feb 23 00:45:58 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 23 Feb 2002 01:45:58 +0100 Subject: [Python-Dev] A little GC confusion In-Reply-To: <150601c1bba1$e22467d0$0500a8c0@boostconsulting.com> References: <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com> <150601c1bba1$e22467d0$0500a8c0@boostconsulting.com> Message-ID: "David Abrahams" writes: > Nice try, but no cigar I'm afraid: copying the tp_is_gc slot from > PyType_Type into my metatype before PyType_Ready() doesn't prevent the > crash. > > Does anyone really understand what happens here? Understand why your code crashes? Because there is a bug in it... To understand what the bug is, one would have to study your code first. Regards, Martin From fdrake@acm.org Sat Feb 23 02:09:33 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Feb 2002 21:09:33 -0500 Subject: [Python-Dev] What's blocking 2.2.1? In-Reply-To: References: <2mwux5fvzx.fsf@starship.python.net> Message-ID: <15478.63965.177519.255626@grendel.zope.com> Tim Peters writes: > I just reassigned that one to me; Fred is off today, and it's shallow (the > docstring got updated to a correct state when I implemented 2.2 dir() Thanks, Tim! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From David Abrahams" <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com><150601c1bba1$e22467d0$0500a8c0@boostconsulting.com> Message-ID: <013901c1bc14$ba52c940$0500a8c0@boostconsulting.com> ----- Original Message ----- From: "Martin v. Loewis" > "David Abrahams" writes: > > > Nice try, but no cigar I'm afraid: copying the tp_is_gc slot from > > PyType_Type into my metatype before PyType_Ready() doesn't prevent the > > crash. > > > > Does anyone really understand what happens here? > > Understand why your code crashes? I'm not asking that. I'm asking if anyone really understands how the flags and tp_xxx slots are supposed to interact. > Because there is a bug in it... I /guess/ there's a bug in my code if you measure it against the standard that says "if it doesn't work with the current Python source code, it's buggy". I'd consider that standard a bit more legitimate if I could find, for example, a mention of Py_TPFLAGS_HEAPTYPE *anywhere* in the Python docs. As it stands, your position seems a bit more unhelpful than neccessary. I can live with incomplete documentation if there's someone around who can explain how the software is supposed to be used; I just want to fill in the holes so that I know I'm not making important errors. I thought I was doing everything right until a few days ago when someone tried something new with my code and uncovered the GC crash. One can only cover so many cases with tests. Even if I repair this problem, how can I be sure I've got the rest of the formula right? Better docs would fix that problem, and give us an objective standard against which to judge which code has bugs. In lieu of that, I would hope that my questions would be answered in good faith. [In the meantime, GC remains turned off for my types and metatypes] > To understand what the bug is, one would have to study your code first. I posted the code yesterday. Did you miss it? I'm sure you could figure out how to apply the simple modification described at the top of this message. -Dave From jeremy@alum.mit.edu Sat Feb 23 03:51:28 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 22 Feb 2002 22:51:28 -0500 Subject: [Python-Dev] A little GC confusion In-Reply-To: <013901c1bc14$ba52c940$0500a8c0@boostconsulting.com> Message-ID: [David Abrahams] > I /guess/ there's a bug in my code if you measure it against the standard > that says "if it doesn't work with the current Python source code, it's > buggy". I'd consider that standard a bit more legitimate if I could find, > for example, a mention of Py_TPFLAGS_HEAPTYPE *anywhere* in the > Python docs. I've been struggling with the meaning of the various TPFLAGS myself. I don't think it's documented anywhere, and I don't think anyone except Guido really understands what all the flags mean. One property of types that do not have define HEAPTYPE is that their __module__ attribute is always __builtin__. This makes them mighty hard to pickle. It further suggests that every type that isn't a builtin type should define HEAPTYPE. There are lots of other cases affected by HEAPTYPE. I imagine you've done the same grep that I did. Jeremy From jeremy@alum.mit.edu Sat Feb 23 03:59:58 2002 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 22 Feb 2002 22:59:58 -0500 Subject: [Python-Dev] A little GC confusion In-Reply-To: Message-ID: [I wrote:] > One property of types that do not have define HEAPTYPE is that their > __module__ attribute is always __builtin__. This makes them > mighty hard to > pickle. It further suggests that every type that isn't a builtin type > should define HEAPTYPE. I don't think I made much sense above. I meant to say: When my C types didn't define HEAPTYPE, it was impossible to pickle them. When I added the HEAPTYPE and defined __safe_for_unpickling__ as a data member, it became possible to pickle instances of those types. It was far from obvious, though, that I needed to do those two things to make pickling work. Jeremy From tim.one@comcast.net Sat Feb 23 05:10:42 2002 From: tim.one@comcast.net (Tim Peters) Date: Sat, 23 Feb 2002 00:10:42 -0500 Subject: [Python-Dev] A little GC confusion In-Reply-To: Message-ID: [Jeremy Hylton] > I've been struggling with the meaning of the various TPFLAGS myself. I > don't think it's documented anywhere, and I don't think anyone > except Guido really understands what all the flags mean. I agree that at this point Guido is the only one who fully understands what they were all *intended* to mean, but I don't believe even Guido can tell you (without the same kinds of study and experimentation and hair-pulling you're doing) what the flags actually do today in all circumstances and combinations. A consequence is that neither can he (or anyone else) always predict what you need to do to get a desired result. What shipped in 2.2 was solid to the extent that it supported everything used by the Python core. You and David are pushing it in other directions, and while it was intended to support them, this stuff was never really *tried* at the C level beyond the demo xxsubtype.c module and some ExtensionClass fiddling. Most "weird experiments" were tried at the Python level instead, just because it's so much more time-efficient to try stuff in Python, and time was in short supply. So you're pioneers, and you've got to draw your own maps of the new territory. Luckily, God isn't resting yet, so He can still create new lifeforms if needed . > One property of types that do not have define HEAPTYPE is that their > __module__ attribute is always __builtin__. This makes them > mighty hard to pickle. It further suggests that every type that isn't > a builtin type should define HEAPTYPE. Yup, all kinds of questions get answered by "does it have HEAPTYPE?" that don't have any obvious connection to heaps. One of my favorites is this seemingly straightforward branch in type_repr(): if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) kind = "class"; else kind = "type"; The philosophical questions that raises could go on for pages . From adam@isdn.net.il Sat Feb 23 08:28:49 2002 From: adam@isdn.net.il (adam) Date: Sat, 23 Feb 2002 10:28:49 +0200 Subject: [Python-Dev] remove Message-ID: <002001c1bc44$1f0e1860$0101c80a@LocalHost> This is a multi-part message in MIME format. ------=_NextPart_000_001D_01C1BC54.E1FFD880 Content-Type: text/plain; charset="x-user-defined" Content-Transfer-Encoding: quoted-printable pls remove us from yr mail list !! >>>>> "adam" =3D=3D writes: adam> warning before a legal claim adam> Remove us from your announcements list !! You are not on any "announcements list". You are, however on python-dev@python.org which is a technical mailing list for developers of the free Python programming language. If you want to be removed from that, please let me know. Please do not mail administrivia requests to the whole mailing list. -Barry ------=_NextPart_000_001D_01C1BC54.E1FFD880 Content-Type: text/html; charset="x-user-defined" Content-Transfer-Encoding: quoted-printable

pls remove us from yr = mail list=20 !!

>>>>> = "adam"=20 =3D=3D   <adam@isdn.net.il>=20 writes:

    adam> warning before a legal=20 claim

    adam> Remove us from your = announcements list=20 !!

You are not on any "announcements list". You are, = however=20 on
python-dev@python.org = which is=20 a technical mailing list for developers
of the free Python = programming=20 language. If you want to be removed
from that, please let me=20 know. Please do not mail administrivia
requests to the whole = mailing=20 list.

-Barry
------=_NextPart_000_001D_01C1BC54.E1FFD880-- From mwh@python.net Sat Feb 23 08:46:43 2002 From: mwh@python.net (Michael Hudson) Date: 23 Feb 2002 08:46:43 +0000 Subject: [Python-Dev] What's blocking 2.2.1? In-Reply-To: Jack Jansen's message of "Fri, 22 Feb 2002 16:07:10 +0100" References: Message-ID: <2m1yfcmvpo.fsf@starship.python.net> Jack Jansen writes: > On Friday, February 22, 2002, at 03:10 , Michael Hudson wrote: > > > I think I'm caught up on porting fixes from the trunk to the > > release22-maint branch. Now would be a good time to shout if you > > think I've missed something (although I might not read my email before > > Monday). > > > > (I've entirely ignored the Mac subtree here. Jack, that's your > > problem, I'm afraid). > > I'll do a quick check to see whether there's anything that is vital for > Mac OS X unix Python that has to go in, I'll let you know. Fine. I've done the first one. > I think MacPython will have to be done after the unix/win distribution > has been made, but that depends on your timeframe (i.e. if you can hand > the Mac/ portion of the tree over to me real soon now I can try and > squeeze the time in). ? Sorry, you've lost me. I'm afraid. The Mac/ portion of the tree is yours. Cheers, M. -- That's why the smartest companies use Common Lisp, but lie about it so all their competitors think Lisp is slow and C++ is fast. (This rumor has, however, gotten a little out of hand. :) -- Erik Naggum, comp.lang.lisp From mwh@python.net Sat Feb 23 08:58:10 2002 From: mwh@python.net (Michael Hudson) Date: 23 Feb 2002 08:58:10 +0000 Subject: [Python-Dev] What's blocking 2.2.1? In-Reply-To: Tim Peters's message of "Fri, 22 Feb 2002 18:55:05 -0500" References: Message-ID: <2mbsegtw0t.fsf@starship.python.net> Tim Peters writes: > [Michael Hudson] > > I think I'm caught up on porting fixes from the trunk to the > > release22-maint branch. > > Voluminous thanks for the work, Michael! It's now *really* easy for me to get checkins across to the branch. Does anyone else use gnus to read their -checkins mail? I could probably put the little bundle of scripts I use under Tools/scripts/. Cheers, M. -- In short, just business as usual in the wacky world of floating point . -- Tim Peters, comp.lang.python From martin@v.loewis.de Sat Feb 23 10:51:57 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 23 Feb 2002 11:51:57 +0100 Subject: [Python-Dev] A little GC confusion In-Reply-To: <150601c1bba1$e22467d0$0500a8c0@boostconsulting.com> References: <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com> <150601c1bba1$e22467d0$0500a8c0@boostconsulting.com> Message-ID: "David Abrahams" writes: > Nice try, but no cigar I'm afraid: copying the tp_is_gc slot from > PyType_Type into my metatype before PyType_Ready() doesn't prevent the > crash. > > Does anyone really understand what happens here? After studying your code in a debugger, it turns out that the code now crashes for a different reason: The "AA" class created in make_class is traversed in subtract_refs. To do so, its type's traverse function is invoked, i.e. classtype_meta_object.tp_traverse. This is a null pointer, hence the crash. If you want objects (in your case, classes) to participate in GC, the of the objects (in your case, the metaclass) needs to implement the GC API. IOW, don't set Py_TPFLAGS_HAVE_GC in a type unless you also set tp_clear and tp_traverse in the same type, see http://www.python.org/doc/current/api/supporting-cycle-detection.html for details. This likely has been the problem all the time; if I remove tp_is_gc, but implement tp_traverse, your test case (import AA,pdb) does not crash anymore. BTW, gcc rejects the code you've posted, as you cannot use PyType_Type.tp_basicsize in an initializer of a global object (it's not a constant). HTH, Martin From info@virtucomnetworks.com Sun Feb 24 04:14:40 2002 From: info@virtucomnetworks.com (info@virtucomnetworks.com) Date: Sat, 23 Feb 2002 23:14:40 -0500 Subject: [Python-Dev] Webhosting en PESOS! Message-ID: <200202240414.g1O4Eeo10928@toservers.com> =0D =0D =0D Untitled Document=0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D
=0D =0D
=0D Todav=EDa tiene su p=E1gina hosteada en USA? sigue pagando en DO= LARES?
=0D Nosotros cobramos en PESOS=0D y NO=0D vamos a aumentar los precios!
=0D
=0D Por qu=E9? porque queremos que nuestro pa=EDs salga adelante y=0D apostamos
=0D a Argentina!
=0D
=0D Puede transferir su p=E1gina desde cualquier proveedor internaciona= l=0D bonificando el cargo de setup y manteniendo su website online todo = el=0D tiempo.=
=0D
=0D Webhosting sobre =0D Global Crossing desde América latina para todo el mundo! =0D Centro de atención on-line en español las 24 hs, compruebelo =0D ahora mismo,
=0D
=0D =0D
=0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D
=0D

=0D
=0D =0D
=0D =0D
=0D
=0D
100 =0D megabytes de disco
=0D 20 cuentas e-mail pop3
=0D E-mail alias ILIMITADOS!
=0D Panel de Control Haiti 2.0
=0D Real Audio/Video
=0D Estad=EDsticas diarias
=0D Soporte Telefónico =0D 24 hs
=0D
=0D MAS INFO
=0D =0D

=0D Cod: =0D 11666391=0D
=0D
700 =0D megabytes de disco
=0D 200 cuentas e-mail pop3
=0D Subdominios ILIMITADOS!
=0D Panel de Control Haiti 2.0
=0D Real Audio/Video
=0D Estad=EDsticas diarias
=0D Soporte Telefónico =0D 24 hs
=0D =0D
=0D MAS INFO
=0D

=0D
=0D
=0D =0D
=0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D

=0D
=0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D
HERRAMIENTAS
=0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D
=0D
=0D =0D
<= /a>
=0D =0D
=0D
=0D
Haiti =0D 2.0
=0D =0D
ToShop =0D
=0D =0D
ToMail
=0D
Panel =0D de control persona= l donde =0D puede administrar todo su sitio, como crear cuent= as de =0D email, cuentas FTP, instalar diferentes component= es y =0D mirar el caudal de visitas de su sitio= Toshop =0D es un producto completame= nte configurable =0D en funcionamiento y diseño por lo que se a= molda =0D a todos los requerimientos que su empresa necesit= a.
=0D La administración de Toshop es tan simple = como =0D navegar por cualquier página de internet.<= /font> =0D
ToMail,<= /font> =0D con esta aplicación, usted podrá = otorgar =0D un e-mail propio a sus visitantes o clientes si= n ningún =0D costo para los mismos y con el nombre de su dom= inio, =0D de esta manera, al enviar o recibir e-mails est= arán =0D publicitando su website.
=0D usuario: =0D demo
=0D clave: demo
=0D
=0D
-- =0D ver demo --
=0D =0D
--= =0D ver demo --
=0D =0D
-- =0D ver demo --
=0D
=0D
ATENC= ION =0D PERSONALIZADA LAS 24 HS.
=0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D =0D
= =0D =0D =0D =0D =0D
=0D =0D = =0D =0D =0D =0D =0D
=0D =0D
=0D
=0D Ejecutivo de Ventas
=0D =0D
Soporte =0D Técnico
=0D
Atencion =0D comercial un ejecutivo de= ventas =0D lo asesorará sobre sus dudas e inquietudes= para =0D poder aprovechar al maximo los beneficios que bri= nda Towebs Soporte =0D Técnico on-line y = telefónico =0D las 24 hs. los 365 días del año un = especialista =0D estará dispuesto a solucionar sus problema= s.
=0D
- =0D compruebelo ahora mismo -
=0D =0D
- =0D compruebelo ahora mismo -
=0D
=0D
=0D
Nota:=0D Los precios no incluyen iva
Cómo =0D borrarse/desuscribirse de nuestros newsletters:
=0D Este mensaje no es spam! Está recibiendo esta oferta p= orque =0D usted (o alguien utilizando su cuenta de correo) complet&oacu= te; un =0D formulario para contacto en nuestro website http://www.towebs= =2Ecom =0D o solicitó información sobre webhosting de ToWe= bs.
=0D Si usted no quiere recibir más información de n= uestra =0D parte, puede borrar su dirección de correo presionando= aqui =0D y será removido de nuestra lista.
=0D Disculpe nuevamente las molestias que le pueda haber causado = este =0D email.
=0D
=0D
ToWebs, =0D (c) 1999 Virtucom Networks S.A
=0D - San Mart=EDn 390 piso 12, Capital Federal, Buenos Aires, = Argentina =0D -
=0D - Tel / Fax: (54)-11-4393-0999 -
=0D http://www.towebs.com =0D
=0D
=0D
=0D
=0D =0D =0D From Jack.Jansen@oratrix.com Mon Feb 25 15:23:09 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Mon, 25 Feb 2002 16:23:09 +0100 Subject: [Python-Dev] Pthreads wizard needed to look at bug #522393 Message-ID: <9314C5E0-2A03-11D6-8301-0030655234CE@oratrix.com> Folks, in the process of testing 2.2.1 I ran into a bug on SGI that has been there since at least 2.2: if you build with threads you get an undefined error on pthread_detach() while linking the interpreter. I think the solution is to change the autoconf test that decides which libraries to add for pthread (make it refer not only the thread_create() but also thread_detach()), but as pthread_detach apparently isn't available in all pthread implementations (if I understand the small forest of ifdefs inside thread_pthread.h correctly) I'm not sure this won't break anything else. Could some pthread guru please have a look at bug #522393? -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From skip@pobox.com Mon Feb 25 15:53:34 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 25 Feb 2002 09:53:34 -0600 Subject: [Python-Dev] PEP 282: A Logging System -- comments please In-Reply-To: <20020215164333.A31903@ActiveState.com> References: <20020215164333.A31903@ActiveState.com> Message-ID: <15482.24062.951258.55849@12-248-41-177.client.attbi.com> Trent> More standard Handlers may be implemented if deemed desirable and Trent> feasible. Other interesting candidates: ... Trent> - SyslogHandler: Akin to log4j's SyslogAppender. I'd implement at least this one, both as a proof of concept and to provide a standard mapping between the levels used in your logger and those syslog provides. Skip From skip@pobox.com Mon Feb 25 16:45:17 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 25 Feb 2002 10:45:17 -0600 Subject: [Python-Dev] Rattlesnake progress In-Reply-To: References: <20020218145806.A26111@glacier.arctrix.com> Message-ID: <15482.27165.395216.650499@12-248-41-177.client.attbi.com> Tim> Excellent advice that almost nobody follows <0.5 wink>: choose a Tim> flexible intermediate representation, then structure all your Tim> transformations as independent passes, such that the output of Tim> every pass is acceptable as the input to every pass. I did this with my peephole optimizer. It worked great. Each peephole optimization is a very simple subclass of an OptimizeFilter base class. The IR is essentially the bytecode split into basic blocks, with each basic block a list of (opcode argument) tuples. Jump targets are represented as simple indexes into the block list. (In fact, my Rattlesnake converter was just a "peephole optimizer" named InstructionSetConverter.) As Tim mentioned about KISS, this means you sometimes have to run particular optimizations or groups of optimizations multiple times. I want to get it checked into the sandbox where others can play with it, but time has shifted its foundation a tad and a couple optimizations don't work any longer. Skip From pedronis@bluewin.ch Mon Feb 25 17:01:32 2002 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 25 Feb 2002 18:01:32 +0100 Subject: [Python-Dev] Re: Jython bugs or features? Message-ID: <00cb01c1be1e$159b9a60$6d94fea9@newmexico> Dinu Gherman wrote in message gherman-035CF3.16342425022002@news.t-online.com... > Hi, > > I'm trying to get a little more familiar with Jython, but after having > just installed the 2.1 final release on OS X I find the following > rather surprising differences between Jython and CPython (2.1 and 2.2 > respectively, but that doesn't matter here): > > 1. cStringIO.StringIO.reset() is missing in Jython > 2. list() doesn't perform as expected in Jython [namely it does not do a copy :( ] > > Are these bugs and if so: how many people are actually using Jython, > then, especially as books start to come out for it from O'Reilly and > New Riders...? > Yes they are bugs. Please report them. Since 2.1 release (two month ago) there have been approximately 10000 download. How are such bugs possible? 1. the typical idiom for list(l) is l[:] (mildly irrelevant I know) 2. if you check the test suite for CPython 2.1 and Jython there is no test for list(l) behavior 3. the new reset method is also not tested by the test suite and is not reported in the NEWS file (These are the result of some grepping, maybe I'm wrong but given the bugs probably I'm right). We try to follow the big picture and the PEPs and check the NEWS file but for the rest the test suites are our best hope vs. delusional friend. If a feature is old, under-used, or through usage the bug does not show through, and un-tested things become tricky. It is open source: Jython is as conforming as its community and CPython community ACTIVELY want it to be. regards, Samuele Pedroni. From barry@zope.com Mon Feb 25 17:41:26 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 12:41:26 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? Message-ID: <15482.30534.640750.875602@anthem.wooz.org> I'm still woefully behind on my email since returning from vacation, but I thought I'd rehash a bit on PEP 215, string interpolation, given some recent hacking and thinking about stuff we talked about at IPC10. Background: PEP 215 has some interesting ideas, but IMHO is more than I'm comfortable with. At IPC10, Guido described his rules for string interpolation as they would be if his time machine were more powerful. These follow some discussions we've had during various Zope sprints about making the rules simpler for non-programmers to understand. I've also been struggling with how error prone %(var)s substitutions can be in the thru-the-web Mailman strings where this is supported. Here's what I've come up with. Guido's rules for $-substitutions are really simple: 1. $$ substitutes to just a single $ 2. $identifier followed by non-identifier characters gets interpolated with the value of the 'identifier' key in the substitution dictionary. 3. For handling cases where the identifier is followed by identifier characters that aren't part of the key, ${identfier} is equivalent to $identifier. And that's it. For the sake of discussion, forget about where the dictionary for string interpolation comes from. I've hacked together 4 functions which I'm experimentally using to provide these rules in thru-the-web string editing, and also for sanity checking the strings as they're submitted. I think there's a fairly straightforward conversion between traditional %-strings and these newfangled $-strings, and so two of the functions do the conversions back and forth. The second two functions attempt to return a list of all the substitution variables found in either a %-string or a $-string. I match this against the list of known legal substitution variables, and bark loudly if there's some mismatch. The one interesting thing about %-to-$ conversion is that the regexp I use leaves the trailing `s' in %(var)s as optional, so I can auto-correct for those that are missing. I think this was an idea that Paul Dubois came up with during the lunch discussion. Seems to work well, and I can do a %-to-$-to-% roundtrip; if the strings at the ends are the same then there wasn't any missing `s's, otherwise the conversion auto-corrected and I can issue a warning. This is all really proto-stuff, but I've done some limited testing and it seems to work pretty well. So without changing the language we can play with $-strings using Guido's rules to see if we like them or not, by simply converting them to traditional %-strings manually, and then doing the mod-operator substitutions. Hopefully I've extracted the right bits of code from my modules for you to get the idea. There may be bugs . -Barry -------------------- snip snip -------------------- import re from string import digits try: # Python 2.2 from string import ascii_letters except ImportError: # Older Pythons _lower = 'abcdefghijklmnopqrstuvwxyz' ascii_letters = _lower + _lower.upper() # Search for $(identifier)s strings, except that the trailing s is optional, # since that's a common mistake cre = re.compile(r'%\(([_a-z]\w*?)\)s?', re.IGNORECASE) # Search for $$, $identifier, or ${identifier} dre = re.compile(r'(\${2})|\$([_a-z]\w*)|\${([_a-z]\w*)}', re.IGNORECASE) IDENTCHARS = ascii_letters + digits + '_' EMPTYSTRING = '' # Utilities to convert from simplified $identifier substitutions to/from # standard Python $(identifier)s substititions. The "Guido rules" for the # former are: # $$ -> $ # $identifier -> $(identifier)s # ${identifier} -> $(identifier)s def to_dollar(s): """Convert from %-strings to $-strings.""" s = s.replace('$', '$$') parts = cre.split(s) for i in range(1, len(parts), 2): if parts[i+1] and parts[i+1][0] in IDENTCHARS: parts[i] = '${' + parts[i] + '}' else: parts[i] = '$' + parts[i] return EMPTYSTRING.join(parts) def to_percent(s): """Convert from $-strings to %-strings.""" s = s.replace('%', '%%') parts = dre.split(s) for i in range(1, len(parts), 4): if parts[i] is not None: parts[i] = '$' elif parts[i+1] is not None: parts[i+1] = '%(' + parts[i+1] + ')s' else: parts[i+2] = '%(' + parts[i+2] + ')s' return EMPTYSTRING.join(filter(None, parts)) def dollar_identifiers(s): """Return the set (dictionary) of identifiers found in a $-string.""" d = {} for name in filter(None, [b or c or None for a, b, c in dre.findall(s)]): d[name] = 1 return d def percent_identifiers(s): """Return the set (dictionary) of identifiers found in a %-string.""" d = {} for name in cre.findall(s): d[name] = 1 return d -------------------- snip snip -------------------- Python 2.2 (#1, Dec 24 2001, 15:39:01) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import dollar >>> dollar.to_dollar('%(one)s %(two)three %(four)seven') '$one ${two}three ${four}even' >>> dollar.to_percent(dollar.to_dollar('%(one)s %(two)three %(four)seven')) '%(one)s %(two)sthree %(four)seven' >>> dollar.percent_identifiers('%(one)s %(two)three %(four)seven') {'four': 1, 'two': 1, 'one': 1} >>> dollar.dollar_identifiers(dollar.to_dollar('%(one)s %(two)three %(four)seven')) {'four': 1, 'two': 1, 'one': 1} From mal@lemburg.com Mon Feb 25 18:08:48 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 25 Feb 2002 19:08:48 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> Message-ID: <3C7A7DB0.A8A3416E@lemburg.com> "Barry A. Warsaw" wrote: > > Background: PEP 215 has some interesting ideas, but IMHO is more than > I'm comfortable with. At IPC10, Guido described his rules for string > interpolation as they would be if his time machine were more powerful. > These follow some discussions we've had during various Zope sprints > about making the rules simpler for non-programmers to understand. > I've also been struggling with how error prone %(var)s substitutions > can be in the thru-the-web Mailman strings where this is supported. > Here's what I've come up with. > > Guido's rules for $-substitutions are really simple: > > 1. $$ substitutes to just a single $ > > 2. $identifier followed by non-identifier characters gets interpolated > with the value of the 'identifier' key in the substitution > dictionary. > > 3. For handling cases where the identifier is followed by identifier > characters that aren't part of the key, ${identfier} is equivalent > to $identifier. > > And that's it. For the sake of discussion, forget about where the > dictionary for string interpolation comes from. Wouldn't it be a lot simpler and more inline with what we already have, if we'd use '%' as escape characters ? 1. %% becomes % 2. %ident maps to %(ident)s as we have it now 3. %{ident} maps to %(ident)s 4. %(ident)s continues to have the same semantics as before -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Mon Feb 25 18:33:32 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 25 Feb 2002 13:33:32 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: Your message of "Mon, 25 Feb 2002 19:08:48 +0100." <3C7A7DB0.A8A3416E@lemburg.com> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> Message-ID: <200202251833.g1PIXW912610@pcp742651pcs.reston01.va.comcast.net> [Barry] > > Guido's rules for $-substitutions are really simple: > > > > 1. $$ substitutes to just a single $ > > > > 2. $identifier followed by non-identifier characters gets interpolated > > with the value of the 'identifier' key in the substitution > > dictionary. > > > > 3. For handling cases where the identifier is followed by identifier > > characters that aren't part of the key, ${identfier} is equivalent > > to $identifier. > > > > And that's it. For the sake of discussion, forget about where the > > dictionary for string interpolation comes from. [MAL] > Wouldn't it be a lot simpler and more inline with what we > already have, if we'd use '%' as escape characters ? > > 1. %% becomes % > > 2. %ident maps to %(ident)s as we have it now > > 3. %{ident} maps to %(ident)s > > 4. %(ident)s continues to have the same semantics as > before That's not simpler, it's more complicated. Any tool dealing with these will have to understand all the rules. The point of switching to $ is twofold: (1) it avoids confusion with the old %-based syntax (which can continue to exist for different purposes), (2) it is familiar to people who have seen substitution in other languages. $ is nearly universal (Perl, Tcl, Ruby, shell, etc.) --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Mon Feb 25 18:48:32 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 13:48:32 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> Message-ID: <15482.34560.688685.262327@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> 1. %% becomes % MAL> 2. %ident maps to %(ident)s as we have it now MAL> 3. %{ident} maps to %(ident)s MAL> 4. %(ident)s continues to have the same semantics as MAL> before What happens to %dogfood or %sickpuppy? If you're trying to maintain backwards compatibility with existing syntax, you can't use %ident strings. -Barry From mal@lemburg.com Mon Feb 25 19:25:59 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 25 Feb 2002 20:25:59 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> Message-ID: <3C7A8FC7.9CB321EE@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MAL" == M writes: > > MAL> 1. %% becomes % > > MAL> 2. %ident maps to %(ident)s as we have it now > > MAL> 3. %{ident} maps to %(ident)s > > MAL> 4. %(ident)s continues to have the same semantics as > MAL> before > > What happens to %dogfood or %sickpuppy? If you're trying to maintain > backwards compatibility with existing syntax, you can't use %ident > strings. That's what I was trying to achieve. The only gripe I sometimes have with '%(ident)s' is that users forget the 's' behind '%(ident)'; I'd be ok with dropping 2. and only adding 3. Whatever you do, just please don't mix the old and new semantics... 'Joe has $ %(a)5.2f in his pocket.' % locals() is perfectly valid now and should continue to be valid. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From barry@zope.com Mon Feb 25 19:28:13 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 14:28:13 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> Message-ID: <15482.36941.605165.133988@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> Whatever you do, just please don't mix the old and new MAL> semantics... MAL> 'Joe has $ %(a)5.2f in his pocket.' % locals() MAL> is perfectly valid now and should continue to be valid. I agree completely; it ought to be one or the other. In the code I emailed, you actually had to do a conversion step from $-strings to %-strings to use the build-in string-mod operator. In practice, if $-strings were to be added to the language, I suspect some new prefix would have to designate a new type of string object, e.g. $'' strings. Or perhaps a different binary operator could be used. -Barry From mal@lemburg.com Mon Feb 25 19:44:48 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 25 Feb 2002 20:44:48 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> Message-ID: <3C7A9430.1E1077F8@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MAL" == M writes: > > MAL> Whatever you do, just please don't mix the old and new > MAL> semantics... > > MAL> 'Joe has $ %(a)5.2f in his pocket.' % locals() > > MAL> is perfectly valid now and should continue to be valid. > > I agree completely; it ought to be one or the other. In the code I > emailed, you actually had to do a conversion step from $-strings to > %-strings to use the build-in string-mod operator. In practice, if > $-strings were to be added to the language, I suspect some new prefix > would have to designate a new type of string object, e.g. $'' > strings. Or perhaps a different binary operator could be used. Good. Since the strings themselves don't really change and to avoid confusing string modifiers... ur$'my $format \$tring' I'd suggest to use a new operator, e.g. 'Joe has $$ $a in his pocket.' $ locals() -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From barry@zope.com Mon Feb 25 19:55:29 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 14:55:29 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> Message-ID: <15482.38577.933015.221824@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> 'Joe has $$ $a in his pocket.' $ locals() I'd prefer to hijack an existing operator -- one that's unsupported by the string object. Perhaps / or - or & or | ? -Barry From mal@lemburg.com Mon Feb 25 20:07:02 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 25 Feb 2002 21:07:02 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> Message-ID: <3C7A9966.CE6C7CCB@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MAL" == M writes: > > MAL> 'Joe has $$ $a in his pocket.' $ locals() > > I'd prefer to hijack an existing operator -- one that's unsupported by > the string object. Perhaps / or - or & or | '/' looks nice and has this "interpret under" sort of meaning: 'Joe has $$ $a in his pocket.' / locals() If you are more into algebra, then '*' would probably also appeal to the eye: 'Joe has $$ $a in his pocket.' * locals() -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From fdrake@acm.org Mon Feb 25 20:08:49 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 25 Feb 2002 15:08:49 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <3C7A9966.CE6C7CCB@lemburg.com> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9966.CE6C7CCB@lemburg.com> Message-ID: <15482.39377.989664.700535@grendel.zope.com> M.-A. Lemburg writes: > '/' looks nice and has this "interpret under" sort of meaning: > > 'Joe has $$ $a in his pocket.' / locals() I'd read that more as "mapped over" rather than "interpret under". ;) > If you are more into algebra, then '*' would probably also appeal > to the eye: > > 'Joe has $$ $a in his pocket.' * locals() But * is already meaningful for strings, so not a good choice. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@zope.com Mon Feb 25 20:10:59 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 15:10:59 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9966.CE6C7CCB@lemburg.com> Message-ID: <15482.39507.311309.96141@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> '/' looks nice and has this "interpret under" sort of MAL> meaning: MAL> 'Joe has $$ $a in his pocket.' / locals() I agree, I like that one. MAL> If you are more into algebra, then '*' would probably also MAL> appeal to the eye: MAL> 'Joe has $$ $a in his pocket.' * locals() I avoid it because then you'd have to add another type test to operator-*. Ping, if you're around and care to comment, perhaps we can try to update PEP 215 and maybe add a reference implementation? -Barry From mal@lemburg.com Mon Feb 25 20:23:04 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 25 Feb 2002 21:23:04 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9966.CE6C7CCB@lemburg.com> <15482.39507.311309.96141@anthem.wooz.org> Message-ID: <3C7A9D28.DE993EEC@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MAL" == M writes: > > MAL> '/' looks nice and has this "interpret under" sort of > MAL> meaning: > > MAL> 'Joe has $$ $a in his pocket.' / locals() > > I agree, I like that one. Fine with me. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From pedronis@bluewin.ch Mon Feb 25 20:18:36 2002 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 25 Feb 2002 21:18:36 +0100 Subject: [Python-Dev] Re: Jython bugs or features? Message-ID: <021a01c1be39$9cfb1b00$6d94fea9@newmexico> [Dinu Gherman] > > I'm pretty surprised! I knew the Python test suite is *very* far > from complete, but list() is an *extremely* crucial function, > isn't it? Given how things concretely are and not considering some abstract point of view, I would say YMMV. I have skimmed over CPython 2.1 std lib, honestly there are not many places relevant to Jython where list for copying lists is used and where the bug can show through. Consider things like: a += list(b) OR b = ... + list(b) and all the places where l = list(sq) and sq is not a list, those works in Jython. Btw here simply Jython was never up to speed (that means always implemented the wrong semantics) and nobody noticed this or to be precise ever reported this. OTOH we try to be responsive to bug reports. Yes the test suites are *far* from optimal and CPython sometimes regresses too. > > My impression is that Jython claims to be an implementation of > Python in Java. Yes but your PS (see below) shows that this can mean a lot of things, Jython is more about writing new programs and Java integration (which is a big and non-easy part and of its codebase) than for example supporting completely CPython os and shell-like programming and allowing effortless porting of all kinds of CPython programs (clearly list bug is another kind of issue given that the right behaviour is well documented). Java integration has a higher priority than those thing. Jython is more about embracing the Java platform (philosophy) than work-arounding it. For example anything that would require writing JNI glue and native C code is simply discarded. There are things for which CPython is simply better suited than Java/Jython and the other way around. They are not fully equivalent substitutes. > Everybody understands the existence of bugs, but > if functions are missing I'm not sure there is sufficient quality > control, leave alone a useful test suite. > > My bold guess is that > it should be very easy to check automatically for each module in > the std.lib. at least if the same classes and methods do exist > in CPython and JPython. > > Honestly, I don't think it makes much sense to maintain two > code bases without some degree of automatic testing... JPython is not born with such a test, and Jython until now has never grown one. But I have taken note of this adding a feature request for such a test but is more a matter of resources and priorities than easiness. Btw after some checking of the CPython CVS http://groups.google.com/groups?q=g:thl2649249624d&hl=en&selm=mailman.100742 5027.30019.python-list%40python.org (look at the change in the module-level __doc__ ) it seems that reset is vestigial and one should use seek(0) instead. reset is not supported by StringIO or by files and not documented (not after 1.5 for sure). You made me think that it was added in 2.1, so at least IMO it is an option for Jython to have decided not to support it. > > When seeing such bugs, my immediate reaction (like that of most > others) is to think that not many people can be using this serious- > ly. > See e.g.: http://groups.google.com/groups?q=g:thl2649249624d&hl=en&selm=mailman.100742 5027.30019.python-list%40python.org and thread. > PS: BTW, how about this one: > > [localhost:~] dinu% jython > Jython 2.1 on java1.3.1 (JIT: null) > Type "copyright", "credits" or "license" for more information. > >>> import os > >>> os.chdir('..') > Traceback (innermost last): > File "", line 1, in ? > File ".../Jython-2.1/Lib/javaos.py", line 56, in chdir > OSError: [Errno 0] chdir not supported in Java: .. We don't support this one, sorry. From paul@prescod.net Mon Feb 25 20:31:55 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 12:31:55 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> Message-ID: <3C7A9F3B.B42265DC@prescod.net> "Barry A. Warsaw" wrote: > > >>>>> "MAL" == M writes: > > MAL> 'Joe has $$ $a in his pocket.' $ locals() > > I'd prefer to hijack an existing operator -- one that's unsupported by > the string object. Perhaps / or - or & or | Yuck! String interopolation should be a *compile time* action, not an operator. One of the goals, in my mind, is to allow people to string interpolate without knowing what the locals() function does. After all, that function is otherwise useless for most Python programmers (and should probably be moved to an introspection module). Your strategy requires the naive user to learn a) the $ syntax, b) the magic operator syntax and c) the meaning of the locals() function. Plus you've thrown away the idea that interpolation works as it does in the shell or in Perl/Awk/Ruby etc. At that point, in my mind, we're back where we started and should just use %. Well have reinvented it with a few small tweaks. Plus, operator-based evaluation has some security implications that compile time evaluation does not. In particular, if the left-hand thing can be any string then you have the potential of accidentally allowing the user to supply a string that allows him/her to introspect your local variables. That can't happen if the interpolation is done at compile time. Paul Prescod From fredrik@pythonware.com Mon Feb 25 20:44:13 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 25 Feb 2002 21:44:13 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> Message-ID: <07fe01c1be3d$32699fb0$ced241d5@hagrid> paul wrote: > Your strategy requires the naive user to learn a) the $ syntax, b) the > magic operator syntax and c) the meaning of the locals() function. Plus > you've thrown away the idea that interpolation works as it does in the > shell or in Perl/Awk/Ruby etc. > > At that point, in my mind, we're back where we started and should just > use %. # interpolate! s = I('Joe has $ ', a, ' in his pocket.') or perhaps # print-like interpolation s = P('Joe has $', a, 'in his pocket.') works pretty well too. in all versions of python, with all existing syntax-aware tools. and if written in C, it's probably as fast as any other solution... (implementing I/P is left as an exercise etc etc) From nas@python.ca Mon Feb 25 20:50:39 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 25 Feb 2002 12:50:39 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <3C7A9F3B.B42265DC@prescod.net>; from paul@prescod.net on Mon, Feb 25, 2002 at 12:31:55PM -0800 References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> Message-ID: <20020225125039.A22519@glacier.arctrix.com> Paul Prescod wrote: > At that point, in my mind, we're back where we started and should just > use %. I agree. Neil From fdrake@acm.org Mon Feb 25 20:55:15 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 25 Feb 2002 15:55:15 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <3C7A9F3B.B42265DC@prescod.net> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> Message-ID: <15482.42163.449374.118807@grendel.zope.com> Paul Prescod writes: > String interopolation should be a *compile time* action, not an > operator. One of the goals, in my mind, is to allow people to string This doesn't work as soon as the string is not a constant. Many of the discussions at PythonLabs did not involve text included as part of an application's source, and the conversion operation would not be driven by application code but by library/service code. Even if it were a constant, needing to add in message catalog support changes things as well. So auto-magical interpolation doesn't seem like a good idea. > interpolate without knowing what the locals() function does. After all, > that function is otherwise useless for most Python programmers (and > should probably be moved to an introspection module). You'd still only need to use locals() if that's your source of variables. > Your strategy requires the naive user to learn a) the $ syntax, b) the > magic operator syntax and c) the meaning of the locals() function. Plus > you've thrown away the idea that interpolation works as it does in the > shell or in Perl/Awk/Ruby etc. a) The $ syntax is easier than the % syntax, and already more familiar to most new users. b) What's a magic operator? string % mapping is already pretty magical as far as the modulus operation is concerned. c) And you still don't have to use locals() if you don't want to. And the string syntax matches a common subset of what's used elsewhere. We just have the added control over the source of substitution values (a good thing). > At that point, in my mind, we're back where we started and should just > use %. Well have reinvented it with a few small tweaks. And we've made it a lot easier for strings that are not part of Python source code, and for people who produce that data but never know Python. > Plus, operator-based evaluation has some security implications that > compile time evaluation does not. In particular, if the left-hand thing > can be any string then you have the potential of accidentally allowing > the user to supply a string that allows him/her to introspect your local > variables. That can't happen if the interpolation is done at compile > time. I'm not sure I understand this. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jepler@unpythonic.dhs.org Mon Feb 25 21:01:06 2002 From: jepler@unpythonic.dhs.org (Jeff Epler) Date: Mon, 25 Feb 2002 15:01:06 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <3C7A9F3B.B42265DC@prescod.net> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> Message-ID: <20020225150106.B22803@unpythonic.dhs.org> On Mon, Feb 25, 2002 at 12:31:55PM -0800, Paul Prescod wrote: > Yuck! > > String interopolation should be a *compile time* action, not an > operator. One of the goals, in my mind, is to allow people to string > interpolate without knowing what the locals() function does. After all, > that function is otherwise useless for most Python programmers (and > should probably be moved to an introspection module). > > Your strategy requires the naive user to learn a) the $ syntax, b) the > magic operator syntax and c) the meaning of the locals() function. Plus > you've thrown away the idea that interpolation works as it does in the > shell or in Perl/Awk/Ruby etc. > > At that point, in my mind, we're back where we started and should just > use %. Well have reinvented it with a few small tweaks. > > Plus, operator-based evaluation has some security implications that > compile time evaluation does not. In particular, if the left-hand thing > can be any string then you have the potential of accidentally allowing > the user to supply a string that allows him/her to introspect your local > variables. That can't happen if the interpolation is done at compile > time. But how do you internationalize your program once you use $-subs? The great strength of %-formats, and the *printf functions that inspired them, are that the interpretation of the format takes place at runtime. (printf has added positional specifiers, spelled like "%1$s", to permit reordering of items in the format, while Python has added key-specifiers, spelled like "%(id)s", but they're about equally powerful) With %-subs, we can write def gettext(s): """ Return the localized version of s from the message catalog """ return s def print_chance(who, chance): print gettext("%(who)s has a %(percent).2f%% chance of surviving") % { 'who': who, 'percent': chance * 100} print_chance("Jeff", 1./3) I'm not interested in any proposal that turns code that's easy to internationalize (just add calls to gettext(), commonly spelled _(), around each string that needs translating, then fix up the places where the programmer was too clever) into code that's impossible to internationalize by design. Jeff From akuchlin@mems-exchange.org Mon Feb 25 21:04:23 2002 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 25 Feb 2002 16:04:23 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.42163.449374.118807@grendel.zope.com> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> Message-ID: <20020225210423.GA2398@crystal.mems-exchange.org> On Mon, Feb 25, 2002 at 03:55:15PM -0500, Fred L. Drake, Jr. wrote: >And we've made it a lot easier for strings that are not part of Python >source code, and for people who produce that data but never know >Python. But for applications where people don't edit Python, this could just be a library module and doesn't need a new operator in the Python code. I agree with Paul; there's no actual gain in clarity from the new syntax. > > > Plus, operator-based evaluation has some security implications that > > compile time evaluation does not. In particular, if the left-hand thing > >I'm not sure I understand this. Presumably Paul is thinking of something like: mlist = load_list('listname') # Lists have .title, .password, ... form_value = cgi.form['text'] # User puts $password into text print text \ vars(mlist) --amk (www.amk.ca) The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. -- H.P. Lovecraft, "The Call of Cthulhu" From guido@python.org Mon Feb 25 21:06:15 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 25 Feb 2002 16:06:15 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: Your message of "Mon, 25 Feb 2002 12:31:55 PST." <3C7A9F3B.B42265DC@prescod.net> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> Message-ID: <200202252106.g1PL6FY13393@pcp742651pcs.reston01.va.comcast.net> > String interopolation should be a *compile time* action, not an > operator. One of the goals, in my mind, is to allow people to string > interpolate without knowing what the locals() function does. After all, > that function is otherwise useless for most Python programmers (and > should probably be moved to an introspection module). > > Your strategy requires the naive user to learn a) the $ syntax, b) the > magic operator syntax and c) the meaning of the locals() function. Plus > you've thrown away the idea that interpolation works as it does in the > shell or in Perl/Awk/Ruby etc. > > At that point, in my mind, we're back where we started and should just > use %. Well have reinvented it with a few small tweaks. > > Plus, operator-based evaluation has some security implications that > compile time evaluation does not. In particular, if the left-hand thing > can be any string then you have the potential of accidentally allowing > the user to supply a string that allows him/her to introspect your local > variables. That can't happen if the interpolation is done at compile > time. All right, but there *also* needs to be a way to invoke interpolation explicitly -- just like eval(). This has applicability e.g. in i18n. --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Mon Feb 25 21:14:36 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 25 Feb 2002 13:14:36 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.42163.449374.118807@grendel.zope.com>; from fdrake@acm.org on Mon, Feb 25, 2002 at 03:55:15PM -0500 References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> Message-ID: <20020225131436.B22519@glacier.arctrix.com> Fred L. Drake, Jr. wrote: > This doesn't work as soon as the string is not a constant. Many of > the discussions at PythonLabs did not involve text included as part of > an application's source, and the conversion operation would not be > driven by application code but by library/service code. Write a function or use %. This is not a good reason to add a string interpolation operator to the language. Note that this does not mean I'm against PEP 215. PEP 215 proposes to solve a different problem and should not be hijacked, IMHO. Neil From skip@pobox.com Mon Feb 25 21:16:19 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 25 Feb 2002 15:16:19 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.36941.605165.133988@anthem.wooz.org> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> Message-ID: <15482.43427.691923.976178@12-248-41-177.client.attbi.com> BAW> ... I suspect some new prefix would have to designate a new type of BAW> string object, e.g. $'' strings. Or perhaps a different binary BAW> operator could be used. I'm still not at all fond of the $-string idea, but in the interests of completeness, perhaps using '$' as a binary operator (by analogy with '%' as a binary operator having nothing to do with modulo when the left arg is a string) would be appropriate. Skip From nas@python.ca Mon Feb 25 21:18:57 2002 From: nas@python.ca (Neil Schemenauer) Date: Mon, 25 Feb 2002 13:18:57 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <20020225150106.B22803@unpythonic.dhs.org>; from jepler@unpythonic.dhs.org on Mon, Feb 25, 2002 at 03:01:06PM -0600 References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> Message-ID: <20020225131857.A22769@glacier.arctrix.com> Jeff Epler wrote: > But how do you internationalize your program once you use $-subs? So don't use them. What's the problem? Neil From jepler@unpythonic.dhs.org Mon Feb 25 21:20:36 2002 From: jepler@unpythonic.dhs.org (Jeff Epler) Date: Mon, 25 Feb 2002 15:20:36 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.42163.449374.118807@grendel.zope.com> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> Message-ID: <20020225152035.C22803@unpythonic.dhs.org> On Mon, Feb 25, 2002 at 03:55:15PM -0500, Fred L. Drake, Jr. wrote: > Paul Prescod writes: > > Plus, operator-based evaluation has some security implications that > > compile time evaluation does not. In particular, if the left-hand thing > > can be any string then you have the potential of accidentally allowing > > the user to supply a string that allows him/her to introspect your local > > variables. That can't happen if the interpolation is done at compile > > time. > > I'm not sure I understand this. Imagine that you have: def print_crypted_passwd(name, plaintext, salt="Xx"): crypted = crypt.crypt(plaintext, salt) print _("""%(name)s, your crypted password is %(crypted)s.""") % locals() and that some crafty devil translates this as msgstr "%(name)s, your plaintext password is %(plaintext). HA HA HA" i.e., the translator (or other person who can influence the format string) can access other information in the dict you pass in, even if you didn't intend it. Personally, I tend to view this as showing that using % locals() is unsanitary. But that means that the problem is in using the locals() dictionary, a problem made worse by making the use of locals() implicit. (And under $-substitution, if locals() is implicit, how do I substitute with a dictionary other than locals()? def print_crypted_passwd(accountinfo): print "%(name)s, your crypted password is %(crypted)s." \ % accountinfo.__dict__ vs def print_crypted_passwd(accountinfo): def really_subst(name, crypted): return $"$name, your crypted password is $crypted" print really_subst(accountinfo.name, accountinfo.crypted) or def print_crypted_passwd(accountinfo): name = accountinfo.name crypted = accountinfo.crypted print $"$name, your crypted password is $crypted" ???) Jeff From jepler@unpythonic.dhs.org Mon Feb 25 21:26:01 2002 From: jepler@unpythonic.dhs.org (Jeff Epler) Date: Mon, 25 Feb 2002 15:26:01 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <20020225131857.A22769@glacier.arctrix.com> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> Message-ID: <20020225152600.E22803@unpythonic.dhs.org> On Mon, Feb 25, 2002 at 01:18:57PM -0800, Neil Schemenauer wrote: > Jeff Epler wrote: > > But how do you internationalize your program once you use $-subs? > > So don't use them. What's the problem? The problem is when I have to internationalize a program some schmuck wrote using $-subs throughout. Jeff From paul@prescod.net Mon Feb 25 21:37:26 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 13:37:26 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> Message-ID: <3C7AAE96.BA4D19FE@prescod.net> Jeff Epler wrote: > > ... > > Imagine that you have: > def print_crypted_passwd(name, plaintext, salt="Xx"): > crypted = crypt.crypt(plaintext, salt) > print _("""%(name)s, your crypted password is %(crypted)s.""") % locals() > > and that some crafty devil translates this as > msgstr "%(name)s, your plaintext password is %(plaintext). HA HA HA" > > i.e., the translator (or other person who can influence the format > string) can access other information in the dict you pass in, even if > you didn't intend it. Right. I don't claim that this is a killer problem. I'm actually much more concerned about the usability aspects. But if we can improve security at the same time, then lets. > Personally, I tend to view this as showing that using % locals() is > unsanitary. But that means that the problem is in using the locals() > dictionary, a problem made worse by making the use of locals() implicit. If it is done a compile time then the crafty devil couldn't get in the alternate string! On the other hand, if you're doing runtime translation stuff then of course you need to use a runtime function, like "%" or maybe a new "interpol". I am not against the existence of such a thing. I'm against it being the default way to do interpolation. It's like "eval" a compile-time tool that sophisticated users have access to at runtime. > (And under $-substitution, if locals() is implicit, how do I substitute > with a dictionary other than locals()? Well I don't think you should have to, because you could use the "interpol" function (maybe from the "interpol" module). But anyhow, your question has a factual answer and you already gave it! > def print_crypted_passwd(accountinfo): > def really_subst(name, crypted): > return $"$name, your crypted password is $crypted" > print really_subst(accountinfo.name, accountinfo.crypted) > or > def print_crypted_passwd(accountinfo): > name = accountinfo.name > crypted = accountinfo.crypted > print $"$name, your crypted password is $crypted" This last one looks very clear and simple to me! What's the problem with it? Still, I don't argue against the need for something at runtime -- as a power tool. Either we could just keep "%" or make a function. Okay, so my proposal for $ doesn't do everything that % does. It was never spec'd to do everything "%" does. For instance it doesn't do float formatting tricks. Paul Prescod From barry@zope.com Mon Feb 25 21:44:02 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 16:44:02 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> Message-ID: <15482.45090.3848.616817@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> I'm still not at all fond of the $-string idea, but in the SM> interests of completeness, perhaps using '$' as a binary SM> operator (by analogy with '%' as a binary operator having SM> nothing to do with modulo when the left arg is a string) would SM> be appropriate. I can't say whether it's a good thing to add this to the language or not. I tend to think that %(var)s is just fine from a Python programmer's point of view, and in the interest of TOOWTDI, we don't need anything else. >From a /non-programmer's/ point of view, %(var)s is way too error prone, and $-strings are an attempt at implementing a simple to explain, hard to get wrong, rule for thru-the-web supplied template strings. There's been no usability testing yet to know whether $-strings actually will be easier to use , but I've got plenty of anecdotal evidence that %-strings suck badly for useability by non-Python programmers. Still, if $-strings are better for non-programmers, maybe they're better for programmers too. There's certainly evidence that translators get them wrong too. -Barry From paul@prescod.net Mon Feb 25 21:46:46 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 13:46:46 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> Message-ID: <3C7AB0C6.8C3BF369@prescod.net> Jeff Epler wrote: > > On Mon, Feb 25, 2002 at 01:18:57PM -0800, Neil Schemenauer wrote: > > Jeff Epler wrote: > > > But how do you internationalize your program once you use $-subs? > > > > So don't use them. What's the problem? > > The problem is when I have to internationalize a program some schmuck > wrote using $-subs throughout. I think you go through and remove the "$" signs (probably at the same time you are removing "_") and use a runtime function to do the translation (probably the same function doing the interpolation). Then you take on the responsibility yourself for making sure that the original string is a constant (not a user-supplied variable) and that the replacement strings come from somewhere secure. So: a = $"Hello there $name" becomes: a = _("Hello there $name") I think Barry's gettext already does that or something, doesn't it? Paul Prescod From barry@zope.com Mon Feb 25 21:51:00 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 16:51:00 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> Message-ID: <15482.45508.332144.622180@anthem.wooz.org> >>>>> "JE" == Jeff Epler writes: JE> Imagine that you have: >> def print_crypted_passwd(name, plaintext, salt="Xx"): crypted = >> crypt.crypt(plaintext, salt) print _("""%(name)s, your crypted >> password is %(crypted)s.""") % locals() JE> and that some crafty devil translates this as msgstr JE> "%(name)s, your plaintext password is %(plaintext). HA HA HA" JE> i.e., the translator (or other person who can influence the JE> format string) can access other information in the dict you JE> pass in, even if you didn't intend it. That's a very interesting vulnerability you bring up! In my own implementation, _() uses sys._getframe(1) to gather up the caller's locals and globals into the interpolation dictionary, i.e. you don't need to specify it explicitly. Damn convenient, but also vulnerable to this exploit. In that case, I'd be very careful to make sure that print_crypted_passwd() was written such that the plaintext wasn't available via a variable in the caller's frame. JE> Personally, I tend to view this as showing that using % JE> locals() is unsanitary. Nope, but you have to watch out not to mix cooked and raw food on the same plate (to stretch an unsavory analogy). JE> But that means that the problem is in using the locals() JE> dictionary, a problem made worse by making the use of locals() JE> implicit. JE> (And under $-substitution, if locals() is implicit, how do I JE> substitute with a dictionary other than locals()? def print_crypted_passwd(name, crypted): print $"$name, your crypted password is $crypted" print_crypted_passwd(yername, crypt.crypt(plaintext, salt)) -Barry From barry@zope.com Mon Feb 25 21:53:12 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 16:53:12 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> Message-ID: <15482.45640.614214.271184@anthem.wooz.org> >>>>> "JE" == Jeff Epler writes: JE> I'm not interested in any proposal that turns code that's easy JE> to internationalize (just add calls to gettext(), commonly JE> spelled _(), around each string that needs translating, then JE> fix up the places where the programmer was too clever) into JE> code that's impossible to internationalize by design. I'm with you there, Jeff. -Barry From barry@zope.com Mon Feb 25 21:55:48 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 16:55:48 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <3C7AAE96.BA4D19FE@prescod.net> Message-ID: <15482.45796.817626.14965@anthem.wooz.org> >>>>> "PP" == Paul Prescod writes: PP> Okay, so my proposal for $ doesn't do everything that % PP> does. It was never spec'd to do everything "%" does. For PP> instance it doesn't do float formatting tricks. Does anybody ever even use something other than `s' for %() strings? >>> '%(float)f' % {'float': 3.9} '3.900000' I never have. -Barry From barry@zope.com Mon Feb 25 21:57:27 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 16:57:27 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> Message-ID: <15482.45895.373283.698600@anthem.wooz.org> >>>>> "PP" == Paul Prescod writes: PP> I think Barry's gettext already does that or something, PP> doesn't it? Yes, I have a function that does that. -Barry From paul@prescod.net Mon Feb 25 21:56:14 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 13:56:14 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <3C7AAE96.BA4D19FE@prescod.net> <15482.45796.817626.14965@anthem.wooz.org> Message-ID: <3C7AB2FE.D1FFE397@prescod.net> "Barry A. Warsaw" wrote: > >... > > Does anybody ever even use something other than `s' for %() strings? > > >>> '%(float)f' % {'float': 3.9} > '3.900000' Presumably numerical analysts do....and David Ascher once told me he uses %d as a sanity type-check. I don't bother. Paul Prescod From fredrik@pythonware.com Mon Feb 25 21:59:14 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 25 Feb 2002 22:59:14 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org><3C7A7DB0.A8A3416E@lemburg.com><15482.34560.688685.262327@anthem.wooz.org><3C7A8FC7.9CB321EE@lemburg.com><15482.36941.605165.133988@anthem.wooz.org><15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> Message-ID: <0a1801c1be47$b9b4bdb0$ced241d5@hagrid> barry wrote: > From a /non-programmer's/ point of view, %(var)s is way too error > prone, and $-strings are an attempt at implementing a simple to > explain, hard to get wrong, rule for thru-the-web supplied template > strings. how about making that "s" optional? 1. %% substitutes to just a single % 2. %(identifier) followed by non-identifier characters gets interpolated with the value of the 'identifier' key in the sub- stitution dictionary. 3. For handling cases where the identifier is followed by identifier characters that aren't part of the key, $(identfier)s is equivalent to %(identifier) From skip@pobox.com Mon Feb 25 22:01:53 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 25 Feb 2002 16:01:53 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.45090.3848.616817@anthem.wooz.org> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> Message-ID: <15482.46161.285288.743587@12-248-41-177.client.attbi.com> BAW> There's been no usability testing yet to know whether $-strings BAW> actually will be easier to use , but I've got plenty of BAW> anecdotal evidence that %-strings suck badly for useability by BAW> non-Python programmers. I presume your anecdotal evidence comes from Mailman. If you have a pair of functions that implement the %-to-$-to-% transformation and can catch the missing 's' problem automatically (is that the biggest problem non- programmers have?), then why not just use this in Mailman and be done with the problem? In fact, why not just document Mailman so that "%(var)" is the correct form and silently add the "missing" 's' in your transformation step? That %-strings suck for Mailman administrators does not mean they necessarily suck for programmers. The two populations obviously overlap somewhat, but not tremendously. I have never had a problem with %-strings, certainly not with omitting the trailing 's'. Past experience with printf() doesn't obviously pollute the sample population too much either, since the %(var)s type of format is not supported by printf(). BAW> Still, if $-strings are better for non-programmers, maybe they're BAW> better for programmers too. There's certainly evidence that BAW> translators get them wrong too. What do you mean by "translators"? Skip From fdrake@acm.org Mon Feb 25 21:59:51 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 25 Feb 2002 16:59:51 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <20020225210423.GA2398@crystal.mems-exchange.org> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225210423.GA2398@crystal.mems-exchange.org> Message-ID: <15482.46039.769808.389448@grendel.zope.com> Andrew Kuchling writes: > But for applications where people don't edit Python, this could just > be a library module and doesn't need a new operator in the Python > code. I agree with Paul; there's no actual gain in clarity from the > new syntax. I'm happy with that as well. > Presumably Paul is thinking of something like: > mlist = load_list('listname') > # Lists have .title, .password, ... > form_value = cgi.form['text'] # User puts $password into text > print text \ vars(mlist) Yes, but I'm not convinced this has any more security implications implications than using a library function to perform the transformation. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@zope.com Mon Feb 25 22:01:25 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 17:01:25 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <3C7AAE96.BA4D19FE@prescod.net> <15482.45796.817626.14965@anthem.wooz.org> <3C7AB2FE.D1FFE397@prescod.net> Message-ID: <15482.46133.509028.780744@anthem.wooz.org> >>>>> "PP" == Paul Prescod writes: PP> "Barry A. Warsaw" wrote: >> >> ... Does anybody ever even use something other than `s' for >> %() strings? >>> '%(float)f' % {'float': 3.9} '3.900000' PP> Presumably numerical analysts do....and David Ascher once told PP> me he uses %d as a sanity type-check. I don't bother. %d I sometimes use, but I don't think I've ever (purposely) used %(var)d. -Barry From guido@python.org Mon Feb 25 22:04:04 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 25 Feb 2002 17:04:04 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: Your message of "Mon, 25 Feb 2002 16:44:02 EST." <15482.45090.3848.616817@anthem.wooz.org> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> Message-ID: <200202252204.g1PM44I13781@pcp742651pcs.reston01.va.comcast.net> There are two entirely different potential uses for interpolation. One is for the Python programmer; call this literal interpolation. It's cute to be able to write a = 12 b = 15 c = a*b print $"A rectangle of $a x $b has an area of $c." This is arguably better than print "A rectangle of", a, "x", b, "has an area of", c, "." (and to get rid of the space between the value of c and the '.' a totally different paradigm would have to be used). A totally *different* use of interpolation is for templates, where both the template (any data containing the appropriate $ syntax) and the set of variables to be substituted (any mapping) should be under full control of the program. This is what mailmail needs. Literal interpolation has no security issues, if done properly. In the latter use, the security issues can be taken care of by carefully deciding what data is available in the set of variables to be interpolated. The interpolation syntax I've proposed is intentionally very simple, so that this is relatively easy. I recall seeing slides at the conference of a templating system (maybe Twisted's?) that allowed expressions like $foo.bar[key] which would be much harder to secure. I18n of templates is easy -- just look up the template string in the translation database. I18n of apps using literal interpolation is more of a can of worms, and I have no clear solution. I agree that a solution is needed -- otherwise literal interpolation would be *worse* than what we have now! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Feb 25 22:05:59 2002 From: guido@python.org (Guido van Rossum) Date: Mon, 25 Feb 2002 17:05:59 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: Your message of "Mon, 25 Feb 2002 13:56:14 PST." <3C7AB2FE.D1FFE397@prescod.net> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <3C7AAE96.BA4D19FE@prescod.net> <15482.45796.817626.14965@anthem.wooz.org> <3C7AB2FE.D1FFE397@prescod.net> Message-ID: <200202252205.g1PM5xD13825@pcp742651pcs.reston01.va.comcast.net> > > Does anybody ever even use something other than `s' for %() strings? > > > > >>> '%(float)f' % {'float': 3.9} > > '3.900000' I never use this in combination with named variables, but I often write timing programs that format times using "%6.3f" to get millisecond precision. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Mon Feb 25 22:16:18 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 17:16:18 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> <15482.46161.285288.743587@12-248-41-177.client.attbi.com> Message-ID: <15482.47026.213046.548051@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: BAW> There's been no usability testing yet to know whether BAW> $-strings actually will be easier to use , but I've got BAW> plenty of anecdotal evidence that %-strings suck badly for BAW> useability by non-Python programmers. SM> I presume your anecdotal evidence comes from Mailman. Correct. SM> If you have a pair of functions that implement the %-to-$-to-% SM> transformation and can catch the missing 's' problem SM> automatically (is that the biggest problem non- programmers SM> have?), The biggest, yes, but not necessarily the only one. SM> then why not just use this in Mailman and be done with the SM> problem? That's what I plan on doing for MM2.1, except I won't force it down people's throats yet. It'll be optional (but it'll be an either-or option). I won't use it in Python code yet though (too disruptive), just the thru-the-web template defining text-boxes. SM> In fact, why not just document Mailman so that "%(var)" is the SM> correct form and silently add the "missing" 's' in your SM> transformation step? SM> That %-strings suck for Mailman administrators does not mean SM> they necessarily suck for programmers. True, but who knows? I wouldn't necessarily classify python-dev as a representative sample of users. SM> The two populations obviously overlap somewhat, but not SM> tremendously. I have never had a problem with %-strings, SM> certainly not with omitting the trailing 's'. Past experience SM> with printf() doesn't obviously pollute the sample population SM> too much either, since the %(var)s type of format is not SM> supported by printf(). BAW> Still, if $-strings are better for non-programmers, maybe BAW> they're better for programmers too. There's certainly BAW> evidence that translators get them wrong too. SM> What do you mean by "translators"? Someone who is fluent in a natural language other than English, and translates a catalog of English source strings to a target non-English natural language. E.g. "No such list: %(listname)s" -> "Non esiste la lista: %(listname)s" -Barry From fdrake@acm.org Mon Feb 25 22:16:48 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 25 Feb 2002 17:16:48 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.45090.3848.616817@anthem.wooz.org> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> Message-ID: <15482.47056.544681.741629@grendel.zope.com> Barry A. Warsaw writes: > I can't say whether it's a good thing to add this to the language or > not. I tend to think that %(var)s is just fine from a Python > programmer's point of view, and in the interest of TOOWTDI, we don't We're definately seeing a lot of reasonable concern over adding another formatting operator, and my own interest in the proposal has nothing to do with having an operator to do this. I probably shouldn't have said anything about the topic (I don't recall even noting a preference, myself, just that I'd read one alternative differently than Marc-Andre and that another already had a meaning). > From a /non-programmer's/ point of view, %(var)s is way too error > prone, and $-strings are an attempt at implementing a simple to > explain, hard to get wrong, rule for thru-the-web supplied template How the string was obtained is irrelevant, only that it is not part of the source code and the author may not be a programmer. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@zope.com Mon Feb 25 22:19:41 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 17:19:41 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> <200202252204.g1PM44I13781@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15482.47229.674597.577768@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: GvR> There are two entirely different potential uses for GvR> interpolation. Ah yes Guido, thanks for the clarity! -Barry From barry@zope.com Mon Feb 25 22:23:02 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 17:23:02 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> <15482.47056.544681.741629@grendel.zope.com> Message-ID: <15482.47430.913924.157520@anthem.wooz.org> >>>>> "Fred" == Fred L Drake, Jr writes: >> From a /non-programmer's/ point of view, %(var)s is way too >> error prone, and $-strings are an attempt at implementing a >> simple to explain, hard to get wrong, rule for thru-the-web >> supplied template Fred> How the string was obtained is irrelevant, only that it is Fred> not part of the source code and the author may not be a Fred> programmer. Correct. -Barry From martin@v.loewis.de Mon Feb 25 22:27:49 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 25 Feb 2002 23:27:49 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <3C7AB0C6.8C3BF369@prescod.net> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> Message-ID: Paul Prescod writes: > I think you go through and remove the "$" signs (probably at the same > time you are removing "_") and use a runtime function to do the > translation (probably the same function doing the interpolation). I could not accept any solution that cannot offer anything but this. This kind of interpolation is plain broken. Regards, Martin From martin@v.loewis.de Mon Feb 25 22:25:48 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 25 Feb 2002 23:25:48 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.45508.332144.622180@anthem.wooz.org> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <15482.45508.332144.622180@anthem.wooz.org> Message-ID: barry@zope.com (Barry A. Warsaw) writes: > JE> i.e., the translator (or other person who can influence the > JE> format string) can access other information in the dict you > JE> pass in, even if you didn't intend it. > > That's a very interesting vulnerability you bring up! That's not a vulnerability. It assumes that the translator is an attacker, or that the attacker can change the catalogs. If he is or can, you could not trust them, anyway, as they could cause arbitrary other failures, as well. Regards, Martin From jepler@unpythonic.dhs.org Mon Feb 25 22:34:49 2002 From: jepler@unpythonic.dhs.org (Jeff Epler) Date: Mon, 25 Feb 2002 16:34:49 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: References: <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> Message-ID: <20020225163447.G22803@unpythonic.dhs.org> On Mon, Feb 25, 2002 at 11:27:49PM +0100, Martin v. Loewis wrote: > Paul Prescod writes: > > > I think you go through and remove the "$" signs (probably at the same > > time you are removing "_") and use a runtime function to do the > > translation (probably the same function doing the interpolation). > > I could not accept any solution that cannot offer anything but this. > This kind of interpolation is plain broken. Exactly. Why spend all this time and effort complicating the Python parser and compiler, only to find that all real-world programs just instead implement the feature inside a function call? Jeff From jepler@unpythonic.dhs.org Mon Feb 25 22:45:33 2002 From: jepler@unpythonic.dhs.org (Jeff Epler) Date: Mon, 25 Feb 2002 16:45:33 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: References: <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <15482.45508.332144.622180@anthem.wooz.org> Message-ID: <20020225164532.H22803@unpythonic.dhs.org> On Mon, Feb 25, 2002 at 11:25:48PM +0100, Martin v. Loewis wrote: > That's not a vulnerability. It assumes that the translator is an > attacker, or that the attacker can change the catalogs. If he is or > can, you could not trust them, anyway, as they could cause arbitrary > other failures, as well. It means that you must audit not only your source code, but also your message catalogs, to determine whether information that is supposed to remain internal to a program is not formatted into a string. Of course, it is fairly easy to do this audit by showing that the translated string doesn't contain substitution on any identifiers that the original string did not. I don't think it's impossible that someone supplying catalogs could be an "attacker", even if a plausible scenario doesn't come directly to mind. Jeff From barry@zope.com Mon Feb 25 23:04:46 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 18:04:46 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <15482.45508.332144.622180@anthem.wooz.org> <20020225164532.H22803@unpythonic.dhs.org> Message-ID: <15482.49934.923783.45301@anthem.wooz.org> >>>>> "JE" == Jeff Epler writes: JE> On Mon, Feb 25, 2002 at 11:25:48PM +0100, Martin v. Loewis JE> wrote: >> That's not a vulnerability. It assumes that the translator is >> an attacker, or that the attacker can change the catalogs. If >> he is or can, you could not trust them, anyway, as they could >> cause arbitrary other failures, as well. JE> It means that you must audit not only your source code, but JE> also your message catalogs, to determine whether information JE> that is supposed to remain internal to a program is not JE> formatted into a string. Of course, it is fairly easy to do JE> this audit by showing that the translated string doesn't JE> contain substitution on any identifiers that the original JE> string did not. >From what I've been told, newer versions (possibly not yet released) of the GNU gettext tools, will do exactly that, and understand Python syntax too (hmm, an argument for keeping the current crop of %-string rules?). Alternatively, or in conjunction, you should be auditing your translation sites to make sure that maliciously translated strings can't access sensitive information. -Barry From paul@prescod.net Mon Feb 25 23:04:33 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 15:04:33 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> <20020225163447.G22803@unpythonic.dhs.org> Message-ID: <3C7AC301.20660EBA@prescod.net> Jeff Epler wrote: > >... > > Exactly. Why spend all this time and effort complicating the Python > parser and compiler, only to find that all real-world programs just > instead implement the feature inside a function call? Nobody said to reimplement it. I've said on several occasions that there should be a runtime version. Paul Prescod From paul@prescod.net Mon Feb 25 23:05:11 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 15:05:11 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> Message-ID: <3C7AC327.4164C382@prescod.net> "Martin v. Loewis" wrote: > > Paul Prescod writes: > > > I think you go through and remove the "$" signs (probably at the same > > time you are removing "_") and use a runtime function to do the > > translation (probably the same function doing the interpolation). > > I could not accept any solution that cannot offer anything but this. > This kind of interpolation is plain broken. How so? I need more info to go on. Paul Prescod From skip@pobox.com Mon Feb 25 23:16:24 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 25 Feb 2002 17:16:24 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.47026.213046.548051@anthem.wooz.org> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> <15482.46161.285288.743587@12-248-41-177.client.attbi.com> <15482.47026.213046.548051@anthem.wooz.org> Message-ID: <15482.50632.874371.694865@12-248-41-177.client.attbi.com> SM> What do you mean by "translators"? BAW> Someone who is fluent in a natural language other than English, and BAW> translates a catalog of English source strings to a target BAW> non-English natural language. E.g. BAW> "No such list: %(listname)s" -> "Non esiste la lista: %(listname)s" So translators aren't programmers either. Just tell them anything between %(...) and the first alphabetic character after that is off-limits. Again, it doesn't look to me like a programmer problem. Just to play the devil's advocate (and ignoring the bit about $-strings not being i18n-friendly), I suspect non-programming translators would have just as much trouble with something like $"Please confirm your choice of color ($color)..." "$color" will look like a word to be translated. You would have to tell them "don't translate anything immediately following a dollar sign up to, but not inluding the next character that can't be part of a Python identifier." Seems either a bit error-prone or confusing to me if I pretend I'm not a programmer. Skip From paul@prescod.net Mon Feb 25 23:12:31 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 15:12:31 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225210423.GA2398@crystal.mems-exchange.org> <15482.46039.769808.389448@grendel.zope.com> Message-ID: <3C7AC4DF.61A7D7AB@prescod.net> "Fred L. Drake, Jr." wrote: > >... > > Yes, but I'm not convinced this has any more security implications > implications than using a library function to perform the > transformation. The point is that the simplest mechanism, that we teach to newbies, has security non-obvious "concerns". If we have literal interpolation, then a library function would be used by people who WANT to do it at runtime because they have a REASON for doing it at runtime and thus have a pretty clear concept of the distinction between runtime and compile time. But as I've said, the major reason for this is not security. I don't know that a Python program has been hacked through "%" so it doesn't make sense to lose sleep over it. The major reason for doing it at compile time (for me) is that you can have a nice syntax that doesn't evolve modulus-ing (or dividing) an otherwise useless vars() or locals() dictionary. Paul Prescod From barry@zope.com Mon Feb 25 23:19:24 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 18:19:24 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> <15482.46161.285288.743587@12-248-41-177.client.attbi.com> <15482.47026.213046.548051@anthem.wooz.org> <15482.50632.874371.694865@12-248-41-177.client.attbi.com> Message-ID: <15482.50812.53834.442570@anthem.wooz.org> >>>>> "SM" == Skip Montanaro writes: SM> So translators aren't programmers either. Well, they may not be /Python/ programmers. ;) SM> Just tell them anything between %(...) and the first SM> alphabetic character after that is off-limits. Again, it SM> doesn't look to me like a programmer problem. SM> Just to play the devil's advocate (and ignoring the bit about SM> $-strings not being i18n-friendly), I suspect non-programming SM> translators would have just as much trouble with something SM> like SM> $"Please confirm your choice of color ($color)..." SM> "$color" will look like a word to be translated. You would SM> have to tell them "don't translate anything immediately SM> following a dollar sign up to, but not inluding the next SM> character that can't be part of a Python identifier." Seems SM> either a bit error-prone or confusing to me if I pretend I'm SM> not a programmer. To be clear, I think the ideal interface would be a graphical one, with drag-n-drop icons for the textual placeholders. This would allow them to re-arrange the order of the placeholder, and it would be obvious what is variable in your templates, but it wouldn't allow them to change, remove, or add placeholders. Then it wouldn't matter what syntax you actually used. I'm holding my breath... ready... go! -Barry From paul@prescod.net Mon Feb 25 23:19:06 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 15:19:06 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> <200202252204.g1PM44I13781@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7AC66A.5B7EF75C@prescod.net> Guido van Rossum wrote: > > There are two entirely different potential uses for interpolation. > One is for the Python programmer; call this literal interpolation. True! >... > A totally *different* use of interpolation is for templates, where > both the template (any data containing the appropriate $ syntax) and > the set of variables to be substituted (any mapping) should be under > full control of the program. This is what mailmail needs. True! But we've already got a solution for this. Is there something wrong with it? I guess I don't know what problem we're trying to solve. My only interest in interpolation was to make the common, simple case easier. > Literal interpolation has no security issues, if done properly. In > the latter use, the security issues can be taken care of by carefully > deciding what data is available in the set of variables to be > interpolated. The interpolation syntax I've proposed is intentionally > very simple, so that this is relatively easy. I recall seeing slides > at the conference of a templating system (maybe Twisted's?) that > allowed expressions like $foo.bar[key] which would be much harder to > secure. I'm not attached enough to fight for these but I'll re-emphasize your implicit point that these are entirely secure if used in literal interpolation. > I18n of templates is easy -- just look up the template string in the > translation database. > > I18n of apps using literal interpolation is more of a can of worms, > and I have no clear solution. I agree that a solution is needed -- > otherwise literal interpolation would be *worse* than what we have now! You translate them from compile time interpolation to runtime by removing a $ and replacing it by a function call. a = $"My name is $name" becomes: a = interp(_("My name is $name")) But of course it is trivial to make the last line of '_' return interp(rc) so that the client doesn't have to do it. Paul Prescod From tim.one@comcast.net Mon Feb 25 23:17:11 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 25 Feb 2002 18:17:11 -0500 Subject: [Python-Dev] Tkinter versus Windows In-Reply-To: Message-ID: FYI, someone just added a new entry to the ancient Python "Programs using Tkinter sometimes can't shut down (Windows)" bug report that everyone gave up on. John Popplewell claims to have found the (a?) cause, in part: """ Managed to track it down to a problem inside Tcl. For the Tcl8.3.4 source distribution the problem is in the file win/tclWinNotify.c """ Beats me, but a claim so specific is probably worth checking out: From skip@pobox.com Mon Feb 25 23:27:18 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 25 Feb 2002 17:27:18 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <3C7AC327.4164C382@prescod.net> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> <3C7AC327.4164C382@prescod.net> Message-ID: <15482.51286.256028.67275@12-248-41-177.client.attbi.com> >>>>> "Paul" == Paul Prescod writes: Paul> "Martin v. Loewis" wrote: >> I could not accept any solution that cannot offer anything but this. >> This kind of interpolation is plain broken. Paul> How so? I need more info to go on. I have no direct experience with text translation, but in this internet day and age, it seems to me that a change to the language shouldn't make internationalization more difficult than it already is. (I doubt anyone will claim that it's truly easy, even with gettext.) Guido mentioned a number of other languages that already use $-interpolation, Perl, the shells, awk and Ruby I think. Of those, all but Ruby were around before the explosion of the internet in general and the web and Unicode in particular, so internationalization wasn't a prime consideration when those languages' $-interpolation facilities were implemented. Skip From skip@pobox.com Mon Feb 25 23:28:27 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 25 Feb 2002 17:28:27 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.50812.53834.442570@anthem.wooz.org> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> <15482.46161.285288.743587@12-248-41-177.client.attbi.com> <15482.47026.213046.548051@anthem.wooz.org> <15482.50632.874371.694865@12-248-41-177.client.attbi.com> <15482.50812.53834.442570@anthem.wooz.org> Message-ID: <15482.51355.567748.233425@12-248-41-177.client.attbi.com> From skip@pobox.com Mon Feb 25 23:29:21 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 25 Feb 2002 17:29:21 -0600 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.50812.53834.442570@anthem.wooz.org> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> <15482.46161.285288.743587@12-248-41-177.client.attbi.com> <15482.47026.213046.548051@anthem.wooz.org> <15482.50632.874371.694865@12-248-41-177.client.attbi.com> <15482.50812.53834.442570@anthem.wooz.org> Message-ID: <15482.51409.945722.29088@12-248-41-177.client.attbi.com> Sorry about the first, fumbled reply... BAW> To be clear, I think the ideal interface would be a graphical one, BAW> with drag-n-drop icons for the textual placeholders. This would BAW> allow them to re-arrange the order of the placeholder, and it would BAW> be obvious what is variable in your templates, but it wouldn't BAW> allow them to change, remove, or add placeholders. This places the onus back on the application programmer, not the language designer. Skip From paul@prescod.net Mon Feb 25 23:29:39 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 15:29:39 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> <3C7AC327.4164C382@prescod.net> <15482.51286.256028.67275@12-248-41-177.client.attbi.com> Message-ID: <3C7AC8E3.1175D99D@prescod.net> Skip Montanaro wrote: > > >>>>> "Paul" == Paul Prescod writes: > > Paul> "Martin v. Loewis" wrote: > >> I could not accept any solution that cannot offer anything but this. > >> This kind of interpolation is plain broken. > > Paul> How so? I need more info to go on. > > I have no direct experience with text translation, but in this internet day > and age, it seems to me that a change to the language shouldn't make > internationalization more difficult than it already is. I've proposed that whereas today you add a "_( )" in the future you would add "_( )" and remove "$" if it happens to occur at the start of the string. If the string didn't start with a "$" you might also have to scan to see if it contains one. In that case you double it up. This doesn't make internationalization more difficult. As proof I present mailman, which *already* does the interpolation I ask for as a feature of its implementation of "_()". All I'm asking is that mailman's interpolation feature ALSO be available under a simplified syntax at compile time. Paul Prescod From martin@v.loewis.de Mon Feb 25 23:34:26 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 26 Feb 2002 00:34:26 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <20020225164532.H22803@unpythonic.dhs.org> References: <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <15482.45508.332144.622180@anthem.wooz.org> <20020225164532.H22803@unpythonic.dhs.org> Message-ID: Jeff Epler writes: > It means that you must audit not only your source code, but also your > message catalogs, to determine whether information that is supposed to > remain internal to a program is not formatted into a string. Of course, > it is fairly easy to do this audit by showing that the translated string > doesn't contain substitution on any identifiers that the original string > did not. That specific test could be done automatically. In fact, GNU msgfmt already performs the test for c-format strings; msgfmt.py should probably learn about the common notations for string interpolation. Regards, Martin From martin@v.loewis.de Mon Feb 25 23:38:58 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 26 Feb 2002 00:38:58 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <3C7AC327.4164C382@prescod.net> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> <3C7AC327.4164C382@prescod.net> Message-ID: Paul Prescod writes: > > > I think you go through and remove the "$" signs (probably at the same > > > time you are removing "_") and use a runtime function to do the > > > translation (probably the same function doing the interpolation). > > > > I could not accept any solution that cannot offer anything but this. > > This kind of interpolation is plain broken. > > How so? I need more info to go on. In the applications that I have in mind, interpolated strings are typically presented to the user, so there must be a way to localize them. An extension to the language that does not support localization is useless if I have to find some other means for l10n. If there will be a standard library function that does the interpolation anyway, I'd prefer not to have a language extension ot achieve the same thing, but is more limited. If anything, the language extension should be more powerful, not more limited, in applicability. Regards, Martin From JeffH@ActiveState.com Mon Feb 25 23:38:35 2002 From: JeffH@ActiveState.com (Jeff Hobbs) Date: Mon, 25 Feb 2002 15:38:35 -0800 Subject: [Python-Dev] RE: Tkinter versus Windows In-Reply-To: Message-ID: <008401c1be55$8b5d7520$ba03a8c0@activestate.ca> I added a note to the bug report, which was correct. The fix was already made in the Tcl head, but I back-ported it to the 8.3-branch of Tcl for those who want to be able to grab that and work against it. Jeff > -----Original Message----- > From: Tim Peters [mailto:tim.one@comcast.net] > > FYI, someone just added a new entry to the ancient Python "Programs using > Tkinter sometimes can't shut down (Windows)" bug report that everyone gave > up on. John Popplewell claims to have found the (a?) cause, in part: > > """ > Managed to track it down to a problem inside Tcl. > For the Tcl8.3.4 source distribution the problem is in > the file win/tclWinNotify.c > """ > > Beats me, but a claim so specific is probably worth checking out: > > From martin@v.loewis.de Mon Feb 25 23:42:41 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 26 Feb 2002 00:42:41 +0100 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.50632.874371.694865@12-248-41-177.client.attbi.com> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org> <15482.46161.285288.743587@12-248-41-177.client.attbi.com> <15482.47026.213046.548051@anthem.wooz.org> <15482.50632.874371.694865@12-248-41-177.client.attbi.com> Message-ID: Skip Montanaro writes: > Just to play the devil's advocate (and ignoring the bit about $-strings not > being i18n-friendly), I suspect non-programming translators would have just > as much trouble with something like > > $"Please confirm your choice of color ($color)..." > > "$color" will look like a word to be translated. You would have to tell > them "don't translate anything immediately following a dollar sign up to, > but not inluding the next character that can't be part of a Python > identifier." Seems either a bit error-prone or confusing to me if I pretend > I'm not a programmer. Indeed. Therefore, the only true solution is to have an automatic check that verifies that the translated string has the same inserts as the original. Such a check could instruct users to follow any interpolation scheme; even if translators don't know the programming language of the application, they still are typically capable of understanding the error messages from msgfmt. Regards, Martin From paul@prescod.net Mon Feb 25 23:46:31 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 15:46:31 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> <3C7AC327.4164C382@prescod.net> Message-ID: <3C7ACCD7.9219AAF5@prescod.net> "Martin v. Loewis" wrote: > >... > > In the applications that I have in mind, interpolated strings are > typically presented to the user, so there must be a way to localize > them. An extension to the language that does not support localization > is useless if I have to find some other means for l10n. You will use another invocation syntax, but probably the same string interpolation syntax. > If there will be a standard library function that does the > interpolation anyway, I'd prefer not to have a language extension ot > achieve the same thing, but is more limited. If anything, the language > extension should be more powerful, not more limited, in applicability. The language extension should be syntactically simpler because it is what is used for simpler cases. Simpler constructs are also less likely to open up security issues. Paul Prescod From barry@zope.com Tue Feb 26 00:20:49 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 19:20:49 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> <3C7AC327.4164C382@prescod.net> <15482.51286.256028.67275@12-248-41-177.client.attbi.com> <3C7AC8E3.1175D99D@prescod.net> Message-ID: <15482.54497.412834.587484@anthem.wooz.org> >>>>> "PP" == Paul Prescod writes: PP> This doesn't make internationalization more difficult. As PP> proof I present mailman, which *already* does the PP> interpolation I ask for as a feature of its implementation of PP> "_()". All I'm asking is that mailman's interpolation feature PP> ALSO be available under a simplified syntax at compile time. Except that remember the interpolation step must happen /after/ the translation step, otherwise it's worse than useless. -Barry From paul@prescod.net Tue Feb 26 00:39:21 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 16:39:21 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <200202252106.g1PL6FY13393@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7AD939.FCFE163D@prescod.net> [meant to send this before] Guido van Rossum wrote: > >... > > All right, but there *also* needs to be a way to invoke interpolation > explicitly -- just like eval(). This has applicability e.g. in i18n. Agree 100%. The last time we discussed this I proposed there should be a function to do this. But the naive integrated syntax could be compile time. My complaints with the current interpolation are: 1. they require too many magical incantations to invoke (especially % vars()) 2. they require too much thinking about types and conversions in the syntax 3. special behaviour with dictionaries and tuples and singleton tuples etc. 4. operator abuse I would only be in favour of a replacement if for *simple cases* it cleared up all of these issues so that it is roughly as easy as in Perl/Ruby/sh/tcl: a = $"My name is $name" If there is any more syntax than that then personally I think that the cost/benefit ratio falls down. So I don't see this as a big win: a = "My name is $name" \ locals() It solves two of my four problems. Maybe other people have different goals than I do and that's why they see the above as a "win". Paul Prescod From paul@prescod.net Tue Feb 26 00:49:19 2002 From: paul@prescod.net (Paul Prescod) Date: Mon, 25 Feb 2002 16:49:19 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> <3C7AC327.4164C382@prescod.net> <15482.51286.256028.67275@12-248-41-177.client.attbi.com> <3C7AC8E3.1175D99D@prescod.net> <15482.54497.412834.587484@anthem.wooz.org> Message-ID: <3C7ADB8F.7A7304BC@prescod.net> "Barry A. Warsaw" wrote: > > >>>>> "PP" == Paul Prescod writes: > > PP> This doesn't make internationalization more difficult. As > PP> proof I present mailman, which *already* does the > PP> interpolation I ask for as a feature of its implementation of > PP> "_()". All I'm asking is that mailman's interpolation feature > PP> ALSO be available under a simplified syntax at compile time. > > Except that remember the interpolation step must happen /after/ the > translation step, otherwise it's worse than useless. Right, that's why you *for localized software* you should do it at runtime. And insofar as the process of localization *already* consists of touching every string, it takes no extra effort to change a compile-time interpolation to a runtime one while you are at it. But the newbie to Python should not be saddled with a syntax optimized towards advanced users, and even as a person often hacking single-language software I shouldn't be saddled with dynamic interpolation until I need it either! "Saddled with" means "required to use a verbose, non-intuitive syntax with a bunch of special cases for a simple and common operation." Paul Prescod From fdrake@acm.org Tue Feb 26 02:00:54 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 25 Feb 2002 21:00:54 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <3C7AC4DF.61A7D7AB@prescod.net> References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225210423.GA2398@crystal.mems-exchange.org> <15482.46039.769808.389448@grendel.zope.com> <3C7AC4DF.61A7D7AB@prescod.net> Message-ID: <15482.60502.420424.995597@grendel.zope.com> Paul Prescod writes: > The major reason for doing it at > compile time (for me) is that you can have a nice syntax that doesn't > evolve modulus-ing (or dividing) an otherwise useless vars() or locals() > dictionary. Which has everything to do with your usage. I almost never use % with locals() or vars(), so I don't share that motivation. I'm much more likely to build a dict specifically for the purpose, which includes computed values, or have something already created which includes this usage as part of the larger picture. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From greg@cosc.canterbury.ac.nz Tue Feb 26 02:14:10 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 26 Feb 2002 15:14:10 +1300 (NZDT) Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <3C7AD939.FCFE163D@prescod.net> Message-ID: <200202260214.PAA23584@s454.cosc.canterbury.ac.nz> I'm not sure I like the idea of using $ as the character for prefixing interpolated strings. Somehow a = $"My name is $name" looks confusing. I think it has something to do with the fact that $ is appearing both inside and outside the quotes, making my visual parser worry that the quotes are misplaced. Also, it uses up one of the three precious not-yet-used characters, and I think we should keep those for some future time when we really need them. We don't need one for this -- there are plenty of operators available that haven't yet been used on strings. I suggest '^', since it does a nice job of suggesting "inject stuff into this string". We can have both a prefix form for compile-time interpolation: a = ^ "My name is $name" and an infix form for run-time interpolation: a = "My name is $name" ^ dict Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry@zope.com Tue Feb 26 03:12:22 2002 From: barry@zope.com (Barry A. Warsaw) Date: Mon, 25 Feb 2002 22:12:22 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <3C7AD939.FCFE163D@prescod.net> <200202260214.PAA23584@s454.cosc.canterbury.ac.nz> Message-ID: <15482.64790.812638.147714@anthem.wooz.org> >>>>> "GE" == Greg Ewing writes: GE> Also, it uses up one of the three precious not-yet-used GE> characters, and I think we should keep those for some GE> future time when we really need them. We don't need one GE> for this -- there are plenty of operators available GE> that haven't yet been used on strings. GE> I suggest '^', since it does a nice job of suggesting GE> "inject stuff into this string". We can have both a GE> prefix form for compile-time interpolation: GE> a = ^ "My name is $name" GE> and an infix form for run-time interpolation: GE> a = "My name is $name" ^ dict I think I suggested using ~ for this at IPC10: a = ~'my name is $name' for the compile-time interpolation. I don't think it matters much which operator is chosen (let Guido decide). -Barry From tim.one@comcast.net Tue Feb 26 03:24:03 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 25 Feb 2002 22:24:03 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.45796.817626.14965@anthem.wooz.org> Message-ID: [Barry A. Warsaw] > Does anybody ever even use something other than `s' for %() strings? > > >>> '%(float)f' % {'float': 3.9} > '3.900000' > > I never have. Then again, you've never used a floating-point number either . I've certainly used %(x)f/g/e with float formats. Not quite speaking of which, if Python grows a new $ operator, let's get the precedence right. This kind of thing is too common today: >>> amount = 3.50 >>> n = 3 >>> print "Total: $%.2f." % amount*n Total: $3.50.Total: $3.50.Total: $3.50. >>> From tim.one@comcast.net Tue Feb 26 03:29:05 2002 From: tim.one@comcast.net (Tim Peters) Date: Mon, 25 Feb 2002 22:29:05 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <200202252205.g1PM5xD13825@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido] > I never use this in combination with named variables, but I often > write timing programs that format times using "%6.3f" to get > millisecond precision. Note that you also use %(name)s with width, precision and justification modifiers. For example, this line is yours: s = "%(name)-20.20s %(sts)-10s %(uptime)6s %(idletime)6s" % locals() From DavidA@ActiveState.com Tue Feb 26 04:25:59 2002 From: DavidA@ActiveState.com (David Ascher) Date: Mon, 25 Feb 2002 20:25:59 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <3C7AAE96.BA4D19FE@prescod.net> <15482.45796.817626.14965@anthem.wooz.org> <3C7AB2FE.D1FFE397@prescod.net> Message-ID: <3C7B0E57.E0FC3960@ActiveState.com> Paul Prescod wrote: > > "Barry A. Warsaw" wrote: > > > >... > > > > Does anybody ever even use something other than `s' for %() strings? > > > > >>> '%(float)f' % {'float': 3.9} > > '3.900000' > > Presumably numerical analysts do....and David Ascher once told me he > uses %d as a sanity type-check. I don't bother. Paul's starting to turn into my brother -- quoting things I said twenty years ago and that I have no way of disproving. As Bill said, "I don't recall". These days, I rarely think in FP, even if I use FP, so I typically use %s. Back then I probably cared about mantissa and her friends. --da From mal@lemburg.com Tue Feb 26 10:27:46 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 11:27:46 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> Message-ID: <3C7B6322.440D21E7@lemburg.com> I consider the above PEP ready for review by the developers. Please comment. http://python.sourceforge.net/peps/pep-0263.html After approval, the next step would be to implement phase 1 for 2.3. Step two would then be on the plate for 2.4. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jim@zope.com Tue Feb 26 14:04:27 2002 From: jim@zope.com (Jim Fulton) Date: Tue, 26 Feb 2002 09:04:27 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <029a01c1b0cb$195cb400$ced241d5@hagrid> <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7B95EB.952037FE@zope.com> Guido van Rossum wrote: > > > http://effbot.org/ideas/time-type.htm > > > > I can produce PEP and patch if necessary. > > Yes, a PEP, please! Jim Fulton has been asking for this for a long > time too. His main requirement is that timestamp objects are small, > both in memory and as pickles, because Zope keeps a lot of these > around. I also need time-zone support. > They are currently represented either as long ints (with a > little under 64 bits) or as 8-byte strings. ZODB has a TimeStamp type that uses a 32-bit unsigned integer to store year, month,, day, hour, and minute in a way that makes it dirt simple to extract a component. It uses a 32-bit integer to store seconds in units of 60/2**32 seconds. This type isn't appropriate for general use because it only allows dates later than Dec 31, 1899. > A dedicated timestamp > object could be smaller than that. A type that only needed minute precision could easily be expressed with 32-bits. Of course, the two-word object overhead makes the difference between 32-bits and 64-bits rather unexciting. > Your idea of a base type (which presumably standarizes at least one > form of representation) sounds like a breakthrough that can help > satisfy different other needs. I agree. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim@zope.com Tue Feb 26 14:05:11 2002 From: jim@zope.com (Jim Fulton) Date: Tue, 26 Feb 2002 09:05:11 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7B9617.D22E449F@zope.com> Guido van Rossum wrote: > > [Tim] > > Are you sure Jim is looking to replace the TimeStamp object? All the > > complaints I've seen aren't about the relatively tiny TimeStamp object, but > > about Zope's relatively huge DateTime class (note that you won't have source > > for that if you're looking at a StandaloneZODB checkout -- DateTime is used > > at higher Zope levels), which is a Python class with a couple dozen(!) > > instance attributes. See, e.g., > > > > http://dev.zope.org/Wikis/DevSite/Proposals/ReplacingDateTime > > > > It seems clear from the source code that TimeStamp is exactly what Jim > > intended it to be . > > I'm notoriously bad at channeling Jim. Nevertheless, I do recall him > saying he wanted a lightweight time object. I think the mistake of > DateTime is that it stores the broken-out info, rather than computing > it on request. I don't think the mistake was so much to store broken-out info, but to store too much data in general. It stores redundant data, which makes it's implementation a bit difficult to understand and maintain. Note that scalability was not a goal of Zope's DateTime type. We meant to replace it with something much tighter a long time ago, but never got around to it. > > > Your idea of a base type (which presumably standarizes at least one > > > form of representation) sounds like a breakthrough that can help > > > satisfy different other needs. > > > > Best I can make out, /F is only proposing what Jim would call an Interface: > > the existence of two methods, timetuple() and utctimetuple(). In a comment > > on his page, /F calls it an "abstract" base class, which is more C++-ish > > terminology, and the sample implementation makes clear it's a "pure" > > abstract base class, so same thing as a Jim Interface in the end. Yup. > I'll show the PEP to Jim when it appears. Thanks. ;) BTW, has there been any progress on this? Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim@zope.com Tue Feb 26 14:05:44 2002 From: jim@zope.com (Jim Fulton) Date: Tue, 26 Feb 2002 09:05:44 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <15459.9239.83647.334632@gondolin.digicool.com> <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net> <15460.15864.266226.241495@grendel.zope.com> <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> <15460.17309.905113.103005@grendel.zope.com> <3C644E7B.F69AF73C@lemburg.com> Message-ID: <3C7B9638.C59EEC35@zope.com> "M.-A. Lemburg" wrote: > > "Fred L. Drake, Jr." wrote: > > > > Guido van Rossum writes: > > > Is comparison the same what Tim mentioned as range searches? I guess > > > a representation like current Zope timestamps or what time.time() > > > returns is fine for that -- it is monononous even if not necessarily > > > continuous. I guess a broken-out time tuple is much harder to compare. > > > > Yes; as long as ordering is easy to check, we're fine with a long int > > or some such thing. The range search is indeed the specific > > application Jim has in mind. > > Uhm... I think this thread is heading in the wrong direction. > > Fredrik wasn't proposing a solution to Jim's particular > problem (whatever it was ;-), but instead opting for a solution > of a large number of Python users out there. Right. ;) > While mxDateTime probably works for most of them (and is used by > pretty much all major database modules out there), some may feel > that they don't want to rely on external libs for their software > to run on. I have no problem relying on external libraries. > I would be willing to make the mxDateTime types subtypes of > whatever Fredrik comes up with. The only requirement I have is > that the binary footprint of the types needs to match todays > layout of mxDateTime types since I need to maintain binary > compatibility. The binary footprint of your types, not the standard base class, right? I don't see a problem with that, > The other possibility would be adding a set of new types > to mxDateTime which focus on low memory requirements rather > than data roundtrip safety and speed. What is data roundtrip safety? I rarely do date-time arithmetic, but I often do date-time-part extraction. I think that mxDateTime is optimized for arithmetic, whereas, I'd prefer a type more focussed on extraction efficiency, and memory usage, and that effciently supports time zones. This is, of course, no knock on mxDateTime. I also want fast comparisons, which I presume mxDateTime provides. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim@zope.com Tue Feb 26 14:05:32 2002 From: jim@zope.com (Jim Fulton) Date: Tue, 26 Feb 2002 09:05:32 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <15459.9239.83647.334632@gondolin.digicool.com> Message-ID: <3C7B962C.147AAA15@zope.com> Jeremy Hylton wrote: > > >>>>> "TP" == Tim Peters writes: > ... > Also, it may not be necessary to have a TimeStamp object in ZODB 4. > There are three uses for the timestamp: tracking how recently an > object was used for cache evication, Time-stamp isn't used for this. > providing a last modified time to > users, and as a simple version number. This is certainly a hack. > In ZODB 4, the cache eviction may be done quite differently. Yup, or, with Toby Dickenson's patches, in ZODB 3. :) > The > version number may be a simple int. The last mod time will not be > provided for each object; instead, users will need to define this > themselves if they care about it. If they define it themselves, > they'd probably use a DateTime object, but we'd care much less about > how small it is. The TimeStamp type will still be useful for storage implementations that want compact time strings. I could imagine al alternate implementation that conformed to the new interface and retained the compact representation. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From guido@python.org Tue Feb 26 14:20:47 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 09:20:47 -0500 Subject: [Python-Dev] OS/2 EMX changes to dynload_shlib.c In-Reply-To: Your message of "Tue, 26 Feb 2002 03:41:36 PST." References: Message-ID: <200202261420.g1QEKld15953@pcp742651pcs.reston01.va.comcast.net> Given the number of OS/2 EMX specific changes to dynload_shlib.c, wouldn't it be better to create a separate dynload_os2.c? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Feb 26 14:22:58 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 09:22:58 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: Your message of "Tue, 26 Feb 2002 11:27:46 +0100." <3C7B6322.440D21E7@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> Message-ID: <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> > I consider the above PEP ready for review by the developers. > Please comment. > > http://python.sourceforge.net/peps/pep-0263.html > > After approval, the next step would be to implement phase 1 > for 2.3. Step two would then be on the plate for 2.4. That looks OK to me. I the Emacs-style comment in fact compatible with Emacs? --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Feb 26 14:31:52 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 15:31:52 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7B9C58.5D9367E0@lemburg.com> Guido van Rossum wrote: > > > I consider the above PEP ready for review by the developers. > > Please comment. > > > > http://python.sourceforge.net/peps/pep-0263.html > > > > After approval, the next step would be to implement phase 1 > > for 2.3. Step two would then be on the plate for 2.4. > > That looks OK to me. I the Emacs-style comment in fact compatible > with Emacs? According to Martin, it is compatible. If it's not we'll make it so :-) Barry ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Tue Feb 26 14:33:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 15:33:05 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <15459.9239.83647.334632@gondolin.digicool.com> <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net> <15460.15864.266226.241495@grendel.zope.com> <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> <15460.17309.905113.103005@grendel.zope.com> <3C644E7B.F69AF73C@lemburg.com> <3C7B9529.309D9902@zope.com> Message-ID: <3C7B9CA1.B7798317@lemburg.com> Jim Fulton wrote: > > > I would be willing to make the mxDateTime types subtypes of > > whatever Fredrik comes up with. The only requirement I have is > > that the binary footprint of the types needs to match todays > > layout of mxDateTime types since I need to maintain binary > > compatibility. > > The binary footprint of your types, not the standard base class, > right? I don't see a problem with that, Fredrik's solution only provides an abstract base type with no additional parameters in the type object (only an interface definition on top of it) -- this would work nicely. > > The other possibility would be adding a set of new types > > to mxDateTime which focus on low memory requirements rather > > than data roundtrip safety and speed. > > What is data roundtrip safety? Roundtrip safety means that e.g. if you take a COM date value from a ADO and create a DateTime object with it, you can be sure to get back the exact same value via the COMDate() method. The same is true for broken down values and, of course, the internal values .absdate and .abstime. This may not be too important for most applications, but it certainly is for database related ones, since rounding can cause e.g. 14:00:00.00 to become 13:59:59.99 and that's not what you want if you transfer data from one database to another. > I rarely do date-time arithmetic, but I often do date-time-part > extraction. I think that mxDateTime is optimized for arithmetic, > whereas, I'd prefer a type more focussed on extraction efficiency, > and memory usage, and that effciently supports time zones. > This is, of course, no knock on mxDateTime. I also want > fast comparisons, which I presume mxDateTime provides. DateTime objects use .abstime and .absdate for doing arithmetic since these provides the best accuracy. The most important broken down values are calculated once at creation time; a few others are done on-the-fly. I suppose that I could easily make a few calculation lazy to enhance speed; memory footprint would not change though. It's currently at 56 bytes per DateTime object and 36 bytes per DateTimeDelta object. To get similar accuracy in Python, you'd need a float and an integer per object, that's 16 bytes + 12 bytes == 28 bytes + malloc() overhead for the two and the wrapping instance which gives another 32 bytes (provided you store the two objects in slots)... >60 bytes per Python based date time object. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From barry@zope.com Tue Feb 26 15:57:49 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 26 Feb 2002 10:57:49 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <3C7B9C58.5D9367E0@lemburg.com> Message-ID: <15483.45181.422942.816096@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> According to Martin, it is compatible. If it's not we'll make MAL> it so :-) MAL> Barry ? I believe so, although I haven't ever used this trick to specify the file's encoding. In some quick tests, at least XEmacs doesn't bomb out on it (if I stick a real encoding in for ). -Barry From martin@v.loewis.de Tue Feb 26 18:00:22 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 26 Feb 2002 19:00:22 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> Message-ID: --=-=-= Guido van Rossum writes: > That looks OK to me. I the Emacs-style comment in fact compatible > with Emacs? It is. I expect many people want to put "utf-8" as the encoding name, and you need Emacs 21 for that (or Emacs with Mule-UCS, or some such). In GNU Emacs, you see the effect of the coding: directive in the Emacs status line. Just try the attached file, it will indicate "R" for KOI8-R. Not sure about XEmacs. Regards, Martin --=-=-= Content-Type: application/octet-stream Content-Disposition: attachment; filename=foo.py Content-Transfer-Encoding: quoted-printable #! /usr/bin/python # -*- coding: koi8-r -*- print "=ED=C1=D2=D4=C9=CE" --=-=-=-- From guido@python.org Tue Feb 26 18:17:54 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 13:17:54 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: Your message of "26 Feb 2002 19:00:22 +0100." References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> Message-ID: <200202261817.g1QIHsO19019@pcp742651pcs.reston01.va.comcast.net> > In GNU Emacs, you see the effect of the coding: directive in the Emacs > status line. Just try the attached file, it will indicate "R" for > KOI8-R. Not sure about XEmacs. Cool! It worked for me in Emacs, but not in XEmacs. Oh well. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Tue Feb 26 18:19:33 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 26 Feb 2002 12:19:33 -0600 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15483.53685.311901.785106@12-248-41-177.client.attbi.com> >> That looks OK to me. I the Emacs-style comment in fact compatible >> with Emacs? Martin> It is. I expect many people want to put "utf-8" as the encoding Martin> name, and you need Emacs 21 for that (or Emacs with Mule-UCS, or Martin> some such). Martin> In GNU Emacs, you see the effect of the coding: directive in the Martin> Emacs status line. Just try the attached file, it will indicate Martin> "R" for KOI8-R. Not sure about XEmacs. I use XEmacs 21.4.5 (non-MULE). I see nothing particularly interesting when visiting that file. Apropos doesn't indicate there is a variable named "coding" either. I see ":encoding", an undocumented variable. Everything else containing "coding" is more complex and seems package-specific (tramp, vm, ediff, etc). Perhaps using MULE would make a difference. Skip From paul@prescod.net Tue Feb 26 18:21:08 2002 From: paul@prescod.net (Paul Prescod) Date: Tue, 26 Feb 2002 10:21:08 -0800 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225210423.GA2398@crystal.mems-exchange.org> <15482.46039.769808.389448@grendel.zope.com> <3C7AC4DF.61A7D7AB@prescod.net> <15482.60502.420424.995597@grendel.zope.com> Message-ID: <3C7BD214.5065100B@prescod.net> "Fred L. Drake, Jr." wrote: > > Paul Prescod writes: > > The major reason for doing it at > > compile time (for me) is that you can have a nice syntax that doesn't > > evolve modulus-ing (or dividing) an otherwise useless vars() or locals() > > dictionary. > > Which has everything to do with your usage. I almost never use % with > locals() or vars(), so I don't share that motivation. Even so you have to modulus a tuple or a variable. That doesn't make any more sense for a newbie and is just as inconvenient for the script kiddie (which is often me!), compared to languages like Perl, Ruby, Tcl, sh etc. Python's interpolation syntax is: more verbose, more complicated, less secure and also more powerful. I have no problem with keeping the power but I'd like something less verbose and less complicated alongside it. > I'm much more > likely to build a dict specifically for the purpose, which includes > computed values, or have something already created which includes this > usage as part of the larger picture. I don't believe that this feature should be taken away from you. But I don't see how it relates to the PEP because what you want to do is already doable. PEP 215 is about making things *easier for simple cases*. If you have new, high-end needs for runtime string interpolation then PEP 215 probably won't address them. Paul Prescod From barry@zope.com Tue Feb 26 18:24:41 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 26 Feb 2002 13:24:41 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15483.53993.852170.135298@anthem.wooz.org> --9qjFu7wnRj Content-Type: text/plain; charset=us-ascii Content-Description: message body text Content-Transfer-Encoding: 7bit >>>>> "MvL" == Martin v Loewis writes: MvL> Guido van Rossum writes: >> That looks OK to me. I the Emacs-style comment in fact >> compatible with Emacs? MvL> It is. I expect many people want to put "utf-8" as the MvL> encoding name, and you need Emacs 21 for that (or Emacs with MvL> Mule-UCS, or some such). MvL> In GNU Emacs, you see the effect of the coding: directive in MvL> the Emacs status line. Just try the attached file, it will MvL> indicate "R" for KOI8-R. Not sure about XEmacs. I don't think it works for XEmacs. I've got a MULE-aware XEmacs 21.4.6 and while it asks if I want to set the local variables in the -*- line, I still see "Raw" in the modeline, and I see the following letters in print string (with funny little lines above the characters): iAOOEI. See attached capture. That doesn't seem right, does it? -Barry --9qjFu7wnRj Content-Type: image/gif Content-Disposition: inline; filename="capture.gif" Content-Transfer-Encoding: base64 R0lGODdh9gCoAPcAAAAAAAAAiwD/AF5eXms6SnNzc3p6eqOjo7IiIr+/v8zMzM2waM3NtNtw k+D//+bm5vX19fzV4f/XAAhACHxM8Dg6PxQUFAgICAI8dAD0OAD/FAC/CAAEEEAAAAQAAGy4 APXcAP8LQL9ABAuIcKEx8wQU/0AIv+6ICzgxoRQUBAgIQIQ8ljn0OAz/FAi/CALceADAsQCy EAAACHxobgD0QQD/AwC/QCPoaQH0AAC/AIcEDTkAAAwAAAgAALh0Ztz0AAv/AEC/AGT0vHb0 OAb/FEC/CAP+oAA08wAD/wBAv8AEQjcAahQABwgACFyEbgL0QbgEadz1AIAEDfUAAP8AAL8A ACSQZoL0AAb/ADgQxDf1OBT/FMD+oDc08xQD/whAv1yIdAI4agAUBwAICLiQiNz0MQv/FMAE gzcAABQAQAgABKy48PXcP/8LFL9ACDqIac0xAAQUAEAIADiIEDcxABQUAAgIAMCQZzf0ABT/ AAi/AOgEEiG8DUD0AAj/AJw8rEf18wz//wi/vwD+NgA0cQADBwBACFy8HPY4X/8UE78ICNC8 Acf0ABf/AAAEiQAAsQAAEMC4vPXc8/8L/79Av1uIPcwxZQQUB0AICFiI0PYxXFy8CPb0+f// /9AAvMcA9hcA/wgAvwDcYwDAfACyBReIDYIxAAcUANDwAMc/ABcUAAFACAA6+QAU/wAIvwAK CAAA+QAA/wAAvwAALAEA9gBA/wAEvwAYAAD1AGQLAPahAP8EAL9AAGC6APY6AP8UAL8IAFyA APb1AP//AL+/ABgKCPYA+f8A/78Av1GAAIL1AAf/ANBguMe13BcQC/aL/wD1/wD//wC//6gP 4AA4TwADIQBAQAAECgADAAi6lPk69P8U/78IvwAkBAAPMAA49AAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACwAAAAA9gCoAAAI/gAjCBxIsKBBAgYTKlzIsKHD hxAjSpxIsaLFgxcRWtR4saPHjyBDihw5kONEAhobqFzJsiVLhC5jypxJs6bNmzhz6tzJs6fP ljB5ohTYM+jPo0iTKl3KtKlKozqHRniKsqrVq1izat3KtavXr2DDih1LtqzZs1inNpD6VIDb twIGyJ1Lt67du3jz6t3Lt6/fv4ADCx7s14Dhw4gTK0aslu1auG/nLgA8mbDdBZjxVva7WXBm y6BDi+67uLRpCI1TPnYLQEBryX87h64se3DtvbJvj97Ne7DhA8CDB4dgQLjxAxBQUyW62rVz 2Jx50xatO2/u3tizB/59HDjx4t2R/itfq5qAawDo0UPnS7u95M8DJmO+Lnf+5vmw8dedrh+/ f/v1xddZfwJWp92Bf3F3XHIKgNddcqkxZ55zrcUVYF/3QZfhhZetNyBdn9EXn4Ye7jfdiBeG iOCKvjloHAQKNOiicBAu1xhrzlmIIm4gegjfjj2WuN6OHwq5IZBEcngikiw2qZeCNMIY44zD jecYAemlp6OBQx4p5H5BAlmkmGFmOKaSaCbp5JpPUolcjFO6WSN5ElIYWWxgpvjlkGpy2KeI 9/1oJol6+snmoXNB6Z2UcMppZXl2usXnXbX5Vx98AHZZ4Hv/uYdkppN2uumo/XGJKIKKvgln nAs+Widk/jqyJ92seXZ46q3ZKcrgqlO2GuGNsOK6om6mCmssaTMyyquurgIL2bHZ6fcetNQS BuWuvPYa5a9PVevtt+DWdW22q17bbLfhmhhmbz/yCGK76fKmILbkNritjeiOWGytCO47G2Fe sgvuvPWW6+CcWeXnWZP+gtZwmQc+3CR39BZ8MARXEURAgdKyBypngoYcIqajXiofngJyqvKe YHbsscvGUlzwsuBBWJKEAUrMpKyFoujlgCf+zOOYZ5asmWU6r/hbxTPXfO5aOetl36Amwxsq oWQaOanUgK674dRbmyjtyVUnLdrSM2e79NMETG02Zbb6rKnWhg4N8dw7hx3v/l6GJWfa3wiX py+bX2N9ZuF1S51m1otTurdgff8NONuDEy7yuyeDXWrV614WqMmcYmp16I8n6GZ4vuILdekO s+76XX0nJ/vstNceOM6v25b77gPEbvvvshtwe2O8g/x28YjGnrbawlOO/PPQ26X88gYPn2/0 2CM/PfVxWr969uDvvj33sXP7ffjolz4+9eWrvnH68O+9/vLt00l8/PgP3Dz3jTZv/vv5CyC0 5pe2+l1JgAiM2f74JyPvATCBEGQTAZvmP/dF8IJrmqDFKmi/62Hwg7laIP8MKDgQmhA7GqwX CXF3whaeTYTk4+ABSyeButTQLjekSw7lskMXjiaF/uRa4f3+woDdOGAAR9QhDu/SwxrusIkD 2OERk+hDyMGQfTIsIRH5UkS+TFGJh/piFa3ItAJmkYW86eJexKgXCTTxjXNxoxx5KMccsnGM pisjBR04gCIyQI197GMX/whIP6qRkH8EzBxt+EQextGRUcRLD/ForSvS74zES2Qg5aJJuhRy kHMBZGCgGMlSOrKRk4QkJUEDRObx8ZChvEshY8lJy5DyhqhUYh1tuMrQtJJmr6TlJusyy1oa czBPrKMTdclMJvaSlZY0YzCPKcpjDhOWhGnkI3O5TUk+0zK/rB7lsDlMT5qTmrYEIyRx+chS ptKU39xONPc4TkKek5j3/vxkNfmiTVW6kZFx3GUUBRrPPCoLi9Ms6DPD2b+EKnSVDO2e8x66 0HlukI8UrageLzrRjFIyog3sqEfHCFIhenCkPiwpJk+K0haqFKMtreJLRRrTE870fzWVqUVV uNLz5dSEN7XgT1260yD29IFD/WBQO+jTpF5wqTN0qlKL6kqaSjWBUNXiVSOYVTRuFatUBaZV v5q/rg6RrAg0K0vRGj+1NpWtbQ2rOHEKVwG6Fal1hd9d8xrAvfIVf379q17l2tCxCjZ6gT1s +BKr2OwxtrGIJaxE6QpZx0o2pJStbGQ3ylOYahZ6j/0s70Ir2tyRtrSuOy1q1XdZk751taxT /i1s4yXb2Yartrb9Fm5zW63d8naArT3qb1MbXM8Ol7bFNexxe5vczC73ts0V6nPT5dvpJi+6 TMWrdalV3e1mELtR9S5zvyO5xRhXvMcqr+TOi15hpQp1qctue8cLvPrKzrnzxZXv7As8/Ob3 VGhjYHHY+99DrW2EA1ZugSdWsxEuyr8L/u6isPhg6Ub4usNhX5UgfGEWzeugzNuwhTss4Qw3 7V7yJbGBXbRRc3FYxdrRlcVepGAY94ZZRqXxi218YyqB2F46HjGPD5SqihW5xkN+4YOW9SAk JxmaTe5fk3f8ZF+ejlFXdnKVWxQeBr1XPFTeciVRB6PTgVnIYv6h/pnLTGYtpzlB/OVvmN8M mDjL2X0PyLOe98znPvv5z4AOtKAHTehCG/rQiE60ohfN6D0X4H+NjrSkJ03pSlv60pguwKPx /IAEePrToA61qEdN6lKb+tSoTrWqV83qVrv61bCOdQI0DelOlxo9oca1qAEg6177OtUBCPav h01sUPO61whAwKlpzelbe/rYCTg2tJ89bVRnqdrFJnYAss3tX0M72eA+tbJDPe5SMzu7eTa2 uqP9aWmrG9rYHjW8u13sbdP73q2udrnLPWp+J8Dfoz73ldL97HWz++AId3e7Tz1tXmeJ2g7X Nb5HHeyKf9riFxc2qbeNcYtjfOKknja//sGt7HCTfN/JBnW4PS3w8hC82vNOeMERvvBbPzzX M6c5yENt74v7PAE977mog+5pou9c3uT+9Lj3rfSke3rpTZ/1ptHdaWzHfN4xNzjSE35zhx+c 5Epf+b/FDvani33joBY6x9NeaqPbW+hH1/rYU/70ukf97iVXOcunPvCq7zrnCr92w0199ZpH 2+tx57nifa52tP/87YnHud5FDXW73/3fmI96yyX0cskXXu7x1jriZR76o8Pd7T+neOqBDvfI 65vUdLd83u3O9L3X2vDUzjXMJV56iPs+PQcffeRZr/GiF5/4izf+8YE+fMmbPfaXf/7czS79 T7f8KgRvvva1/t/6om8/5+IetqYzVpLsf//8IO8+89Hfe8v7mtY3a4z50U9/bn9c+eqv/9wB DuvNy1/TABiAAjiABFiABniACJiACriADNiADviAEBiBEiiA/zOBFniBGJiBGriBHNiB/rdW dCYscxaCa0JZDnCCoXGCKkiCgUFXSURFhAGDLAgYOCWDeEFFNjgXOXgXKriDYlaDfYGDedGD eiGEMwiEdNGDK4hETBiEQ6iDMzgASOiEfuGDRsiCU7hGTehFTygXPrhlLsiFXtiFW2gXL/iF VWaCKHiDUDiEa+iGZRiCI3ggaPhkc6gddZhk/yMRJkESfviHgBiIgqgQffgQqrMTQVDhFIq4 iIzYiI4YE4mIE46BiGrxiJZ4iZiYiTcRiZuIEGcRAWgRiqI4iqRYiqZIiqB4iqq4iqzYiq74 irCIFQEBADs= --9qjFu7wnRj-- From martin@v.loewis.de Tue Feb 26 18:49:38 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 26 Feb 2002 19:49:38 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <15483.53993.852170.135298@anthem.wooz.org> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <15483.53993.852170.135298@anthem.wooz.org> Message-ID: barry@zope.com (Barry A. Warsaw) writes: > I don't think it works for XEmacs. I've got a MULE-aware XEmacs > 21.4.6 and while it asks if I want to set the local variables in the > -*- line, I still see "Raw" in the modeline, and I see the following > letters in print string (with funny little lines above the > characters): iAOOEI. See attached capture. That doesn't seem right, > does it? Indeed not: It interprets it as latin-1. I hope XEmacs will eventually follow the GNU Emacs conventions here, since I think they are useful. Regards, Martin From fredrik@pythonware.com Tue Feb 26 19:08:03 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 26 Feb 2002 20:08:03 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> Message-ID: <065201c1bef8$f3e03ad0$ced241d5@hagrid> jim wrote: > BTW, has there been any progress on this? the current pepoid proposal is here: http://effbot.org/ideas/time-type.htm major issues right now is 1. should the type know about timezones (probably not), and 2. should it support basic arithmetics (probably yes). (and 3. find time to write a html-to-pep converter ;-) cheers /F From bckfnn@worldonline.dk Tue Feb 26 19:44:39 2002 From: bckfnn@worldonline.dk (Finn Bock) Date: Tue, 26 Feb 2002 19:44:39 GMT Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7B6322.440D21E7@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> Message-ID: <3c7bbf00.17218508@mail.wanadoo.dk> [MAL] >I consider the above PEP ready for review by the developers. >Please comment. The pep seems to dictate that the source by default must be read as latin-1: """ Python will default to Latin-1 as standard encoding if no other encoding hints are given. """ Jython already reads the python source with the default java encoding which usually depends on the PCs locale. If a small loophole could be added to that requirement, then the pep have my full support. regards, finn From mal@lemburg.com Tue Feb 26 19:50:35 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 20:50:35 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <15483.53993.852170.135298@anthem.wooz.org> Message-ID: <3C7BE70B.74713ED2@lemburg.com> "Martin v. Loewis" wrote: > > barry@zope.com (Barry A. Warsaw) writes: > > > I don't think it works for XEmacs. I've got a MULE-aware XEmacs > > 21.4.6 and while it asks if I want to set the local variables in the > > -*- line, I still see "Raw" in the modeline, and I see the following > > letters in print string (with funny little lines above the > > characters): iAOOEI. See attached capture. That doesn't seem right, > > does it? > > Indeed not: It interprets it as latin-1. I hope XEmacs will eventually > follow the GNU Emacs conventions here, since I think they are useful. After reading some of the Emacs docs, I think we should allow a more flexible coding line: -*- ... coding: (\w+) ... -*- because you will sometimes want to add more variables to that Emacs init line than just the encoding declaration. Does anybody know where XEmacs is moving w/r to this ? (and for that matter what about vi, vim, etc. ?) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jim@zope.com Tue Feb 26 19:53:49 2002 From: jim@zope.com (Jim Fulton) Date: Tue, 26 Feb 2002 14:53:49 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> Message-ID: <3C7BE7CD.EAA8BA24@zope.com> Fredrik Lundh wrote: > > jim wrote: > > > BTW, has there been any progress on this? > > the current pepoid proposal is here: > > http://effbot.org/ideas/time-type.htm > > major issues right now is 1. should the type know about > timezones (probably not), :( -1 Doesn't the proposal sort of imply time-zone awareness of some kind? Or does it simply imply UT storage? > and 2. should it support basic > arithmetics (probably yes). Does this imply leap second hell, or will we simply be vague about expectations? I'd also like to see simple access methods for year, month, day, hours, minutes, and seconds, with date parts being one based and time parts being zero based. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From guido@python.org Tue Feb 26 19:58:54 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 14:58:54 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: Your message of "Tue, 26 Feb 2002 19:44:39 GMT." <3c7bbf00.17218508@mail.wanadoo.dk> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> Message-ID: <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> > """ > Python will default to Latin-1 as standard encoding if no other > encoding hints are given. > """ I missed this. Why not default to ASCII like any decent programming language does in the absence of an explicit encoding? --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Tue Feb 26 19:59:45 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 26 Feb 2002 14:59:45 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <15483.53993.852170.135298@anthem.wooz.org> <3C7BE70B.74713ED2@lemburg.com> Message-ID: <15483.59697.784765.121045@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> Does anybody know where XEmacs is moving w/r to this ? (and MAL> for that matter what about vi, vim, etc. ?) I'll ask around in the XEmacs community. -Barry From mal@lemburg.com Tue Feb 26 20:11:47 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 21:11:47 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> Message-ID: <3C7BEC03.E8225CFD@lemburg.com> Jim Fulton wrote: > > > major issues right now is 1. should the type know about > > timezones (probably not), > > :( > > -1 > > Doesn't the proposal sort of imply time-zone > awareness of some kind? Or does it simply imply > UT storage? I tried that in early version of mxDateTime -- it fails badly. I switched to the local time assumption very early in the development. Note that Fredrik's type is an abstract type; it doesn't even store anything -- that's up to subtypes which of course can implement timezones at their liking. > > and 2. should it support basic > > arithmetics (probably yes). > > Does this imply leap second hell, or will we > simply be vague about expectations? The type will store a fixed point in time, so why worry about leap seconds (most system's don't support these anyway and if they do, the support is usually switched off per default) ? > I'd also like to see simple access methods for year, > month, day, hours, minutes, and seconds, with date parts > being one based and time parts being zero based. In the abstract base type ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Tue Feb 26 20:15:40 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 21:15:40 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7BECEC.E1550553@lemburg.com> Guido van Rossum wrote: > > > """ > > Python will default to Latin-1 as standard encoding if no other > > encoding hints are given. > > """ > > I missed this. Why not default to ASCII like any decent programming > language does in the absence of an explicit encoding? Jack had the same question. The simple answer is: we need this in order to maintain backward compatibility when we move to phase two of the implementation. Here's the longer one: ASCII is the standard encoding for Python keywords and identifiers. There is no standard source code encoding for string literals. Unicode literals are interpreted using 'unicode-escape' which is an enhanced Latin-1 with escape semantics. This makes Latin-1 the right choice: * Unicode literals already use it today * As soon as we get to phase two of the implementation, 8-bit string literals will be have to make the round trip raw binary -> Unicode -> raw binary and this only works if you make Latin-1 the default. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From fredrik@pythonware.com Tue Feb 26 20:17:23 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 26 Feb 2002 21:17:23 +0100 Subject: [Python-Dev] proposal: add basic money type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> Message-ID: <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> and while I'm at it: I propose adding an "abstract" money base type to the standard library, to be subclassed by real money/decimal implementations. if isinstance(v, basemoney): # yay! it's money print float(money) # let's hope it's not too much The goal is not to standardize any behaviour beyond this; anything else should be provided by subtypes. More details here: http://effbot.org/ideas/money-type.htm I can produce PEP and patch if necessary. From mal@lemburg.com Tue Feb 26 20:20:56 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 21:20:56 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <15459.9239.83647.334632@gondolin.digicool.com> <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net> <15460.15864.266226.241495@grendel.zope.com> <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> <15460.17309.905113.103005@grendel.zope.com> <3C644E7B.F69AF73C@lemburg.com> <3C7B9529.309D9902@zope.com> <3C7B9C08.EAE1A30C@lemburg.com> <3C7BA3D8.7C417182@zope.com> Message-ID: <3C7BEE28.9030E4EA@lemburg.com> Jim Fulton wrote: > > "M.-A. Lemburg" wrote: > > > > Jim Fulton wrote: > ... > > > > What is data roundtrip safety? > > > > Roundtrip safety means that e.g. if you take a COM date value > > from a ADO and create a DateTime object with it, you can > > be sure to get back the exact same value via the COMDate() > > method. > > Since I don't know what COMDate is, this doesn't mean > anything to me. :) Then you're lucky -- COMDates are just about the strangest beast I've ever seen as date/time encoding. > ... > > I suppose that I could easily make a few calculation > > lazy to enhance speed; memory footprint would not change > > though. It's currently at 56 bytes per DateTime object > > and 36 bytes per DateTimeDelta object. > > Does that include the two words of Python object overhead? I suppose so -- the values I quoted are the tp_size values of the types. The instance will probably also require a dictionary and the weak ref list on top of those figures. > > To get similar accuracy in Python, > > I assume you mean precision. Eh, yes. > > you'd need a float and > > an integer per object, > > It depends on the desired precision. To get minute > precision, an int will do. Two ints can get you about > a hundreth of a microsecond precision, which is more than > most people need. I was just trying to compare apples to apples :-) mxDateTime offers the same precision as a float (for daytime) and an integer (for the day) can give. > > that's 16 bytes + 12 bytes == 28 bytes > > + malloc() overhead for the two and the wrapping instance > > which gives another 32 bytes (provided you store the two > > objects in slots)... >60 bytes per Python based date time > > object. > > A Python-based date-time object isn't very interesting to me. You should have mentioned that earlier ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From fredrik@pythonware.com Tue Feb 26 20:22:59 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 26 Feb 2002 21:22:59 +0100 Subject: [Python-Dev] proposal: add basic money type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> Message-ID: <071b01c1bf03$645eb520$ced241d5@hagrid> oops. two lines of code, and one bug. should be: > if isinstance(v, basemoney): > # yay! it's money > print float(v) # let's hope it's not too much From guido@python.org Tue Feb 26 20:28:24 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 15:28:24 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 14:53:49 EST." <3C7BE7CD.EAA8BA24@zope.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> Message-ID: <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> > Does this imply leap second hell, or will we > simply be vague about expectations? IMO, leap seconds should be ignored. Time stands still during a leap second. Consider this a BDFL pronouncement if you wish. :-) > I'd also like to see simple access methods for year, > month, day, hours, minutes, and seconds, The timetuple() method provides access to all of these simultaneously. Isn't that enough? t.year() could be spelled as t.timetuple()[0]. I expect that usually you'd request several of these together anyway, in order to do some fancy formatting, so the timetuple() approach makes sense. > with date parts > being one based and time parts being zero based. I'm not sure what you mean here. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Feb 26 20:29:59 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 15:29:59 -0500 Subject: [Python-Dev] proposal: add basic money type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 21:17:23 +0100." <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> Message-ID: <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> > I propose adding an "abstract" money base type to the standard > library, to be subclassed by real money/decimal implementations. Why do we need this? I guess that would be Question #1... --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Feb 26 20:31:52 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 21:31:52 +0100 Subject: [Python-Dev] proposal: add basic money type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> Message-ID: <3C7BF0B8.84C89070@lemburg.com> Fredrik Lundh wrote: > > and while I'm at it: > > I propose adding an "abstract" money base type to the standard > library, to be subclassed by real money/decimal implementations. > > if isinstance(v, basemoney): > # yay! it's money > print float(money) # let's hope it's not too much > > The goal is not to standardize any behaviour beyond this; anything > else should be provided by subtypes. > > More details here: > > http://effbot.org/ideas/money-type.htm > > I can produce PEP and patch if necessary. Sounds like a plan. One thing though: the RE "[+|-]?\d+(.\d+)?" should be extended to allow for currency symbols and names in front or after the monetary value. Currency for money is a bit like timezones for datetime, so it's a good idea, not to add it to the base type interface. However, the interface should be extendable to include currency information. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Tue Feb 26 20:33:27 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 21:33:27 +0100 Subject: [Python-Dev] proposal: add basic money type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7BF117.D7417F7D@lemburg.com> Guido van Rossum wrote: > > > I propose adding an "abstract" money base type to the standard > > library, to be subclassed by real money/decimal implementations. > > Why do we need this? I guess that would be Question #1... For databases ?! The DB API has long had a monetary or at least decimal type on its plate... never got around to implementing one, though :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Tue Feb 26 20:37:05 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 15:37:05 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: Your message of "Tue, 26 Feb 2002 21:15:40 +0100." <3C7BECEC.E1550553@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> Message-ID: <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> > > I missed this. Why not default to ASCII like any decent programming > > language does in the absence of an explicit encoding? > > Jack had the same question. The simple answer is: we need this > in order to maintain backward compatibility when we move to > phase two of the implementation. > > Here's the longer one: > > ASCII is the standard encoding for Python keywords and identifiers. > There is no standard source code encoding for string literals. > Unicode literals are interpreted using 'unicode-escape' which > is an enhanced Latin-1 with escape semantics. > > This makes Latin-1 the right choice: > > * Unicode literals already use it today But they shouldn't, IMO. We should require an explicit encoding when more than ASCII is used, and I'd like to enforce this. > * As soon as we get to phase two of the implementation, > 8-bit string literals will be have to make the round trip > raw binary -> Unicode -> raw binary and this only works > if you make Latin-1 the default. Sorry, I don't understand what you're trying to say here. Can you explain this with an example? Why can't we require any program encoded in more than pure ASCII to have an encoding magic comment? I guess I don't understand why you mean by "raw binary". Once you've explained it to me, the PEP should address this issue. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Tue Feb 26 20:42:02 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 26 Feb 2002 21:42:02 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> Message-ID: <075f01c1bf06$1383d380$ced241d5@hagrid> jim wrote: > Doesn't the proposal sort of imply time-zone > awareness of some kind? Or does it simply imply > UT storage? as written, it still implies time-zone awareness. the question is whether to remove that constraint (and the utc* methods). *all* early reviewers argued that time zones are a representation thingie, and doesn't belong in the abstract type. I'm tempted to agree, but I'm not sure I can explain why... > > and 2. should it support basic > > arithmetics (probably yes). > > Does this imply leap second hell, or will we > simply be vague about expectations? vague. > I'd also like to see simple access methods for year, > month, day, hours, minutes, and seconds, with date parts > being one based and time parts being zero based. use timetuple(). (I rather not add too much stuff to the abstract interface; the goal is to let MAL turn mxDateTime into a basetime sub- type without breaking any application code...) From jim@zope.com Tue Feb 26 20:39:22 2002 From: jim@zope.com (Jim Fulton) Date: Tue, 26 Feb 2002 15:39:22 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <3C7BEC03.E8225CFD@lemburg.com> Message-ID: <3C7BF27A.BD588B4B@zope.com> "M.-A. Lemburg" wrote: > > Jim Fulton wrote: > > > > > major issues right now is 1. should the type know about > > > timezones (probably not), > > > > :( > > > > -1 > > > > Doesn't the proposal sort of imply time-zone > > awareness of some kind? Or does it simply imply > > UT storage? > > I tried that in early version of mxDateTime -- it fails > badly. I switched to the local time assumption very > early in the development. That's a really bad assumption if times are used in different time zones. > Note that Fredrik's type is an abstract type; it doesn't > even store anything -- that's up to subtypes which of > course can implement timezones at their liking. No, it does store something, it just doesn't say how. There are methods for returning localtime and gmtime, so there is (abstract) storage. > > > and 2. should it support basic > > > arithmetics (probably yes). > > > > Does this imply leap second hell, or will we > > simply be vague about expectations? > > The type will store a fixed point in time, so why > worry about leap seconds (most system's don't support these > anyway and if they do, the support is usually switched off per > default) ? There are a lot of semantic issues with date-time math. Leap seconds is an example. If you store local time, are date-time subtractions affected by daylight-savings time? Do the calculations depend on the calendar? Do you take into account the lost days in the switch from the Julean to the Gregorian calendar? I'm not really opposed to doing math, but we need to at least recognize the fuzzyness of the semantics. > > I'd also like to see simple access methods for year, > > month, day, hours, minutes, and seconds, with date parts > > being one based and time parts being zero based. > > In the abstract base type ? Yes. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From guido@python.org Tue Feb 26 20:49:04 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 15:49:04 -0500 Subject: [Python-Dev] proposal: add basic money type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 21:33:27 +0100." <3C7BF117.D7417F7D@lemburg.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> <3C7BF117.D7417F7D@lemburg.com> Message-ID: <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net> > > > I propose adding an "abstract" money base type to the standard > > > library, to be subclassed by real money/decimal implementations. > > > > Why do we need this? I guess that would be Question #1... > > For databases ?! The DB API has long had a monetary or at least > decimal type on its plate... never got around to implementing > one, though :-) I can only find one reference to money or decimal in the DB API PEP, and that's as a future task. I guess that's what you mean by "on its plate". Since I'm not a database expert, maybe you can explain the use of this in more detail? And why would we need a monetary type rather than a fixed-point decimal type? --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Feb 26 20:48:57 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 21:48:57 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> Message-ID: <3C7BF4B9.472C9207@lemburg.com> Finn Bock wrote: > > [MAL] > > >I consider the above PEP ready for review by the developers. > >Please comment. > > The pep seems to dictate that the source by default must be read as > latin-1: > > """ > Python will default to Latin-1 as standard encoding if no other > encoding hints are given. > """ > > Jython already reads the python source with the default java encoding > which usually depends on the PCs locale. > > If a small loophole could be added to that requirement, then the pep > have my full support. Hmm, in phase two we will need to decode the source code file using some encoding into Unicode and then reencode the 8-bit string parts using that same encoding. The only requirement we have for that is round-trip safety, so that string literals turn out as the same value you see in the source file. Now, Unicode literals are explicit about this: unicode-escape is a latin-1 codec with some escaping knowledge. I'm not sure how to get this in line with the "any round-trip safe encoding" strategy... OTOH, if Jython users write source code which depends on the PC's locale then they are bound to write non-portable code, so fixing one encoding would certainly help here. What I don't understand is why you read the file using the PC's locale. Wouldn't it be possible to set the file encoding prior to reading from it ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From fredrik@pythonware.com Tue Feb 26 20:48:47 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 26 Feb 2002 21:48:47 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <3C7BEC03.E8225CFD@lemburg.com> Message-ID: <079f01c1bf07$11963ee0$ced241d5@hagrid> mal wrote: > > Doesn't the proposal sort of imply time-zone > > awareness of some kind? Or does it simply imply > > UT storage? > > I tried that in early version of mxDateTime -- it fails > badly. can you elaborate? > > Does this imply leap second hell, or will we > > simply be vague about expectations? > > The type will store a fixed point in time, so why > worry about leap seconds (most system's don't support these > anyway and if they do, the support is usually switched off per > default) ? the updated proposal adds __hash__ and __cmp__, and the following (optional?) operations: deltaobject = timeobject - timeobject floatobject = float(deltaobject) # fractional seconds timeobject = timeobject + integerobject timeobject = timeobject + floatobject timeobject = timeobject + deltaobject note that "deltaobject" can be anything; the abstract type only says that if you manage to subtract one time object from another one of the same type, you get some object that you can 1) convert to a float, and 2) add to another time object. vague, but pretty useful. > > I'd also like to see simple access methods for year, > > month, day, hours, minutes, and seconds, with date parts > > being one based and time parts being zero based. > > In the abstract base type ? Q. does mxDateTime provide separate accessors for individual members? From fredrik@pythonware.com Tue Feb 26 20:52:48 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 26 Feb 2002 21:52:48 +0100 Subject: [Python-Dev] proposal: add basic money type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <3C7BF0B8.84C89070@lemburg.com> Message-ID: <07c101c1bf07$8fbe1540$ced241d5@hagrid> mal wrote: > > I propose adding an "abstract" money base type to the standard > > library, to be subclassed by real money/decimal implementations. > > > > if isinstance(v, basemoney): > > # yay! it's money > > print float(money) # let's hope it's not too much > > > > The goal is not to standardize any behaviour beyond this; anything > > else should be provided by subtypes. > > > > More details here: > > > > http://effbot.org/ideas/money-type.htm > > > > I can produce PEP and patch if necessary. > > Sounds like a plan. > > One thing though: the RE "[+|-]?\d+(.\d+)?" should be extended > to allow for currency symbols and names in front or after the > monetary value. isn't this better handled by a separate method/attribute? (otherwise, I fear that we'll end up adding all possible currency notations to the abstract type. but maybe there is a standard for this, somewhere?) From guido@python.org Tue Feb 26 20:54:30 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 15:54:30 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 21:42:02 +0100." <075f01c1bf06$1383d380$ced241d5@hagrid> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <075f01c1bf06$1383d380$ced241d5@hagrid> Message-ID: <200202262054.g1QKsUk19953@pcp742651pcs.reston01.va.comcast.net> > *all* early reviewers argued that time zones are a representation > thingie, and doesn't belong in the abstract type. Then maybe the reviewers didn't have a sufficiently wide range of applications in mind? If I were to create a database of email messages, I'd be seriously annoyed if it normalized the timezone info away. A message sent at 10pm EST has a different feel to it than one sent at 4am MET. It should *sort* on UTC, but it should use the original timezone to display the dates. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim@zope.com Tue Feb 26 20:53:01 2002 From: jim@zope.com (Jim Fulton) Date: Tue, 26 Feb 2002 15:53:01 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7BF5AD.45BACF05@zope.com> Guido van Rossum wrote: > > > Does this imply leap second hell, or will we > > simply be vague about expectations? > > IMO, leap seconds should be ignored. Time stands still during a leap > second. Consider this a BDFL pronouncement if you wish. :-) > > > I'd also like to see simple access methods for year, > > month, day, hours, minutes, and seconds, > > The timetuple() method provides access to all of these > simultaneously. Isn't that enough? >From a minimalist point of view, yet, but from a usability point of view, no. > t.year() could be spelled as > t.timetuple()[0]. Yes, but t.year() is a lot more readable. > I expect that usually you'd request several of > these together anyway, in order to do some fancy formatting, so the > timetuple() approach makes sense. I find the time tuples to be really inconvenient. I *always* have to slice off the parts I don't want, which I find annoying. Hm, now that I mention the extra parts, it seems kind of silly to make implementors of the type come up with weekday, julian day, and a daylight-savings flag. This time format is really biased by the C time library, which is fine for a module that wraps the C library but seems a bit silly for a standard date-time interface. > > with date parts > > being one based and time parts being zero based. > > I'm not sure what you mean here. Years, months, and days should start from 1. Hours, minutes and seconds should start from 0. One confusion I often have with time tuples is that I know too much about C's time struct, which numbers months from 0 and which has years since 1900. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From guido@python.org Tue Feb 26 20:58:57 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 15:58:57 -0500 Subject: [Python-Dev] proposal: add basic money type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 21:31:52 +0100." <3C7BF0B8.84C89070@lemburg.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <3C7BF0B8.84C89070@lemburg.com> Message-ID: <200202262058.g1QKwvA19990@pcp742651pcs.reston01.va.comcast.net> > Currency for money is a bit like timezones for datetime, > so it's a good idea, not to add it to the base type > interface. However, the interface should be extendable > to include currency information. Currency is much worse than timezones -- once you are interested in exchange rates, you need to know *when* to calculate the exchange rate (as well as other parameters such as which exchange rate). So please let's keep the currency out of the money type; it's utterly application dependent what to do with that information. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Tue Feb 26 21:00:40 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 26 Feb 2002 15:00:40 -0600 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15483.63352.850857.833197@12-248-41-177.client.attbi.com> Guido> The timetuple() method provides access to all of these Guido> simultaneously. Isn't that enough? t.year() could be spelled as Guido> t.timetuple()[0]. Since we're discussing an abstract type it probably doesn't apply directly, but perhaps timetuple() could be specified to return a super-tuple like os.stat() does... Skip From jepler@unpythonic.dhs.org Tue Feb 26 21:01:43 2002 From: jepler@unpythonic.dhs.org (jepler@unpythonic.dhs.org) Date: Tue, 26 Feb 2002 15:01:43 -0600 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7BE70B.74713ED2@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <15483.53993.852170.135298@anthem.wooz.org> <3C7BE70B.74713ED2@lemburg.com> Message-ID: <20020226150143.C16980@unpythonic.dhs.org> On Tue, Feb 26, 2002 at 08:50:35PM +0100, M.-A. Lemburg wrote: > Does anybody know where XEmacs is moving w/r to this ? (and > for that matter what about vi, vim, etc. ?) I'm working with Vim 6.0, 20001 Sep 14. VIM lets you set variables with text similar to vim:KEY=VALUE:KEY=VALUE:....: Apparently you would use vim:fileencoding=sjis: to select shift-jis encoding. In the vim style, it seems most common to place this at the bottom of a file, but it can be placed at the top too. The variable "modelines" controls how many lines at each end of the file is inspected, with the default being 5. It's documented that the form vi:set KEY=VALUE: may be compatible with "some versions of Vi" but does not say which. (I can't get this to work) You can set a list of encodings to attempt when a file is loaded, which defaults to "ucs-bom,utf-8,latin1". A user who wanted to treate non-unicode files as shift-jis by default would :set fileencodings=ucs-bom,utf-8,sjis You can also load a particular file with the ++enc parameter: :edit ++enc=koi8-r russian.txt (I can get this to work, but I have to do it manually to load anything in an odd character set) The emacs line is harmless in vim, but doesn't do anything. It's possible that using :autocmd someone could make vim use the emacs line to set encoding, but I'm not sure -- setting fileencoding after a file is loaded seems to perform a translation from the old characterset to the new. Jeff From fdrake@acm.org Tue Feb 26 21:01:12 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 26 Feb 2002 16:01:12 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <200202262054.g1QKsUk19953@pcp742651pcs.reston01.va.comcast.net> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <075f01c1bf06$1383d380$ced241d5@hagrid> <200202262054.g1QKsUk19953@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15483.63384.640568.56380@grendel.zope.com> Guido van Rossum writes: > 10pm EST has a different feel to it than one sent at 4am MET. It > should *sort* on UTC, but it should use the original timezone to > display the dates. Sounds like a user preference, not a universal truth. Is it important that the timezone is part of the date/time type, though? Is it important that it be part of the abstract base date/time? Specific implementations should certainly be able to add support for timezones, and perhaps some hypothetical default date/time type should include it for convenience, but that doesn't tell me it's fundamental. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From tim@zope.com Tue Feb 26 21:05:10 2002 From: tim@zope.com (Tim Peters) Date: Tue, 26 Feb 2002 16:05:10 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <3C7B9CA1.B7798317@lemburg.com> Message-ID: [M.-A. Lemburg] >>> The other possibility would be adding a set of new types >>> to mxDateTime which focus on low memory requirements rather >>> than data roundtrip safety and speed. [Jim Fulton] >> What is data roundtrip safety? [MAL] > Roundtrip safety means that e.g. if you take a COM date value > from a ADO and create a DateTime object with it, you can > be sure to get back the exact same value via the COMDate() > method. > > The same is true for broken down values and, of course, > the internal values .absdate and .abstime. I'm unsure whether Jim is aware of this, but if not he should be: the non-trivial time I spent over the last week repairing test failures in Zope's current DateTime.py was all spent finding & repairing basic roundtrip failures. These came in two flavors: 1. dt = DateTime() dt2 = DateTime( dt.year(), dt.month(), dt.day(), dt.hour(), dt.minute(), dt.second()) assert dt == dt2 could fail, depending on the exact time you tried it. This was the root cause of many test failures (they were usually reported as failures after doing some sort of arithmetic first, but the arithmetic was actually irrelevant: when these failed, the base objects didn't match from the start). 2. Failure of roundtrip conversion between DateTime objects D and repr(D), than back again, to reproduce the original DateTime or string. "Floating-point" got the generic blame for these things under the covers, but it was really peoples' spectacular inability to deal with *binary* floating-point that caused all the problems. People just can't help but see, e.g., "50.327" as an exact decimal value, so just can't help writing code that assumes false correlates (such as, e.g., that int((50.327 - 50) * 1000) will return 327; but it doesn't; it returns 326). If we were using decimal floating-point instead, the numerically naive code here would have worked fine. > ... > DateTime objects use .abstime and .absdate for doing > arithmetic since these provides the best accuracy. The most > important broken down values are calculated once at creation > time; a few others are done on-the-fly. Note that 2.2 properties allow natural support of calculated attributes, and that a computed attribute can easily arrange to cache its value. > I suppose that I could easily make a few calculation > lazy to enhance speed; memory footprint would not change > though. It's currently at 56 bytes per DateTime object > and 36 bytes per DateTimeDelta object. I'm assuming that counts Python object header overhead, but does not count hidden malloc overhead. Switching to pymalloc would slash the latter. > To get similar accuracy in Python, you'd need a float and > an integer per object, that's 16 bytes + 12 bytes == 28 bytes > + malloc() overhead for the two and the wrapping instance > which gives another 32 bytes (provided you store the two > objects in slots)... >60 bytes per Python based date time > object. Just noting that a Zope DateTime instance is huge, with a dozen named attributes, one of which holds a Python long (unbounded integer). From mal@lemburg.com Tue Feb 26 21:06:15 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 22:06:15 +0100 Subject: [Python-Dev] proposal: add basic money type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> <3C7BF117.D7417F7D@lemburg.com> <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7BF8C7.DECB7D38@lemburg.com> Guido van Rossum wrote: > > > > > I propose adding an "abstract" money base type to the standard > > > > library, to be subclassed by real money/decimal implementations. > > > > > > Why do we need this? I guess that would be Question #1... > > > > For databases ?! The DB API has long had a monetary or at least > > decimal type on its plate... never got around to implementing > > one, though :-) > > I can only find one reference to money or decimal in the DB API PEP, > and that's as a future task. I guess that's what you mean by "on its > plate". Exactly :-) > Since I'm not a database expert, maybe you can explain the use of this > in more detail? And why would we need a monetary type rather than a > fixed-point decimal type? A decimal would do as well, I suppose, at least in terms of storing the raw value. The reason for trying to come up with a monetary type is to make operations between monetary values having two different currencies illegal. Coercion between two of those would always have to be made explicit (for obvious reasons). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Tue Feb 26 21:07:48 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 16:07:48 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 15:53:01 EST." <3C7BF5AD.45BACF05@zope.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> Message-ID: <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> > > The timetuple() method provides access to all of these > > simultaneously. Isn't that enough? > > From a minimalist point of view, yet, but from a usability point > of view, no. > > > t.year() could be spelled as > > t.timetuple()[0]. > > Yes, but t.year() is a lot more readable. When do you ever use this in isolation? I'd expect in 99% of the cases you hand it off to a formatting routine, and guess what -- strftime() takes a time tuple. I worry about the time wasted by calling all of t.year(), t.month(), t.day() (etc.) -- given that they do so little, the call overhead is probably near 100%. I wonder how often this is needed. The only occurrences of year() in the entire Zope source that I found are in various test routines. > > I expect that usually you'd request several of > > these together anyway, in order to do some fancy formatting, so the > > timetuple() approach makes sense. > > I find the time tuples to be really inconvenient. I *always* > have to slice off the parts I don't want, which I find annoying. Serious question: what do you tend to do with time values? I imagine that once we change strftime() to accept an abstract time object, you'll never need to call either timetuple() or year() -- strftime() will do it for you. > Hm, now that I mention the extra parts, it seems kind of silly > to make implementors of the type come up with weekday, julian day, and > a daylight-savings flag. This time format is really biased by > the C time library, which is fine for a module that wraps the C library > but seems a bit silly for a standard date-time interface. That's why /F's pre-PEP allows the implementation to leaves these three set to -1. > > > with date parts > > > being one based and time parts being zero based. > > > > I'm not sure what you mean here. > > Years, months, and days should start from 1. > Hours, minutes and seconds should start from 0. > > One confusion I often have with time tuples is that I know > too much about C's time struct, which numbers months from 0 > and which has years since 1900. I guess that confusion is yours alone. In Python, of course month and day start from 1. Whether years start from 1 is a theological question. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Feb 26 21:11:34 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 16:11:34 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 16:01:12 EST." <15483.63384.640568.56380@grendel.zope.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <075f01c1bf06$1383d380$ced241d5@hagrid> <200202262054.g1QKsUk19953@pcp742651pcs.reston01.va.comcast.net> <15483.63384.640568.56380@grendel.zope.com> Message-ID: <200202262111.g1QLBZA20104@pcp742651pcs.reston01.va.comcast.net> > Guido van Rossum writes: > > 10pm EST has a different feel to it than one sent at 4am MET. It > > should *sort* on UTC, but it should use the original timezone to > > display the dates. [Fred] > Sounds like a user preference, not a universal truth. Fair enough. Even some of my well-traveled friends cannot do timezone arithmetic in their head... :-) > Is it important that the timezone is part of the date/time type, > though? Is it important that it be part of the abstract base > date/time? > > Specific implementations should certainly be able to add support for > timezones, and perhaps some hypothetical default date/time type should > include it for convenience, but that doesn't tell me it's fundamental. I guess I want it to be possible to have an implementation that keeps track of the timezone as entered. It's true that time deltas are a nightmare when dealing with different timezones. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Feb 26 21:16:31 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 16:16:31 -0500 Subject: [Python-Dev] proposal: add basic money type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 22:06:15 +0100." <3C7BF8C7.DECB7D38@lemburg.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> <3C7BF117.D7417F7D@lemburg.com> <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net> <3C7BF8C7.DECB7D38@lemburg.com> Message-ID: <200202262116.g1QLGVN20144@pcp742651pcs.reston01.va.comcast.net> > A decimal would do as well, I suppose, at least in > terms of storing the raw value. The reason for trying > to come up with a monetary type is to make operations between > monetary values having two different currencies illegal. > Coercion between two of those would always have to be > made explicit (for obvious reasons). Are you sure you're trying to solve a real problem here? There are lots of operations on monetary values that make no sense (try multiplying two amounts of money). --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Feb 26 21:26:47 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 22:26:47 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7BFD97.B69DDDFD@lemburg.com> Guido van Rossum wrote: > > > > The timetuple() method provides access to all of these > > > simultaneously. Isn't that enough? > > > > From a minimalist point of view, yet, but from a usability point > > of view, no. > > > > > t.year() could be spelled as > > > t.timetuple()[0]. > > > > Yes, but t.year() is a lot more readable. > > When do you ever use this in isolation? I'd expect in 99% of the > cases you hand it off to a formatting routine, and guess what -- > strftime() takes a time tuple. FWIW, mxDateTime exposes these values as attributes -- there is no call overhead. > I worry about the time wasted by calling all of t.year(), t.month(), > t.day() (etc.) -- given that they do so little, the call overhead is > probably near 100%. > > I wonder how often this is needed. The only occurrences of year() in > the entire Zope source that I found are in various test routines. I actually use the attributes quite a bit in my stuff and hardly ever use .strftime(). The mxDateTime is different though, e.g. I sometimes get questions about how to make strftime() output fractions of a seconds (doesn't work, because strftime() doesn't support it). > > > I expect that usually you'd request several of > > > these together anyway, in order to do some fancy formatting, so the > > > timetuple() approach makes sense. > > > > I find the time tuples to be really inconvenient. I *always* > > have to slice off the parts I don't want, which I find annoying. > > Serious question: what do you tend to do with time values? I imagine > that once we change strftime() to accept an abstract time object, > you'll never need to call either timetuple() or year() -- strftime() > will do it for you. Depends on the application space. Database applications will call .timetuple() very often and use strftime() hardly ever. > > > > with date parts > > > > being one based and time parts being zero based. > > > > > > I'm not sure what you mean here. > > > > Years, months, and days should start from 1. > > Hours, minutes and seconds should start from 0. > > > > One confusion I often have with time tuples is that I know > > too much about C's time struct, which numbers months from 0 > > and which has years since 1900. > > I guess that confusion is yours alone. In Python, of course month and > day start from 1. Whether years start from 1 is a theological > question. :-) It's not really a question: the year 0 simply does not exist in reality ! (Christians didn't have a 0 available at the time ;-) Still, historic dates are usually referenced by making year 0 == 1 b.c., -1 == 2 b.c., etc. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jim@zope.com Tue Feb 26 21:23:42 2002 From: jim@zope.com (Jim Fulton) Date: Tue, 26 Feb 2002 16:23:42 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7BFCDE.F920078F@zope.com> Guido van Rossum wrote: > > > > The timetuple() method provides access to all of these > > > simultaneously. Isn't that enough? > > > > From a minimalist point of view, yet, but from a usability point > > of view, no. > > > > > t.year() could be spelled as > > > t.timetuple()[0]. > > > > Yes, but t.year() is a lot more readable. > > When do you ever use this in isolation? I'd expect in 99% of the > cases you hand it off to a formatting routine, and guess what -- > strftime() takes a time tuple. > > I worry about the time wasted by calling all of t.year(), t.month(), > t.day() (etc.) -- given that they do so little, the call overhead is > probably near 100%. > > I wonder how often this is needed. The only occurrences of year() in > the entire Zope source that I found are in various test routines. These methods and others are used a lot in presentation code, which tends to be expressed in DTML or ZPT. It's not uncommon to select/catagorize things by year or month. I think most people would find individual date-part methods a lot more natural than tuples. > > > I expect that usually you'd request several of > > > these together anyway, in order to do some fancy formatting, so the > > > timetuple() approach makes sense. > > > > I find the time tuples to be really inconvenient. I *always* > > have to slice off the parts I don't want, which I find annoying. > > Serious question: what do you tend to do with time values? I format them in various ways and I sort them. > I imagine > that once we change strftime() to accept an abstract time object, > you'll never need to call either timetuple() or year() -- strftime() > will do it for you. Maybe, if I use strftime, but I don't use strftime all that much. I can certainly think of even formatting cases (e.g. internationalized dates) where it's not adequate. > > Hm, now that I mention the extra parts, it seems kind of silly > > to make implementors of the type come up with weekday, julian day, and > > a daylight-savings flag. This time format is really biased by > > the C time library, which is fine for a module that wraps the C library > > but seems a bit silly for a standard date-time interface. > > That's why /F's pre-PEP allows the implementation to leaves these > three set to -1. I missed these. Still, providing -1s seems, uh, vestigial. > > > > with date parts > > > > being one based and time parts being zero based. > > > > > > I'm not sure what you mean here. > > > > Years, months, and days should start from 1. > > Hours, minutes and seconds should start from 0. > > > > One confusion I often have with time tuples is that I know > > too much about C's time struct, which numbers months from 0 > > and which has years since 1900. > > I guess that confusion is yours alone. In Python, of course month and > day start from 1. Whether years start from 1 is a theological > question. :-) I doubt the confusion is mine alone, but I'll take your word for it. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From mal@lemburg.com Tue Feb 26 21:32:31 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 22:32:31 +0100 Subject: [Python-Dev] proposal: add basic money type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> <3C7BF117.D7417F7D@lemburg.com> <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net> <3C7BF8C7.DECB7D38@lemburg.com> <200202262116.g1QLGVN20144@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7BFEEF.16BDC177@lemburg.com> Guido van Rossum wrote: > > > A decimal would do as well, I suppose, at least in > > terms of storing the raw value. The reason for trying > > to come up with a monetary type is to make operations between > > monetary values having two different currencies illegal. > > Coercion between two of those would always have to be > > made explicit (for obvious reasons). > > Are you sure you're trying to solve a real problem here? There are > lots of operations on monetary values that make no sense (try > multiplying two amounts of money). Indeed, monetary types solve different problems than decimal types. Financial applications do have a need for these kind of implicit error checks. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Tue Feb 26 21:37:31 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 16:37:31 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 16:23:42 EST." <3C7BFCDE.F920078F@zope.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFCDE.F920078F@zope.com> Message-ID: <200202262137.g1QLbWq20272@pcp742651pcs.reston01.va.comcast.net> [me] > > I wonder how often this is needed. The only occurrences of year() in > > the entire Zope source that I found are in various test routines. [Jim] > These methods and others are used a lot in presentation code, > which tends to be expressed in DTML or ZPT. > > It's not uncommon to select/catagorize things by year or month. > > I think most people would find individual date-part methods > a lot more natural than tuples. OK, that explains a lot. For this context I agree, although I think they should probably appear as (computed) attributes rather than methods. Properties seem perfect. > > I imagine > > that once we change strftime() to accept an abstract time object, > > you'll never need to call either timetuple() or year() -- strftime() > > will do it for you. > > Maybe, if I use strftime, but I don't use strftime all that much. Maybe you should. :-) > I can certainly think of even formatting cases (e.g. internationalized > dates) where it's not adequate. Then a super-strftime() should be invented that *is* enough, rather than fumbling with hand-coded solutions. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue Feb 26 21:36:58 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 26 Feb 2002 22:36:58 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > > This makes Latin-1 the right choice: > > > > * Unicode literals already use it today > > But they shouldn't, IMO. I agree. I recommend to deprecate this feature, and raise a DeprecationWarning if a Unicode literal contains non-ASCII characters but no encoding has been declared. > Sorry, I don't understand what you're trying to say here. Can you > explain this with an example? Why can't we require any program > encoded in more than pure ASCII to have an encoding magic comment? I > guess I don't understand why you mean by "raw binary". With the proposed implementation, the encoding declaration is only used for Unicode literals. In all other places where non-ASCII characters can occur (comments, string literals), those characters are treated as "bytes", i.e. it is not verified that these bytes are meaningful under the declared encoding. Marc's original proposal was to apply the declared encoding to the complete source code, but I objected claiming that it would make the tokenizer changes more complex, and the resulting tokenizer likely significantly slower (atleast if you use the codecs API to perform the decoding). In phase 2, the encoding will apply to all strings. So it will not be possible to put arbitrary byte sequences in a string literal, atleast if the encoding disallows certain byte sequences (like UTF-8, or ASCII). Since this is currently possible, we have a backwards compatibility problem. Regards, Martin From guido@python.org Tue Feb 26 21:40:42 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 16:40:42 -0500 Subject: [Python-Dev] proposal: add basic money type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 22:32:31 +0100." <3C7BFEEF.16BDC177@lemburg.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> <3C7BF117.D7417F7D@lemburg.com> <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net> <3C7BF8C7.DECB7D38@lemburg.com> <200202262116.g1QLGVN20144@pcp742651pcs.reston01.va.comcast.net> <3C7BFEEF.16BDC177@lemburg.com> Message-ID: <200202262140.g1QLehH20305@pcp742651pcs.reston01.va.comcast.net> > Indeed, monetary types solve different problems than decimal > types. Financial applications do have a need for these kind > of implicit error checks. But this is easily done by creating a custom class -- which has the advantage that the set of constraints can be specialized to the needs of a specific application. When we add a monetary type to the language we'll never get it right for all apps. OTOH, I think we could get a fixed point type right. How many other languages have a monetary type? What support for money does SQL have? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Feb 26 21:53:55 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 16:53:55 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: Your message of "26 Feb 2002 22:36:58 +0100." References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> Message-ID: <200202262153.g1QLrt220437@pcp742651pcs.reston01.va.comcast.net> > In phase 2, the encoding will apply to all strings. So it will not be > possible to put arbitrary byte sequences in a string literal, atleast > if the encoding disallows certain byte sequences (like UTF-8, or > ASCII). Since this is currently possible, we have a backwards > compatibility problem. I would say that any program that currently uses non-ASCII in string literals (whether Unicode or 8-bit literals) is strictly spoken undefined. For cases where a specific encoding is used, the solution is easy: add an explicit encoding. Other cases are simply garbage and should use \xDD escapes instead. Maybe an implementation phase 1a should be introduced that warns about the occurrence of non-ASCII characters anywhere in the source code when no encoding is specified. --Guido van Rossum (home page: http://www.python.org/~guido/) From Nikolai Tue Feb 26 19:12:01 2002 From: Nikolai (Nikolai) Date: Tue, 26 Feb 2002 23:12:01 +0400 Subject: [Python-Dev] =?koi8-r?B?8NLP3tTJ1MUsIM3P1sXULCDc1M8g98HbINvBztMuIPXEwczJ1NggzsXJy8/H?= =?koi8-r?B?xMEgzsUg0M/axM7PLg==?= Message-ID: <147594609.20020226231201@honkong.com> SGVsbG8sDQoNCg0KICAgICDtxc7RINrP19XUIO7Jy8/MwcouIOna18nOydTFINrBIMLF09DPy8/K 09TXzywgzs8gzsUg1M/Sz9DJ1MXT2CDVxMHM0dTYINzUzyDQydPYzc8gLSDPzs8gzc/WxdQg09nH 0sHU2CDXwdbO1cAg0s/M2CDXDQr3wdvFyiDWydrOySEg6c3Fzs7PINTByyDQ0s/J2s/bzM8g088g zc7PyiDQz9PMxSDQz8zV3sXOydEgLSDOxcvP1M/Sz8Ug19LFzdEgzsHawcQgLSDUwcvPx88g1sUg 0MnT2M3BLiD2xczBwCDJIPfBzSDOxQ0K1dDV09TJ1Ngg09fPyiDbwc7TISDv4vH64fTl7Pju7yDz 7+jy4e7p9OUg/PTvIPDp8/jt7yDu4SD25fP06+/tIOkg5+ni6+noIOTp8+vh6CENCg0KICAgICD3 0NLP3sXNLCDF08zJIPfZIM7FIMjP1MnUxSDTz9fF0tvFzs7PINrBy8/Ozs8gxM/Qz8zOydTFzNjO zyDawdLBwsHU2dfB1NggzsXTy8/M2MvPINTZ09HeIMnMySDExdPR1MvP1yDU2dPR3iBVUyQg1w0K zcXT0cMgzsUgz9TIz8TRIM/UIM3F09TBLCDHxMUg09TPydQg98HbIMvPzdDYwNTF0iAoy8/Oy9LF 1M7B0SDT1c3NwSDC1cTF1CDawdfJ08XU2CDUz8zYy88gz9Qg19LFzcXOySwgy8/Uz9LPxSD32SDa wdPUwdfJ1MUNCtfB2yDLz83Q2MDUxdIgydPLwdTYIEUtbWFpbCjZKSDJIM/U0NLB18zR1Ngg0M8g zsnNIMTBzs7PxSDQydPYzc8sIMTB1sUgxdPMySD32SDTwc3JLCDQ0skg3NTPzSwgydrXyc7J1MUs INPQydTFIC0g1yDMwMLPzQ0K083Z08zFINzUz8fPINPMz9fBKSwg0M/Mzs/T1NjAIM7F2sHXydPJ zc8gz9Qgy8/HzyDC2SDUzyDOySDC2czPIMkg08/I0sHO0dEgKMXTzMkg98HNINzUzyDC1cTF1CDV x8/Ezs8pIMHOz87Jzc7P09TYLA0Kyc7XxdPUydLV0SDOwSDT1MHS1MUg19PFx88gMjAgVVMkICjL z9TP0tnFLCDLINTPzdUg1sUsIM/L1dDR1NPRIMLVy9fBzNjOzyDexdLF2iDOxcTFzMAgyczJIMTX xSksINPPyNLBztHRINDSySDX08XNINzUz80NCt7J09TVwCDTz9fF09TYICjULssuIPfBzSDOySDS wdrVIM7FINDSycTF1NPRIM7Jy8/HzyDPws3BztnXwdTYIC0g/PTvIO7lIPDp8uHt6eThIMEgzNEg 7e3tIMnMySwgxMHWxSwg58XSwsHMwcrGIC0g1NXUDQrPws/Hwd3BwNTT0SDX08UsIMvUzyDSxcHM 2M7PINLBws/UwcXUIMLF2iDcy9PQzNXB1MHDyckgItfF0sjBzckiLCDLz9TP0tnIINrExdPYINDS z9PUzyDOxdQsIM7PIMkgwsXaINXSwdfOyczP18vJLCDLz87F3s7PKSwNCtPQz8vPys7ZyiDTz84g ySDQz8TPwsHA3cXFINbFzMHOycUsINTPIM3P1sXUxSDIzMHEzs/L0s/Xzs8g1cLJ18HU2CDT18/A IO7h5OX25PUg7uEg9ez1/vvl7unlIOvh/uXz9PfhIPbp+u7pLCDVzsne1M/WwdEg3NTPDQrQydPY zc8hDQoNCiAgICAg98HWzs8g2sHNxdTJ1NgsIN7UzyDIz9TRINDPxM/CztnKINDSz8XL1CDawdLP xMnM09Eg1yDz++EsIPfBzSDQ0sXEzMHHwcXU09EgxcfPINfB0snBztQg8O/s7u/z9PjgIOHk4fD0 6fLv9+Hu7vnqIOsNCvPw5ePp5unr5SDy9fPz6+/x+vn+7u/qIP7h8/TpIOnu9OXy7uX04SAyMDAy IMfPxMEhDQoNCiAgKioqDQoNCiAg4SD05fDl8vggLSAg7/D59CDi+ffh7O/n7yDy7/Pz6erz6+/n 7yDw7+z4+u/34fTl7PEgDQoNCg0KDQogICAgIPzUwSDQ0s/H0sHNzcEgxMXK09TXydTFzNjOzyDE xcrT1NfVxdQuIPbJ19UgzsUg1yDhzcXSycvFLCDBINcg8s/T08nJLCDJINPOwd7BzMEg0SDCz9HM 09EsIM7FIMLZzCDV18XSxc4sIMTFytPU18nUxczYzs8NCszJINzUzyDExcrT1NfVxdQsIMEg0M/U z83VLCDOxSDP1M7P08nM09EgyyDc1M/N1SDTxdLYxdrOzy4g4SDQz9TPzSDTy8Hawcwg08XCxTog IuEg0M/exc3VIM7F1D8iLiDzz9rEwcwgy8/bxczFyywg0M/Qz8zOycwNCsXHzyAsIMkg08TFzMHM INDF0sXXz8QsINrBy8Hawdcg08XCxSDexdTZ0sUgyc7T1NLVy8PJyS4g9yDUxd7FzsnJIDUtySDE zsXKINDPzNXeycwgycgg19PFyCDQzyBlLW1haWwuIOTPzNjbxSDX08XHzyDQ0snbzM/T2A0K1sTB 1Nggyc7T1NLVy8PJySA0LiDuzyDc1M8gySDQz87R1M7PLCDXxcTYINUg0NLPxMHXw8Eg3NTPx88s INDP08zFxM7Fx88g1dLP187RLCDU2dPR3skg2sHLwdrP1y4g99PFINPExczBzCDUz97OzyDQzw0K yc7T1NLVy8PJySwg3tTPwtkgwtnU2CDV18XSxc7O2c0sIMXTzMkg3NTPIMTFzM8gzsUg2sHSwcLP 1MHF1Cwg1M8g3NTPIM7FINDSyd7JzsEgzc/FyiDP28nCy8kgySDWxMHMLiANCg0KICAgICDxINfO yc3B1MXM2M7PINDSz97J1MHMINfTxSDQz8zV3sXOztnFIMnO09TS1cvDycksIMEgy8/HxMEg1drO wcwsIMvByyDX08UgzsHEzyDExczB1NgsIM7B3sHMINPXz8ogwsnazsXTLiDxIMnTy8HMDQrBxNLF 08Eg18XaxMUgySDTxMXMwcwg08XCxSDEzMnOztnKINPQydPPyyAozc7FINzUzyDC2czPIMTFytPU 18nUxczYzs8gyc7UxdLF087PLCDc1M8gwtnMzyDLwcsgzs/Xz8UgyM/CwsksIMkg0SDOxSDNz8cg zsnexcfPDQrQz9TF0tHU2CksIMvByyDOxc7P0s3BzNjO2cog0SDOwd7BzCDQz9PZzMHU2CBlLW1h aWwgzMDE0c0g1yDDxczPzSDT18XUxS4g5MXMwcwg0SDc1M8g0M/T1M/Rzs7PLCDJIMvB1sTZyiDE xc7YIMvPztTSz8zJ0s/XwcwNCtPXz8ogy8/bxczFyy4g8NLJzcXSzs8g3sXSxdogxMXO2CDOwd7B zMkg0NLJyM/EydTYINrBy8Ha2S4g5M8g08nIINDP0iDQz83OwCDUz9Qgzc/Nxc7ULCDLz8fEwSDP ws7B0tXWycwg0MXS19nKINrBy8HaLg0K7sXLz9TP0s/FINfSxc3RINEg0NLP09TPINPUz9HMIMkg zsUgzc/HIMTXycfB1NjT0TogIvzUzyDSwcLP1MHF1CEg/NTBINvU1cvBINrB0sHCz9TBzMEgzcHU 2CDFxSDUwcshIi4g8NLP29Ug0NLP3cXOydEg2sENCtfZ0sHWxc7JxSwgzs8g8SDC2cwgz97Fztgg 097B09TMydcsIMkgzsHewcwg0M/T2czB1Nggxd3FIMLPzNjbxSBlLW1haWwsINDP0dfJzNPRINPJ zNjOxcrbycog09TJzdXMIMsg0sHCz9TFLiANCg0KICAgICDuwSDTzMXE1cDdycogxMXO2CAtINDV 09TPyiDR3cnLIMkg087P18Eg0SDQz8TVzcHMLCDe1M8g3NTPIM7FIMLVxMXUINLBws/UwdTYLCDO zyDPy8HawczP09ggzsHPws/Sz9QuIO7BINPMxcTVwN3JyiDExc7YDQrRINDPzNXeycwgMyDawcvB 2sEsINcg1M/UINbFIM3PzcXO1CDRINDP08zBzCDMwMTRzSDJyCDJztPU0tXLw8nJLCDe1M/C2SDN z8fMySDUz9bFIMLZ09TSzyDawdLBws/UwdTYIMTFztjHySAoxMzRINPFwtEgySDEzNENCs3FztEp LiD6wSDE18UgzsXExczJLCDLwdbE2cogxMXO2CDRINPJxMXMINDSyc3F0s7PIDMwIM3JztXUINUg y8/N0NjA1MXSwSDJINDP09nMwcwg2sHLwdrZLiD3INTF3sXOyckgxNfVyCDOxcTFzNgg0SDQz8zV 3snMDQoyOSDawcvB2s/XIM7BIMnO09TS1cvDycAgIzEuIPDP1M/NINrBy8Ha2SDT1MHMySDQ0snI z8TJ1Ngg3sHdxSDJIMLZ09TSxcUsIMvB1sTVwCDOxcTFzMAg0SDQz8zV3sHMIM/Lz8zPINPUwSDa wcvB2s/XLCBhDQrExc7Yx8kg19PFINDP09TV0MHMySDOwSDNz8og097F1C4g9yDDxczPzSDRINrB 0sHCz9TBzCDPy8/MzyA2NC4wMDAsLSBVU0QuIPcg/PTvIO7l9+/67e/27u8g4vns7yDw7/fl8un0 +CEg7sEg0NLP28zPyiDOxcTFzMUNCtEgy9XQycwg08XCxSDOz9fVwCDUwd7L1SDJINzUzyDCzMHH z8TB0tEg0NLPx9LBzc3FLiDl08zJIMkg1MXQxdLYIPfZIM7FINrOwcXUxSwg3tTPIMTFzMHU2Cwg 1MHLINEg98HNIMfP18/SwA0K8F/vX/Bf8l/vX+Jf9V/qX/Rf5SDJIM7FINDP1sHMxcXUxS4g/NTP IPfB2yDbwc7TLCDF08zJIMXHzyDV0NXT1MnUxSwg1MHLIMLVxMXUxSDWwczF1Nggz8Ig3NTPzSDE zyDLz87DwSDWydrOySENCg0KICDuLiDyxcLSz9csIPLP09PJ0S4NCg0KDQoNCiAgKioqDQoNCiAg 4SDUxdDF0tgg18XSzsXN09EgyyDz9f3l8/T39SDExczBLiD3IN7FzSDTz9PUz8nUINDSxcTMwcfB xc3B0SDy4eLv9OE/DQoNCg0KDQogICAgIPfZIO/k6e4g0sHaINDPy9XQwcXUxSA0IMnO09TS1cvD yckgKMXdxSDPxM7BIC0gIs7VzMXXwdEiIMTP09TBzMHT2CD3wc0gwsXT0MzB1M7PIC0g3NTPIM7B 09TP0d3FxSDQydPYzc8pINDPINzUz8og08HNz8oNCsTF0dTFzNjOz9PUySAozc7Px8/V0s/XzsXN 1SDTxdTF18/N1SDNwdLLxdTJzsfVINcg6c7UxdLOxdTFKSDXINzMxcvU0s/Ozs/NINfJxMUgKNDP IDUgVVMkINrBIMvB1sTPxSkg1SA0Lcgg0sHaztnIIMzAxMXKIC0NCvfB28nIINDSxcTbxdPU18XO zsnLz9cg0M8gyc7Gz9LNwcPJz87Oz8ogw8XQz97LxS4g+sHUxc0sIPfZIOzg4u/lIN7J08zPINLB 2iDQ0s/EwcXUxSDc1Mkgyc7T1NLVy8PJySDX08XNINbFzMHA3cnNDQrQz8zY2s/XwdTFzNHNIOnO 1MXSzsXUwS4NCg0KICAgICDrwdbE2cogxMXO2CDOwSDPxM7PzSDUz8zYy88gbWFpbC5ydSAtINPB zc/NIMna18XT1M7PzSDXIPLP09PJySwgzs8gxMHMxcvPIM7FINPBzdnNIMzV3tvJzSAi0M/T1MHX 3cnLz80iIMLF09DMwdTO2cgNCtHdycvP1ywg3snTzM8g0M/M2NrP18HUxczFyiDXz9rSwdPUwcXU IMLPzMXFIN7FzSDOwSA2INTZ09HeLiDr0s/NxSDUz8fPLCDXINLV09PLz9Ha2d7Oz8og3sHT1Mkg 6c7UxdLOxdTBIM7F08/Qz9PUwdfJzc8NCs3FztjbwdEgy8/Oy9XSxc7DydEg1yDSxcvMwc3FLCDe xc0gLSDXIMHOx8zP0drZ3s7PyiAozsEgIs7B28kiINHdycvJINcgx8/EINDSycjPxMnUIM3Fztjb xSDSxcvMwc3ZLCDexc0gzsEgIsnIIiAtINcgxMXO2CkhDQoNCiAgICAg8MXSxcQg1MXNIMvByyDS xdvB1NgsIMjP1MnUxSDc1MnNINrBzsnNwdTY09EgyczJIM7F1Cwg0NLP3snUwcrUxSDTzMXE1cDd ycUgxsHL1Nkgz8Ig3NTPyiDQ0s/H0sHNzcUgLSD32ToNCg0KICAxLiDw8u/k4eX05SDw8u/k9ev0 LCDw8u/p+vfv5PP09+8g6+/07/Lv5+8g9+HtIO7p/uXn7yDu5SDz9O/p9CENCg0KICAyLiDw8u/k 4eX05SDw8u/k9ev0LCD08uHu8/Dv8vTp8u/36+Eg6+/07/Lv5+8g9+HtIO7p/uXn7yDu5SDz9O/p 9CENCg0KICAzLiDw8u/k4eX05SDw8u/k9ev0LCDy5evs4e3hIOvv9O/y7+fvIPfh7SDu6f7l5+8g 7uUg8/Tv6fQhDQoNCiAgNC4g6fPw7+z4+vXl9OUg8+ns9SDp7vTl8u7l9OEg6SBNVUxUSS1MRVZF TCBNQVJLRVRJTkchDQoNCiAgNS4g9+H76e0g5eTp7vP09+Xu7vntIPfr7OHk7+0sIOvy7+3lIO7h /uHs+O7v6iDp7vfl8/Tp4+npINcgMjAgVVMkIPH37PHl9PPxIPTv7Pjr7yD34fvlIPfy5e3xICjL wcsgzMnezs/FLCDUwcsgyQ0K0M/Ey8zA3sXOydEgyyDpztTF0s7F1NUpIQ0KDQogIDYuIPfl8/gg +uHy4eLv9O/rLCDr7/Tv8vnqIPf5IPDv7PX+6fTlLCDx9+zx5fTz8SD+6fP07+og8PLp4vns+OAh DQoNCiAgNy4g/PThIPDy7+fy4e3t4SDu4ffz5efk4SDp+u3l7un0IPfh+/Ug9un67vghDQoNCiAg KioqDQoNCiAg9+/0LCD+9O8sIOvv7uvy5fTu7ywg7vX27u8g4vXk5fQg9+HtIPPk5ezh9PggKOnu 8/Ty9evj6fEgIzApOg0KDQogIDEuIPPLz9DJ0s/XwdTYINDP08zFxM7AwCDXxdLTycAgKM7FIM3F zsXFLCDexc0gMikg0NLPx9LBzc3ZIFdlYk1vbmV5IGtlZXBlciDOwSDTwcrUxSBodHRwOi8vd3d3 LndlYm1vbmV5LnJ1LyD0wc0g1sUg99kNCs7BysTF1MUgz9DJ08HOycUg0sHCz9TZINMgy8/bxczY y8/NIMkgyc7Gz9LNwcPJwCDPwiDc1M/KINPJ09TFzcUg0MzB1MXWxcouDQoNCiAgMi4g8M/Qz8zO ydTYIPP37+ogWiDLz9vFzMXLIDIwIFVTJCAoyczJINPOwd7BzMEgUiDLz9vFzMXLIC0gxdPMySDX IPfB28XNIMfP0s/ExSDOxdQg0NLFxNPUwdfJ1MXM2NPU18EgV2ViTW9uZXksIMEgwsHOy8kNCs/U y8Ha2dfBwNTT0SDP09XdxdPU18zR1Ngg0MXSxdfPxNkg1yBVUyQsIMEg2sHUxc0gz8LNxc7R1Ngg 0tXCzMkgzsEgVVMkINfO1dTSySDTwc3PyiDTydPUxc3ZICjULsUuINDF0sXT1MkgxMXO2MfJINPP INPXz8XHzw0KUi3Lz9vFzNjLwSDOwSBaICkuIPDPxNLPws7PxSDP0MnTwc7JxSDc1MnIIM/QxdLB w8nKIC0gzsEgaHR0cDovL3d3dy53ZWJtb25leS5ydS9ydXMvcGVyZXZvZHMuaHRtLw0KDQogIDMu IPDP08zFINDP09TV0MzFzsnRIMTFzsXHINcg98HbIMvP28XMxcssINrBy8HawdTYINPFwsUg19PF IN7F1NnSxSDJztPU0tXLw8nJLCDQ1dTFzSDQxdLF18/EwSBXZWJNb25leSDJ2iDT18/Fx88gy8/b xczYy8ENCtcgy8HWxNnKIMnaIDQtyCDLz9vFzNjLz9cg0NLPxMHXw8/XLCDT1c3N2SA0Ljk2IFdN WiAoMC44JSAtIMvPzcnT08nPzs7ZxSDTwc3PyiDTydPUxc3ZINrBIM/T1d3F09TXzMXOycUg0MXS xdfPxMEgxMXOxccpLCDawQ0Ky8HWxNXAIMnO09TS1cvDycAuIPfh9u7vISDl08zJIPfZINDP0M/M zsnUxSDT18/KIMvP28XMxcsg0s/Xzs8gMjAkIMkg0MXSxdfFxMXUxSDSz9fOzyDQzyA1JCDawSDJ ztPU0tXLw8nJIDEtMywg1M8g1SD3wdMNCs/LwdbF1NPRIM7FxM/T1MHUz97OzyDExc7FxyDEzNEg 0MXSxdfPxMEg2sEgyc7T1NLVy8PJwCAjNCENCg0KICAqIPDy6e3l/uHu6eU6DQoNCiAgKvcgy8/b xczYy8UsIMTFztjHySDI0sHO0dTT0SDXINfJxMUg1dPMz9fO2cggxcTJzsnDIChXZWJNb25leSku 8M8gy9XS09UgMVdNID0gMSDS1cIuIMTM0SBSLSDLz9vFzNjLwSwgMVdNID0gMSDEz8zMwdIg8/vh DQrEzNEgWi0gy8/bxczYy8EuDQoNCiAgKuvPx8TBINPExczBxdTFINPXz8og2sHLwdosINXCxcTJ 1MXT2Cwg3tTPINfZINrBy8HawczJINfTxSDJztPU0tXLw8nJLiD308Ugz87JINDPzsHEz8LR1NPR IMTM0SDUz8fPLCDe1M/C2SD32SDTz8jSwc7JzMkg1Q0K08XC0SDXIMvPzdDYwNTF0sUgKMkgzsEg xMnTy8XUxSwgxMzRIM7BxMXWzs/T1MkpIN7Uz8LZINDP1M/NIPfZIM3Px8zJINDSz8TB18HU2CDL z9DJyS4g98HNIMTFytPU18nUxczYzs8gztXWztkg19PFINzUyQ0Kyc7T1NLVy8PJyS4g5dPMySDV IPfB0yDOxSDC1cTF1CDI18HUwdTYIM/Ezs/KIMnaIM7JyCwg99kgzsUg083P1sXUxSDP09XdxdPU 18zR1Ngg0sHT09nMy9UuIPTF0MXS2CDc1M8g98HbINTP18HSLCDTINDSwdfPzQ0K0NLPxMHWySEg 78LR2sHUxczYzs8sINXLwdbJ1MUg1yDQz8zFIMvPzc3FztTB0snRIM7PzcXSIMnO09TS1cvDyckg ySDT18/KIEUtbWFpbCDBxNLF0y4NCg0KICDw0snNxdIg08/Pwt3FzsnRLCDQ0snMwcfBxc3Px88g yyDQxdLF18/E1SDExc7FxyDexdLF2iBXZWJNb25leTogIsnO09TS1cvDydEgIzE7IEUtbWFpbCB4 eHh4eHh4eHh4eHh4QHh4eHh4eC54eCINCg0KICD0wcLMycPBOg0KDQogIMnO09TS1cvDydEgIyAg ICAvICAgLiDLz9vFzMXLDQoNCiAgICAgICAgICAgICAgICAgICAxIC8gWjM0MTEzODQ2MjE3OA0K DQogMiAvIFo4MzE4MDA0MDQxMTgNCg0KICAgICAgICAgICAgICAgICAgIDMgLyBaODU1Njc4MzI2 NDQ1DQoNCiAgICAgICAgICAgICAgICAgICA0IC8gWjQ1MjkyNTA2NjExNA0KDQogDQoNCg0KDQog IPfu6e3h7unlISEhDQoNCiAg7sUgzsHQ0sHXzNHK1MUg98HbySDXz9DSz9PZIMkg0M/E1NfF0tbE xc7J0SDP0MzB1Nkg0yDQz83P3djAIMvOz9DLySAiz9TXxdTJ1Nggz9TQ0sHXydTFzMAiLCAiz9TX xdTJ1NggzsEg19nC0sHOzs/FDQrQydPYzc8iLCDJzMkgInJlcGx5IiDOwSDR3cnLLCDTIMvP1M/S z8fPIPfZINDPzNXeyczJIMTBzs7PxSDQydPYzc8gLSDXINDSz9TJ187PzSDTzNXewcUgz87PINDS z9PUzyDOxSDC1cTF1CDQ0s/eydTBzs8sIMENCsHE0sXT1crUxSDJyCD07+z46+8g1yDXycTFINPP z8Ldxc7J0SDXINPJ09TFzcUgV2ViTW9uZXkuDQoNCiAgNC4g9yD0wcLMycPFLCDVxMHMydTFINPU 0s/L1Swg08/P1NfF1NPU19XA3dXAIMnO09TS1cvDyckgIzQuIOnazcXOydTFIM7PzcXSwSDJztPU 0tXLw8nKIDMgLSDOwSA0LCAyIC0gzsEgMyDJIDEgLSDOwSAyLCDOxQ0KzcXO0dEgy8/bxczYy8/X INcg08/P1NfF1NPU19XA3cnIINPU0s/Lwcgg9MHCzMnD2S4g5M/CwdfY1MUg1yDUwcLMycPVICjT 18XSyNUpINPU0s/L1SAxINPPIPP37+ntIMvP28XM2MvPzS4g8M/Nxc7RytTFIM3PyiBXTQ0KycTF ztTJxsnLwdTP0iDOwSDz9+/qINcg0M/TzMXEzsXNINPPxMXS1sHUxczYzs/NIMHC2sHDxSDEwc7O z8fPINDJ09jNwS4g7sHLz87Fwywg2sHNxc7J1MUgzc/FIMnN0SDOwSD3wdvFINcg08HNz80gzsHe wczFIMkNCsvPzsPFINDJ09jNwS4g8MnT2M3PIMTM0SD3wdvFyiDSwdPT2czLySDHz9TP188hIPTF 0MXS2CD32SDT1MHMySDQ0s/EwdfDz80gyc7T1NLVy8PJySAjMS4NCg0KICD34fbu7yEhIQ0KDQog ICAgIO7FIM3FztHK1MUgzs/NxdLBIMvP28XM2MvP1ywgy8/Uz9LZxSDOwcjPxNHU09Eg1yD0wcLM ycPFLCDOycvBy8nNINPQz9PPws/NLCDL0s/NxSDP0MnTwc7Oz8fPINcg0NXOy9TFIDQsIMnOwd7F DQrQz9TF0tHF1MUgws/M2NvVwCDewdPU2CDT18/JyCDEz8jPxM/XLiDrz8fEwSDQz8rNxdTFLCDL wcsg3NTPIMTFytPU19XF1Cwg98HNINPSwdrVINPUwc7F1CDQz87R1M7PLCDQz97FzdUg3NTPINDF 0sXT1MHF1A0K0sHCz9TB1NgsIMvPx8TBIN7Uzy3OycLVxNggydrNxc7J29ggzsUg0M8g0NXOy9TV IDQgKNcgV2ViTW9uZXkgzc/Wzs8gzMXHy88g0NLP18XSydTYIM7FIMLZzMEgzMkg0NLPydrXxcTF zsEg0M/EzcXOwSkuDQoNCiAgICAg7sUgxMXMwcrUxSDOycvBy8nIIMnazcXOxc7JyiDXIOnu8/Ty 9evj6ekhISENCg0KICAgICD3wdsgxsnOwc7Tz9fZyiDXy8zBxCDXIMTBzs7PxSDQ0sXE0NLJ0dTJ xSDR18zRxdTT0SDQ0sHL1MnexdPLySDOyd7Uz9bO2c0uIO7FIM/QwdPBytTF09gsIN7UzyD3wc0g zsUg19nbzMDUIMnO09TS1cvDyckNCi0g3NTPIMLZzM8gwtkgx8zV0M8g088g09TP0s/O2SDQ0s/E wdfDz9cgLSDQ0sXE28XT1NfFzs7Jy8/XINcgyc7Gz9LNwcPJz87Oz8ogw8XQz97LxS4g787JICjL 0s/NxSDQ0s/EwdfDwSDJztPU0tXLw8nJICM0LCDOzw0Kz84g08HN2cogws/HwdTZyiDJIM/CzcHO 2dfB1Nggxc3VIM7F1CDTzdnTzMEpIMvSz9fOzyDawcnO1MXSxdPP18HO2SDXIM3By9PJzcHM2M7P zSDV09DFyMUg09fPycgg0M/TzMXEz9fB1MXMxcosINQuyy4gycgNCsvP28XM2MvJIMLVxNXUIMbJ x9XSydLP18HU2CDXINLB09PZzMHFzc/NIPfBzckg0MnT2M3FLiDr0s/NxSDUz8fPLCDQ0s/EwdfD 2SDJztPU0tXLw8nKIM7J3sXHzyDOxSDUxdLRwNQgz9TQ0sHXzNHRIMnIIPfBzS4g4Q0Kz8LNwc7V 1ywgz87JINLJ08vVwNQg0NLFy9LB1MnU2CDc1NUg09fPwCDExdHUxczYzs/T1Nggydot2sEg98Hb xcog1sHMz8LZINcgwcTNyc7J09TSwcPJwCBXZWJtb25leS4NCg0KICAgICDr1dDJ1yDX08UgNCDJ ztPU0tXLw8nJLCD32SDCxdPQzMHUzs8g0M/M1d7BxdTFIDIg08/XxdLbxc7OzyDOxc/CyM/Eyc3Z xSD3wc0g0NLPx9LBzc3ZINPCz9LBIMHE0sXTz9cgySDNwdPTz9fPyg0K0sHT09nMy8kgz8TJzsHL z9fZyCDQydPFzSwgy8/Uz9LZxSwgxdPMySD32SDC1cTF1MUgycgg0M/L1dDB1Ngg08HNz9PUz9HU xczYzs8sIM/Cz8rE1dTT0SD3wc0sIMvByyDNyc7JzdXNLCDXINTFINbFIDIwIFVTJCwNCt7UzyDJ INfTxSDJztPU0tXLw8nJLg0KDQogIO/i8frh9OXs+O7vIPDy7/fl8vj05SDw8uH36ez47u/z9Pgg 6frt5e7l7unxIPTh4uzp4/khISENCg0KICAgICDw0s/XxdLY1MUsIM/Tz8LFzs7PINfOyc3B1MXM 2M7PLCDQ0sHXyczYzs/T1Ngg1cvB2sHOydEgzs/NxdLBIMvP28XM2MvBINDSySDQxdLF18/ExS4g /NTPIM/exc7YINfB1s7PLCDUwcsgy8HLINDPy8EgzsUNCtrB0MzB1MnUxSDQ0sHXyczYzs8sINrB y8HaIM7FINDSycTF1CwgwSD32SDOxSDQz8zV3snUxSDT18/AIMnO09TS1cvDycAuIO7BysTJ1MUg 19LFzdEsIN7Uz8LZIPfZINPNz8fMySDTxMXMwdTYINfTxSDQ0sHXyczYzs8NCskgzsUg1M/Sz9DR 09gsINDP1M/N1SDe1M8g3NTPIM/Tzs/XwSD3wdvFx88g2sHSwcLP1MvBINcg3NTPzSDQ0s/Fy9TF Lg0KDQogICAgIPfTxcfEwSwgy8/HxMEg98HbIMvP28XMxcsg0NLPxNfJx8HF1NPRINfOydog0M8g 09DJ08vVLCD32SDQz8zV3sHF1MUg2sHLwdogzsEg08zFxNXA3dXAIMnO09TS1cvDycAsINDP3NTP zdUgzc/WxdTFDQrP1NPMxcTJ1Ngg09fPxSDQ0s/E18nWxc7JxSwg0M8g1M/N1Swgy8HL1cAgyc7T 1NLVy8PJwCDP1CD3wdMg2sHLwdrZ18HA1CDMwMTJISDw0skg3NTPzSDeydPMzyD3wdvJyCDQz8vV 0MHUxczFyiwgwSDT1MHMzyDC2dTYDQrJIMTPyM/ELCDXz9rSwdPUwcXUINcgx8XPzcXU0snexdPL z8og0NLPx9LF09PJySEg5dPMySDQz9bFzMHF1MUgxd3FINDP19nTydTYINPXz8ogxM/Iz8QsINTP INDSz9PUzyDQz9PZzMHK1MUgzsHT1M/R3cXFDQrQydPYzc8gKNMg1cvB2sHOztnNySDXINAuNCDF x88gyc7T1NLVy8PJySAjMCDJ2s3FzsXOydHNySwgy8/Oxd7Ozykg0M8gzs/Xz83VINPQydPL1SDB xNLF08/XLiD0wcsg99kgzsHezsXUxSDXxdPYINDSz8PF09MNCtPOwd7BzMEuDQoNCiAgICAg5dPM ySDQz9PMxSDXzsnNwdTFzNjOz8fPINDSz97Uxc7J0SDOwdPUz9HdxcfPINDJ09jNwSDVIPfB0yDP 09TBzMnT2CDXz9DSz9PZIC0g2sHEwdfBytTFIMnILCDOzyDQydvJ1MUgzsUgzsEgRS1tYWlsLCDP 1A0Ky8/Uz9LPx88g99kg0M/M1d7JzMkgxMHOzs/FINDJ09jNzyAoz84gyyDUz83VINfSxc3Fzskg 0M/e1MkgzsHXxdLO0cvBIMLVxMXUIMzJy9fJxMnSz9fBziksIMEg9O/s+OvvIM7BIFdlYk1vbmV5 DQoo0M/TzMXEz9fB1MXM2M7PIN3FzMvO1dTYINDPICLNxc7AIiwgItPPz8Ldxc7J0SIgySAiz9TQ 0sHXydTYIikgySwg1sXMwdTFzNjOzywg9O/s+OvvINDSz8TB18PVIMnO09TS1cvDyckgIzEgKM/T 1MHM2M7ZxSDU0s/FDQotINcgx8/SwdrEzyDCz8zY28XKINPUxdDFzskgLSDXz9rSwdPUwcDdxcog 1yDHxc/NxdTSyd7F08vPyiDQ0s/H0sXT08nJIC0g0MXSxcfS1dbFztkgz9TQ0sHXy8/KIMnO09TS 1cvDycopLCDOxSDawcLZ18HRDQrVy8HawdTYIPP37+ogRS1tYWlsIMTM0SDT19HayS4g7c/KIFdN IMnExc7UycbJy8HUz9IgKN7Uz8LZIPfBzSDT0MXDycHM2M7PIM7FINXazsHXwdTYIMXHzyAi08nN 1czJ0tXRIiDQxdLF18/EIMTFzsXHKSAtDQoyNzEzMDIxNzM2MjUuDQoNCiAgICAg8SDPwtHawdTF zNjOzyDP1NfF3tUuDQoNCiAgICAg9sXMwcAg19PFx88gzsHJzNXe28XHzyEg9sTVINrBy8Haz9cg ySDXz9DSz9PP1yENCg0KICDzIMnTy9LFzs7JzSDV18HWxc7Jxc0sDQoNCiAg7snLz8zByg0KICAN Cg0KLS0gDQpCZXN0IHJlZ2FyZHMsDQogTmlrb2xhaSAgICAgICAgICAgICAgICAgICAgICAgICAg bWFpbHRvOnRyYXN0bWVAaG9ua29uZy5jb20= From jim@zope.com Tue Feb 26 21:50:21 2002 From: jim@zope.com (Jim Fulton) Date: Tue, 26 Feb 2002 16:50:21 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFCDE.F920078F@zope.com> <200202262137.g1QLbWq20272@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7C031D.4376D05A@zope.com> Guido van Rossum wrote: > > [me] > > > I wonder how often this is needed. The only occurrences of year() in > > > the entire Zope source that I found are in various test routines. > > [Jim] > > These methods and others are used a lot in presentation code, > > which tends to be expressed in DTML or ZPT. > > > > It's not uncommon to select/catagorize things by year or month. > > > > I think most people would find individual date-part methods > > a lot more natural than tuples. > > OK, that explains a lot. For this context I agree, although I think > they should probably appear as (computed) attributes rather than > methods. Properties seem perfect. That's fine with me. > > > I imagine > > > that once we change strftime() to accept an abstract time object, > > > you'll never need to call either timetuple() or year() -- strftime() > > > will do it for you. > > > > Maybe, if I use strftime, but I don't use strftime all that much. > > Maybe you should. :-) I do when I can. But it often doesn't meet my needs. > > I can certainly think of even formatting cases (e.g. internationalized > > dates) where it's not adequate. > > Then a super-strftime() should be invented that *is* enough, rather > than fumbling with hand-coded solutions. I think we don't need a one-size-fits-all all-powerful date-time formating solution. ;) Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From guido@python.org Tue Feb 26 22:01:59 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 17:01:59 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 16:50:21 EST." <3C7C031D.4376D05A@zope.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFCDE.F920078F@zope.com> <200202262137.g1QLbWq20272@pcp742651pcs.reston01.va.comcast.net> <3C7C031D.4376D05A@zope.com> Message-ID: <200202262201.g1QM1x620487@pcp742651pcs.reston01.va.comcast.net> > I think we don't need a one-size-fits-all all-powerful date-time > formating solution. ;) It's probably impossible to create one, but I think there's also no reason to require people to invent the wheel over and over. I've seen enough broken code attempting to do date/time formatting that I strongly prefer the creation of a few standard solutions that will work for most people, rather than only giving people the low-level bits to work with. Another thing to consider is that for most apps, the choice of the date/time format should be taken out of the hands of the programmer and placed into the hands of the user, through some kind of preference setting. I18n and L10n also strongly suggests to take this route. --Guido van Rossum (home page: http://www.python.org/~guido/) From Jack.Jansen@oratrix.com Tue Feb 26 22:04:16 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Tue, 26 Feb 2002 23:04:16 +0100 Subject: [Python-Dev] Adding lots of basic types to Python Message-ID: I can think of a lot more basic types that we could add to Python that would make as much sense as currencies: pixels, points, geometric figures, (audio) samples, images, ... But: aren't we really trying to standardise interfaces? The main benefit of a basic currency type would be that it defines the set of operations allowed on it (add, subtract are fine, divide gives a normal number, multiply isn't allowed) much more than code sharing. Note that I do think standardised interfaces would be a great thing, and if there was a common Python "pixel" interface that would free up quite a lot of my brain cells that are now used for remembering the 4 or 5 different pixel interfaces that I use regularly, but I'm not sure that a standard Python pixel type is the solution. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From guido@python.org Tue Feb 26 22:06:14 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 17:06:14 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 22:26:47 +0100." <3C7BFD97.B69DDDFD@lemburg.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFD97.B69DDDFD@lemburg.com> Message-ID: <200202262206.g1QM6En20549@pcp742651pcs.reston01.va.comcast.net> > FWIW, mxDateTime exposes these values as attributes -- there > is no call overhead. Good, I think this is the way to go. (Of course there will be some C-level call overhead if we make these properties.) > > Serious question: what do you tend to do with time values? I imagine > > that once we change strftime() to accept an abstract time object, > > you'll never need to call either timetuple() or year() -- strftime() > > will do it for you. > > Depends on the application space. Database applications > will call .timetuple() very often and use strftime() hardly > ever. What does a database app with the resulting tuple? --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Feb 26 22:15:22 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 23:15:22 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <3C7BEC03.E8225CFD@lemburg.com> <079f01c1bf07$11963ee0$ced241d5@hagrid> Message-ID: <3C7C08FA.C6375FD7@lemburg.com> Fredrik Lundh wrote: > > mal wrote: > > > > Doesn't the proposal sort of imply time-zone > > > awareness of some kind? Or does it simply imply > > > UT storage? > > > > I tried that in early version of mxDateTime -- it fails > > badly. > > can you elaborate? First of all, the C lib only support UTC and local time, so you don't really have a chance of correctly converting a non-local time using a different time zone in either local time or UTC: there simply are no C APIs you could use and the problems which DST and leap seconds introduces are no fun at all (they are fun to read though: figuring out the various DST switch times is an adventure -- just have a look at the C lib's DST files). The next problem is that the C lib only provides APIs for conversion from local time to UTC, but not UTC to local time. There is an API called timegm() for this on some platforms, but its non-standard. As a result, making UTC the default won't allow you to safely represent the datetime value in local time. A third obstacle is typcial user assumptions: users simply assume local time and it's hard to tell them otherwise (people have very personal feelings about date and time for some reason...). > > > Does this imply leap second hell, or will we > > > simply be vague about expectations? > > > > The type will store a fixed point in time, so why > > worry about leap seconds (most system's don't support these > > anyway and if they do, the support is usually switched off per > > default) ? > > the updated proposal adds __hash__ and __cmp__, and > the following (optional?) operations: > > deltaobject = timeobject - timeobject > floatobject = float(deltaobject) # fractional seconds > timeobject = timeobject + integerobject > timeobject = timeobject + floatobject > timeobject = timeobject + deltaobject > > note that "deltaobject" can be anything; the abstract type > only says that if you manage to subtract one time object from > another one of the same type, you get some object that you > can 1) convert to a float, and 2) add to another time object. > > vague, but pretty useful. Indeed :-) > > > I'd also like to see simple access methods for year, > > > month, day, hours, minutes, and seconds, with date parts > > > being one based and time parts being zero based. > > > > In the abstract base type ? > > Q. does mxDateTime provide separate accessors for individual > members? Yes, it provides access to these in form of attributes. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Tue Feb 26 22:18:22 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 23:18:22 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFD97.B69DDDFD@lemburg.com> <200202262206.g1QM6En20549@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7C09AE.54D16E2D@lemburg.com> Guido van Rossum wrote: > > > FWIW, mxDateTime exposes these values as attributes -- there > > is no call overhead. > > Good, I think this is the way to go. (Of course there will be some > C-level call overhead if we make these properties.) Right. > > > Serious question: what do you tend to do with time values? I imagine > > > that once we change strftime() to accept an abstract time object, > > > you'll never need to call either timetuple() or year() -- strftime() > > > will do it for you. > > > > Depends on the application space. Database applications > > will call .timetuple() very often and use strftime() hardly > > ever. > > What does a database app with the resulting tuple? It puts the values into struct fields for year, month, day, etc. (Databases usually avoid using Unix ticks since these cause the known problems with dates prior to 1970) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim@zope.com Tue Feb 26 22:23:59 2002 From: tim@zope.com (Tim Peters) Date: Tue, 26 Feb 2002 17:23:59 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library Message-ID: [Guido] > ... > Another thing to consider is that for most apps, the choice of the > date/time format should be taken out of the hands of the programmer > and placed into the hands of the user, through some kind of preference > setting. I18n and L10n also strongly suggests to take this route. I'm sure nobody wants to admit this , but in sheer numbers, nobody has more experience with this stuff than Microsoft. If you sit at your Windows box and go to Start -> Settings -> Control Panel -> Regional Settings, you'll get a tabbed dialog for specifying the format of number, currency, time, and date displays. A Windows app that ignores the settings here is considered to be broken (and rightly so). Idiosyncratic formats for user-visible number/currency/date/time info is going to become an increasingly Bad Idea on other OSes too. From guido@python.org Tue Feb 26 22:26:19 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 17:26:19 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 23:18:22 +0100." <3C7C09AE.54D16E2D@lemburg.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFD97.B69DDDFD@lemburg.com> <200202262206.g1QM6En20549@pcp742651pcs.reston01.va.comcast.net> <3C7C09AE.54D16E2D@lemburg.com> Message-ID: <200202262226.g1QMQKG20726@pcp742651pcs.reston01.va.comcast.net> > > What does a database app with the resulting tuple? > > It puts the values into struct fields for year, month, day, etc. > (Databases usually avoid using Unix ticks since these cause > the known problems with dates prior to 1970) Hm, I thought that databases have their own date/time types? Aren't these used? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Feb 26 22:29:36 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 17:29:36 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 17:23:59 EST." References: Message-ID: <200202262229.g1QMTat20771@pcp742651pcs.reston01.va.comcast.net> > Idiosyncratic formats for user-visible number/currency/date/time > info is going to become an increasingly Bad Idea on other OSes too. Oh, so it'll be at least another 10 years before the same wisdom reaches the typical web application... :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Feb 26 22:36:01 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 23:36:01 +0100 Subject: [Python-Dev] proposal: add basic money type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> <3C7BF117.D7417F7D@lemburg.com> <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net> <3C7BF8C7.DECB7D38@lemburg.com> <200202262116.g1QLGVN20144@pcp742651pcs.reston01.va.comcast.net> <3C7BFEEF.16BDC177@lemburg.com> <200202262140.g1QLehH20305@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7C0DD1.A71FDEB2@lemburg.com> Guido van Rossum wrote: > > > Indeed, monetary types solve different problems than decimal > > types. Financial applications do have a need for these kind > > of implicit error checks. > > But this is easily done by creating a custom class -- which has the > advantage that the set of constraints can be specialized to the needs > of a specific application. When we add a monetary type to the > language we'll never get it right for all apps. OTOH, I think we > could get a fixed point type right. True. A real implementation of a good working decimal type with adjustable rounding rules would certainly go a long way and the money type could be built on top of it. > What support for money does SQL have? SQL-92 doesn't have support for it, but some modern database engines do, e.g. MS SQL Server, PostgreSQL (even though it's deprecated there, now), MS Access. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Tue Feb 26 22:38:49 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 26 Feb 2002 23:38:49 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFD97.B69DDDFD@lemburg.com> <200202262206.g1QM6En20549@pcp742651pcs.reston01.va.comcast.net> <3C7C09AE.54D16E2D@lemburg.com> <200202262226.g1QMQKG20726@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7C0E79.AB97C994@lemburg.com> Guido van Rossum wrote: > > > > What does a database app with the resulting tuple? > > > > It puts the values into struct fields for year, month, day, etc. > > (Databases usually avoid using Unix ticks since these cause > > the known problems with dates prior to 1970) > > Hm, I thought that databases have their own date/time types? Aren't > these used? At C level, interfacing is usually done using structs (ISO SQL/CLI defines these). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From barry@zope.com Tue Feb 26 22:38:28 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 26 Feb 2002 17:38:28 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library References: Message-ID: <15484.3684.503252.531036@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: TP> [Guido] >> ... Another thing to consider is that for most apps, the >> choice of the date/time format should be taken out of the hands >> of the programmer and placed into the hands of the user, >> through some kind of preference setting. I18n and L10n also >> strongly suggests to take this route. TP> I'm sure nobody wants to admit this , but in sheer TP> numbers, nobody has more experience with this stuff than TP> Microsoft. If you sit at your Windows box and go to Start -> TP> Settings -> Control Panel -> Regional Settings, you'll get a TP> tabbed dialog for specifying the format of number, currency, TP> time, and date displays. A Windows app that ignores the TP> settings here is considered to be broken (and rightly so). TP> Idiosyncratic formats for user-visible TP> number/currency/date/time info is going to become an TP> increasingly Bad Idea on other OSes too. I've not been following this thread at all, so apologies if this has been brought up already. The localization context should not (always) be taken from the user environment. In systems like web-based services, the context will instead be relative to the person/entity making the remote request, so we have to be able to explicitly specify the localization context, or at least query, modify, and restore some global context. -Barry From tim@zope.com Tue Feb 26 22:39:59 2002 From: tim@zope.com (Tim Peters) Date: Tue, 26 Feb 2002 17:39:59 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: <200202262229.g1QMTat20771@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Tim] > Idiosyncratic formats for user-visible number/currency/date/time > info is going to become an increasingly Bad Idea on other OSes too. [Guido] > Oh, so it'll be at least another 10 years before the same wisdom > reaches the typical web application... :-) Optimist. From guido@python.org Tue Feb 26 22:45:30 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 17:45:30 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 17:38:28 EST." <15484.3684.503252.531036@anthem.wooz.org> References: <15484.3684.503252.531036@anthem.wooz.org> Message-ID: <200202262245.g1QMjUf23970@pcp742651pcs.reston01.va.comcast.net> > I've not been following this thread at all, so apologies if this has > been brought up already. No, but unclear if it's relevant. > The localization context should not (always) be taken from the user > environment. In systems like web-based services, the context will > instead be relative to the person/entity making the remote request, so > we have to be able to explicitly specify the localization context, or > at least query, modify, and restore some global context. Sure. So the interface may be different. The main argument (that you shouldn't be using t.year() to format dates) remains the same. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Tue Feb 26 22:48:24 2002 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 26 Feb 2002 17:48:24 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library References: <15484.3684.503252.531036@anthem.wooz.org> <200202262245.g1QMjUf23970@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15484.4280.89844.484487@anthem.wooz.org> >>>>> "GvR" == Guido van Rossum writes: >> The localization context should not (always) be taken from the >> user environment. In systems like web-based services, the >> context will instead be relative to the person/entity making >> the remote request, so we have to be able to explicitly specify >> the localization context, or at least query, modify, and >> restore some global context. GvR> Sure. So the interface may be different. The main argument GvR> (that you shouldn't be using t.year() to format dates) GvR> remains the same. Doesn't Java have separate formatting objects? You decide which format object you need based on the localication context, then you pass in the timestamp/date/money/whatever thingie and the format object knowws how to render that data representation in the appropriate localization. makes-sense-to-me-ly y'rs, -Barry From pedroni@inf.ethz.ch Tue Feb 26 22:36:28 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Tue, 26 Feb 2002 23:36:28 +0100 Subject: [Python-Dev] Mmh, unrelated "silly" test suite patch Message-ID: <081501c1bf16$0841cbc0$6d94fea9@newmexico> Hi, mmh, kind of break strange as it may seem (CVS homework results) in 1994 Guido added some tests for the tuple built-in to the test suite but this one has never grown some explicit tests for the basic behavior of list(SEQ). Maybe there is some esoteric reason for this, but anyway I have just posted a patch to SF https://sourceforge.net/tracker/index.php?func=detail&aid=523169&group_id=5470& atid=305470 with tests for list() and an amended test for tuple(), in particular they try to check: list2 = list(LIST) list2 is not LIST and list2 == LIST tuple(TUPLE) is TUPLE also the documented behavior. Yup, there is no hurry to check this in, I have written this because yesterday the lacking test has burned "badly" your Jython brothers [we don't have a time machine on our side :(] and for the benefit of the future generations of Python re-implementers. The new tests pass Python 2.2. hoping-that-doing-this-so-late-in-game-will-not-cause-some-kind-of-karmic- unbalance-also-because-I-have-just-bumped-the-#-of-patches-from-sacred-128- to-129-ly y'rs - Samuele. From guido@python.org Tue Feb 26 22:53:07 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 17:53:07 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 23:38:49 +0100." <3C7C0E79.AB97C994@lemburg.com> References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFD97.B69DDDFD@lemburg.com> <200202262206.g1QM6En20549@pcp742651pcs.reston01.va.comcast.net> <3C7C09AE.54D16E2D@lemburg.com> <200202262226.g1QMQKG20726@pcp742651pcs.reston01.va.comcast.net> <3C7C0E79.AB97C994@lemburg.com> Message-ID: <200202262253.g1QMr7J24039@pcp742651pcs.reston01.va.comcast.net> > > Hm, I thought that databases have their own date/time types? Aren't > > these used? > > At C level, interfacing is usually done using structs (ISO SQL/CLI > defines these). Ah, there's another requirement that we (or at least I) almost forgot. There should be an efficient C-level interface for the abstract date/time type. This stuff is hairier than it seems! I think the main tension is between improving upon the Unix time_t type, and improving upon the Unix "struct tm" type. Improving upon time_t could mean to extend the range beyond 1970-2038, and/or extend the precision to milliseconds or microseconds. Improving upon struct tm is hard (it has all the necessary fields and no others), unless you want to add operations (just add methods) or make the representation more compact (several of the fields can be packed in 4-6 bits each). A third dimension might be to provide better date/time arithmetic, but I'm not sure if there's much of a market for that, given all the fuzzy semantics (leap seconds, differences across DST changes, timezones). --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Tue Feb 26 23:00:09 2002 From: skip@pobox.com (Skip Montanaro) Date: Tue, 26 Feb 2002 17:00:09 -0600 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: References: Message-ID: <15484.4985.471940.886541@12-248-41-177.client.attbi.com> Tim> .... If you sit at your Windows box and go to Start -> Settings -> Tim> Control Panel -> Regional Settings, you'll get a tabbed dialog for Tim> specifying the format of number, currency, time, and date displays. I'm gonna go ever so slightly out on a limb here and make a wild-ass guess here that Apple probably had this functionality before Microsoft and that like on Windows, all well-behaved Mac applications had to use the user's settings. Maybe this abstract time object's strftime method (or time.strftime) should grow format specifiers for the user-specified date and time... Tim> Idiosyncratic formats for user-visible number/currency/date/time Tim> info is going to become an increasingly Bad Idea on other OSes too. Of course, neither Apple's nor Microsoft's efforts in this area will help the poor person trying to emit a dynamic web page containing "correctly" formatted dates. You still have to guess or just fall back to something most everyone can deduce. can-we-squeeze-it-into-http-2.0?-ly, y'rs, Skip From tim@zope.com Tue Feb 26 22:59:41 2002 From: tim@zope.com (Tim Peters) Date: Tue, 26 Feb 2002 17:59:41 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: <15484.3684.503252.531036@anthem.wooz.org> Message-ID: [Barry] > I've not been following this thread at all, so apologies if this has > been brought up already. > > The localization context should not (always) be taken from the user > environment. In systems like web-based services, the context will > instead be relative to the person/entity making the remote request, so > we have to be able to explicitly specify the localization context, or > at least query, modify, and restore some global context. Like I said , Microsoft has more experience with this stuff than anyone. Check out http://www.trigeminal.com/ Provided you're using IE, it should up come in the right language for you. Try viewing it in different languages, and note, e.g., how date formats change automatically. Here's an article on how they do it: http://msdn.microsoft.com/msdnmag/issues/0700/localize/localize.asp From guido@python.org Tue Feb 26 23:04:23 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 26 Feb 2002 18:04:23 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: Your message of "Tue, 26 Feb 2002 17:48:24 EST." <15484.4280.89844.484487@anthem.wooz.org> References: <15484.3684.503252.531036@anthem.wooz.org> <200202262245.g1QMjUf23970@pcp742651pcs.reston01.va.comcast.net> <15484.4280.89844.484487@anthem.wooz.org> Message-ID: <200202262304.g1QN4Of24123@pcp742651pcs.reston01.va.comcast.net> > Doesn't Java have separate formatting objects? You decide which > format object you need based on the localication context, then you > pass in the timestamp/date/money/whatever thingie and the format > object knowws how to render that data representation in the > appropriate localization. Yes, that's probably a good way to do it in general. There may be global functions that are initialized based on getenv(), and Zope may provide a different set of global functions that are initialized based on the preferences of the client. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim@zope.com Tue Feb 26 23:08:10 2002 From: tim@zope.com (Tim Peters) Date: Tue, 26 Feb 2002 18:08:10 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: <15484.4985.471940.886541@12-248-41-177.client.attbi.com> Message-ID: [Skip Montanaro] > ... > Of course, neither Apple's nor Microsoft's efforts in this area will help > the poor person trying to emit a dynamic web page containing "correctly" > formatted dates. You still have to guess or just fall back to something > most everyone can deduce. No, MS languages support APIs for high-level things like FormatCurrency(), and some have dedicated Currency, Time and Date types. You don't *get* low-level control under these things, and the high-level APIs automatically respect user preferences. So, for example, if you're generating dynamic content via a VBScript program, it's easy provided you stick to VBScript's high-level date format functions when you pump out a date: you can't not respect the user's date format preferences then. See the links I posted just before this to see how a server can suck down the user's preferences, at least to a first approximation (the default formats for the user's primary language). From DavidA@ActiveState.com Tue Feb 26 23:37:37 2002 From: DavidA@ActiveState.com (David Ascher) Date: Tue, 26 Feb 2002 15:37:37 -0800 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library References: Message-ID: <3C7C1C41.AB594D72@activestate.com> Tim Peters wrote: > Like I said , Microsoft has more experience with this stuff than > anyone. Check out >=20 > http://www.trigeminal.com/ >=20 > Provided you're using IE, it should up come in the right language for y= ou. > Try viewing it in different languages, and note, e.g., how date formats > change automatically. Except that: Derni=E8re mise-=E0-jour: 02/08/02 05:33 AM=20 is not correct. The AM/PM distinction is one which no right-thinking folks bother with. Then again, it looks like that particular line is not using their fancy logic, since the date is in the future if you follow the pattern set by other dates. Also note: Des problemes avec ce site? SVP, contacter le webmaster avec vos commentaires, questions ou suggestions (si possible, en anglais). which is pretty funny =3D). --david From tim@zope.com Wed Feb 27 00:06:52 2002 From: tim@zope.com (Tim Peters) Date: Tue, 26 Feb 2002 19:06:52 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: <3C7C1C41.AB594D72@activestate.com> Message-ID: [David Ascher] > Except that: > > Derni=E8re mise-=E0-jour: 02/08/02 05:33 AM > > is not correct. Well, if you're not viewing the page in English like God intended, you should be grateful it talked back to you at all. > ... > Also note: > > Des problemes avec ce site? SVP, contacter le webmaster > avec vos commentaires, questions ou suggestions (si possible, en > anglais). > > which is pretty funny =3D). Mabye to you. Me, I don't know French at all, so I'm delighted to see th= em produce French that I can understand easily. Hell, I even recognize SVP = as a suffix of the good old English RSVP -- it figures the French would drop the first and most important letter . the-kind-of-i18n-a-sensitive-american-can-support-ly y'rs - tim From Anthony Baxter Wed Feb 27 00:21:22 2002 From: Anthony Baxter (Anthony Baxter) Date: Wed, 27 Feb 2002 11:21:22 +1100 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Message from "M.-A. Lemburg" of "Tue, 26 Feb 2002 21:11:47 BST." <3C7BEC03.E8225CFD@lemburg.com> Message-ID: <200202270021.g1R0LMr15633@burswood.off.ekorp.com> >>> "M.-A. Lemburg" wrote > I tried that in early version of mxDateTime -- it fails > badly. I switched to the local time assumption very > early in the development. If you must store stuff without timezones, _please_ don't use localtime. localtime is a variable thing (think what happens when daylight savings goes on and off). Storing UTC, or else the local time and enough timezone data to get to UTC reliably, is the only thing that will lead to hugs and puppies. Anthony, who deals with stupid date/times all the time, in billing systems, and in different time zones. -- Anthony Baxter It's never to late to have a happy childhood. From ping@lfw.org Wed Feb 27 00:35:05 2002 From: ping@lfw.org (Ka-Ping Yee) Date: Tue, 26 Feb 2002 18:35:05 -0600 (CST) Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: Message-ID: David Ascher wrote; > Also note: > > Des problemes avec ce site? SVP, contacter le webmaster > avec vos commentaires, questions ou suggestions (si possible, en > anglais). > > which is pretty funny =). Tim Peters wrote: > Mabye to you. Me, I don't know French at all, so I'm delighted to see them > produce French that I can understand easily. In case the humour wasn't clear, here's the translation: Problems with this site? Please contact the webmaster with your comments, questions, or suggestions (if possible, in English). -- ?!ng From tim@zope.com Wed Feb 27 00:42:28 2002 From: tim@zope.com (Tim Peters) Date: Tue, 26 Feb 2002 19:42:28 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <3C7B95EB.952037FE@zope.com> Message-ID: [Jim Fulton] > ZODB has a TimeStamp type that uses a 32-bit unsigned integer > to store year, month,, day, hour, and minute in a way that makes it dirt > simple to extract a component. You really think so? It's a mixed-radix scheme: v=((((y-1900)*12+mo-1)*31+d-1)*24+h)*60+m; so requires lots of expensive integer division and remainder operations to pick apart again (the trend in CPUs is to make these relatively more expensive, not less, and e.g. Itanium doesn't even have an integer division instruction). If we had this to do over again, I'd strongly suggest assigning 12 bits to the year, 4 to the month, 5 each to day and hour, and 6 to the minute. The components would then be truly dirt simple and dirt cheap to extract, and we wouldn't even have to bother switching between 0-based and 1-based for the months and days (let 'em stay 1-based). They would still sort and compare correctly in packed format. The only downside I can see is that not pursuing every last drop of potential compression would shrink the dynamic range from 8000+ years to 4000+ years, but we're likely to have much worse problems in Zope by the year 5900 anyway . From tim@zope.com Wed Feb 27 00:51:17 2002 From: tim@zope.com (Tim Peters) Date: Tue, 26 Feb 2002 19:51:17 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: Message-ID: [Ping] > In case the humour wasn't clear, here's the translation: > > Problems with this site? Please contact the webmaster > with your comments, questions, or suggestions (if possible, > in English). Yes, it was clear. If the French could get over their speech impediment of using "avec" when they mean "with", and stuck to webspeak in this style, they *would* be speaking English. I think that's even funnier . From nhodgson@bigpond.net.au Wed Feb 27 01:23:28 2002 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Wed, 27 Feb 2002 12:23:28 +1100 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library References: Message-ID: <076601c1bf2d$5c389490$0acc8490@neil> Tim Peters: > Like I said , Microsoft has more experience with this stuff than > anyone. Check out > > http://www.trigeminal.com/ > > Provided you're using IE, it should up come in the right language for you. > Try viewing it in different languages, and note, e.g., how date formats > change automatically. Well, it crushes us Aussies under the jackboot of US cultural imperialism, unless month 31 was recently added to the calendar. This is despite sending a nice Accept-Language: en-au It does work if I change to German. Neil From gmcm@hypernet.com Wed Feb 27 01:30:29 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Tue, 26 Feb 2002 20:30:29 -0500 Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: <15484.4280.89844.484487@anthem.wooz.org> Message-ID: <3C7BF065.20061.3F14F1B8@localhost> On 26 Feb 2002 at 17:48, Barry A. Warsaw wrote: > Doesn't Java have separate formatting objects? You > decide which format object you need based on the > localication context, then you pass in the > timestamp/date/money/whatever thingie and the > format object knowws how to render that data > representation in > the appropriate localization. Yeah. In an optimization gig I had a couple years ago, I had them take out their use of the fancy format objects. It took 7,000 calls to print a date time according to the trace. -- Gordon http://www.mcmillan-inc.com/ From greg@cosc.canterbury.ac.nz Wed Feb 27 02:58:12 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 27 Feb 2002 15:58:12 +1300 (NZDT) Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: <3C7C1C41.AB594D72@activestate.com> Message-ID: <200202270258.PAA23722@s454.cosc.canterbury.ac.nz> David Ascher : > Des problemes avec ce site? SVP, contacter le webmaster > avec vos commentaires, questions ou suggestions (si possible, en > anglais). > > which is pretty funny =). Obviously they haven't yet implemented the natural-language- understanding webmaster-bot that adapts to the user's language settings. They could just pipe your query through Babelfish, though... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Wed Feb 27 03:10:05 2002 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 27 Feb 2002 16:10:05 +1300 (NZDT) Subject: [Python-Dev] Re: proposal: add basic time type to the standard library In-Reply-To: Message-ID: <200202270310.QAA23726@s454.cosc.canterbury.ac.nz> Tim Peters : > Hell, I even recognize SVP as a suffix of the good old English RSVP -- > it figures the French would drop the first and most important letter > . RSVP = repondez s'il vous plait = reply, please SVP = s'il vous plait = please (I know, this has nothing to do with timezones. Sorry. I'll stop now.) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From andymac@bullseye.apana.org.au Tue Feb 26 21:44:36 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Wed, 27 Feb 2002 08:44:36 +1100 (EDT) Subject: [Python-Dev] Re: OS/2 EMX changes to dynload_shlib.c In-Reply-To: Message-ID: On Tue, 26 Feb 2002, Guido Van Rossum wrote: > Given the number of OS/2 EMX specific changes to > dynload_shlib.c, wouldn't it be better to create a > separate dynload_os2.c? Its just occurred to me that you might in fact be referring to the changes to import.c, which were in the same commit (& thus the same checkin message) and were extensive. I did try to make it clear in the checkin message that changes to multiple files were committed. It seems to be de rigeur to commit a single file at a time; something I hadn't appreciated and don't remember being advised about. I will follow this practive from now on. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From andymac@bullseye.apana.org.au Tue Feb 26 21:06:47 2002 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Wed, 27 Feb 2002 08:06:47 +1100 (EDT) Subject: [Python-Dev] Re: OS/2 EMX changes to dynload_shlib.c In-Reply-To: Message-ID: On Tue, 26 Feb 2002, Guido Van Rossum wrote: > [I sent this to python-dev, but my copy to you > bounced; I'm not sure if you're on python-dev yet.] Have been for some time. > Given the number of OS/2 EMX specific changes to > dynload_shlib.c, wouldn't it be better to create a > separate dynload_os2.c? There is already a dynload_os2.c for the VACPP port, which implements a module loader via direct OS/2 API calls. Because the EMX stuff that I picked up (a 1.5.2 port by Andrew Zobolotny) used an emulation of dlopen() (PC/os2emx/dlfcn.c) I carried that over. I hadn't considered the changes to dynload_shlib.c that significant. however I'll look into whether I can adapt the EMX port to use the VACPP module loader in dynload_os2.c. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From tim_one@email.msn.com Wed Feb 27 06:06:28 2002 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 27 Feb 2002 01:06:28 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <200202262253.g1QMr7J24039@pcp742651pcs.reston01.va.comcast.net> Message-ID: [Guido] > ... > This stuff is hairier than it seems! You're just getting your toes wet: it's impossible to make any two of {astronomers, businessfolk, Jim} happy at the same time, even if they're all American and live in the same city. > I think the main tension is between improving upon the Unix time_t type, > and improving upon the Unix "struct tm" type. Improving upon time_t > could mean to extend the range beyond 1970-2038, and/or extend the > precision to milliseconds or microseconds. > > Improving upon struct tm is hard (it has all the necessary fields and no > others), unless you want to add operations (just add methods) or make > the representation more compact (several of the fields can be packed in > 4-6 bits each). I'm suprised you say "all the necessary fields", because a tm contains no info about timezone. C99 introduces a struct tmx that does. The initial segment of a struct tmx must be identical to a struct tm, but the meaning of tmx.tm_isdst differs from tm.tm_isdst. tmx.tm_isdst is the positive number of minutes of offset if Daylight Saving Time is in effect, zero if Daylight Saving Time is not in effect, and -1 if the information is not available. Then it adds some fields not present in a struct tm: int tm_version; // version number int tm_zone; // time zone offset in minutes from UTC [-1439, +1439] int tm_leapsecs; // number of leap seconds applied void *tm_ext; // extension block size_t tm_extlen;// size of the extension block The existence of tm_version, tm_ext and tm_extlen can be fairly viewed as a committee's inability to say "no" . > A third dimension might be to provide better date/time arithmetic, but > I'm not sure if there's much of a market for that, given all the fuzzy > semantics (leap seconds, differences across DST changes, timezones). I don't think we can get off that easy. Time computation is critical for businesses and astronomers, and leap seconds etc are a PITA independent of time computations. Time computations seem to me to be the easiest of all, provided we've already "done something" intelligible about the rest: any calculation boils down to factoring away leap seconds etc in conversion to a canonical form, doing the computing there, then injecting leap seconds etc back in to the result when converting out of canonical form again. The ECMAScript std (nee Javascript) has, I think, a good example of a usable facility that refused to get mired down in impossible details; e.g., it flat-out refuses to recognize leap seconds. mxDateTime is similarly sane, but MAL keeps threatening to flirt with insanity . BTW, I doubt there'd be any discussion of leap seconds in the C std if some astronomers hadn't been early Unix users. It's never a net win in the end to try to make a scientist happy <0.9 wink>. From mal@lemburg.com Wed Feb 27 09:16:18 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 10:16:18 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7CA3E2.C3705289@lemburg.com> "Martin v. Loewis" wrote: > > Guido van Rossum writes: > > > > This makes Latin-1 the right choice: > > > > > > * Unicode literals already use it today > > > > But they shouldn't, IMO. > > I agree. I recommend to deprecate this feature, and raise a > DeprecationWarning if a Unicode literal contains non-ASCII characters > but no encoding has been declared. > > > Sorry, I don't understand what you're trying to say here. Can you > > explain this with an example? Why can't we require any program > > encoded in more than pure ASCII to have an encoding magic comment? I > > guess I don't understand why you mean by "raw binary". > > With the proposed implementation, the encoding declaration is only > used for Unicode literals. In all other places where non-ASCII > characters can occur (comments, string literals), those characters are > treated as "bytes", i.e. it is not verified that these bytes are > meaningful under the declared encoding. > > Marc's original proposal was to apply the declared encoding to the > complete source code, but I objected claiming that it would make the > tokenizer changes more complex, and the resulting tokenizer likely > significantly slower (atleast if you use the codecs API to perform the > decoding). I don't think that the codecs will significantly slow down overall compilation -- the compiler is not fast to begin with. However, changing the bsae type in the tokenizer and compiler from char* to Py_UNICODE* will be a significant effort and that's why I added two phases to the implementation. The first phase will only touch Unicode literals as proposed by Martin. > In phase 2, the encoding will apply to all strings. So it will not be > possible to put arbitrary byte sequences in a string literal, atleast > if the encoding disallows certain byte sequences (like UTF-8, or > ASCII). Since this is currently possible, we have a backwards > compatibility problem. Right and I believe that a lot of people in European countries write strings literals with a Latin-1 encoding in mind. We cannot simply break all that code. The other problem is with comments found in Python source code. In phase 2 these will break as well. So how about this: In phase 1, the tokenizer checks the *complete file* for non-ASCII characters and outputs single warning per file if it doesn't find a coding declaration at the top. Unicode literals continue to use [raw-]unicode-escape as codec. In phase 2, we enforce ASCII as default encoding, i.e. the warning will turn into an error. The [raw-]unicode-escape codec will be extended to also support converting Unicode to Unicode, that is, only handle escape sequences in this case. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Feb 27 09:20:57 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 10:20:57 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <15483.53993.852170.135298@anthem.wooz.org> <3C7BE70B.74713ED2@lemburg.com> <20020226150143.C16980@unpythonic.dhs.org> Message-ID: <3C7CA4F9.86626985@lemburg.com> jepler@unpythonic.dhs.org wrote: > > On Tue, Feb 26, 2002 at 08:50:35PM +0100, M.-A. Lemburg wrote: > > Does anybody know where XEmacs is moving w/r to this ? (and > > for that matter what about vi, vim, etc. ?) > > I'm working with Vim 6.0, 20001 Sep 14. > > VIM lets you set variables with text similar to > vim:KEY=VALUE:KEY=VALUE:....: > Apparently you would use > vim:fileencoding=sjis: > to select shift-jis encoding. In the vim style, it seems most common to > place this at the bottom of a file, but it can be placed at the top too. > The variable "modelines" controls how many lines at each end of the file is > inspected, with the default being 5. It's documented that the form > vi:set KEY=VALUE: > may be compatible with "some versions of Vi" but does not say which. (I > can't get this to work) > > You can set a list of encodings to attempt when a file is loaded, which > defaults to "ucs-bom,utf-8,latin1". A user who wanted to treate > non-unicode files as shift-jis by default would > :set fileencodings=ucs-bom,utf-8,sjis > You can also load a particular file with the ++enc parameter: > :edit ++enc=koi8-r russian.txt > (I can get this to work, but I have to do it manually to load anything in > an odd character set) > > The emacs line is harmless in vim, but doesn't do anything. It's possible > that using :autocmd someone could make vim use the emacs line to set > encoding, but I'm not sure -- setting fileencoding after a file is loaded > seems to perform a translation from the old characterset to the new. So if we use the RE "coding[=:]\s*([\w-]+)" on the first line, we should be able to reach out for the encoding, right ? This RE would then cover both vim and emacs. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Wed Feb 27 09:26:15 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 27 Feb 2002 10:26:15 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7CA3E2.C3705289@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > In phase 1, the tokenizer checks the *complete file* for > non-ASCII characters and outputs single warning > per file if it doesn't find a coding declaration at > the top. Unicode literals continue to use [raw-]unicode-escape > as codec. Do you suggest that in this phase, the declared encoding is not used for anything except to complain? -1. I think people need to gain something from declaring the encoding; what they gain is that Unicode literals work right (i.e. that they really denote the strings that people see on their screen - given the appropriate editor). Regards, Martin From mal@lemburg.com Wed Feb 27 09:36:05 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 10:36:05 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202270021.g1R0LMr15633@burswood.off.ekorp.com> Message-ID: <3C7CA885.1E4F1ED8@lemburg.com> Anthony Baxter wrote: > > >>> "M.-A. Lemburg" wrote > > I tried that in early version of mxDateTime -- it fails > > badly. I switched to the local time assumption very > > early in the development. > > If you must store stuff without timezones, _please_ don't use > localtime. localtime is a variable thing (think what happens > when daylight savings goes on and off). You probably didn't notice the "assumption" -- mxDateTime has a few APIs which make assumptions about the value stored in DateTime objects; however, you can just as well store UTC in them. In that case, the APIs making the local time assumption will produce wrong data of course. >>> from mx.DateTime import * >>> now() # local time >>> now().gmtime() # in UTC In the end, it's better to leave the decision what to store in a DateTime object to the programmer. Timezones, DST and leap seconds sometimes have their application and sometimes just cause plain confusion. IMHO, the application should decide what to do about them and manage the data storage aspects of its decision. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From Jack.Jansen@oratrix.com Wed Feb 27 09:38:28 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Wed, 27 Feb 2002 10:38:28 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7CA3E2.C3705289@lemburg.com> Message-ID: On Wednesday, February 27, 2002, at 10:16 , M.-A. Lemburg wrote: > In phase 1, the tokenizer checks the *complete file* for > non-ASCII characters and outputs single warning > per file if it doesn't find a coding declaration at > the top. Unicode literals continue to use [raw-]unicode-escape > as codec. > > In phase 2, we enforce ASCII as default encoding, i.e. > the warning will turn into an error. The [raw-]unicode-escape > codec will be extended to also support converting Unicode > to Unicode, that is, only handle escape sequences in this > case. +1 -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mal@lemburg.com Wed Feb 27 09:48:25 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 10:48:25 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> Message-ID: <3C7CAB69.FF421A6E@lemburg.com> "Martin v. Loewis" wrote: > > "M.-A. Lemburg" writes: > > > In phase 1, the tokenizer checks the *complete file* for > > non-ASCII characters and outputs single warning > > per file if it doesn't find a coding declaration at > > the top. Unicode literals continue to use [raw-]unicode-escape > > as codec. > > Do you suggest that in this phase, the declared encoding is not used > for anything except to complain? No. This is just an extra step on top of what is proposed in the PEP to make people aware of the problem in phase 1. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Feb 27 09:56:45 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 10:56:45 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> Message-ID: <3C7CAD5D.6692F44@lemburg.com> I just got a private response about the proposal from Atsuo Ishimoto, Japan. They use two different encoding in day-to-day life (one for windows, one for unix) and have their complete tool chain setup to auto-convert all files between the two environments. Recognizing the magic comment would pose a problem for them, since their tools assume conversion to the PC's locale setting. He proposed to make the interpreters default encoding the default for source files which don't specify an encoding. That is ASCII on all standard Python installations and different encodings on tweaked installations. He also told me that they put raw Shift-JIS and EUC-JP into Python literal strings -- just like Europeans do with Latin-1. Wouldn't his suggestion be a good compromise for phase 2 ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Feb 27 10:07:15 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 11:07:15 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: Message-ID: <3C7CAFD3.60B32168@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > Jack had the same question. The simple answer is: we need this > > in order to maintain backward compatibility when we move to > > phase two of the implementation. > > > > Here's the longer one: > > > > ASCII is the standard encoding for Python keywords and identifiers. > > There is no standard source code encoding for string literals. > > But there is: > > Python uses the 7-bit ASCII character set for program text and > string literals. 8-bit characters may be used in string literals > and comments but their interpretation is platform dependent; the > proper way to insert 8-bit characters in string literals is by > using octal or hexadecimal escape sequences. > > The Ref Man has said "7-bit ASCII" for both "program text and string > literals" for a long time. The formal grammar in the Ref Man agrees with > this (including the formal grammar for Unicode literals). It's an > historical accident that the tokenizer happened to use C isalpha() to > "enforce" this for identifiers, and that C isalpha() happened to grow > locale-dependence while Guido was too drunk with power to notice . It's a fact of life that users don't read reference manuals, but simply write programs and feel good if they happen to work :-) As a result, programs have used string literals in many different encodings for a long time. Changing this situation will take time. The proposal aims at clarifying the situation and to make the transition less painful. > > Unicode literals are interpreted using 'unicode-escape' which > > is an enhanced Latin-1 with escape semantics. > > I'm sure they *do* "act like" Latin-1 on your box, and that identifiers also > act like Latin-1 was in effect on your box. But the Ref Man explicitly says > all that is platform dependent; there's no "backward compatibility" to > preserve here beyond 7-bit ASCII unless you want to preserve that Python > always rely on what C isalpha() says. You tell that to the Russians, Japanese or the Europeans writing Python programs -- it just happens that comments and literals are bound to end up using local encodings. Anyway, with the PEP implemented we'll no longer have to restrict ourselves to 7-bit US-ASCII, so all these problems will go away. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Feb 27 11:16:25 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 12:16:25 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <3C7CAFD3.60B32168@lemburg.com> Message-ID: <3C7CC009.F33C4AEE@lemburg.com> I've updated the PEP with the new requirements. http://python.sourceforge.net/peps/pep-0263.html The new scheme for the default encoding now maps the standard procedure for all other conversions in Python which go from strings to Unicode: use the sys.getdefaultencoding(). This happens to be ASCII in all standard installations, but sys admins may change it at their own risk and liking. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Wed Feb 27 08:25:02 2002 From: tim.one@comcast.net (Tim Peters) Date: Wed, 27 Feb 2002 03:25:02 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7BECEC.E1550553@lemburg.com> Message-ID: [M.-A. Lemburg] > Jack had the same question. The simple answer is: we need this > in order to maintain backward compatibility when we move to > phase two of the implementation. > > Here's the longer one: > > ASCII is the standard encoding for Python keywords and identifiers. > There is no standard source code encoding for string literals. But there is: Python uses the 7-bit ASCII character set for program text and string literals. 8-bit characters may be used in string literals and comments but their interpretation is platform dependent; the proper way to insert 8-bit characters in string literals is by using octal or hexadecimal escape sequences. The Ref Man has said "7-bit ASCII" for both "program text and string literals" for a long time. The formal grammar in the Ref Man agrees with this (including the formal grammar for Unicode literals). It's an historical accident that the tokenizer happened to use C isalpha() to "enforce" this for identifiers, and that C isalpha() happened to grow locale-dependence while Guido was too drunk with power to notice . > Unicode literals are interpreted using 'unicode-escape' which > is an enhanced Latin-1 with escape semantics. I'm sure they *do* "act like" Latin-1 on your box, and that identifiers also act like Latin-1 was in effect on your box. But the Ref Man explicitly says all that is platform dependent; there's no "backward compatibility" to preserve here beyond 7-bit ASCII unless you want to preserve that Python always rely on what C isalpha() says. From mal@lemburg.com Wed Feb 27 11:38:39 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 12:38:39 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: Message-ID: <3C7CC53F.DF13D8D7@lemburg.com> Tim Peters wrote: > > The ECMAScript std (nee Javascript) has, I think, a good example of a usable > facility that refused to get mired down in impossible details; e.g., it > flat-out refuses to recognize leap seconds. mxDateTime is similarly sane, > but MAL keeps threatening to flirt with insanity . FYI, mxDateTime test the C lib for leap second support; if leap seconds are used, then it has to support these too in conversions from and to Unix ticks. > BTW, I doubt there'd be any discussion of leap seconds in the C std if some > astronomers hadn't been early Unix users. It's never a net win in the end > to try to make a scientist happy <0.9 wink>. What strange about leap seconds is that they don't fit well with the idea of counting seconds since some fixed point in history. They are only useful for conversions from this count to a broken down date and time representation.... time simply doesn't leap. >From a comment in mxDateTime: /* This function checks whether the system uses the POSIX time_t rules (which do not support leap seconds) or a time package with leap second support enabled. Return 1 if it uses POSIX time_t values, 0 otherwise. POSIX: 1986-12-31 23:59:59 UTC == 536457599 With leap seconds: == 536457612 (since there were 13 leap seconds in the years 1972-1985 according to the tz package available from ftp://elsie.nci.nih.gov/pub/) */ -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From andy@reportlab.com Wed Feb 27 12:08:22 2002 From: andy@reportlab.com (Andy Robinson) Date: Wed, 27 Feb 2002 12:08:22 -0000 Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1917 - 14 msgs In-Reply-To: Message-ID: > > I propose adding an "abstract" money base type to the standard > > library, to be subclassed by real money/decimal implementations. > > Why do we need this? I guess that would be Question #1... > > --Guido van Rossum (home page: http://www.python.org/~guido/) I can think of 3 reasons; I've seen all these occur in real life. Reason 1: Currency safety. Having a special type can rule out subtle programming errors. Imagine this: >> x = Money(3, "USD") >> y = Money(4.5, "NLG") >> z = x + y TypeError: can't add different currencies Likewise, if you add or subtract money you get money; if you divide money in the same currency you get a float; and just about any other operation might be an error. IMHO a basic type should just rule out operations; subclasses could do clever conversions etc. (Does anyone need Euro triagulation rules in the Python standard library?) Reason 2: fixed decimals SQL databases and AS400s have fixed decimal data types and can do math on thousands or millions of numeric fields at C-like speeds. There would be a (very small) market for a type that could do this. Reason (3): speed If I went for a Python "money class" with smart behaviour, I'd get a sizable speed hit compared to floats. Let's say I want to average a time series of 1000 bond prices; it will be faster on floats than on Python classes. IMHO all these are best served by an extension package not in the core language - but having a common base for them to inherit from would get a thumbs-up from me. Best Regards, Andy Robinson From guido@python.org Wed Feb 27 12:40:01 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 07:40:01 -0500 Subject: [Python-Dev] Re: OS/2 EMX changes to dynload_shlib.c In-Reply-To: Your message of "Wed, 27 Feb 2002 08:44:36 +1100." References: Message-ID: <200202271240.g1RCe1x25304@pcp742651pcs.reston01.va.comcast.net> > On Tue, 26 Feb 2002, Guido Van Rossum wrote: > > > Given the number of OS/2 EMX specific changes to > > dynload_shlib.c, wouldn't it be better to create a > > separate dynload_os2.c? > > Its just occurred to me that you might in fact be referring to the changes > to import.c, which were in the same commit (& thus the same checkin > message) and were extensive. I did try to make it clear in the checkin > message that changes to multiple files were committed. > > It seems to be de rigeur to commit a single file at a time; something I > hadn't appreciated and don't remember being advised about. I will follow > this practive from now on. No need -- you're right, I mistook the diff, but that's purely my fault. Multi-file commits are quite common when they're related. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Feb 27 12:52:42 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 07:52:42 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: Your message of "Wed, 27 Feb 2002 10:16:18 +0100." <3C7CA3E2.C3705289@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> Message-ID: <200202271252.g1RCqgG25420@pcp742651pcs.reston01.va.comcast.net> > So how about this: > > In phase 1, the tokenizer checks the *complete file* for > non-ASCII characters and outputs single warning > per file if it doesn't find a coding declaration at > the top. Unicode literals continue to use [raw-]unicode-escape > as codec. > > In phase 2, we enforce ASCII as default encoding, i.e. > the warning will turn into an error. The [raw-]unicode-escape > codec will be extended to also support converting Unicode > to Unicode, that is, only handle escape sequences in this > case. +1. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Feb 27 12:57:11 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 07:57:11 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: Your message of "Wed, 27 Feb 2002 10:56:45 +0100." <3C7CAD5D.6692F44@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> Message-ID: <200202271257.g1RCvB525447@pcp742651pcs.reston01.va.comcast.net> > I just got a private response about the proposal from Atsuo Ishimoto, > Japan. They use two different encoding in day-to-day life (one for > windows, one for unix) and have their complete tool chain setup > to auto-convert all files between the two environments. > > Recognizing the magic comment would pose a problem for them, > since their tools assume conversion to the PC's locale setting. > > He proposed to make the interpreters default encoding the default > for source files which don't specify an encoding. That is > ASCII on all standard Python installations and different > encodings on tweaked installations. > > He also told me that they put raw Shift-JIS and EUC-JP > into Python literal strings -- just like Europeans do > with Latin-1. > > Wouldn't his suggestion be a good compromise for phase 2 ? I'm OK with a way to change the default to something locale-specific, as long as there's also a way to make the default strict ASCII (for export). Maybe python -A could force the default encoding to be ASCII even if the locale specifies something different. (I'd still *prefer* it the other way around, where you have to specify an explicit option to make the default equal to the locale rather than ASCII, but I can see the other side. Sigh.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jim@zope.com Wed Feb 27 13:01:43 2002 From: jim@zope.com (Jim Fulton) Date: Wed, 27 Feb 2002 08:01:43 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: Message-ID: <3C7CD8B7.3E9A89A3@zope.com> Tim Peters wrote: > > [Jim Fulton] > > ZODB has a TimeStamp type that uses a 32-bit unsigned integer > > to store year, month,, day, hour, and minute in a way that makes it dirt > > simple to extract a component. > > You really think so? It's a mixed-radix scheme: > > v=((((y-1900)*12+mo-1)*31+d-1)*24+h)*60+m; > > so requires lots of expensive integer division and remainder operations to > pick apart again (the trend in CPUs is to make these relatively more > expensive, not less, and e.g. Itanium doesn't even have an integer division > instruction). Compared to storing date-times as offsets from an epoch, this is much simpler and cheaper. > If we had this to do over again, I'd strongly suggest assigning 12 bits to > the year, 4 to the month, 5 each to day and hour, and 6 to the minute. The > components would then be truly dirt simple and dirt cheap to extract, and we > wouldn't even have to bother switching between 0-based and 1-based for the > months and days (let 'em stay 1-based). They would still sort and compare > correctly in packed format. The only downside I can see is that not > pursuing every last drop of potential compression would shrink the dynamic > range from 8000+ years to 4000+ years, but we're likely to have much worse > problems in Zope by the year 5900 anyway . Sounds good to me. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From guido@python.org Wed Feb 27 13:07:15 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 08:07:15 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Wed, 27 Feb 2002 12:38:39 +0100." <3C7CC53F.DF13D8D7@lemburg.com> References: <3C7CC53F.DF13D8D7@lemburg.com> Message-ID: <200202271307.g1RD7FC25501@pcp742651pcs.reston01.va.comcast.net> > FYI, mxDateTime test the C lib for leap second support; if leap > seconds are used, then it has to support these too in conversions > from and to Unix ticks. Since AFAIK POSIX doesn't admit the existence of leap seconds, how do you ask the C library for leap seconds? > > BTW, I doubt there'd be any discussion of leap seconds in the C > > std if some astronomers hadn't been early Unix users. It's never > > a net win in the end to try to make a scientist happy <0.9 wink>. Yeah, we learned that the hard way by adding complex numbers. :-) > What strange about leap seconds is that they don't fit well with > the idea of counting seconds since some fixed point in history. > They are only useful for conversions from this count to a broken > down date and time representation.... time simply doesn't > leap. > > >From a comment in mxDateTime: > /* This function checks whether the system uses the POSIX time_t rules > (which do not support leap seconds) or a time package with leap > second support enabled. Return 1 if it uses POSIX time_t values, 0 > otherwise. > > POSIX: 1986-12-31 23:59:59 UTC == 536457599 > > With leap seconds: == 536457612 > > (since there were 13 leap seconds in the years 1972-1985 according > to the tz package available from ftp://elsie.nci.nih.gov/pub/) > > */ I think an important (but so far unvoiced) requirement is that datetime objects can be stored in a database. Since the database may be read by systems that may or may not support leap seconds, we should be independent of the leap second support in the C library. As I've said before, we should ignore leap seconds. Even if we end up expressing times deltas as a number of seconds, that should be understood to be calendar seconds and not astronomical seconds. Let the astronomers deal with leap seconds themselves -- they should know how to. BTW, this means that we can't use the C calls mktime(), timegm(), localtime(), and gmtime(), or their Python wrappers in the time module! That's fine by me. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Feb 27 13:27:10 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 08:27:10 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Wed, 27 Feb 2002 01:06:28 EST." References: Message-ID: <200202271327.g1RDRBl25604@pcp742651pcs.reston01.va.comcast.net> [me] > > Improving upon struct tm is hard (it has all the necessary fields and no > > others), unless you want to add operations (just add methods) or make > > the representation more compact (several of the fields can be packed in > > 4-6 bits each). [Tim] > I'm suprised you say "all the necessary fields", because a tm contains no > info about timezone. Oops. My mistake. I thought it had timezone. [tmx details snipped] > > A third dimension might be to provide better date/time arithmetic, but > > I'm not sure if there's much of a market for that, given all the fuzzy > > semantics (leap seconds, differences across DST changes, timezones). > > I don't think we can get off that easy. Time computation is critical for > businesses and astronomers, and leap seconds etc are a PITA independent of > time computations. Time computations seem to me to be the easiest of all, > provided we've already "done something" intelligible about the rest: any > calculation boils down to factoring away leap seconds etc in conversion to a > canonical form, doing the computing there, then injecting leap seconds etc > back in to the result when converting out of canonical form again. > > The ECMAScript std (nee Javascript) has, I think, a good example of a usable > facility that refused to get mired down in impossible details; e.g., it > flat-out refuses to recognize leap seconds. mxDateTime is similarly sane, > but MAL keeps threatening to flirt with insanity . > > BTW, I doubt there'd be any discussion of leap seconds in the C std if some > astronomers hadn't been early Unix users. It's never a net win in the end > to try to make a scientist happy <0.9 wink>. I'd be happy to support time computations, provided we keep the leap seconds out. I propose a representation that resembles a compressed struct tm (or tmx), with appropriately-sized bit fields for year, month, day, hour, minute, second, millisecond, and microsecond, and timezone and DST info. Since the most likely situation is extraction in local time, these should be stored as local time with an explicit timezone. (I don't want to store these things in a database without an explicit timezone, even if it costs another 12 bit field.) An app extracting the local time without checking the timezone could be fooled by a time stored with a different timezone. Do we care? Time computations are only slightly complex because they have to be calendar-aware, but at least they don't have to be DST-aware -- they can just thake the timezone offset in minutes and apply it. The DST info should probably be two bits: one telling whether DST is in effect at the given time, one telling whether DST is honored in the given timezone. Maybe it should also allow "missing info" for either. Details, details. --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Wed Feb 27 13:35:07 2002 From: mwh@python.net (Michael Hudson) Date: 27 Feb 2002 13:35:07 +0000 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include patchlevel.h,2.60.2.1,2.60.2.1.2.1 In-Reply-To: Guido van Rossum's message of "Wed, 27 Feb 2002 08:11:57 -0500" References: <200202271311.g1RDBvW25563@pcp742651pcs.reston01.va.comcast.net> Message-ID: <2madtvkpys.fsf@starship.python.net> Guido van Rossum writes: > > I *think* this is the only place I need to do this. > > I think so too. Good. > > There are also some "(c) 2001"s that should probably be turned into > > "(c) 2001, 2002"s -- should this be done on the trunk too? > > Yes -- can you take care of it? I'm such a sucker. -- > so python will fork if activestate starts polluting it? I find it more relevant to speculate on whether Python would fork if the merpeople start invading our cities riding on the backs of giant king crabs. -- Brian Quinlan, comp.lang.python From gward@python.net Wed Feb 27 13:56:16 2002 From: gward@python.net (Greg Ward) Date: Wed, 27 Feb 2002 08:56:16 -0500 Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus? In-Reply-To: <15482.64790.812638.147714@anthem.wooz.org> References: <3C7AD939.FCFE163D@prescod.net> <200202260214.PAA23584@s454.cosc.canterbury.ac.nz> <15482.64790.812638.147714@anthem.wooz.org> Message-ID: <20020227135616.GA8928@gerg.ca> [Greg Ewing] > I suggest '^', since it does a nice job of suggesting > "inject stuff into this string". We can have both a > prefix form for compile-time interpolation: > > a = ^ "My name is $name" > > and an infix form for run-time interpolation: > > a = "My name is $name" ^ dict [Barry] > I think I suggested using ~ for this at IPC10: > > a = ~'my name is $name' > > for the compile-time interpolation. I don't think it matters much > which operator is chosen (let Guido decide). -1 on all line-noise string modifiers. (I just looked at Barry's example and part of my reptilian hindbrain thought it was a regex match. Don't do that to Perl and awk refugees, please!) All existing string modifiers are letters; how about "i" for "interpolation": a = i"my name is $name" Assuming of course that we really do need yet another flavour of strings... Greg -- Greg Ward - programmer-at-large gward@python.net http://starship.python.net/~gward/ Time flies like an arrow; fruit flies like a banana. From mal@lemburg.com Wed Feb 27 14:12:25 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 15:12:25 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <200202271257.g1RCvB525447@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7CE949.8893873D@lemburg.com> Guido van Rossum wrote: > > > I just got a private response about the proposal from Atsuo Ishimoto, > > Japan. They use two different encoding in day-to-day life (one for > > windows, one for unix) and have their complete tool chain setup > > to auto-convert all files between the two environments. > > > > Recognizing the magic comment would pose a problem for them, > > since their tools assume conversion to the PC's locale setting. > > > > He proposed to make the interpreters default encoding the default > > for source files which don't specify an encoding. That is > > ASCII on all standard Python installations and different > > encodings on tweaked installations. > > > > He also told me that they put raw Shift-JIS and EUC-JP > > into Python literal strings -- just like Europeans do > > with Latin-1. > > > > Wouldn't his suggestion be a good compromise for phase 2 ? > > I'm OK with a way to change the default to something locale-specific, > as long as there's also a way to make the default strict ASCII (for > export). Maybe python -A could force the default encoding to be ASCII > even if the locale specifies something different. > > (I'd still *prefer* it the other way around, where you have to specify > an explicit option to make the default equal to the locale rather than > ASCII, but I can see the other side. Sigh.) Let's put it this way: the interpreter's default encoding has to be changed explicitly by the sys admin (in sitecustomize.py), so the decision to take e.g. a locale specific default encoding is one which the admin maintaining the installation has to make (with all the consequences that go with it). Per default, the default encoding is ASCII, so I don't think we really need an extra option. Hmm, could be that python -S already implies this, BTW... checking this reveils that even sys.setdefaultencoding() remains available if -S is used. Perhaps we should remove the API with -S too ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Feb 27 14:21:54 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 15:21:54 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <200202271327.g1RDRBl25604@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7CEB82.3035BE20@lemburg.com> The discussion is going astray again: Fredrik proposed an abstract base type, i.e. a type providing only the name and an interface which is defined as convention. I am all for adding such an abstract base type (and others as well, e.g. for numbers, sequences, money, decimal, etc.) with minimal interfaces, but not for fixing a complex interface on top of these. What you are currently discussing is heading in the direction of imlementing one or more time subclasses. That's two steps ahead of what Fredrik was proposing. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Feb 27 14:33:00 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 15:33:00 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <3C7CC53F.DF13D8D7@lemburg.com> <200202271307.g1RD7FC25501@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7CEE1C.D6EF5C4B@lemburg.com> Guido van Rossum wrote: > > > FYI, mxDateTime test the C lib for leap second support; if leap > > seconds are used, then it has to support these too in conversions > > from and to Unix ticks. > > Since AFAIK POSIX doesn't admit the existence of leap seconds, how do > you ask the C library for leap seconds? See below (the quoted C comment). > > > BTW, I doubt there'd be any discussion of leap seconds in the C > > > std if some astronomers hadn't been early Unix users. It's never > > > a net win in the end to try to make a scientist happy <0.9 wink>. > > Yeah, we learned that the hard way by adding complex numbers. :-) > > > What strange about leap seconds is that they don't fit well with > > the idea of counting seconds since some fixed point in history. > > They are only useful for conversions from this count to a broken > > down date and time representation.... time simply doesn't > > leap. > > > > >From a comment in mxDateTime: > > /* This function checks whether the system uses the POSIX time_t rules > > (which do not support leap seconds) or a time package with leap > > second support enabled. Return 1 if it uses POSIX time_t values, 0 > > otherwise. > > > > POSIX: 1986-12-31 23:59:59 UTC == 536457599 > > > > With leap seconds: == 536457612 > > > > (since there were 13 leap seconds in the years 1972-1985 according > > to the tz package available from ftp://elsie.nci.nih.gov/pub/) > > > > */ > > I think an important (but so far unvoiced) requirement is that > datetime objects can be stored in a database. Since the database may > be read by systems that may or may not support leap seconds, ... SQL databases don't deal with leap seconds. They store the broken down value (in some way) without time zone information and that's it, fortunately :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From Jack.Jansen@oratrix.com Wed Feb 27 14:40:43 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Wed, 27 Feb 2002 15:40:43 +0100 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <3C7CEB82.3035BE20@lemburg.com> Message-ID: On Wednesday, February 27, 2002, at 03:21 , M.-A. Lemburg wrote: > The discussion is going astray again: Fredrik proposed an abstract > base type, i.e. a type providing only the name and an interface > which is defined as convention. > > I am all for adding such an abstract base type (and others > as well, e.g. for numbers, sequences, money, decimal, etc.) > with minimal interfaces, but not for fixing a complex interface > on top of these. Oops, I had missed that bit as well, that adding an *abstract* base type was the intention. I'm all for that as well. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From guido@python.org Wed Feb 27 14:42:09 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 09:42:09 -0500 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: Your message of "Wed, 27 Feb 2002 15:12:25 +0100." <3C7CE949.8893873D@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <200202271257.g1RCvB525447@pcp742651pcs.reston01.va.comcast.net> <3C7CE949.8893873D@lemburg.com> Message-ID: <200202271442.g1REg9D25882@pcp742651pcs.reston01.va.comcast.net> > > (I'd still *prefer* it the other way around, where you have to specify > > an explicit option to make the default equal to the locale rather than > > ASCII, but I can see the other side. Sigh.) > > Let's put it this way: the interpreter's default encoding has > to be changed explicitly by the sys admin (in sitecustomize.py), > so the decision to take e.g. a locale specific default encoding > is one which the admin maintaining the installation has > to make (with all the consequences that go with it). OK. I missed that part -- I thought that it would look in the locale by default. > Per default, the default encoding is ASCII, so I don't > think we really need an extra option. Agreed. > Hmm, could be that python -S already implies this, BTW... :-) > checking this reveils that even sys.setdefaultencoding() > remains available if -S is used. Perhaps we should remove > the API with -S too ?! I don't think so. It should be left in, caveat emptor. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Feb 27 14:43:37 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 09:43:37 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Wed, 27 Feb 2002 15:21:54 +0100." <3C7CEB82.3035BE20@lemburg.com> References: <200202271327.g1RDRBl25604@pcp742651pcs.reston01.va.comcast.net> <3C7CEB82.3035BE20@lemburg.com> Message-ID: <200202271443.g1REhbQ25904@pcp742651pcs.reston01.va.comcast.net> > The discussion is going astray again: Fredrik proposed an abstract > base type, i.e. a type providing only the name and an interface > which is defined as convention. > > I am all for adding such an abstract base type (and others > as well, e.g. for numbers, sequences, money, decimal, etc.) > with minimal interfaces, but not for fixing a complex interface > on top of these. > > What you are currently discussing is heading in the direction > of imlementing one or more time subclasses. That's two steps > ahead of what Fredrik was proposing. Good point. The two discussions are both useful, but should be separated. --Guido van Rossum (home page: http://www.python.org/~guido/) From jacobs@penguin.theopalgroup.com Wed Feb 27 15:11:56 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 27 Feb 2002 10:11:56 -0500 (EST) Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <3C7CEE1C.D6EF5C4B@lemburg.com> Message-ID: On Wed, 27 Feb 2002, M.-A. Lemburg wrote: > SQL databases don't deal with leap seconds. They store > the broken down value (in some way) without time zone information > and that's it, fortunately :-) Er... SQL99 (and I believe SQL92) have native support for time with and without time zones, and neither say nothing about how databases are to "store" those values. I don't have a copy in front of me, so I can't tell you what they say about leap-seconds. Of course, few implementations support this yet, though it worth being forward-looking. For my own uses, I have a base time class that encapsulates either mxDateTime objects or unix time-since-epoch, and implements the basic time and date accessors and simple arithmetic. A subclass of that type then adds awareness of timezones and daylight savings time. My first effort at trying to do all of those things in one big monolithic class was a nightmare. This layering does result in some (relative) inefficiency, but correctness and maintainability is vastly more important to me. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From jepler@unpythonic.dhs.org Wed Feb 27 17:10:59 2002 From: jepler@unpythonic.dhs.org (Jeff Epler) Date: Wed, 27 Feb 2002 11:10:59 -0600 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7CA4F9.86626985@lemburg.com> References: <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <15483.53993.852170.135298@anthem.wooz.org> <3C7BE70B.74713ED2@lemburg.com> <20020226150143.C16980@unpythonic.dhs.org> <3C7CA4F9.86626985@lemburg.com> Message-ID: <20020227111058.B30863@unpythonic.dhs.org> On Wed, Feb 27, 2002 at 10:20:57AM +0100, M.-A. Lemburg wrote: > So if we use the RE "coding[=:]\s*([\w-]+)" on the first line, > we should be able to reach out for the encoding, right ? > > This RE would then cover both vim and emacs. I've been informed on a #vim irc channel that "vim:fillencoding=blah:" does not work. Unfortunate. I overlooked the part of the documentation which states To read a file in a certain encoding it won't work by setting 'fileencoding', use the |++enc| argument. However, there's a "charset plugin" for vim: http://vim.sourceforge.net/scripts/script.php?script_id=199 which could be adapted to follow whatever convention is chosen for Python. However, this plugin is not standard in any version of vim. It's not clear what license it's under, but referencing it from the PEP and documenting that something like au BufReadPost *.py ReloadWhenCharset(1, "coding[:=]\s([\w-]+)") au BufReadPost *.py ReloadWhenCharset(2, "coding[:=]\s([\w-]+)") (search the first two lines for the emacs coding special marker) would cause it to detect the charset of a Python file would certainly be possible. The plugin functions by executing a reload of the file with ++enc when ReloadWhenCharset matches its pattern. Jeff From skip@pobox.com Wed Feb 27 17:20:51 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 27 Feb 2002 11:20:51 -0600 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7CAFD3.60B32168@lemburg.com> References: <3C7CAFD3.60B32168@lemburg.com> Message-ID: <15485.5491.709403.99698@beluga.mojam.com> >> Python uses the 7-bit ASCII character set for program text and string >> literals. 8-bit characters may be used in string literals and >> comments but their interpretation is platform dependent; the proper >> way to insert 8-bit characters in string literals is by using octal >> or hexadecimal escape sequences. mal> It's a fact of life that users don't read reference manuals, but mal> simply write programs and feel good if they happen to work :-) Perhaps a warning should be emitted by the compiler if a plain string literal is found that contains 8-bit characters. Better yet, perhaps Neal can add this to PyChecker if he hasn't already... Skip From martin@v.loewis.de Wed Feb 27 17:26:54 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 27 Feb 2002 18:26:54 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7CAD5D.6692F44@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > He also told me that they put raw Shift-JIS and EUC-JP > into Python literal strings -- just like Europeans do > with Latin-1. I expected that much; chosing Latin-1 as the default encoding is certainly Euro-centric. At the moment, declaring either eucJP or or Shift-JIS wouldn't work with the proposed implementation, anyway, since those encodings are not supported in the standard Python installation. > Wouldn't his suggestion be a good compromise for phase 2 ? This raises the question what exactly should be deprecated. AFAIK, both eucJP and Shift-JIS use non-ASCII bytes to denote Japanese characters, so they'd get a DeprecationWarning on every file. However, they could not put an encoding declaration into the file, as Python would not recognize the encoding. I don't see the convention to convert as too much of a stumbling block; to my knowledge, many editors can display text in both encodings correctly these days (but I may be wrong with that assumption). Regards, Martin From mal@lemburg.com Wed Feb 27 17:31:13 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 18:31:13 +0100 Subject: [Python-Dev] proposal: add basic time type to the standardlibrary References: Message-ID: <3C7D17E1.326DB5AF@lemburg.com> Kevin Jacobs wrote: > > On Wed, 27 Feb 2002, M.-A. Lemburg wrote: > > SQL databases don't deal with leap seconds. They store > > the broken down value (in some way) without time zone information > > and that's it, fortunately :-) > > Er... SQL99 (and I believe SQL92) have native support for time with and > without time zones, and neither say nothing about how databases are to > "store" those values. I don't have a copy in front of me, so I can't tell > you what they say about leap-seconds. Of course, few implementations > support this yet, though it worth being forward-looking. True, SQL-92 defines data types "TIME WITH TIME ZONE" and "TIMESTAMP WITH TIME ZONE". The standard is only available as book, but here's a draft which has all the details: http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt Still, only Oracle and PostgreSQL seem to actually implement these and ODBC (SQL/CLI), the defacto standard for database interfacing, doesn't even provide interfaces to query or store time zone information (you can put the information directly in the SQL string, but not use it in bound variables). Basically, you should not store local time in databases, but instead use UTC. If you need the original time zone information for reference, you'd keep this in separate DB columns (e.g. as strings). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jepler@unpythonic.dhs.org Wed Feb 27 17:32:50 2002 From: jepler@unpythonic.dhs.org (Jeff Epler) Date: Wed, 27 Feb 2002 11:32:50 -0600 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <20020227111058.B30863@unpythonic.dhs.org> References: <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <15483.53993.852170.135298@anthem.wooz.org> <3C7BE70B.74713ED2@lemburg.com> <20020226150143.C16980@unpythonic.dhs.org> <3C7CA4F9.86626985@lemburg.com> <20020227111058.B30863@unpythonic.dhs.org> Message-ID: <20020227113249.C30863@unpythonic.dhs.org> This actually works in vim with "charset plugin": let s:pep263='coding[:=]\s*\([-A-Za-z0-9_]\+\)' au BufReadPost *.py call ReloadWhenCharsetSet(1, s:pep263) au BufReadPost *.py call ReloadWhenCharsetSet(2, s:pep263) It searches for a RE compatible with PEP263 in the first and second lines. You could change the pattern from *.py to * if you want to recognize the emacs-style coding in all files. Jeff From mal@lemburg.com Wed Feb 27 17:36:39 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 18:36:39 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <3C7CAFD3.60B32168@lemburg.com> <15485.5491.709403.99698@beluga.mojam.com> Message-ID: <3C7D1927.3414607E@lemburg.com> Skip Montanaro wrote: > > >> Python uses the 7-bit ASCII character set for program text and string > >> literals. 8-bit characters may be used in string literals and > >> comments but their interpretation is platform dependent; the proper > >> way to insert 8-bit characters in string literals is by using octal > >> or hexadecimal escape sequences. > > mal> It's a fact of life that users don't read reference manuals, but > mal> simply write programs and feel good if they happen to work :-) > > Perhaps a warning should be emitted by the compiler if a plain string > literal is found that contains 8-bit characters. Better yet, perhaps Neal > can add this to PyChecker if he hasn't already... See the PEP: this is what phase 1 will do; phase 2 won't accept such a file without an explicit encoding declaration. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From steve@cat-box.net Wed Feb 27 17:44:28 2002 From: steve@cat-box.net (Steve Alexander) Date: Wed, 27 Feb 2002 17:44:28 +0000 Subject: [Python-Dev] Supporting precision in a DateTime type Message-ID: <3C7D1AFC.4050809@cat-box.net> Hi folks, On #zope3-dev, we were discussing how best to implement a DateTime type in Python. Leaving aside arguments of whether to store it as a packed C tuple, or as ms since an epoch, I'd like to think about the concept of precision as it relates to dates and times. As a writer of software applications that people use in people settings, I like to use types that reflect the elements of reality that people find important. One aspect of time that is important is its precision. Here's an example How long is it between 1992 and March 15, 1993 ? There isn't a sensible answer. Or, rather, there are many answers, some more sensible than others. The correct answer might be "1 year", a date range, or an error (perhaps a ValueError). In any case, the correct answer depends on the nature of the application. Thus, if I'm only interested in using dates, such as in an application where I'm interested in birthdays, I want to be able to describe a date without reference to a particular time. It isn't just a default time, it is a "no time specified". So, I won't get caught later on if I compare that datetime instance with another that has a different precision. It is often possible to resolve differing precisions in an application-specific way. Another way of thinking about precision is as a constraint on possible more precise values. So, I can play an April fool prank any time in the morning of April 1, in my local time-zone. The actual exact time of my pranks will fall within the less precise constraint. This makes dates with precision similar to durations. Common precisions in applications include years, months, iso weeks of a year, days. Any finer precision doesn't really matter; the max precision of time in C is ok for most human purposes. Although you could catch some cases by having distinct types for dates and times, this only captures the precision of days. It doesn't help for other precisions. Here's a paper I found via google, that discusses these issues: http://www.martinfowler.com/ap2/timePoint.html ps. I'm not a regular reader of python-dev. Guido suggested I post this here for further discussion. I'll catch up via the web eventually, but please cc me into any relevant replies. -- Steve Alexander From mal@lemburg.com Wed Feb 27 17:43:10 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 18:43:10 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> Message-ID: <3C7D1AAE.234A01F7@lemburg.com> "Martin v. Loewis" wrote: > > "M.-A. Lemburg" writes: > > > He also told me that they put raw Shift-JIS and EUC-JP > > into Python literal strings -- just like Europeans do > > with Latin-1. > > I expected that much; chosing Latin-1 as the default encoding is > certainly Euro-centric. > > At the moment, declaring either eucJP or or Shift-JIS wouldn't work > with the proposed implementation, anyway, since those encodings are > not supported in the standard Python installation. But they will be using Tamito's Japanese codecs... and, of course, they do work now in string literals, since there is no enforcement of any encoding in the compiler. > > Wouldn't his suggestion be a good compromise for phase 2 ? > > This raises the question what exactly should be deprecated. AFAIK, > both eucJP and Shift-JIS use non-ASCII bytes to denote Japanese > characters, so they'd get a DeprecationWarning on every file. However, > they could not put an encoding declaration into the file, as Python > would not recognize the encoding. With Tamito's codecs installed, this wouldn't be a problem. Putting the encoding comment in the files will turn the compiler quiet in phase 1 and in phase 2 assure that their editors do in fact use the defined encoding. FYI, I've updated the PEP to use the interpreter's default encoding as basis for the source file encoding too. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Feb 27 17:44:09 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 18:44:09 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <15483.53993.852170.135298@anthem.wooz.org> <3C7BE70B.74713ED2@lemburg.com> <20020226150143.C16980@unpythonic.dhs.org> <3C7CA4F9.86626985@lemburg.com> <20020227111058.B30863@unpythonic.dhs.org> <20020227113249.C30863@unpythonic.dhs.org> Message-ID: <3C7D1AE9.11C2740F@lemburg.com> Jeff Epler wrote: > > This actually works in vim with "charset plugin": > let s:pep263='coding[:=]\s*\([-A-Za-z0-9_]\+\)' > au BufReadPost *.py call ReloadWhenCharsetSet(1, s:pep263) > au BufReadPost *.py call ReloadWhenCharsetSet(2, s:pep263) > It searches for a RE compatible with PEP263 in the first and second lines. > > You could change the pattern from *.py to * if you want to recognize the > emacs-style coding in all files. Great ! So we can say that the RE fits vim and emacs, right ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jacobs@penguin.theopalgroup.com Wed Feb 27 17:45:17 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 27 Feb 2002 12:45:17 -0500 (EST) Subject: [Python-Dev] proposal: add basic time type to the standardlibrary In-Reply-To: <3C7D17E1.326DB5AF@lemburg.com> Message-ID: On Wed, 27 Feb 2002, M.-A. Lemburg wrote: > True, SQL-92 defines data types "TIME WITH TIME ZONE" and > "TIMESTAMP WITH TIME ZONE". The standard is only available > as book, but here's a draft which has all the details: > > http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt > > Still, only Oracle and PostgreSQL seem to actually implement these > and ODBC (SQL/CLI), the defacto standard for database interfacing, > doesn't even provide interfaces to query or store time zone > information (you can put the information directly in the SQL > string, but not use it in bound variables). Strangely enough I use TIMESPAMP WITH TIMEZONE quite a bit on both Oracle and PostgreSQL using native drivers. I'm also fairly sure that Sybase and MS-SQL store timestamps with timezone somehow, though my memory on the project that did so is a little fuzzy. > Basically, you should not store local time in databases, > but instead use UTC. If you need the original time zone > information for reference, you'd keep this in separate > DB columns (e.g. as strings). Why not minute offset from UTC like C99? Anyhow, everyone knows that time zones and daylight savings time are a pain to deal with. However, lets provide work toward a sane implementation that can relieve the end-user from having to smack their head against this particular brick wall every time. (even if it means smacking our collective heads against the brick wall until we're happy, or reduced to unintelligible ranting, or possibly both). Regards, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From skip@pobox.com Wed Feb 27 18:30:55 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 27 Feb 2002 12:30:55 -0600 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7D1927.3414607E@lemburg.com> References: <3C7CAFD3.60B32168@lemburg.com> <15485.5491.709403.99698@beluga.mojam.com> <3C7D1927.3414607E@lemburg.com> Message-ID: <15485.9695.126411.600632@beluga.mojam.com> >> Perhaps a warning should be emitted by the compiler if a plain string >> literal is found that contains 8-bit characters. Better yet, perhaps >> Neal can add this to PyChecker if he hasn't already... mal> See the PEP: this is what phase 1 will do; phase 2 won't accept mal> such a file without an explicit encoding declaration. That wasn't what I was getting at. The quoted part of the reference manual seemed to suggest that programmers should be using hex escapes in string literals instead of 8-bit characters. This doesn't seem to me to be related to what encoding the file is in. Skip From David Abrahams" Message-ID: <133401c1bfbf$46e3fd90$0500a8c0@boostconsulting.com> ----- Original Message ----- From: "Steve Alexander" > Here's a paper I found via google, that discusses these issues: > > http://www.martinfowler.com/ap2/timePoint.html > > > ps. I'm not a regular reader of python-dev. Guido suggested I post this > here for further discussion. You might also be interested in what's happening in this area in the C++ world. AFAIK the most advanced C++ date/time library development is centered here: http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?GDTL with a nice paper from OOPSLA available here: http://www.oonumerics.org/tmpw01/garland.pdf or-you-might-run-away-screaming-ly y'rs, Dave From Barrett@stsci.edu Wed Feb 27 19:04:40 2002 From: Barrett@stsci.edu (Paul Barrett) Date: Wed, 27 Feb 2002 14:04:40 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library References: <3C7CC53F.DF13D8D7@lemburg.com> <200202271307.g1RD7FC25501@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7D2DC8.90802@STScI.Edu> Guido van Rossum wrote: > > I think an important (but so far unvoiced) requirement is that > datetime objects can be stored in a database. Since the database may > be read by systems that may or may not support leap seconds, we should > be independent of the leap second support in the C library. As I've > said before, we should ignore leap seconds. Even if we end up > expressing times deltas as a number of seconds, that should be > understood to be calendar seconds and not astronomical seconds. Let > the astronomers deal with leap seconds themselves -- they should know > how to. As for us astronomers, we're suppose to represent time in Julian days and fractions thereof since the beginning of time (about 6714 years ago). Today is day 2452346. In practice we use whole days and represent the fractional part in seconds, because floating point numbers don't have a sufficient number of bits to represent Julian days to nanosecond precision. A typical day contains 86400 seconds. In essence we use Julian days as our reference point and seconds of a day as our delta time. From these two values you can theoretically calculate any time past, present, or future with or without leap seconds (if known). Just thought you might like to know, if you didn't already. -- Paul -- Paul Barrett, PhD Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Group FAX: 410-338-4767 Baltimore, MD 21218 From guido@python.org Wed Feb 27 19:55:40 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 14:55:40 -0500 Subject: [Python-Dev] Supporting precision in a DateTime type In-Reply-To: Your message of "Wed, 27 Feb 2002 17:44:28 GMT." <3C7D1AFC.4050809@cat-box.net> References: <3C7D1AFC.4050809@cat-box.net> Message-ID: <200202271955.g1RJteW26624@pcp742651pcs.reston01.va.comcast.net> Thanks Steve, for posting this summary. I'm going to take a different route for now but will keep your remarks in mind. Martin Fowler's note on time points was really helpful! Also thanks to David Abraham for the pointer to Boost GDTL. Python's standard date/time type will relate to GDTL like Python's iterator concept relates to C++ iterators. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From David Abrahams" A quick grep-find through the Python-2.2 sources reveals the following: Include/dictobject.h:49: long aligner; Include/objimpl.h:275: double dummy; /* force worst-case alignment */ Modules/addrinfo.h:162: LONG_LONG __ss_align; /* force desired structure storage alignment */ Modules/addrinfo.h:164: double __ss_align; /* force desired structure storage alignment */ At first glance, there appear to be different assumptions at work here about what constitutes maximal alignment on any given platform. I've been using a little C++ metaprogram to find a type which will properly align any other given type. Because of limitations of one compiler, I had to disable the computation and instead used the objimpl.h assumption that double was maximally aligned, but also added a compile-time assertion to check that the alignment is always greater than or equal to that of the target type. Well, it failed today on Tru64 Unix with the latest compaq CXX 6.5 prerelease compiler; it appears that the alignment of long double is greater than that of double on that platform. I thought someone might want to know, Dave +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+ From barry@zope.com Wed Feb 27 20:09:43 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 27 Feb 2002 15:09:43 -0500 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> Message-ID: <15485.15623.543255.443894@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: MvL> At the moment, declaring either eucJP or or Shift-JIS MvL> wouldn't work with the proposed implementation, anyway, since MvL> those encodings are not supported in the standard Python MvL> installation. Which actually touches on something I wanted to bring up. Why don't we include the Japanese codecs with Python? Is it just a size issue? The gzip'd tarball of the JapaneseCodecs-1.4.3 is 258k, unpacked it's 3.2M. Okay, so that's nontrivial, but I can think of 2 approaches: - Have a second, sumo (no pun intended) release that inclues the codecs - Include the gzip'd tarball and do a distutils install at Python install time I bet we'd win some Ruby converts if we did this . For reference, I'm thinking about including the Japanese and Chinese codecs with MM2.1 because it makes little sense to claim support for those languages without them. -Barry From mal@lemburg.com Wed Feb 27 20:33:58 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 21:33:58 +0100 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> Message-ID: <3C7D42B6.A88568CD@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MvL" == Martin v Loewis writes: > > MvL> At the moment, declaring either eucJP or or Shift-JIS > MvL> wouldn't work with the proposed implementation, anyway, since > MvL> those encodings are not supported in the standard Python > MvL> installation. > > Which actually touches on something I wanted to bring up. Why don't > we include the Japanese codecs with Python? Is it just a size issue? > > The gzip'd tarball of the JapaneseCodecs-1.4.3 is 258k, unpacked it's > 3.2M. Okay, so that's nontrivial, but I can think of 2 approaches: > > - Have a second, sumo (no pun intended) release that inclues the > codecs > > - Include the gzip'd tarball and do a distutils install at Python > install time Why not simply make the installation a configure option ? We could easily extend setup.py to grab the tarball from the web in case it is needed. > I bet we'd win some Ruby converts if we did this . For > reference, I'm thinking about including the Japanese and Chinese > codecs with MM2.1 because it makes little sense to claim support for > those languages without them. Agreed. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jepler@unpythonic.dhs.org Wed Feb 27 20:43:39 2002 From: jepler@unpythonic.dhs.org (Jeff Epler) Date: Wed, 27 Feb 2002 14:43:39 -0600 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7D1AE9.11C2740F@lemburg.com> References: <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <15483.53993.852170.135298@anthem.wooz.org> <3C7BE70B.74713ED2@lemburg.com> <20020226150143.C16980@unpythonic.dhs.org> <3C7CA4F9.86626985@lemburg.com> <20020227111058.B30863@unpythonic.dhs.org> <20020227113249.C30863@unpythonic.dhs.org> <3C7D1AE9.11C2740F@lemburg.com> Message-ID: <20020227144337.D30863@unpythonic.dhs.org> On Wed, Feb 27, 2002 at 06:44:09PM +0100, M.-A. Lemburg wrote: > Great ! So we can say that the RE fits vim and emacs, right ? Fits vim 6.0 with additional configuration of that editor. Jeff From martin@v.loewis.de Wed Feb 27 21:01:25 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 27 Feb 2002 22:01:25 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <15485.9695.126411.600632@beluga.mojam.com> References: <3C7CAFD3.60B32168@lemburg.com> <15485.5491.709403.99698@beluga.mojam.com> <3C7D1927.3414607E@lemburg.com> <15485.9695.126411.600632@beluga.mojam.com> Message-ID: Skip Montanaro writes: > >> Perhaps a warning should be emitted by the compiler if a plain string > >> literal is found that contains 8-bit characters. Better yet, perhaps > >> Neal can add this to PyChecker if he hasn't already... > > mal> See the PEP: this is what phase 1 will do; phase 2 won't accept > mal> such a file without an explicit encoding declaration. > > That wasn't what I was getting at. The quoted part of the reference manual > seemed to suggest that programmers should be using hex escapes in string > literals instead of 8-bit characters. This doesn't seem to me to be related > to what encoding the file is in. PEP 263 says "the tokenizer must check the complete source file for compliance with the default encoding". The part of the reference manual will become incorrect: the meaning of 8-bit characters (rather: bytes) will be well-defined if you have an encoding declaration. If the default encoding is ASCII, and you have a 8-bit character, the compiler will emit a warning if it is enhanced to follow PEP 263. So what were you getting at? Regards, Martin From skip@pobox.com Wed Feb 27 21:41:53 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 27 Feb 2002 15:41:53 -0600 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: References: <3C7CAFD3.60B32168@lemburg.com> <15485.5491.709403.99698@beluga.mojam.com> <3C7D1927.3414607E@lemburg.com> <15485.9695.126411.600632@beluga.mojam.com> Message-ID: <15485.21153.951244.102021@beluga.mojam.com> Martin> If the default encoding is ASCII, and you have a 8-bit Martin> character, the compiler will emit a warning if it is enhanced to Martin> follow PEP 263. So what were you getting at? I was thinking about strings used as byte containers for non-character data. Skip From martin@v.loewis.de Wed Feb 27 22:03:51 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 27 Feb 2002 23:03:51 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <15485.21153.951244.102021@beluga.mojam.com> References: <3C7CAFD3.60B32168@lemburg.com> <15485.5491.709403.99698@beluga.mojam.com> <3C7D1927.3414607E@lemburg.com> <15485.9695.126411.600632@beluga.mojam.com> <15485.21153.951244.102021@beluga.mojam.com> Message-ID: Skip Montanaro writes: > I was thinking about strings used as byte containers for > non-character data. Ok, but then you also said that you would want to produce a warning for those? How can you tell them apart from "proper" character strings if the encoding allows arbitrary byte sequences (like Latin-1)? Regards, Martin From martin@v.loewis.de Wed Feb 27 22:02:03 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 27 Feb 2002 23:02:03 +0100 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) In-Reply-To: <15485.15623.543255.443894@anthem.wooz.org> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> Message-ID: barry@zope.com (Barry A. Warsaw) writes: > Which actually touches on something I wanted to bring up. Why don't > we include the Japanese codecs with Python? Is it just a size issue? I think Guido's original concern was about the size (apart from the fact that they were not available before). My concern is also correctness and efficiency. Most current systems provide high-performance well-tested codecs, since they need those frequently. It is a waste of resources not to make use of these codecs. The counter-argument, of course, is that you cannot always rely on these codecs being available (apart from the fact that you need wrappers around the platform API). > I bet we'd win some Ruby converts if we did this . For > reference, I'm thinking about including the Japanese and Chinese > codecs with MM2.1 because it makes little sense to claim support for > those languages without them. That is certainly the right thing to do. If correctness could be verified independently, I'd be in favour of including them with Python - even though they will likely never get the efficiency that wrappers around the platform's codecs would have. Regards, Martin From barry@zope.com Wed Feb 27 22:53:02 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 27 Feb 2002 17:53:02 -0500 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> Message-ID: <15485.25422.524082.109890@anthem.wooz.org> >>>>> "MvL" == Martin v Loewis writes: >> I bet we'd win some Ruby converts if we did this . For >> reference, I'm thinking about including the Japanese and >> Chinese codecs with MM2.1 because it makes little sense to >> claim support for those languages without them. MvL> That is certainly the right thing to do. If correctness could MvL> be verified independently, I'd be in favour of including them MvL> with Python - even though they will likely never get the MvL> efficiency that wrappers around the platform's codecs would MvL> have. I'm obviously not qualified to verify them independently, but I have had some initial positive feedback from a few Japanese users of the MM2.1 alphas. My second hand information indicates that he Japanese codecs are pretty good, the Chinese are okay, and the Korean ones need a lot of work. Also, it's a bit of a catch 22, in that the more official exposure these codecs get, the better they will eventually become, hopefully. I'd be +1 on including them in Python 2.3. -Barry From barry@zope.com Wed Feb 27 22:53:55 2002 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 27 Feb 2002 17:53:55 -0500 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <3C7D42B6.A88568CD@lemburg.com> Message-ID: <15485.25475.913116.826208@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> Why not simply make the installation a configure option ? MAL> We could easily extend setup.py to grab the tarball from MAL> the web in case it is needed. That's another option. Certainly stuff like that is becoming fairly common for installers these days. >> I bet we'd win some Ruby converts if we did this . For >> reference, I'm thinking about including the Japanese and >> Chinese codecs with MM2.1 because it makes little sense to >> claim support for those languages without them. MAL> Agreed. -Barry From gsw@agere.com Wed Feb 27 22:54:32 2002 From: gsw@agere.com (Gerald S. Williams) Date: Wed, 27 Feb 2002 17:54:32 -0500 Subject: [Python-Dev] POSIX thread code Message-ID: This is a multi-part message in MIME format. ------=_NextPart_000_0023_01C1BFB7.CF9678F0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit I recently came up with a fix for thread support in Python under Cygwin. Jason Tishler and Norman Vine are looking it over, but I'm pretty sure something similar should be used for the Cygwin Python port. This is easily done--simply add a few lines to thread.c and create a new thread_cygwin.h (context diff and new file both provided). But there is a larger issue: The thread interface code in thread_pthread.h uses mutexes and condition variables to emulate semaphores, which are then used to provide Python "lock" and "sema" services. I know this is a common practice since those two thread synchronization primitives are defined in "pthread.h". But it comes with quite a bit of overhead. (And in the case of Cygwin causes race conditions, but that's another matter.) POSIX does define semaphores, though. (In fact, it's in the standard just before Mutexes and Condition Variables.) According to POSIX, they are found in and _POSIX_SEMAPHORES should be defined if they work as POSIX expects. If they are available, it seems like providing direct semaphore services would be preferable to emulating them using condition variables and mutexes. thread_posix.h.diff-c is a context diff that can be used to convert thread_pthread.h into a more general POSIX version that will use semaphores if available. thread_cygwin.h would no longer be needed then, since all it does is uses POSIX semaphores directly rather than mutexes/condition vars. Changing the interface to POSIX threads should bring a performance improvement to any POSIX platform that supports semaphores directly. Does this sound like a good idea? Should I create a more thorough set of patch files and submit them? (I haven't been accepted to the python-dev list yet, so please CC me. Thanks.) -Jerry -O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O- -O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661 O- -O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592 O- ------=_NextPart_000_0023_01C1BFB7.CF9678F0 Content-Type: application/octet-stream; name="thread.c.diff-c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="thread.c.diff-c" *** thread.c Tue Oct 16 17:13:49 2001 --- thread.c.new Tue Feb 26 07:49:13 2002 *************** *** 113,118 **** --- 113,123 ---- #include "thread_pth.h" #endif + #ifdef __CYGWIN__ + #include "thread_cygwin.h" + #undef _POSIX_THREADS + #endif + #ifdef _POSIX_THREADS #include "thread_pthread.h" #endif ------=_NextPart_000_0023_01C1BFB7.CF9678F0 Content-Type: application/octet-stream; name="thread_cygwin.h" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="thread_cygwin.h" /* Posix threads interface */ /* * Modified to avoid condition variables, which cause race conditions in Cygwin. * Gerald Williams, gsw@agere.com * $Id: thread_cygwin.h,v 1.6 2002/02/27 19:34:08 gsw Exp $ */ #include #include #include #include #include /* try to determine what version of the Pthread Standard is installed. * this is important, since all sorts of parameter types changed from * draft to draft and there are several (incompatible) drafts in * common use. these macros are a start, at least. * 12 May 1997 -- david arnold */ #if defined(__ultrix) && defined(__mips) && defined(_DECTHREADS_) /* _DECTHREADS_ is defined in cma.h which is included by pthread.h */ # define PY_PTHREAD_D4 #elif defined(__osf__) && defined (__alpha) /* _DECTHREADS_ is defined in cma.h which is included by pthread.h */ # if !defined(_PTHREAD_ENV_ALPHA) || defined(_PTHREAD_USE_D4) || defined(PTHREAD_USE_D4) # define PY_PTHREAD_D4 # else # define PY_PTHREAD_STD # endif #elif defined(_AIX) /* SCHED_BG_NP is defined if using AIX DCE pthreads * but it is unsupported by AIX 4 pthreads. Default * attributes for AIX 4 pthreads equal to NULL. For * AIX DCE pthreads they should be left unchanged. */ # if !defined(SCHED_BG_NP) # define PY_PTHREAD_STD # else # define PY_PTHREAD_D7 # endif #elif defined(__DGUX) # define PY_PTHREAD_D6 #elif defined(__hpux) && defined(_DECTHREADS_) # define PY_PTHREAD_D4 #else /* Default case */ # define PY_PTHREAD_STD #endif #ifdef USE_GUSI /* The Macintosh GUSI I/O library sets the stackspace to ** 20KB, much too low. We up it to 64K. */ #define THREAD_STACK_SIZE 0x10000 #endif /* set default attribute object for different versions */ #if defined(PY_PTHREAD_D4) || defined(PY_PTHREAD_D7) # define pthread_attr_default pthread_attr_default # define pthread_mutexattr_default pthread_mutexattr_default #elif defined(PY_PTHREAD_STD) || defined(PY_PTHREAD_D6) # define pthread_attr_default ((pthread_attr_t *)NULL) # define pthread_mutexattr_default ((pthread_mutexattr_t *)NULL) #endif /* On platforms that don't use standard POSIX threads pthread_sigmask() * isn't present. DEC threads uses sigprocmask() instead as do most * other UNIX International compliant systems that don't have the full * pthread implementation. */ #ifdef HAVE_PTHREAD_SIGMASK # define SET_THREAD_SIGMASK pthread_sigmask #else # define SET_THREAD_SIGMASK sigprocmask #endif #define CHECK_STATUS(name) if (status != 0) { perror(name); error = 1; } /* * Initialization. */ #ifdef _HAVE_BSDI static void _noop(void) { } static void PyThread__init_thread(void) { /* DO AN INIT BY STARTING THE THREAD */ static int dummy = 0; pthread_t thread1; pthread_create(&thread1, NULL, (void *) _noop, &dummy); pthread_join(thread1, NULL); } #else /* !_HAVE_BSDI */ static void PyThread__init_thread(void) { #if defined(_AIX) && defined(__GNUC__) pthread_init(); #endif } #endif /* !_HAVE_BSDI */ /* * Thread support. */ long PyThread_start_new_thread(void (*func)(void *), void *arg) { pthread_t th; int success; sigset_t oldmask, newmask; #if defined(THREAD_STACK_SIZE) || defined(PTHREAD_SYSTEM_SCHED_SUPPORTED) pthread_attr_t attrs; #endif dprintf(("PyThread_start_new_thread called\n")); if (!initialized) PyThread_init_thread(); #if defined(THREAD_STACK_SIZE) || defined(PTHREAD_SYSTEM_SCHED_SUPPORTED) pthread_attr_init(&attrs); #endif #ifdef THREAD_STACK_SIZE pthread_attr_setstacksize(&attrs, THREAD_STACK_SIZE); #endif #ifdef PTHREAD_SYSTEM_SCHED_SUPPORTED pthread_attr_setscope(&attrs, PTHREAD_SCOPE_SYSTEM); #endif /* Mask all signals in the current thread before creating the new * thread. This causes the new thread to start with all signals * blocked. */ sigfillset(&newmask); SET_THREAD_SIGMASK(SIG_BLOCK, &newmask, &oldmask); success = pthread_create(&th, #if defined(PY_PTHREAD_D4) pthread_attr_default, (pthread_startroutine_t)func, (pthread_addr_t)arg #elif defined(PY_PTHREAD_D6) pthread_attr_default, (void* (*)(void *))func, arg #elif defined(PY_PTHREAD_D7) pthread_attr_default, func, arg #elif defined(PY_PTHREAD_STD) #if defined(THREAD_STACK_SIZE) || defined(PTHREAD_SYSTEM_SCHED_SUPPORTED) &attrs, #else (pthread_attr_t*)NULL, #endif (void* (*)(void *))func, (void *)arg #endif ); /* Restore signal mask for original thread */ SET_THREAD_SIGMASK(SIG_SETMASK, &oldmask, NULL); #if defined(THREAD_STACK_SIZE) || defined(PTHREAD_SYSTEM_SCHED_SUPPORTED) pthread_attr_destroy(&attrs); #endif if (success == 0) { #if defined(PY_PTHREAD_D4) || defined(PY_PTHREAD_D6) || defined(PY_PTHREAD_D7) pthread_detach(&th); #elif defined(PY_PTHREAD_STD) pthread_detach(th); #endif } #if SIZEOF_PTHREAD_T <= SIZEOF_LONG return (long) th; #else return (long) *(long *) &th; #endif } /* XXX This implementation is considered (to quote Tim Peters) "inherently hosed" because: - It does not guanrantee the promise that a non-zero integer is returned. - The cast to long is inherently unsafe. - It is not clear that the 'volatile' (for AIX?) and ugly casting in the latter return statement (for Alpha OSF/1) are any longer necessary. */ long PyThread_get_thread_ident(void) { volatile pthread_t threadid; if (!initialized) PyThread_init_thread(); /* Jump through some hoops for Alpha OSF/1 */ threadid = pthread_self(); #if SIZEOF_PTHREAD_T <= SIZEOF_LONG return (long) threadid; #else return (long) *(long *) &threadid; #endif } static void do_PyThread_exit_thread(int no_cleanup) { dprintf(("PyThread_exit_thread called\n")); if (!initialized) { if (no_cleanup) _exit(0); else exit(0); } } void PyThread_exit_thread(void) { do_PyThread_exit_thread(0); } void PyThread__exit_thread(void) { do_PyThread_exit_thread(1); } #ifndef NO_EXIT_PROG static void do_PyThread_exit_prog(int status, int no_cleanup) { dprintf(("PyThread_exit_prog(%d) called\n", status)); if (!initialized) if (no_cleanup) _exit(status); else exit(status); } void PyThread_exit_prog(int status) { do_PyThread_exit_prog(status, 0); } void PyThread__exit_prog(int status) { do_PyThread_exit_prog(status, 1); } #endif /* NO_EXIT_PROG */ /* * Lock support. */ PyThread_type_lock PyThread_allocate_lock(void) { sem_t *lock; int status, error = 0; dprintf(("PyThread_allocate_lock called\n")); if (!initialized) PyThread_init_thread(); lock = (sem_t *)malloc(sizeof(sem_t)); if (lock) { status = sem_init(lock,0,1); CHECK_STATUS("sem_init"); if (error) { free((void *)lock); lock = NULL; } } dprintf(("PyThread_allocate_lock() -> %p\n", lock)); return (PyThread_type_lock)lock; } void PyThread_free_lock(PyThread_type_lock lock) { sem_t *thelock = (sem_t *)lock; int status, error = 0; dprintf(("PyThread_free_lock(%p) called\n", lock)); if (!thelock) return; status = sem_destroy(thelock); CHECK_STATUS("sem_destroy"); free((void *)thelock); } int PyThread_acquire_lock(PyThread_type_lock lock, int waitflag) { int success; sem_t *thelock = (sem_t *)lock; int status, error = 0; dprintf(("PyThread_acquire_lock(%p, %d) called\n", lock, waitflag)); if (waitflag) { status = sem_wait(thelock); CHECK_STATUS("sem_wait"); } else { status = sem_trywait(thelock); } success = (status == 0) ? 1 : 0; dprintf(("PyThread_acquire_lock(%p, %d) -> %d\n", lock, waitflag, success)); return success; } void PyThread_release_lock(PyThread_type_lock lock) { sem_t *thelock = (sem_t *)lock; int status, error = 0; dprintf(("PyThread_release_lock(%p) called\n", lock)); status = sem_post(thelock); CHECK_STATUS("sem_post"); } /* * Semaphore support. */ PyThread_type_sema PyThread_allocate_sema(int value) { sem_t *sema; int status, error = 0; dprintf(("PyThread_allocate_sema called\n")); if (!initialized) PyThread_init_thread(); sema = (sem_t *)malloc(sizeof(sem_t)); if (sema) { status = sem_init(sema,0,value); CHECK_STATUS("sem_init"); if (error) { free((void *)sema); sema = NULL; } } dprintf(("PyThread_allocate_sema() -> %p\n", sema)); return (PyThread_type_sema)sema; } void PyThread_free_sema(PyThread_type_sema sema) { int status, error = 0; sem_t *thesema = (sem_t *)sema; dprintf(("PyThread_free_sema(%p) called\n", sema)); if (!thesema) return; status = sem_destroy(thesema); CHECK_STATUS("sem_destroy"); free((void *) thesema); } int PyThread_down_sema(PyThread_type_sema sema, int waitflag) { int status, error = 0, success; sem_t *thesema = (sem_t *)sema; dprintf(("PyThread_down_sema(%p, %d) called\n", sema, waitflag)); if (waitflag) { status = sem_wait(thesema); CHECK_STATUS("sem_wait"); } else { status = sem_trywait(thesema); } success = (status == 0) ? 1 : 0; dprintf(("PyThread_down_sema(%p) return\n", sema)); return success; } void PyThread_up_sema(PyThread_type_sema sema) { int status, error = 0; sem_t *thesema = (sem_t *)sema; dprintf(("PyThread_up_sema(%p)\n", sema)); status = sem_post(thesema); CHECK_STATUS("sem_post"); } ------=_NextPart_000_0023_01C1BFB7.CF9678F0 Content-Type: application/octet-stream; name="thread_posix.diff-c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="thread_posix.diff-c" *** thread_pthread.h Wed Feb 27 17:35:11 2002 --- thread_posix.h Wed Feb 27 17:39:30 2002 *************** *** 5,10 **** --- 5,13 ---- #include #include #include + #ifdef _POSIX_SEMAPHORES + #include + #endif /* try to determine what version of the Pthread Standard is installed. *************** *** 288,293 **** --- 291,457 ---- } #endif /* NO_EXIT_PROG */ + #ifdef _POSIX_SEMAPHORES + /* + * Lock support. + */ + + PyThread_type_lock + PyThread_allocate_lock(void) + { + sem_t *lock; + int status, error = 0; + + dprintf(("PyThread_allocate_lock called\n")); + if (!initialized) + PyThread_init_thread(); + + lock = (sem_t *)malloc(sizeof(sem_t)); + + if (lock) { + status = sem_init(lock,0,1); + CHECK_STATUS("sem_init"); + + if (error) { + free((void *)lock); + lock = NULL; + } + } + + dprintf(("PyThread_allocate_lock() -> %p\n", lock)); + return (PyThread_type_lock)lock; + } + + void + PyThread_free_lock(PyThread_type_lock lock) + { + sem_t *thelock = (sem_t *)lock; + int status, error = 0; + + dprintf(("PyThread_free_lock(%p) called\n", lock)); + + if (!thelock) + return; + + status = sem_destroy(thelock); + CHECK_STATUS("sem_destroy"); + + free((void *)thelock); + } + + int + PyThread_acquire_lock(PyThread_type_lock lock, int waitflag) + { + int success; + sem_t *thelock = (sem_t *)lock; + int status, error = 0; + + dprintf(("PyThread_acquire_lock(%p, %d) called\n", lock, waitflag)); + + if (waitflag) { + status = sem_wait(thelock); + CHECK_STATUS("sem_wait"); + } else { + status = sem_trywait(thelock); + } + + success = (status == 0) ? 1 : 0; + + dprintf(("PyThread_acquire_lock(%p, %d) -> %d\n", lock, waitflag, success)); + return success; + } + + void + PyThread_release_lock(PyThread_type_lock lock) + { + sem_t *thelock = (sem_t *)lock; + int status, error = 0; + + dprintf(("PyThread_release_lock(%p) called\n", lock)); + + status = sem_post(thelock); + CHECK_STATUS("sem_post"); + } + + /* + * Semaphore support. + */ + + PyThread_type_sema + PyThread_allocate_sema(int value) + { + sem_t *sema; + int status, error = 0; + + dprintf(("PyThread_allocate_sema called\n")); + if (!initialized) + PyThread_init_thread(); + + sema = (sem_t *)malloc(sizeof(sem_t)); + + if (sema) { + status = sem_init(sema,0,value); + CHECK_STATUS("sem_init"); + + if (error) { + free((void *)sema); + sema = NULL; + } + } + dprintf(("PyThread_allocate_sema() -> %p\n", sema)); + return (PyThread_type_sema)sema; + } + + void + PyThread_free_sema(PyThread_type_sema sema) + { + int status, error = 0; + sem_t *thesema = (sem_t *)sema; + + dprintf(("PyThread_free_sema(%p) called\n", sema)); + + if (!thesema) + return; + + status = sem_destroy(thesema); + CHECK_STATUS("sem_destroy"); + + free((void *) thesema); + } + + int + PyThread_down_sema(PyThread_type_sema sema, int waitflag) + { + int status, error = 0, success; + sem_t *thesema = (sem_t *)sema; + + dprintf(("PyThread_down_sema(%p, %d) called\n", sema, waitflag)); + + if (waitflag) { + status = sem_wait(thesema); + CHECK_STATUS("sem_wait"); + } else { + status = sem_trywait(thesema); + } + + success = (status == 0) ? 1 : 0; + + dprintf(("PyThread_down_sema(%p) return\n", sema)); + return success; + } + + void + PyThread_up_sema(PyThread_type_sema sema) + { + int status, error = 0; + sem_t *thesema = (sem_t *)sema; + + dprintf(("PyThread_up_sema(%p)\n", sema)); + + status = sem_post(thesema); + CHECK_STATUS("sem_post"); + } + #else /* _POSIX_SEMAPHORES */ /* * Lock support. */ *************** *** 497,499 **** --- 661,664 ---- status = pthread_mutex_unlock(&thesema->mutex); CHECK_STATUS("pthread_mutex_unlock"); } + #endif /* _POSIX_SEMAPHORES */ ------=_NextPart_000_0023_01C1BFB7.CF9678F0-- From skip@pobox.com Wed Feb 27 22:17:15 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 27 Feb 2002 16:17:15 -0600 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: References: <3C7CAFD3.60B32168@lemburg.com> <15485.5491.709403.99698@beluga.mojam.com> <3C7D1927.3414607E@lemburg.com> <15485.9695.126411.600632@beluga.mojam.com> <15485.21153.951244.102021@beluga.mojam.com> Message-ID: <15485.23275.603452.414165@beluga.mojam.com> >> I was thinking about strings used as byte containers for >> non-character data. Martin> Ok, but then you also said that you would want to produce a Martin> warning for those? Never mind. I'm probably just confused. Skip From mal@lemburg.com Wed Feb 27 21:59:34 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 27 Feb 2002 22:59:34 +0100 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding References: <3C7CAFD3.60B32168@lemburg.com> <15485.5491.709403.99698@beluga.mojam.com> <3C7D1927.3414607E@lemburg.com> <15485.9695.126411.600632@beluga.mojam.com> <15485.21153.951244.102021@beluga.mojam.com> Message-ID: <3C7D56C6.E9BAA5E4@lemburg.com> Skip Montanaro wrote: > > Martin> If the default encoding is ASCII, and you have a 8-bit > Martin> character, the compiler will emit a warning if it is enhanced to > Martin> follow PEP 263. So what were you getting at? > > I was thinking about strings used as byte containers for non-character data. In string literals ? I think it is common to encode this sort of data as hex or using octal escapes. Since these encodings are plain 7-bit ASCII I don't see a problem. Your hint about the manual is correct though: we'll have to adapt that to the new reading as well. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From skip@pobox.com Thu Feb 28 01:26:57 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 27 Feb 2002 19:26:57 -0600 Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding In-Reply-To: <3C7D56C6.E9BAA5E4@lemburg.com> References: <3C7CAFD3.60B32168@lemburg.com> <15485.5491.709403.99698@beluga.mojam.com> <3C7D1927.3414607E@lemburg.com> <15485.9695.126411.600632@beluga.mojam.com> <15485.21153.951244.102021@beluga.mojam.com> <3C7D56C6.E9BAA5E4@lemburg.com> Message-ID: <15485.34657.168922.781138@12-248-41-177.client.attbi.com> >> I was thinking about strings used as byte containers for >> non-character data. mal> In string literals ? I think it is common to encode this sort of mal> data as hex or using octal escapes. Since these encodings are plain mal> 7-bit ASCII I don't see a problem. Precisely. I was thinking about situations where they aren't encoded, but sitting there naked, so to speak. Skip From guido@python.org Thu Feb 28 02:11:08 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 21:11:08 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library Message-ID: <200202280211.g1S2B8U27062@pcp742651pcs.reston01.va.comcast.net> We had a brief jam on date/time objects at Zope Corp. HQ today. I won't get to writing up the full proposal that came out of this, but I'd like to give at least a summary. (Th0se who were there: my thoughts have advanced a bit since this afternoon.) My plan is to create a standard timestamp object in C that can be subclassed. The internal representation will favor extraction of broken-out time fields (year etc.) in local time. It will support comparison, basic time computations, and effbot's minimal API, as well as conversions to and from the two currently most popular time representations used by the time module: posix timestamps in UTC and 9-tuples in local time. There will be a C API. Proposal for internal representation (also the basis for an efficient pickle format): year 2 bytes, big-endian, unsigned (0 .. 65535) month 1 byte day 1 byte hour 1 byte minute 1 byte second 1 byte usecond 3 bytes, big-endian tzoffset 2 bytes, big-endian, signed (in minutes, -1439 .. 1439) total 12 bytes Things this will not address (but which you may address through subclassing): - leap seconds - alternate calendars - years far in the future or BC - precision of timepoints (e.g. a separate Date type) - DST flags (DST is accounted for by the tzoffset field) Mini-FAQ - Why store a broken-out local time rather than seconds (or microseconds) relative to an epoch in UTC? There are two kinds of operations on times: accessing the broken-out fields (probably in local time), and time computations. The chosen representation favors accessing broken-out fields, which I expect to be more common than time computations. - Why a big-endian internal representation? So that comparison can be done using a single memcmp() call as long as the tzoffset fields are the same. - Why not pack the fields closer to save a few bytes? To make the pack and unpack operations more efficient; the object footprint isn't going to make much of a difference. - Why is the year unsigned? So memcmp() will do the right thing for comparing dates (in the same timezone). - What's the magic number 1439? One less than 24 * 60. Timezone offsets may be up to 24 hours. (The C99 standard does it this way.) I'll try to turn this into a proper PEP ASAP. (Stephan: do I need to CC you or are you reading python-dev?) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Thu Feb 28 02:33:52 2002 From: skip@pobox.com (Skip Montanaro) Date: Wed, 27 Feb 2002 20:33:52 -0600 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <200202280211.g1S2B8U27062@pcp742651pcs.reston01.va.comcast.net> References: <200202280211.g1S2B8U27062@pcp742651pcs.reston01.va.comcast.net> Message-ID: <15485.38672.466928.755447@12-248-41-177.client.attbi.com> Guido> Proposal for internal representation (also the basis for an Guido> efficient pickle format): Guido> year 2 bytes, big-endian, unsigned (0 .. 65535) ... Guido> - Why is the year unsigned? So memcmp() will do the right thing Guido> for comparing dates (in the same timezone). So the earliest year it can represent is 1BC (or does year == 0 represent some other base year)? One of MAL's desires were that he could use the abstract interface /F defined and remain binary compatible with the current mxDateTime layout. Will your layout work for him? Skip From guido@python.org Thu Feb 28 02:39:10 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 21:39:10 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: Your message of "Wed, 27 Feb 2002 20:33:52 CST." <15485.38672.466928.755447@12-248-41-177.client.attbi.com> References: <200202280211.g1S2B8U27062@pcp742651pcs.reston01.va.comcast.net> <15485.38672.466928.755447@12-248-41-177.client.attbi.com> Message-ID: <200202280239.g1S2dA927244@pcp742651pcs.reston01.va.comcast.net> > Guido> Proposal for internal representation (also the basis for an > Guido> efficient pickle format): > > Guido> year 2 bytes, big-endian, unsigned (0 .. 65535) > ... > Guido> - Why is the year unsigned? So memcmp() will do the right thing > Guido> for comparing dates (in the same timezone). > > So the earliest year it can represent is 1BC (or does year == 0 represent > some other base year)? Correct. > One of MAL's desires were that he could use the abstract interface /F > defined and remain binary compatible with the current mxDateTime layout. > Will your layout work for him? My layout is incompatible with that of mxDateTime, but this is not supposed to be /F's abstract interface -- this is supposed to be one implementation of it, mxDateTime can be another. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Feb 28 03:40:08 2002 From: guido@python.org (Guido van Rossum) Date: Wed, 27 Feb 2002 22:40:08 -0500 Subject: [Python-Dev] Manning Seeking Python book authors Message-ID: <200202280340.g1S3e8j27431@pcp742651pcs.reston01.va.comcast.net> Is anybody interested in writing any of the titles below, or can you recommend someone? --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Forwarded Message Date: Wed, 27 Feb 2002 15:02:17 -0500 From: Susan Capparelle To: Guido van Rossum Subject: Seeking Python book authors Hi Guido, I hope all is well? As someone who has done valuable reviewing for us before in the Python arena, I thought you might be one of the right people to contact. We're currently seeking authors for a number of Python related books. A couple of titles or topics would be; 'Enterprise system development with Python,' 'Practical Python' and 'Effective Python.' Can you recommend anyone with the necessary experience and skills to undertake any of these books? Looking forward to your response and thanks in advance for your help. Sincerely, ======================================= Susan W. Capparelle Assistant Publisher Manning Publications Co. 209 Bruce Park Avenue, Greenwich, CT 06830 suca@manning.com tel. 203.629.2211 www.manning.com fax. 203.629.2084 ======================================= ------- End of Forwarded Message From tim@zope.com Thu Feb 28 04:24:15 2002 From: tim@zope.com (Tim Peters) Date: Wed, 27 Feb 2002 23:24:15 -0500 Subject: [Python-Dev] POSIX thread code In-Reply-To: Message-ID: [Gerald S. Williams] > I recently came up with a fix for thread support in Python > under Cygwin. Jason Tishler and Norman Vine are looking it > over, but I'm pretty sure something similar should be used > for the Cygwin Python port. > > This is easily done--simply add a few lines to thread.c > and create a new thread_cygwin.h (context diff and new file > both provided). > > But there is a larger issue: > > The thread interface code in thread_pthread.h uses mutexes > and condition variables to emulate semaphores, which are > then used to provide Python "lock" and "sema" services. Please use current CVS Python for patches. For example, all the "sema" code no longer exists (it was undocumented and unused). > I know this is a common practice since those two thread > synchronization primitives are defined in "pthread.h". But > it comes with quite a bit of overhead. (And in the case of > Cygwin causes race conditions, but that's another matter.) > > POSIX does define semaphores, though. (In fact, it's in > the standard just before Mutexes and Condition Variables.) Semaphores weren't defined by POSIX at the time this code was written; IIRC, they were first introduced in the later and then-rarely implemented POSIX realtime extensions. How stable are they? Some quick googling didn't inspire a lot of confidence, but maybe I was just bumping into early bug reports. > According to POSIX, they are found in and > _POSIX_SEMAPHORES should be defined if they work as POSIX > expects. This may be a nightmare; for example, I don't see anything in the Single UNIX Specification about this symbol, and as far as I'm concerned POSIX as a distinct standard is a DSW (dead standard walking ). That's one for the Unixish geeks to address. > If they are available, it seems like providing direct > semaphore services would be preferable to emulating them > using condition variables and mutexes. They could be hugely better on Linux, but I don't know: there's anecdotal evidence that Linux scheduling of threads competing for a mutex can get itself into a vastly unfair state. Provided Linux implements semaphores properly, sempahore contention can be tweaked (and Python should do so), as befits a realtime gimmick, to guarantee fairness (SCHED_FIFO and SCHED_RR). > thread_posix.h.diff-c is a context diff that can be used > to convert thread_pthread.h into a more general POSIX > version that will use semaphores if available. I believe your PyThread_acquire_lock() code has two holes: 1. sem_trywait() is not checked for an error return. 2. sem_wait() and sem_trywait() can be interrupted by signal, and that's not an error condition. So these calls should be stuck in a loop: do { ... call the right one ... } while (status < 0 && errno == EINTR); if (status < 0) { /* an unexpected exceptional return */ ... } > ... > Does this sound like a good idea? Yes, provided it works . > Should I create a more thorough set of patch files and submit them? I'd like that, but please don't email patches -- they'll just be forgotten. Upload patches to the Python patch manager instead: http://sf.net/tracker/?group_id=5470&atid=305470 Discussion about the patches remains appropriate on Python-Dev. From martin@v.loewis.de Thu Feb 28 07:57:32 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 28 Feb 2002 08:57:32 +0100 Subject: [Python-Dev] POSIX thread code In-Reply-To: References: Message-ID: "Tim Peters" writes: > Semaphores weren't defined by POSIX at the time this code was written; IIRC, > they were first introduced in the later and then-rarely implemented POSIX > realtime extensions. How stable are they? They are in Single UNIX V2 (1997), so anybody claiming conformance to Single UNIX has implemented them: - AIX 4.3.1 and later - Tru64 UNIX V5.1A and later - Solaris 7 and later [from the list of certified Unix98 systems] In addition, the following implementations document support for sem_init: - LinuxThreads since glibc 2.0 (1996) - IRIX atleast since 6.5 (a patch for 6.2 is available since 1996) > > According to POSIX, they are found in and > > _POSIX_SEMAPHORES should be defined if they work as POSIX > > expects. > > This may be a nightmare; for example, I don't see anything in the Single > UNIX Specification about this symbol, and as far as I'm concerned POSIX as a > distinct standard is a DSW (dead standard walking ). That's one for > the Unixish geeks to address. You didn't ask google for _POSIX_SEMAPHORES, right? The first hit brings you to http://www.opengroup.org/onlinepubs/7908799/xsh/feature.html _POSIX_SEMAPHORES Implementation supports the Semaphores option. A quick check shows that both Solaris 8 and glibc 2.2 do indeed define the symbol. > They could be hugely better on Linux, but I don't know: there's anecdotal > evidence that Linux scheduling of threads competing for a mutex can get > itself into a vastly unfair state. For glibc 2.1, semaphores have been reimplemented; they now provide FIFO wakeup (sorted by thread priority). Same for mutexes: the highest-priority oldest-waiting thread will be resumed. > do { > ... call the right one ... > } while (status < 0 && errno == EINTR); Shouldn't EINTR check for KeyboardInterrupt? Regards, Martin From mal@lemburg.com Thu Feb 28 08:14:20 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 09:14:20 +0100 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <15485.25422.524082.109890@anthem.wooz.org> Message-ID: <3C7DE6DC.893E594B@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MvL" == Martin v Loewis writes: > > >> I bet we'd win some Ruby converts if we did this . For > >> reference, I'm thinking about including the Japanese and > >> Chinese codecs with MM2.1 because it makes little sense to > >> claim support for those languages without them. > > MvL> That is certainly the right thing to do. If correctness could > MvL> be verified independently, I'd be in favour of including them > MvL> with Python - even though they will likely never get the > MvL> efficiency that wrappers around the platform's codecs would > MvL> have. > > I'm obviously not qualified to verify them independently, but I have > had some initial positive feedback from a few Japanese users of the > MM2.1 alphas. My second hand information indicates that he Japanese > codecs are pretty good, the Chinese are okay, and the Korean ones need > a lot of work. > > Also, it's a bit of a catch 22, in that the more official exposure > these codecs get, the better they will eventually become, hopefully. > I'd be +1 on including them in Python 2.3. You could (and probably should) add Tamito's codecs in Python, but the others have licensing problems :-/ It shouldn't be hard though for native speakers and programmers to build upon the work of Tamito and get those codecs done as well. Alternatively, the PSF or some company interested in having these codecs available could fund the development. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Feb 28 08:45:53 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 09:45:53 +0100 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <3C7D42B6.A88568CD@lemburg.com> <15485.25475.913116.826208@anthem.wooz.org> Message-ID: <3C7DEE41.F31FAEA@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MAL" == M writes: > > MAL> Why not simply make the installation a configure option ? > > MAL> We could easily extend setup.py to grab the tarball from > MAL> the web in case it is needed. > > That's another option. Certainly stuff like that is becoming fairly > common for installers these days. Hmm, make that ZIP-ball (we have no .tar support in the standard lib, only ZIP-file support). Also, the setup.py will have to check whether it has to grab a level 0 compression ZIP file or a level 9 one. Nothing which cannot be done, of course... net installers are quite common these days (see e.g. Mozilla, IE and others), so people are probably quite used to them already. And we can always provide a full install download as well. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Thu Feb 28 08:34:32 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 28 Feb 2002 09:34:32 +0100 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) In-Reply-To: <3C7DE6DC.893E594B@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <15485.25422.524082.109890@anthem.wooz.org> <3C7DE6DC.893E594B@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > You could (and probably should) add Tamito's codecs in Python, > but the others have licensing problems :-/ I would not recommend to incorporate any of this into Python without asking the author(s). When doing so, it would be appropriate, IMO, to ask them whether they would fill out the contributor agreement. Then, the presumed licensing problems would be gone. Regards, Martin From mal@lemburg.com Thu Feb 28 09:08:17 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 10:08:17 +0100 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> Message-ID: <3C7DF381.C2E1335A@lemburg.com> "Martin v. Loewis" wrote: > > barry@zope.com (Barry A. Warsaw) writes: > > > Which actually touches on something I wanted to bring up. Why don't > > we include the Japanese codecs with Python? Is it just a size issue? > > I think Guido's original concern was about the size (apart from the > fact that they were not available before). > > My concern is also correctness and efficiency. Most current systems > provide high-performance well-tested codecs, since they need those > frequently. It is a waste of resources not to make use of these > codecs. The counter-argument, of course, is that you cannot always > rely on these codecs being available (apart from the fact that you > need wrappers around the platform API). Which wrapper APIs do we currently have which could actually be made part of the Python core ? Aside: while it's true that we could use those, the Unicode implementation has shown that rolling our own has worked out quite well too. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jacobs@penguin.theopalgroup.com Thu Feb 28 11:46:24 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Thu, 28 Feb 2002 06:46:24 -0500 (EST) Subject: [Python-Dev] Manning Seeking Python book authors In-Reply-To: <200202280340.g1S3e8j27431@pcp742651pcs.reston01.va.comcast.net> Message-ID: On Wed, 27 Feb 2002, Guido van Rossum wrote: > Is anybody interested in writing any of the titles below, or can > you recommend someone? If I were to find a one or two motivated co-authors, I would strongly consider tackling 'Enterprise system development with Python'. I'm in the final stretches of my upcoming book on data manipulation and statistical analysis in Python for programmers and graduate students, and have been looking around for new ideas. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From mal@lemburg.com Thu Feb 28 12:11:37 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 13:11:37 +0100 Subject: [Python-Dev] PEP 0275 -- Switching on Multiple Values Message-ID: <3C7E1E79.751AF37A@lemburg.com> I consider the PEP 0275 ready for review by the developers. Comments please. http://python.sourceforge.net/peps/pep-0275.html Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Thu Feb 28 06:57:36 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 28 Feb 2002 01:57:36 -0500 Subject: [Python-Dev] Alignment assumptions In-Reply-To: <13b201c1bfc9$c94d1b90$0500a8c0@boostconsulting.com> Message-ID: [David Abrahams] > A quick grep-find through the Python-2.2 sources reveals the following: > > Include/dictobject.h:49: long aligner; This is in #ifdef USE_CACHE_ALIGNED long aligner; #endif and AFAIK nobody ever defines the symbol. It's a cache-line optimization gimmick, but is effectively a nop (except to waste memory) on "almost all" machines. IIRC, the author never measured any improvement by using it (not surprising, since I believe almost all mallocs at least 8-byte align now). I vote we delete it. > Include/objimpl.h:275: double dummy; /* force worst-case alignment */ One branch of a union, forces enough padding in the gc header so that whatever follows the gc header is "aligned enough". This is sufficient for all core gc types, but may not be sufficient for user-defined gc types. I'm happy enough to view it as a restriction on what user-defined gc'able types can contain. > Modules/addrinfo.h:162: LONG_LONG __ss_align; /* force desired structure > storage alignment */ > Modules/addrinfo.h:164: double __ss_align; /* force desired structure > storage alignment */ This isn't our code (it's imported from the WIDE project), and I have no idea what it thinks it's trying to accomplish (neither the mystery padding, nor really much of anything else in the WIDE code!). > At first glance, there appear to be different assumptions at work > here about what constitutes maximal alignment on any given platform. Only the objimpl.h trick might benefit from maximal alignment. > I've been using a little C++ metaprogram to find a type which will > properly align any other given type. Because of limitations of one > compiler, I had to disable the computation and instead used the > objimpl.h assumption that double was maximally aligned, but also > added a compile-time assertion to check that the alignment is always > greater than or equal to that of the target type. Well, it failed today > on Tru64 Unix with the latest compaq CXX 6.5 prerelease compiler; it > appears that the alignment of long double is greater than that > of double on that platform. > > I thought someone might want to know, If you ever compile on a KSR machine, you'll discover there's no std C type that captures maximal alignment. You'd have to guess it's an extension type named "_subpage". I'm not sure that even C++ template metaprogramming could manage that bit of channeling (FYI, _subpage required 128-byte alignment). Stupid trick: If you can compute this at run time, do malloc(1) a few times, count the number of trailing 0 bits in the returned addresses, and take the minimum. Since malloc has to return memory "suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects", you'd soon discover you always got at least 7 trailing zero bits back from KSR malloc(), and presumably at least 4 under Tru64. there's-the-standard-and-then-there's-real-life-ly y'rs - tim From thomas.heller@ion-tof.com Thu Feb 28 13:19:20 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 28 Feb 2002 14:19:20 +0100 Subject: [Python-Dev] PEP 273 - Import from Zip Archives Message-ID: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook> [Jeremy on python-checkins list, PEP 283: Python 2.3 release schedule] > Planned features for 2.3 > Here are a few PEPs that I know to be under consideration. [...] > S 273 Import Modules from Zip Archives Ahlstrom I haven't participated in the discussion of PEP 273, IIRC it was mostly about implementation details... Wouldn't it be the right time now, instead of complicating the builtin import mechanism further, to simplify the builtin import code, and use it as the foundation of a Python coded implementation - imputil, or better Gordon's iu.py, or whatever? Thomas From David Abrahams" Message-ID: <172601c1c05e$0c0ea630$0500a8c0@boostconsulting.com> ----- Original Message ----- From: "Tim Peters" > > Include/objimpl.h:275: double dummy; /* force worst-case alignment */ > > One branch of a union, forces enough padding in the gc header so that > whatever follows the gc header is "aligned enough". This is sufficient for > all core gc types, but may not be sufficient for user-defined gc types. I'm > happy enough to view it as a restriction on what user-defined gc'able types > can contain. As I read the code, it affects all types (doesn't this header begin every object, regardless of its GC flags?) and I think that's a very unhappy circumstance for your numeric community. Remember, the type that raised the alarm here was just a long double. > > At first glance, there appear to be different assumptions at work > > here about what constitutes maximal alignment on any given platform. > > Only the objimpl.h trick might benefit from maximal alignment. I'm not actually after maximal alignment; I look for a minimally-sized/aligned type whose alignment is a multiple of the target type's alignment. In any case, I was just using the assumption that double was maximally aligned since I was linking with Python code and the EDG front-end was too slow to handle the metaprogram -- I figured that if the assumption was good enough for Python and my clients were depending on it anyway, it was good enough for my code (not!). > If you ever compile on a KSR machine, you'll discover there's no std C type > that captures maximal alignment. I was aware that this was a theoretical possibility, but not that it was a practical one. What's KSR? > You'd have to guess it's an extension type > named "_subpage". I'm not sure that even C++ template metaprogramming could > manage that bit of channeling Nope; we can only look through a list of likely candidates to try to find a match. We're hoping to address this for the next standard -- I'm pushing for allowing non-POD types in unions, leaving construction/destruction up to the user. > (FYI, _subpage required 128-byte > alignment). I guess that strictly speaking, requiring maximal alignment wouldn't be appropriate for objimpl ;-) > Stupid trick: If you can compute this at run time, do malloc(1) a few > times, count the number of trailing 0 bits in the returned addresses, and > take the minimum. Since malloc has to return memory "suitably aligned so > that it may be assigned to a pointer to any type of object and then used to > access such an object or an array of such objects", you'd soon discover you > always got at least 7 trailing zero bits back from KSR malloc(), and > presumably at least 4 under Tru64. Sounds like a good candidate for your autoconf script. Seriously, though, I think it would be reasonable to stick to aligning the standard builtin types, in which can you can do the test without calling malloc, FWIW. > there's-the-standard-and-then-there's-real-life-ly y'rs - tim in-theory-theory-and-practice-are-the-same-and-to-hell-with-what-happens-in- practice-ly y'rs -Dave From mal@lemburg.com Thu Feb 28 13:44:52 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 14:44:52 +0100 Subject: [Python-Dev] PEP 273 - Import from Zip Archives References: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook> Message-ID: <3C7E3454.B6690A7@lemburg.com> Thomas Heller wrote: > > [Jeremy on python-checkins list, PEP 283: Python 2.3 release schedule] > > Planned features for 2.3 > > Here are a few PEPs that I know to be under consideration. > [...] > > S 273 Import Modules from Zip Archives Ahlstrom > > I haven't participated in the discussion of PEP 273, > IIRC it was mostly about implementation details... > > Wouldn't it be the right time now, instead of complicating > the builtin import mechanism further, to simplify the builtin > import code, and use it as the foundation of a Python coded > implementation - imputil, or better Gordon's iu.py, or whatever? This would be nice to have, but how do you bootstrap the importer if it's written in Python ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From gsw@agere.com Thu Feb 28 13:47:42 2002 From: gsw@agere.com (Gerald S. Williams) Date: Thu, 28 Feb 2002 08:47:42 -0500 Subject: [Python-Dev] POSIX thread code In-Reply-To: Message-ID: Tim Peters wrote: > Please use current CVS Python for patches. For example, all the "sema" code > no longer exists (it was undocumented and unused). DOH! Sorry, I thought of that after pressing SEND. I had been using a specific Cygwin version to relay and test the proposed changes. DOH again! I just realized that a thread_nt.h patch that I submitted to the patch manager has the same problem! I'd better go get the latest CVS sources before commenting any further about the code... You and Martin have good points about the implementation, some of which I had intended to address once I knew which implementation to target. It sounds like I'll be targetting the general POSIX thread version of Python's thread interface code. I'd definitely at least check for _POSIX_SEMAPHORES before changing the behavior, though. One question left is whether to continue calling the file thread_pthread.h or to rename it thread_posix.h. > /* Thread package. > This is intended to be usable independently from Python. > > That's why there are no calls to Python runtime functions in > thread_pthread.h (etc) files now; e.g., they call malloc() and free() > directly, and don't reference any PyExc_XXX symbols. That's a lot to > overcome just to break existing code . Actually, this isn't true. The current thread_nt.h creates a Python dictionary to keep track of thread handles. This was what my earlier patch was for--the dictionary isn't even used (and creates a memory leak to boot). I proposed removing it entirely (along with the #include ). I'll update my previous patch with one based on current CVS sources. -Jerry -O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O- -O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661 O- -O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592 O- From guido@python.org Thu Feb 28 13:49:12 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 28 Feb 2002 08:49:12 -0500 Subject: [Python-Dev] Version updates etc. Message-ID: <200202281349.g1SDnCi28561@pcp742651pcs.reston01.va.comcast.net> Maybe it's time for a quick informative PEP explaining where, when and how version numbers, copyright dates and the like should be updated? This info is currently spread all over the place (PEP 101 and 102 have some, the rest in in the minds of various PythonLabs folks). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Feb 28 13:58:43 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 28 Feb 2002 08:58:43 -0500 Subject: [Python-Dev] PEP 0275 -- Switching on Multiple Values In-Reply-To: Your message of "Thu, 28 Feb 2002 13:11:37 +0100." <3C7E1E79.751AF37A@lemburg.com> References: <3C7E1E79.751AF37A@lemburg.com> Message-ID: <200202281358.g1SDwh428640@pcp742651pcs.reston01.va.comcast.net> > I consider the PEP 0275 ready for review by the developers. > Comments please. > > http://python.sourceforge.net/peps/pep-0275.html I think it's fine to look into this, but I believe for Python 2.3 we should focus more on stabilization than on new language features. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Thu Feb 28 13:57:49 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 28 Feb 2002 14:57:49 +0100 Subject: [Python-Dev] PEP 273 - Import from Zip Archives References: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook> <3C7E3454.B6690A7@lemburg.com> Message-ID: <053201c1c05f$e86a9610$e000a8c0@thomasnotebook> From: "M.-A. Lemburg" > Thomas Heller wrote: > > Wouldn't it be the right time now, instead of complicating > > the builtin import mechanism further, to simplify the builtin > > import code, and use it as the foundation of a Python coded > > implementation - imputil, or better Gordon's iu.py, or whatever? > > This would be nice to have, but how do you bootstrap the > importer if it's written in Python ? > Have you looked at imputil? It bootstraps itself only from builtin modules (which may be the only mechanism to be in the core). Probably everything else, even packages can be implemented outside. How did ni do it? Also I think Gordon's rimport and aimport are good ideas. Thomas From guido@python.org Thu Feb 28 14:03:51 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 28 Feb 2002 09:03:51 -0500 Subject: [Python-Dev] Alignment assumptions In-Reply-To: Your message of "Thu, 28 Feb 2002 01:57:36 EST." References: Message-ID: <200202281403.g1SE3ph28665@pcp742651pcs.reston01.va.comcast.net> > This is in > > #ifdef USE_CACHE_ALIGNED > long aligner; > #endif > > and AFAIK nobody ever defines the symbol. It's a cache-line > optimization gimmick, but is effectively a nop (except to waste > memory) on "almost all" machines. IIRC, the author never measured > any improvement by using it (not surprising, since I believe almost > all mallocs at least 8-byte align now). I vote we delete it. The malloc 8-byte align argument doesn't apply, since this struct is used in an array. Since the struct itself doesn't require alignment beyond 4 bytes, the array entries can be 12 bytes apart. So I don't think this is a nop -- I think it would waste 4 bytes per hash table entry on most machines. This was added by Jack Jansen ages ago -- I think he did measure a speedup on an old Mac compiler, or he wouldn't have added it, and I bet there was a #define USE_CACHE_ALIGNED in his config.h then. But that's all history; I agree it should be deleted. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Feb 28 14:07:47 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 15:07:47 +0100 Subject: [Python-Dev] PEP 0275 -- Switching on Multiple Values References: <3C7E1E79.751AF37A@lemburg.com> <200202281358.g1SDwh428640@pcp742651pcs.reston01.va.comcast.net> Message-ID: <3C7E39B3.AB4203F7@lemburg.com> Guido van Rossum wrote: > > > I consider the PEP 0275 ready for review by the developers. > > Comments please. > > > > http://python.sourceforge.net/peps/pep-0275.html > > I think it's fine to look into this, but I believe for Python > 2.3 we should focus more on stabilization than on new language features. Should I move this to 2.4 then ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Feb 28 14:12:44 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 15:12:44 +0100 Subject: [Python-Dev] PEP 273 - Import from Zip Archives References: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook> <3C7E3454.B6690A7@lemburg.com> <053201c1c05f$e86a9610$e000a8c0@thomasnotebook> Message-ID: <3C7E3ADC.DCED6D15@lemburg.com> Thomas Heller wrote: > > From: "M.-A. Lemburg" > > Thomas Heller wrote: > > > Wouldn't it be the right time now, instead of complicating > > > the builtin import mechanism further, to simplify the builtin > > > import code, and use it as the foundation of a Python coded > > > implementation - imputil, or better Gordon's iu.py, or whatever? > > > > This would be nice to have, but how do you bootstrap the > > importer if it's written in Python ? > > > Have you looked at imputil? It bootstraps itself only from builtin > modules (which may be the only mechanism to be in the core). Sure, but for finding imputil itself you still need the C import mechanism. Even worse: if Python can't find imputil (for some reason), it would be completely broken. My only gripe with the existing C implementation is that I would like to have more hooks available. Currently, you have to replace the complete API in order to add new features -- not exactly OO :-/ BTW, how is progress on the ZIP import patch doing ? Perhaps Jim should just check in what he has so that the code gets a little more code review... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Thu Feb 28 14:15:58 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 28 Feb 2002 09:15:58 -0500 Subject: [Python-Dev] PEP 0275 -- Switching on Multiple Values In-Reply-To: Your message of "Thu, 28 Feb 2002 15:07:47 +0100." <3C7E39B3.AB4203F7@lemburg.com> References: <3C7E1E79.751AF37A@lemburg.com> <200202281358.g1SDwh428640@pcp742651pcs.reston01.va.comcast.net> <3C7E39B3.AB4203F7@lemburg.com> Message-ID: <200202281415.g1SEFwU28798@pcp742651pcs.reston01.va.comcast.net> > > > I consider the PEP 0275 ready for review by the developers. > > > Comments please. > > > > > > http://python.sourceforge.net/peps/pep-0275.html > > > > I think it's fine to look into this, but I believe for Python > > 2.3 we should focus more on stabilization than on new language features. > > Should I move this to 2.4 then ? Yes, if that. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Thu Feb 28 08:35:35 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 28 Feb 2002 03:35:35 -0500 Subject: [Python-Dev] POSIX thread code In-Reply-To: Message-ID: [Martin v. Loewis] > ... > You didn't ask google for _POSIX_SEMAPHORES, right? The first hit > brings you to > > http://www.opengroup.org/onlinepubs/7908799/xsh/feature.html > > _POSIX_SEMAPHORES > Implementation supports the Semaphores option. Good catch! I didn't get a hit from the Open Group's SUS search box: http://www.opengroup.org/onlinepubs/7908799/ > A quick check shows that both Solaris 8 and glibc 2.2 do indeed define > the symbol. Cool. > ... > For glibc 2.1, semaphores have been reimplemented; they now provide > FIFO wakeup (sorted by thread priority). Same for mutexes: the > highest-priority oldest-waiting thread will be resumed. My impression is that some at Zope Corp would find it hard to believe that works. >> do { >> ... call the right one ... >> } while (status < 0 && errno == EINTR); > Shouldn't EINTR check for KeyboardInterrupt? Sorry, too much a can of worms for me -- the question and the possible answers are irrelevant on my box . Complications include that interrupts weren't able to break out of a wait on a Python lock before (so you'd change endcase semantics). If you don't care about that, how would you go about "checking for KeyboardInterrupt"? Note thread.c's initial comment: /* Thread package. This is intended to be usable independently from Python. That's why there are no calls to Python runtime functions in thread_pthread.h (etc) files now; e.g., they call malloc() and free() directly, and don't reference any PyExc_XXX symbols. That's a lot to overcome just to break existing code . From jim@interet.com Thu Feb 28 14:44:44 2002 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 28 Feb 2002 09:44:44 -0500 Subject: [Python-Dev] PEP 273 - Import from Zip Archives References: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook> <3C7E3454.B6690A7@lemburg.com> <053201c1c05f$e86a9610$e000a8c0@thomasnotebook> <3C7E3ADC.DCED6D15@lemburg.com> Message-ID: <3C7E425C.5060003@interet.com> M.-A. Lemburg wrote: > Sure, but for finding imputil itself you still need the C import > mechanism. Even worse: if Python can't find imputil (for some > reason), it would be completely broken. The other objection raised at the time was the possible slow down of imports. I think the existing C module search code is basically good, although I wouldn't mind moving module import into a Python method. But since the C works, I have little motivation to replace it. > My only gripe with the existing C implementation is that > I would like to have more hooks available. Currently, you > have to replace the complete API in order to add new > features -- not exactly OO :-/ My code uses os.listdir to cache directory contents, but defers its use until the os module can be imported using the C import code. I think a similar trick could be used to replace imports with a new module. This would make it easy to replace imports. But this would not make it easy to add features unless a module were available which implemented the current import semantics in Python. > BTW, how is progress on the ZIP import patch doing ? > Perhaps Jim should just check in what he has so that the code > gets a little more code review... The code is "done" has been in Source Forge patch 492105 for some time. I am leaving for Panama tomorrow for 8 days, so if I seem to disappear, that's why. I would be happy to work hard on this after I get back, because I think it is an important addition for Python. JimA From thomas.heller@ion-tof.com Thu Feb 28 14:57:07 2002 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 28 Feb 2002 15:57:07 +0100 Subject: [Python-Dev] Version updates etc. References: <200202281349.g1SDnCi28561@pcp742651pcs.reston01.va.comcast.net> Message-ID: <05e501c1c068$3124c670$e000a8c0@thomasnotebook> From: "Guido van Rossum" > Maybe it's time for a quick informative PEP explaining where, when and > how version numbers, copyright dates and the like should be updated? > This info is currently spread all over the place (PEP 101 and 102 have > some, the rest in in the minds of various PythonLabs folks). This PEP should also define the policy for the distutils version number - when there is one. Maybe distutils should simply use the Python version number, because Python is released more often than distutils. Thomas From gmcm@hypernet.com Thu Feb 28 15:55:39 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 28 Feb 2002 10:55:39 -0500 Subject: [Python-Dev] PEP 273 - Import from Zip Archives In-Reply-To: <3C7E3ADC.DCED6D15@lemburg.com> Message-ID: <3C7E0CAB.19066.47536147@localhost> [M.-A. Lemburg] > > > This would be nice to have, but how do you > > > bootstrap the importer if it's written in Python ? The response to "let's revamp C import" is "Oh no, we need a chance to play with it in Python first." [Thomas Heller] > > Have you looked at imputil? It bootstraps itself only > > from builtin modules (which may be the only mechanism > > to be in the core). True when Greg wrote it, but strop is now depecrated, and not necessarily builtin. It's still the best route, because strop has no dependencies, while string does. > Sure, but for finding imputil itself you still need the > C import mechanism. Even worse: if Python can't find > imputil (for some reason), it would be completely > broken. If Python can't find the std lib, it's broken. No change there. > My only gripe with the existing C implementation is > that I would like to have more hooks available. All the more reason to try it in Python first. There's never been agreement about what hooks should be available. The import-sig was founded so ihooks defenders could hash it out with imputil defenders (the ihooks camp has never said a word). It's my observation that most import hacks these days are really namespace hacks anyway (that is, they do a relatively normal import, and then alter the way it's exposed so that "replace dots with slashes and look in the filesystem" no longer applies). -- Gordon http://www.mcmillan-inc.com/ From gmcm@hypernet.com Thu Feb 28 15:55:39 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 28 Feb 2002 10:55:39 -0500 Subject: [Python-Dev] PEP 273 - Import from Zip Archives In-Reply-To: <3C7E425C.5060003@interet.com> Message-ID: <3C7E0CAB.6729.475361AB@localhost> On 28 Feb 2002 at 9:44, James C. Ahlstrom wrote: > The other objection raised at the time was the > possible slow down of imports. imputil was 30 to 40% slower than C import. iu is about 10 to 15% slower under normal usage, but can be faster if you use archives and arrange sys.path intelligently. > I think the existing C module search code is > basically good, although I wouldn't mind moving > module import into a Python method. But since the C > works, I have little motivation to replace it. It works because its implementation is the definition of what works. Note that while the import namespace (pkg.submodule.module) is mapped to the filesystem, the two namespaces are not isomorphic. > My code uses os.listdir to cache directory > contents, but defers its use until the os module can > be imported using the C import code. A win over some threshold of number of hits on that directory; a loss under that threshold. > I think a > similar trick could be used to replace imports with > a new module. This would make it easy to replace > imports. But this would not make it easy to add > features unless a module were available which > implemented the current import semantics in Python. The only incompatibility I'm aware of in iu.py is that it doesn't have a import lock. -- Gordon http://www.mcmillan-inc.com/ From skip@pobox.com Thu Feb 28 16:15:38 2002 From: skip@pobox.com (Skip Montanaro) Date: Thu, 28 Feb 2002 10:15:38 -0600 Subject: [Python-Dev] PEP 273 - Import from Zip Archives In-Reply-To: <3C7E0CAB.19066.47536147@localhost> References: <3C7E3ADC.DCED6D15@lemburg.com> <3C7E0CAB.19066.47536147@localhost> Message-ID: <15486.22442.738570.615670@beluga.mojam.com> Gordon> [Thomas Heller] >> > Have you looked at imputil? It bootstraps itself only from builtin >> > modules (which may be the only mechanism to be in the core). Gordon> True when Greg wrote it, but strop is now depecrated, and not Gordon> necessarily builtin. It's still the best route, because strop Gordon> has no dependencies, while string does. What do strop or string provide that string methods don't? It's likely that if you needed to import either in the past, you don't need to now. Skip From sdm7g@virginia.edu Thu Feb 28 16:15:18 2002 From: sdm7g@virginia.edu (Steven Majewski) Date: Thu, 28 Feb 2002 11:15:18 -0500 (EST) Subject: [Python-Dev] PEP 273 - Import from Zip Archives In-Reply-To: <3C7E3ADC.DCED6D15@lemburg.com> Message-ID: On Thu, 28 Feb 2002, M.-A. Lemburg wrote: > My only gripe with the existing C implementation is that > I would like to have more hooks available. Currently, you > have to replace the complete API in order to add new > features -- not exactly OO :-/ It might be time to consider, rather than a special case for zip files only, adding an extensible import mechanism ( something like the protocol or mime-type handlers for browsers ). If there's a zipfile in sys.path, then import calls the zipfile handler to search it, if there's a URL in the path, it calls a handler for that, etc. ( Maybe even a url for some sort of directory service that finds the module for you. ) -- Steve [Obviously thinking about TimBL's talk...] From barry@zope.com Thu Feb 28 16:17:01 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 28 Feb 2002 11:17:01 -0500 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <15485.25422.524082.109890@anthem.wooz.org> <3C7DE6DC.893E594B@lemburg.com> Message-ID: <15486.22525.324049.844325@anthem.wooz.org> [This thread probably ought to be moved to i18n-sig, so I'm CC'ing them and will remove all future cc's to python-dev. -BAW] >>>>> "MAL" == M writes: MAL> You could (and probably should) add Tamito's codecs in MAL> Python, but the others have licensing problems :-/ I believe I am using Tamito KAJIYAMA's codecs, from: http://pseudo.grad.sccs.chukyo-u.ac.jp/~kajiyama/python/ Or were you thinking about some different Japanese codecs? The ones at this url are BSD-ish and so should be compatible with the PSF license, GPL, etc. MAL> It shouldn't be hard though for native speakers and MAL> programmers to build upon the work of Tamito and get those MAL> codecs done as well. Alternatively, the PSF or some company MAL> interested in having these codecs available could fund the MAL> development. All good points. I still think that by giving more visibility to the codecs (i.e. adding them to the Python distro) would help bring muscle to the effort. >>>>> "MvL" == Martin v Loewis writes: MvL> I would not recommend to incorporate any of this into Python MvL> without asking the author(s). When doing so, it would be MvL> appropriate, IMO, to ask them whether they would fill out the MvL> contributor agreement. Then, the presumed licensing problems MvL> would be gone. Agreed on both points! -Barry From barry@zope.com Thu Feb 28 16:18:21 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 28 Feb 2002 11:18:21 -0500 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <3C7D42B6.A88568CD@lemburg.com> <15485.25475.913116.826208@anthem.wooz.org> <3C7DEE41.F31FAEA@lemburg.com> Message-ID: <15486.22605.863259.997769@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> Hmm, make that ZIP-ball (we have no .tar support in the MAL> standard lib, only ZIP-file support). Also, the setup.py will MAL> have to check whether it has to grab a level 0 compression MAL> ZIP file or a level 9 one. MAL> Nothing which cannot be done, of course... net installers are MAL> quite common these days (see e.g. Mozilla, IE and others), so MAL> people are probably quite used to them already. And we can MAL> always provide a full install download as well. Isn't there some PEP about all this? -Barry From tree@basistech.com Thu Feb 28 16:27:41 2002 From: tree@basistech.com (Tom Emerson) Date: Thu, 28 Feb 2002 11:27:41 -0500 Subject: [I18n-sig] Re: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) In-Reply-To: <15486.22525.324049.844325@anthem.wooz.org> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <15485.25422.524082.109890@anthem.wooz.org> <3C7DE6DC.893E594B@lemburg.com> <15486.22525.324049.844325@anthem.wooz.org> Message-ID: <15486.23165.349397.260521@magrathea.basistech.com> I've been working on a unified architecture for the Asian codecs. I presented a paper about it at the last Unicode Conference in Washington D.C. You can find it at http://www.basistech.com/articles/python-zh-transcoding_iuc20_TE2.pdf The presentation concentrates on Chinese, but the architecture will work for JK as well. -tree -- Tom Emerson Basis Technology Corp. Sr. Computational Linguist http://www.basistech.com "Beware the lollipop of mediocrity: lick it once and you suck forever" From guido@python.org Thu Feb 28 16:31:10 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 28 Feb 2002 11:31:10 -0500 Subject: [Python-Dev] PEP 273 - Import from Zip Archives In-Reply-To: Your message of "Thu, 28 Feb 2002 10:55:39 EST." <3C7E0CAB.19066.47536147@localhost> References: <3C7E0CAB.19066.47536147@localhost> Message-ID: <200202281631.g1SGVAw29092@pcp742651pcs.reston01.va.comcast.net> > True when Greg wrote it, but strop is now > depecrated, and not necessarily builtin. It's > still the best route, because strop has no > dependencies, while string does. Have a look at the code. It no longer import strop -- it uses string methods now. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Thu Feb 28 16:38:27 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 28 Feb 2002 11:38:27 -0500 Subject: [Python-Dev] PEP 273 - Import from Zip Archives In-Reply-To: <15486.22442.738570.615670@beluga.mojam.com> References: <3C7E0CAB.19066.47536147@localhost> Message-ID: <3C7E16B3.1124.477A9147@localhost> On 28 Feb 2002 at 10:15, Skip Montanaro wrote: > What do strop or string provide that string methods > don't? It's likely that if you needed to import > either in the past, you don't need to now. Oops, you're right. iu doesn't use strop. Just sys, imp and marshal (and optionally Win32api if required and found). -- Gordon http://www.mcmillan-inc.com/ From mal@lemburg.com Thu Feb 28 16:40:28 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 17:40:28 +0100 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <15485.25422.524082.109890@anthem.wooz.org> <3C7DE6DC.893E594B@lemburg.com> <15486.22525.324049.844325@anthem.wooz.org> Message-ID: <3C7E5D7C.A62CC10F@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MvL" == Martin v Loewis writes: > > MvL> I would not recommend to incorporate any of this into Python > MvL> without asking the author(s). When doing so, it would be > MvL> appropriate, IMO, to ask them whether they would fill out the > MvL> contributor agreement. Then, the presumed licensing problems > MvL> would be gone. > > Agreed on both points! +1. The PSF will have to agree on the contribution docs first, though. Since there's no discussion on the PSF docs discussion list, I suppose everybody is happy with them :-) BTW, I was referring to the other codecs in the python-codecs project on SF. Most of those are encumbered by the GPL and thus unusable in non-GPL projects. Tamito has switched to a BSD-license after some private discussions about this, which is goodness :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From David Abrahams" Message-ID: <18e001c1c078$1cb02940$0500a8c0@boostconsulting.com> ----- Original Message ----- From: "Gordon McMillan" > That's not even part of import. import is done when it > has [name1, name2, name3]. It's ceval.c that > does the binding. Yep, so I discovered. > Sounds to me like you want to override __setitem__ > on the module's __dict__. Not neccessarily, though that might be one approach. I might want to treat explicit setting of attributes differently from an import. > Tricky, 'cause a module > is hardly in charge of its own __dict__. > > But if you see value in it, you'd better persue it > now, because Jeremy's plans for optimization of > module __dict__ will likely make things harder. I thought this /was/ pursuing it. What did you have in mind? -Dave From barry@zope.com Thu Feb 28 16:46:35 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 28 Feb 2002 11:46:35 -0500 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <15485.25422.524082.109890@anthem.wooz.org> <3C7DE6DC.893E594B@lemburg.com> <15486.22525.324049.844325@anthem.wooz.org> <3C7E5D7C.A62CC10F@lemburg.com> Message-ID: <15486.24299.262770.702438@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> The PSF will have to agree on the contribution docs first, MAL> though. Since there's no discussion on the PSF docs MAL> discussion list, I suppose everybody is happy with them :-) I am. What do we need to do next? MAL> BTW, I was referring to the other codecs in the python-codecs MAL> project on SF. Most of those are encumbered by the GPL and MAL> thus unusable in non-GPL projects. MAL> Tamito has switched to a BSD-license after some private MAL> discussions about this, which is goodness :-) >From what I've been told, the Japanese codecs are the most stable. I'm really not qualified to judge though. -Barry From mal@lemburg.com Thu Feb 28 16:54:16 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 17:54:16 +0100 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <15485.25422.524082.109890@anthem.wooz.org> <3C7DE6DC.893E594B@lemburg.com> <15486.22525.324049.844325@anthem.wooz.org> <3C7E5D7C.A62CC10F@lemburg.com> <15486.24299.262770.702438@anthem.wooz.org> Message-ID: <3C7E60B8.41458EE8@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "MAL" == M writes: > > MAL> The PSF will have to agree on the contribution docs first, > MAL> though. Since there's no discussion on the PSF docs > MAL> discussion list, I suppose everybody is happy with them :-) > > I am. What do we need to do next? Wait. The deadline is mid-March. After that the docs will have to go to the lawyer and only then we can use them... > MAL> BTW, I was referring to the other codecs in the python-codecs > MAL> project on SF. Most of those are encumbered by the GPL and > MAL> thus unusable in non-GPL projects. > > MAL> Tamito has switched to a BSD-license after some private > MAL> discussions about this, which is goodness :-) > > From what I've been told, the Japanese codecs are the most stable. > I'm really not qualified to judge though. Me neither, but Tamito has put a lot of work into them and with his move to C for the codec engine, speed is not an issue anymore either. Also, I've asked him about his thoughts about having them included in the core before. He would be happy with that move. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From barry@zope.com Thu Feb 28 16:56:42 2002 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 28 Feb 2002 11:56:42 -0500 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <15485.25422.524082.109890@anthem.wooz.org> <3C7DE6DC.893E594B@lemburg.com> <15486.22525.324049.844325@anthem.wooz.org> <3C7E5D7C.A62CC10F@lemburg.com> <15486.24299.262770.702438@anthem.wooz.org> <3C7E60B8.41458EE8@lemburg.com> Message-ID: <15486.24906.938762.229351@anthem.wooz.org> >>>>> "MAL" == M writes: MAL> Wait. The deadline is mid-March. After that the docs will MAL> have to go to the lawyer and only then we can use them... Right, I forgot. ;) MAL> Me neither, but Tamito has put a lot of work into them and MAL> with his move to C for the codec engine, speed is not an MAL> issue anymore either. MAL> Also, I've asked him about his thoughts about having them MAL> included in the core before. He would be happy with that MAL> move. Cool! -Barry From mal@lemburg.com Thu Feb 28 17:00:24 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 18:00:24 +0100 Subject: [Python-Dev] PEP 273 - Import from Zip Archives References: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook> <3C7E3454.B6690A7@lemburg.com> <053201c1c05f$e86a9610$e000a8c0@thomasnotebook> <3C7E3ADC.DCED6D15@lemburg.com> <3C7E425C.5060003@interet.com> Message-ID: <3C7E6228.F8B3E59D@lemburg.com> "James C. Ahlstrom" wrote: > > M.-A. Lemburg wrote: > > BTW, how is progress on the ZIP import patch doing ? > > Perhaps Jim should just check in what he has so that the code > > gets a little more code review... > > The code is "done" has been in Source Forge patch 492105 > for some time. > > I am leaving for Panama tomorrow for 8 days, so if I > seem to disappear, that's why. I would be happy to work > hard on this after I get back, because I think it is an > important addition for Python. Great ! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jim@interet.com Thu Feb 28 17:02:27 2002 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 28 Feb 2002 12:02:27 -0500 Subject: [Python-Dev] PEP 273 - Import from Zip Archives References: <3C7E3ADC.DCED6D15@lemburg.com> <3C7E0CAB.19066.47536147@localhost> <15486.22442.738570.615670@beluga.mojam.com> Message-ID: <3C7E62A3.5090404@interet.com> Skip Montanaro wrote: > Gordon> [Thomas Heller] > >> > Have you looked at imputil? It bootstraps itself only from builtin > >> > modules (which may be the only mechanism to be in the core). > > Gordon> True when Greg wrote it, but strop is now depecrated, and not > Gordon> necessarily builtin. It's still the best route, because strop > Gordon> has no dependencies, while string does. > > What do strop or string provide that string methods don't? It's likely that > if you needed to import either in the past, you don't need to now. The real problem isn't the string module, it is the os module. Any importer will need this. The usual hack is to duplicate its logic in the my_importer module. That is, the selection of the correct builtin os functions. And MAL's point that you need a C importer to import your Python importer is inescapable. And suppose the whole Python library is in a zip file? You must have additional C code to extract and load your Python importer as well as the modules it imports. It seems to me that the correct solution is to use the C importer to import the my_importer Python module, plus all the imports that my_importer needs. Then you switch to resolving imports with my_importer.py. Something like this is already in my import.c patch. I don't think this discussion should hold up installing my zip import patches. I believe these patches are required, and can be the basis of a subsequent patch to add an external Python importer. JimA From mal@lemburg.com Thu Feb 28 17:09:46 2002 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Feb 2002 18:09:46 +0100 Subject: [Python-Dev] PEP 273 - Import from Zip Archives References: <3C7E0CAB.19066.47536147@localhost> Message-ID: <3C7E645A.3D464207@lemburg.com> Gordon McMillan wrote: > > [M.-A. Lemburg] > > > > This would be nice to have, but how do you > > > > bootstrap the importer if it's written in Python ? > > The response to "let's revamp C import" is "Oh no, > we need a chance to play with it in Python first." I think you misunderstood my request: I *don't* want to revamp import.c, I would just like some extra hooks to be able to only replace those few parts which I'd like to extend from time to time, e.g. instead of replacing the complete __import__ machinery, it would be nice to have a callback hook in the finder and another one in the module loader. All this has nothing to do with the PEP, though :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From jim@interet.com Thu Feb 28 17:16:05 2002 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 28 Feb 2002 12:16:05 -0500 Subject: [Python-Dev] PEP 273 - Import from Zip Archives References: <3C7E0CAB.6729.475361AB@localhost> Message-ID: <3C7E65D5.4080005@interet.com> Gordon McMillan wrote: > On 28 Feb 2002 at 9:44, James C. Ahlstrom wrote: > >>The other objection raised at the time was the >>possible slow down of imports. >> > > imputil was 30 to 40% slower than C import. iu > is about 10 to 15% slower under normal usage, but > can be faster if you use archives and arrange sys.path > intelligently. I think I can add iu.py as the standard Python importer to my import.c patches. That is, if iu.py can be imported (using C), then it takes over imports. Note that the C code changes to import.c are still required. Also note that iu.py may be in a zip file, and so the import.c changes are still required. >>My code uses os.listdir to cache directory >>contents, but defers its use until the os module can >>be imported using the C import code. >> > > A win over some threshold of number of hits on > that directory; a loss under that threshold. Exactly correct. It is tradeoff between the OS caching directory hits from fopen() versus using a Python cache and os.listdir(). Dramatic gains are obtained when importing from network file systems, an important case. JimA From aahz@rahul.net Thu Feb 28 18:17:06 2002 From: aahz@rahul.net (Aahz Maruch) Date: Thu, 28 Feb 2002 10:17:06 -0800 (PST) Subject: [Python-Dev] PEP 1 update In-Reply-To: <3C7B6322.440D21E7@lemburg.com> from "M.-A. Lemburg" at Feb 26, 2002 11:27:46 AM Message-ID: <20020228181706.D0335E8C7@waltz.rahul.net> M.-A. Lemburg wrote: > > I consider the above PEP ready for review by the developers. > Please comment. > > http://python.sourceforge.net/peps/pep-0263.html After looking at several PEPs over the last couple of days, I suggest that PEP 1 be updated to require inclusion of the Last-Modified: field. At the very least, I suggest that Post-History: be checked more rigorously. (PEP 263 contains a Post-History: field, but it is blank.) I don't think it's necessary to retrofit every PEP, but I think that every PEP up for consideration should be required to comply. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From pedroni@inf.ethz.ch Thu Feb 28 18:22:40 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Thu, 28 Feb 2002 19:22:40 +0100 Subject: [Python-Dev] PEP 1 update References: <20020228181706.D0335E8C7@waltz.rahul.net> Message-ID: <017101c1c084$e85cdaa0$6d94fea9@newmexico> [Ahz Maruch] > > After looking at several PEPs over the last couple of days, I suggest > that PEP 1 be updated to require inclusion of the Last-Modified: > field. At the very least, I suggest that Post-History: be checked more > rigorously. (PEP 263 contains a Post-History: field, but it is blank.) > > I don't think it's necessary to retrofit every PEP, but I think that > every PEP up for consideration should be required to comply. > -- >From some post son comp.lang.python it seems that people has some problem keeping track of PEPs and understand their status /iter: - whether they are there hanging around from version to version for possible consideration until the BDFL pick them up - whether they are open to changes or just pending and pushed for approval (there is only the draft/final distinction) - wondering whether some things under consideration are just oddballs hanging around for long spans of time and why they are not rapidly rejected or improbably accepted. I know what the PEP 1 says but anyway the PEP summary and PEP headers don't seem to properly and completely capture the right information needed to make sense for a casual reader. Another problem is that there are PEPs that have multiple phases but are marked has finished just because the main changes are implemented (division changes) and PEPs with important changes already done that are reported somehow just as unimplemented . Even Alex Martelli was wondering what was happing e.g. with PEP 246 (I think it has solved that at IPC10). Just my impressions. regards, Samuele Pedroni. From guido@python.org Thu Feb 28 18:58:48 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 28 Feb 2002 13:58:48 -0500 Subject: [Python-Dev] Re: Of slots and metaclasses... In-Reply-To: Your message of "Thu, 28 Feb 2002 09:30:51 EST." References: Message-ID: <200202281858.g1SIwm930118@pcp742651pcs.reston01.va.comcast.net> [Kevin Jacobs wrote me in private to ask my position on __slots__. I'm posting my reply here, quoting his full message -- I see no reason to carry this on as a private conversation. Sorry, Kevin, if this wasn't your intention.] > Hi Guido; > > Now that you are back from your travels, I'll start bugging you, as > gently as possible, for some insight into your intent wrt slots and > metaclasses. As you can read from the python-dev archives, I've > instigated a fair amount of discussion on the topic, though the > conversation is almost meaningless without your input. Hi Kevin, you got me to finally browse the thread "Meta-reflections". My first response was: "you've got it all wrong." My second response was a bit more nuanced: "that's not how I intended it to be at all!" OK, let me elaborate. :-) You want to be able to find out which instance attributes are defined by __slots__, so that (by combining this with the instance's __dict__) you can obtain the full set of attribute values. But this defeats the purpose of unifying built-in types and user-defined classes. A new-style class, with or without __slots__, should be considered no different from a new-style built-in type, except that all of the methods happen to be defined in Python (except maybe for inherited methods). In order to find all attributes, you should *never* look at __slots__. Your should search the __dict__ of the class and its base classes, in MRO order, looking for descriptors, and *then* add the keys of the __dict__ as a special case. This is how PEP 252 wants it to be. If the descriptors don't tell you everything you need, too bad -- some types just are like that. For example, if you're deriving from a list or tuple, there's no attribute that leads to the items: you have to use __len__ and __getitem__ to find out about these, and you have to "know" that that's how you get at them (although the presence of __getitem__ should be a clue). Why do I reject your suggestion of making __slots__ (more) usable for introspection? Because it would create another split between built-in types and user-defined classes: built-in types don't have __slots__, so any strategy based on __slots__ will only work for user-defined types. And that's exactly what I'm trying to avoid! You may complain that there are so many things to be found in a class's __dict__, it's hard to tell which things are descriptors. Actually, it's easy: if it has a __get__ (method) attribute, it's a descriptor; if it also has a __set__ attribute, it's a data attribute, otherwise it's a method. (Note that read-only data attributes have a descriptor that has a __set__ method that always raises TypeError or AttributeError.) Given this viewpoint, you won't be surprised that I have little desire to implement your other proposals, in particular, I reject all these: - Proxy the instance __dict__ with something that makes the slots visible - Flatten slot lists and make them immutable - Alter vars(obj) to return a dict of all attrs - Flatten slot inheritance (see below) - Change descriptors to fall back on class variables for unfilled slots I'll be the first to admit that some details are broken in 2.2. In particular, the fact that instances of classes with __slots__ appear picklable but lose all their slot values is a bug -- these should either not be picklable unless you add a __reduce__ method, or they should be pickled properly. This is a bug of the same kind as the problem with pickling time.localtime() (SF bug #496873), so I'm glad this problem has now been entered in the SF database (as #520644). I haven't made up my mind on how to fix this -- it would be nice if __slots__ would automatically be pickled, but it's tricky (although I think it's doable -- without ever referencing the __slots__ variable :-). I'm not so sure that the fact that you can "override" or "hide" slots defined in a base class should be classified as a bug. I see it more as a "don't do that" issue: If you're deriving a class that overrides a base class slot, you haven't done your homework. PyChecker could warn about this though. I think you're mostly right with your proposal "Update standard library to use new reflection API". Insofar as there are standard support classes that use introspection to provide generic services for classic classes, it would be nice of these could work correctly for new-style classes even if they use slots or are derived from non-trivial built-in types like dict or list. This is a big job, and I'd love some help. Adding the right things to the inspect module (without breaking pydoc :-) would probably be a first priority. Now let me get to the rest of your letter. > So I've been sitting on my hands and waiting for you to dive in and > set us all straight. Actually, that is not entirely true; I picked > up a copy of 'Putting Metaclasses to Work' and read it cover to > cover. Wow. That's more than I've ever managed (due to what I hope can still be called a mild case of ADD :-). But I think I studied all the important parts. (I should ask the authors for a percentage -- I think they've made quite some sales because of my frequent quoting of their book. :-) > Many things you've done in Python 2.2 are much clearer now, > though new questions have emerged. I would greatly appreciate it if > you would answer a few of them at a time. In return, I will > synthesize your ideas with my own and compile a document that > clearly defines and justifies the new Python object model and > metaclass protocol. Maybe you can formulate it as a set of tentative clarifying patches to PEPs 252, 253, and 254? > To start, there are some fairly broad and overlapping questions to get > started: > > 1) How much of IBM's SOMobject MetaClass Protocol (SOMMCP) do you > want to adapt to Python? For now (Python 2.2/2.3/2.4 time > frame)? And in the future (Python 3.0/3000)? Not much more than what I've done so far. A lot of what they describe is awfully C++ specific anyway; a lot of the things they struggle with (such as the redispatch hacks and requestFirstCooperativeMethodCall) can be done so much simpler in a dynamic language like Python that I doubt we should follow their examples literally. > 2) In Python 2.2, what intentional deviations have you chosen from the > SOMMCP and what differences are incidental or accidental? Hard to say, unless you specifically list all the things that you consider part of the SOMMCP. Here are some things I know: - In descrintro.html, I describe a slightly different algorithm for calculating the MRO than they use. But my implementation is theirs -- I didn't realize the two were different until it was too late, and it only matters in uninteresting corner cases. - I currently don't complain when there are serious order disagreements. I haven't decided yet whether to make these an error (then I'd have to implement an overridable way of defining "serious") or whether it's more Pythonic to leave this up to the user. - I don't enforce any of their rules about cooperative methods. This is Pythonic: you can be cooperative but you don't have to be. It would also be too incompatible with current practice (I expect few people will adopt super().) - I don't automatically derive a new metaclass if multiple base classes have different metaclasses. Instead, I see if any of the metaclasses of the bases is usable (i.e. I don't need to derive one anyway), and then use that; instead of deriving a new metaclass, I raise an exception. To fix this, the user can derive a metaclass and provide it in the __metaclass__ variable in the class statement. I'm not sure whether I should automatically derive metaclasses; I haven't got enough experience with this stuff to get a good feel for when it's needed. Since I expect that non-trivial metaclasses are often implemented in C, I'm not so comfortable with automatically merging multiple metaclasses -- I can't prove to myself that it's always safe. - I don't check that a base class doesn't override instance variables. As I stated above, I don't think I should, but I'm not 100% sure. > 3) Do you intend to enforce monotonicity for all methods and slots? > (Clearly, this is not desirable for instance __dict__ attributes.) If I understand the concept of monotonicity, no. Python traditionally allows you to override methods in ways that are incompatible with the contract of the base class method, and I don't intend to forbid this. It would be good if PyChecker checked for accidental mistakes in this area, and maybe there should be a way to declare that you do want this enforced; I don't know how though. There's also the issue that (again, if I remember the concepts right) there are some semantic requirements that would be really hard to check at compile time for Python. > 4) Should descriptors work cooperatively? i.e., allowing a > 'super' call within __get__ and __set__. I don't think so, but I haven't thought through all the consequences (I'm not sure why you're asking this, and whether it's still a relevant question after my responses above). You can do this for properties though. Thanks for the dialogue! --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Thu Feb 28 19:06:33 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 28 Feb 2002 14:06:33 -0500 Subject: [Python-Dev] PEP 273 - Import from Zip Archives In-Reply-To: <3C7E65D5.4080005@interet.com> Message-ID: <3C7E3969.14214.480227F9@localhost> On 28 Feb 2002 at 12:16, James C. Ahlstrom wrote: > I think I can add iu.py as the standard Python > importer to my import.c patches. That is, if iu.py > can be imported (using C), then it takes over > imports. Note that the C code changes to import.c > are still required. Also note that iu.py may be in a > zip file, and so the import.c changes are still > required. Thanks, but I don't want iu.py to be used instead of c import in normal Python installations. I'll use it that way in Installer, but since that's an embedding app, it's not hard to bootstrap. In the context of python-dev, iu is, I think, useful because it (a) emulates nearly exactly Python's import rules and (b) it does so in a nicely OO framework with some interesting facilities. In other words, as a model of what some future revamp of c import might be. -- Gordon http://www.mcmillan-inc.com/ From gmcm@hypernet.com Thu Feb 28 19:06:33 2002 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 28 Feb 2002 14:06:33 -0500 Subject: [Python-Dev] PEP 273 - Import from Zip Archives In-Reply-To: <3C7E62A3.5090404@interet.com> Message-ID: <3C7E3969.16843.4802279F@localhost> On 28 Feb 2002 at 12:02, James C. Ahlstrom wrote: > The real problem isn't the string module, it is the > os module. Any importer will need this. The usual > hack is to duplicate its logic in the my_importer > module. That is, the selection of the correct > builtin os functions. getpath.c has to invent the same filesystem primitives, since it runs before builtins are loaded. > And MAL's point that you need a C importer to import > your Python importer is inescapable. Everybody has the same bootstrap problem. > And suppose the whole Python library is in a zip > file? You must have additional C code to extract > and load your Python importer as well as the modules > it imports. Right. Primitives have to come from somewhere. > It seems to me that the correct solution is to use > the C importer to import the my_importer Python > module, plus all the imports that my_importer needs. > Then you switch to resolving imports with > my_importer.py. Something like this is already in > my import.c patch. Which is what almost everybody does, the exception being macPython. They use resources a lot, and most of the import extensions are built in at a very low level. > I don't think this discussion should hold up > installing my zip import patches. Not at all. Getting zip files onto sys.path is a very good thing. -- Gordon http://www.mcmillan-inc.com/ From nas@python.ca Thu Feb 28 19:33:28 2002 From: nas@python.ca (Neil Schemenauer) Date: Thu, 28 Feb 2002 11:33:28 -0800 Subject: [Python-Dev] PEP 273 - Import from Zip Archives In-Reply-To: <3C7E645A.3D464207@lemburg.com>; from mal@lemburg.com on Thu, Feb 28, 2002 at 06:09:46PM +0100 References: <3C7E0CAB.19066.47536147@localhost> <3C7E645A.3D464207@lemburg.com> Message-ID: <20020228113328.A3275@glacier.arctrix.com> M.-A. Lemburg wrote: > I think you misunderstood my request: I *don't* want > to revamp import.c, I would just like some extra hooks > to be able to only replace those few parts which I'd > like to extend from time to time, e.g. instead of replacing > the complete __import__ machinery, it would be nice > to have a callback hook in the finder and another one > in the module loader. I have a some rough code that does this. I've stuck it on my web site at: http://arctrix.com/nas/python/cimport-20020228.tar.gz if anyone is interested. I found that for my application (importing .ptl modules that need to be compiled with a different compiler), imputil did not have the right kind of hooks. ihooks was better but still kunky and slow. Neil From tim.one@comcast.net Thu Feb 28 19:42:35 2002 From: tim.one@comcast.net (Tim Peters) Date: Thu, 28 Feb 2002 14:42:35 -0500 Subject: [Python-Dev] Alignment assumptions In-Reply-To: <172601c1c05e$0c0ea630$0500a8c0@boostconsulting.com> Message-ID: [Jack, skip to the end please] [David Abrahams, on Include/objimpl.h:275: double dummy; /* force worst-case alignment */ ] > As I read the code, it affects all types (doesn't this header begin every > object, regardless of its GC flags?) Nope, only objects that go through _PyObject_GC_Malloc(). It could be a nightmare if, e.g., every string and int object consumed another (at least) 12 bytes. > and I think that's a very unhappy circumstance for your numeric > community. Remember, the type that raised the alarm here was just a > long double. The *Python* numeric community is far more likely to embed a float than a long double, and in any case seems unlikely to build a container type mixing long double with PyObject* members (i.e., one that ought to participate in cyclic gc). I expect we have a blind spot towards long double in general since Python doesn't expose or use such a thing, all the developers run on platforms where (as far as they know ) it's the same as a double, and "long double" was introduced after K&R (so some old-timers likely aren't even aware C89 introduced it). But I'll change the code here to use long double instead -- it's harmless, as it doesn't make a lick of difference on any platform that matters <0.7 wink>. >> Only the objimpl.h trick might benefit from maximal alignment. > I'm not actually after maximal alignment; I look for a minimally- > sized/aligned type whose alignment is a multiple of the target > type's alignment. In any case, I was just using the assumption that > double was maximally aligned since I was linking with Python code > and the EDG front-end was too slow to handle the metaprogram -- I > figured that if the assumption was good enough for Python Well, nobody has complained yet, but the core never needs alignment stricter than double, and-- as above --an extension type that both did and needed to participate in GC is unlikey. > and my clients were depending on it anyway, it was good enough for > my code (not!). One of the secrets to Python's success is that we tell unreasonable users to go away and bother the C++ committee instead. [128-byte alignment needed for KSR's _subpage type] > I was aware that this was a theoretical possibility, but not that it > was a practical one. What's KSR? Kendall Square Research, my (and Tani's, Tamah's and Steve Breit's) employer before Dragon. The address space was carved into 128-byte "subpages", and the hardware supported Python-style (non-owned non-reentrant) locks directly on a per-subpage basis (Python's lock.acquire() and lock.release() were one machine instruction each!). Subpages were also the unit for cache coherency across processors. So use of _subpage in our system code, and in speed-obsessed app code, was ubiquitous. I guess the main thing KSR proved was that you can't stay in business designing custom hardware to execute Python's semantics directly . > ... > Seriously, though, I think it would be reasonable to stick to aligning > the standard builtin types, in which can you can do the test without > calling malloc, FWIW. I checked this in: long double dummy; /* force worst-case alignment */ [Guido, on #ifdef USE_CACHE_ALIGNED long aligner; #endif ] > The malloc 8-byte align argument doesn't apply, since this struct is > used in an array. I was composing email while asleep . Gotcha. > ... > This was added by Jack Jansen ages ago -- I think he did measure a > speedup on an old Mac compiler, or he wouldn't have added it, and I > bet there was a #define USE_CACHE_ALIGNED in his config.h then. > > But that's all history; I agree it should be deleted. Jack, do you still want this? fighting-code-rot-ly y'rs - tim From David Abrahams" Message-ID: <1a2401c1c099$1088e1e0$0500a8c0@boostconsulting.com> ----- Original Message ----- From: "Tim Peters" > [David Abrahams, on > Include/objimpl.h:275: double dummy; /* force worst-case alignment */ > ] > > As I read the code, it affects all types (doesn't this header begin every > > object, regardless of its GC flags?) > > Nope, only objects that go through _PyObject_GC_Malloc(). It could be a > nightmare if, e.g., every string and int object consumed another (at least) > 12 bytes. Oh! I guess I should explicitly avoid _PyObject_GC_Malloc() unless I'm supporting GC, then. As you can see, there's a lot of basic stuff I still don't understand. > > and I think that's a very unhappy circumstance for your numeric > > community. Remember, the type that raised the alarm here was just a > > long double. > > The *Python* numeric community is far more likely to embed a float than a > long double, and in any case seems unlikely to build a container type > mixing long double with PyObject* members (i.e., one that ought to > participate in cyclic gc). OK, I get it. I'm still not clear on what happens by default, but I was under the mistaken impression that some types get GC support "automatically" and thus that people would be subject to undesired alignment problems without explicitly choosing them. > I expect we have a blind spot towards long double in general since Python > doesn't expose or use such a thing, all the developers run on platforms > where (as far as they know ) it's the same as a double, and "long > double" was introduced after K&R (so some old-timers likely aren't even > aware C89 introduced it). > > But I'll change the code here to use long double instead -- it's harmless, > as it doesn't make a lick of difference on any platform that matters <0.7 > wink>. Just for the record, I didn't twist your arm about this (only the ends of your moustache). > Well, nobody has complained yet, but the core never needs alignment stricter > than double, and-- as above --an extension type that both did and needed to > participate in GC is unlikey. Makes sense. And I guess because this is 'C', hacking in the appropriate alignment if such a type ever arose wouldn't be that hard. > > and my clients were depending on it anyway, it was good enough for > > my code (not!). > > One of the secrets to Python's success is that we tell unreasonable users to > go away and bother the C++ committee instead. That explains everything, thank you (especially the oving relationship we have with our lusers)! > [128-byte alignment needed for KSR's _subpage type] > > I was aware that this was a theoretical possibility, but not that it > > was a practical one. What's KSR? > > Kendall Square Research, my (and Tani's, Tamah's and Steve Breit's) employer > before Dragon. The address space was carved into 128-byte "subpages", and > the hardware supported Python-style (non-owned non-reentrant) locks directly > on a per-subpage basis (Python's lock.acquire() and lock.release() were one > machine instruction each!). Subpages were also the unit for cache coherency > across processors. So use of _subpage in our system code, and in > speed-obsessed app code, was ubiquitous. I guess the main thing KSR proved > was that you can't stay in business designing custom hardware to execute > Python's semantics directly . /Please/ tell me you weren't trying to build a parallel Python machine <5.99wink>. From jacobs@penguin.theopalgroup.com Thu Feb 28 20:48:05 2002 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Thu, 28 Feb 2002 15:48:05 -0500 (EST) Subject: [Python-Dev] Re: Of slots and metaclasses... In-Reply-To: <200202281858.g1SIwm930118@pcp742651pcs.reston01.va.comcast.net> Message-ID: On Thu, 28 Feb 2002, Guido van Rossum wrote: > [Kevin Jacobs wrote me in private to ask my position on __slots__. > I'm posting my reply here, quoting his full message -- I see no reason > to carry this on as a private conversation. Sorry, Kevin, if this > wasn't your intention.] No problem -- I sent it privately only to spare python-dev if you happened to be too busy for a coherent reply. > Hi Kevin, you got me to finally browse the thread "Meta-reflections". > My first response was: "you've got it all wrong." My second response > was a bit more nuanced: "that's not how I intended it to be at all!" > OK, let me elaborate. :-) Yes -- I can see why my initial efforts of making slots work "just like __dict__ attributes" is a bad idea. However, it took reading 'Putting Metaclasses to Work' for me to realize that. > You want to be able to find out which instance attributes are defined > by __slots__, so that (by combining this with the instance's __dict__) > you can obtain the full set of attribute values. But this defeats the > purpose of unifying built-in types and user-defined classes. I suppose the purpose of unifying built-in types and user-defined classes is rather subjective. There are many roads that will get us there, and I happened to fixate on another one... > A new-style class, with or without __slots__, should be considered no > different from a new-style built-in type, except that all of the > methods happen to be defined in Python (except maybe for inherited > methods). Sure. Except that I also want to be able to extend existing new-style classes/types in C, as well as Python. Here is how I do it now (minus error checking and ref-counting): static PyMethodDef PyRow_methods[] = { {"__init__", (PyCFunction)rowinit, METH_VARARGS}, {"__repr__", (PyCFunction)rowstrrepr, METH_NOARGS }, {"__getitem__", (PyCFunction)rowgetitem, METH_VARARGS} /* etc... */ } PyRow_Type = (PyTypeObject*)PyType_Type.tp_call((PyObject*)&PyType_Type,args, NULL) /* Methods must be added _after_ PyRow_Type has been created since the type is an argument to PyDescr_NewMethod */ dict = PyRow_Type->tp_dict; meth = PyRow_methods; for (; meth->ml_name != NULL; meth++) { PyObject* method = PyDescr_NewMethod(PyRow_Type, meth); PyDict_SetItemString(dict,meth->ml_name,method); } Though this doesn't look nearly as ugly as it did when I first wrote it, before I read 'Putting Metaclasses to Work'; strangely enough it ends up looking a lot like their metaclass interface. > In order to find all attributes, you should *never* look at __slots__. > Your should search the __dict__ of the class and its base classes, in > MRO order, looking for descriptors, and *then* add the keys of the > __dict__ as a special case. This is how PEP 252 wants it to be. Sure. I was just hoping to have that list of descriptors pre-computed and stored in the class (like __mro__). I suppose the question is why even expose __slots__ if it is so worthless? > If the descriptors don't tell you everything you need, too bad -- some > types just are like that. This has _never_ been a concern of mine -- I don't mind if the C implementation chooses to hide things. > Why do I reject your suggestion of making __slots__ (more) usable for > introspection? Because it would create another split between built-in > types and user-defined classes: built-in types don't have __slots__, > so any strategy based on __slots__ will only work for user-defined > types. And that's exactly what I'm trying to avoid! Well, I'm busing creating C extension types that *do* have slots! One of my many current projects is to create a better type to store the results of relational database queries. I want the memory efficiency of tuples and the ability to query by name (via __getitem__ or __getattr__). So I basically need to re-invent a magic tuple type that adds descriptors for every named field. Strangely enough, this is basically what the slots mechanism does. I do realize that I could accomplish the same end by sub-classing tuple and adding a bunch of descriptors. > Given this viewpoint, you won't be surprised that I have little desire > to implement your other proposals, in particular, I reject all these: > > - Proxy the instance __dict__ with something that makes the slots > visible I wasn't real thrilled with this idea myself. Among all the other reasons why not to do this, it has some terrible performance implications. > - Flatten slot lists and make them immutable Again, why even have __slots__ if they are so useless? Assuming that there is a legitimate reason to peek at __slots__, why not at least make them immutable? Or, even better, why not use __slots__ to expose the etype slot tuple instead? > - Alter vars(obj) to return a dict of all attrs Ok, I'm a little baffled by this. Why not? > I'll be the first to admit that some details are broken in 2.2. > > In particular, the fact that instances of classes with __slots__ > appear picklable but lose all their slot values is a bug -- these > should either not be picklable unless you add a __reduce__ method, or > they should be pickled properly. My vote is that they should be pickled properly by default. In my mind, slots are a more static type of attribute. Since they are more static, my feeling is that they should be as or more accessible than dict attributes. Descriptors are fine for handing the black magic of making them addressable by name, but it just feels wrong to hide them from access by other means. Of course, I am really talking about slots defined at the Python level -- not necessarily all storage allocated in the 'members' array. > I'm not so sure that the fact that you can "override" or "hide" slots > defined in a base class should be classified as a bug. I see it more > as a "don't do that" issue: If you're deriving a class that overrides > a base class slot, you haven't done your homework. PyChecker could > warn about this though. Unless attribute access becomes scoped based on the static type of the method, then I think it is a bug. Re-declared slots become effectively orphaned and just waste memory. Coalescing them or raising an exception when they are re-declared seem much better alternatives. > I think you're mostly right with your proposal "Update standard > library to use new reflection API". Insofar as there are standard > support classes that use introspection to provide generic services for > classic classes, it would be nice of these could work correctly for > new-style classes even if they use slots or are derived from > non-trivial built-in types like dict or list.> This is a big job, and > I'd love some help. Adding the right things to the inspect module > (without breaking pydoc :-) would probably be a first priority. Well, I'm happy to contribute, though my primary concern (other than correctness and completeness) is efficiency. The whole reason I'm using slots is to save space when allocating huge numbers of fairly small objects. I believe that there is a big performance difference between being able to pickle based on arbitrary descriptors and pickling just slots. Slots are already nicely laid out in rows, just waiting to be plucked out and stuffed into a pickle. Even without flattened __slots__ lists, it is a fast and trivial operation to iterate over a class and all its bases and extract slots. Doing so over dictionaries is not nearly so trivial. > Maybe you can formulate it as a set of tentative clarifying patches to > PEPs 252, 253, and 254? To be honest, I forgot that those PEPs existed! I've been working off of the Python 2.2 source and the tutorials. I'll read them over tonight and see. > > 2) In Python 2.2, what intentional deviations have you chosen from the > > SOMMCP and what differences are incidental or accidental? > > Hard to say, unless you specifically list all the things that you > consider part of the SOMMCP. When I say SOMMCP, I really mean the "metaclass protocol" defined by the various postulates and theorems in the first few chapters of the book. > - I currently don't complain when there are serious order > disagreements. I haven't decided yet whether to make these an error > (then I'd have to implement an overridable way of defining > "serious") or whether it's more Pythonic to leave this up to the > user. Sure -- I noticed this. Maybe you should store the order-safety in the metaclass? That way, the user can inspect it when they decide it is important. > - I don't enforce any of their rules about cooperative methods. This > is Pythonic: you can be cooperative but you don't have to be. It > would also be too incompatible with current practice (I expect few > people will adopt super().) I agree with most of that, except that I expect that MANY people will start using 'super'. I've trained an office full of Java programmers to program in Python and they are always complaining about the lack of super calls. Also, I've _always_ considered this idiom ugly and hackish: def Foo(Bar,Baz): def __init__(self): Bar.__init__(self) Baz.__init__(self) Its so much better as: def Foo(Bar,Baz): def __init__(self): # when super becomes a keyword and we write nice cooperative __init__ # methods super.__init__(self) > - I don't automatically derive a new metaclass if multiple base > classes have different metaclasses. I have my own ideas about this, but like you, don't have enough experience with them in practice to do anything about it. > Since I expect that non-trivial metaclasses are > often implemented in C, I'm not so comfortable with automatically > merging multiple metaclasses -- I can't prove to myself that it's > always safe. It is always safe when the assumption of monotonicity is not violated. > - I don't check that a base class doesn't override instance > variables. As I stated above, I don't think I should, but I'm not > 100% sure. Do you mean slots or all Python instance attributes in this statement? > > 3) Do you intend to enforce monotonicity for all methods and slots? > > (Clearly, this is not desirable for instance __dict__ attributes.) > > If I understand the concept of monotonicity, no. Python traditionally > allows you to override methods in ways that are incompatible with the > contract of the base class method, and I don't intend to forbid this. For Python, monotonicity means that the instance attributes and instance methods of a class are a superset of those of all its ancestors. This is not the way that normal __dict__ attributes work in Python, so lets talk only about slots when discussing monotonic properties. In order words, it means that the metaclass interface does not provide a way to delete a slot or a method, only ways to add and override them. Combined with some static type information, the assumption of monotonicity will be very helpful when we can eventually compile Python. > It would be good if PyChecker checked for accidental mistakes in this > area, and maybe there should be a way to declare that you do want this > enforced; I don't know how though. I have a pretty good idea how. Its essentially a proof-based method that works by solving metatype constraints. > There's also the issue that (again, if I remember the concepts right) > there are some semantic requirements that would be really hard to > check at compile time for Python. True for __dict__ instance attributes, not for slots! > > 4) Should descriptors work cooperatively? i.e., allowing a > > 'super' call within __get__ and __set__. > > I don't think so, but I haven't thought through all the consequences > (I'm not sure why you're asking this, and whether it's still a > relevant question after my responses above). You can do this for > properties though. class Foo(object): __slots__=() a = 1 class Bar(Foo): __slots__ = ('a',) bar = Bar() print dir(a) print a The resolution rule for descriptors could work cooperatively to find Foo's class attribute 'a' instead of giving up with an AttributeError. Thanks for the very useful answers, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From Jack.Jansen@oratrix.com Thu Feb 28 21:34:05 2002 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Thu, 28 Feb 2002 22:34:05 +0100 Subject: [Python-Dev] Alignment assumptions In-Reply-To: Message-ID: On donderdag, februari 28, 2002, at 07:57 , Tim Peters wrote: > [David Abrahams] >> A quick grep-find through the Python-2.2 sources reveals the >> following: >> >> Include/dictobject.h:49: long aligner; > > This is in > > #ifdef USE_CACHE_ALIGNED > long aligner; > #endif > > and AFAIK nobody ever defines the symbol. It's a cache-line > optimization > gimmick, but is effectively a nop (except to waste memory) on > "almost all" > machines. IIRC, the author never measured any improvement by > using it (not > surprising, since I believe almost all mallocs at least 8-byte > align now). > I vote we delete it. MacPython uses it. At the time it was put in it caused a 15% increase in Pystones because dictionary entries were aligned in cache lines. But: this was in the PPC 601 and 604 era, I must say that I've never tested whether it made any difference on G3 and G4. Put in a bug report in my name, and one day I'll get around to testing whether it still makes a difference on current hardware and rip it out if it doesn't. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From guido@python.org Thu Feb 28 21:51:45 2002 From: guido@python.org (Guido van Rossum) Date: Thu, 28 Feb 2002 16:51:45 -0500 Subject: [Python-Dev] Re: Of slots and metaclasses... In-Reply-To: Your message of "Thu, 28 Feb 2002 15:48:05 EST." References: Message-ID: <200202282151.g1SLpjE30957@pcp742651pcs.reston01.va.comcast.net> [me] > > A new-style class, with or without __slots__, should be considered > > no different from a new-style built-in type, except that all of > > the methods happen to be defined in Python (except maybe for > > inherited methods). [Kevin] > Sure. Except that I also want to be able to extend existing > new-style classes/types in C, as well as Python. Here is how I do > it now (minus error checking and ref-counting): > > static PyMethodDef PyRow_methods[] = { > {"__init__", (PyCFunction)rowinit, METH_VARARGS}, > {"__repr__", (PyCFunction)rowstrrepr, METH_NOARGS }, > {"__getitem__", (PyCFunction)rowgetitem, METH_VARARGS} > /* etc... */ } > > PyRow_Type = (PyTypeObject*)PyType_Type.tp_call((PyObject*)&PyType_Type,args, NULL) > > /* Methods must be added _after_ PyRow_Type has been created > since the type is an argument to PyDescr_NewMethod */ > dict = PyRow_Type->tp_dict; > meth = PyRow_methods; > for (; meth->ml_name != NULL; meth++) > { > PyObject* method = PyDescr_NewMethod(PyRow_Type, meth); > PyDict_SetItemString(dict,meth->ml_name,method); > } Heh?!?!!! Why can't you declare PyRow_Type as a statically initialized struct like all extensions and the core do? [snip] > Sure. I was just hoping to have that list of descriptors > pre-computed and stored in the class (like __mro__). __mro__ gets used *all the time*; on every method lookup at least. The list of instance variable descriptors is only interesting to a small number of highly introspective tools. > I suppose the question is why even expose __slots__ if it is so > worthless? It's found in the dict when the class is defined. Why delete it? The idea is that you can make it a dict that has other info about the slots. It's got a __foo__ name. I can give it any semantics I damn well please. :-) > > If the descriptors don't tell you everything you need, too bad -- > > some types just are like that. > > This has _never_ been a concern of mine -- I don't mind if the C > implementation chooses to hide things. Exactly, and I'm telling you to have the same attitude about slots. Let me repeat something I just sent someone else about slots: It seems that unfortunately __slots__ is Python 2.2's most misunderstood feature... I see it as a hack that lets me define a special-purpose class whose instances are (almost) as efficient as I can do using C, but without having to write a C extension. (I say "almost", because a C extension can store simple values as C ints, while __slots__ only lets you store PyObject pointers. But still, it's a big savings compared to adding a __dict__ to every instance, and sometimes the slot value is picked from a small number of interned or cached ints or strings.) It has different semantics from regular attributes, and I don't try to hide that: introspection doesn't find slots the same way as it finds regular instance vars, you can't provide a default via a class variable, and there are a bunch of "don't do that" things like modifying __slots__ of an existing class or overriding a slot defined by a base class. (There's a whole list of warnings in http://www.python.org/2.2/descrintro.html!) I think as such, the feature is just right (except for the no-pickling bug). It's unfortunate that people have jumped on it as the answer to all their questions. I guess that means there's a big demand for more control over instance variables -- whether that demand is created by a real need or simply because that's how most other languages do it remains to be seen... > > Why do I reject your suggestion of making __slots__ (more) usable > > for introspection? Because it would create another split between > > built-in types and user-defined classes: built-in types don't have > > __slots__, so any strategy based on __slots__ will only work for > > user-defined types. And that's exactly what I'm trying to avoid! > > Well, I'm busing creating C extension types that *do* have slots! > One of my many current projects is to create a better type to store > the results of relational database queries. I want the memory > efficiency of tuples and the ability to query by name (via > __getitem__ or __getattr__). So I basically need to re-invent a > magic tuple type that adds descriptors for every named field. > Strangely enough, this is basically what the slots mechanism does. > I do realize that I could accomplish the same end by sub-classing > tuple and adding a bunch of descriptors. Note that there's something already there that you might reuse: Objects/structseq.c, which is used to create the return values of localtime(), stat() and a few others in a way that looks both like a tuple and like a read-only record. It may not be powerful enough because I think the assumption is that the set of field names is static, but you may be able to extend it or copy some good ideas. (Just don't try to understand what it does to make the tuple shorter than the record in some cases -- that's for backwards compatibility because lots of code would break if e.g. struct() returned a longer tuple than in previous Python versions, but we still want to provide new fields when using named fields. This part is not for the weak of heart, and I didn't write it, and can't guarantee that it's 100% bugfree.) [items I rejected] > > - Alter vars(obj) to return a dict of all attrs > > Ok, I'm a little baffled by this. Why not? Currently, the assumption is that vars() returns a dict that can be modified to modify the underlying object's attributes. If it were to return a synthetic dict, that wouldn't work, or it would require more implementation effort than I care for -- again, since I doubt there is much demand for this outside a small set of introspection tools. > > I'll be the first to admit that some details are broken in 2.2. > > > > In particular, the fact that instances of classes with __slots__ > > appear picklable but lose all their slot values is a bug -- these > > should either not be picklable unless you add a __reduce__ method, > > or they should be pickled properly. > > My vote is that they should be pickled properly by default. In my > mind, slots are a more static type of attribute. Since they are > more static, my feeling is that they should be as or more accessible > than dict attributes. Descriptors are fine for handing the black > magic of making them addressable by name, but it just feels wrong to > hide them from access by other means. Of course, I am really > talking about slots defined at the Python level -- not necessarily > all storage allocated in the 'members' array. Slots share their descriptor implementation with anything defined by the tp_members array in a type object. E.g. file.softspace is a descriptor of the same type as used by slots. What they share is that they refer to "real" data stored in the instance -- either a PyObject* or some basic C type like int or double. I don't want to trust that __slots__ has the right data: even if I made it immutable, someone could still do C.__dict__['__slots__'] = , and I don't want to go so far as to make __slots__ a property stored in the type object. So I can't really tell which descriptors are slots and which are other things -- and I don't want to, because I believe that would be breaking through an abstraction. > Unless attribute access becomes scoped based on the static type of > the method, then I think it is a bug. Re-declared slots become > effectively orphaned and just waste memory. Coalescing them or > raising an exception when they are re-declared seem much better > alternatives. It's a bug to redeclare a slot. I don't find it Python's job to make it an error. > > I think you're mostly right with your proposal "Update standard > > library to use new reflection API". Insofar as there are standard > > support classes that use introspection to provide generic services > > for classic classes, it would be nice of these could work > > correctly for new-style classes even if they use slots or are > > derived from non-trivial built-in types like dict or list.> This > > is a big job, and I'd love some help. Adding the right things to > > the inspect module (without breaking pydoc :-) would probably be a > > first priority. > > Well, I'm happy to contribute, though my primary concern (other than > correctness and completeness) is efficiency. The whole reason I'm > using slots is to save space when allocating huge numbers of fairly > small objects. I believe that there is a big performance difference > between being able to pickle based on arbitrary descriptors and > pickling just slots. Slots are already nicely laid out in rows, > just waiting to be plucked out and stuffed into a pickle. Even > without flattened __slots__ lists, it is a fast and trivial > operation to iterate over a class and all its bases and extract > slots. Doing so over dictionaries is not nearly so trivial. I think you're overstating the simplicity of pickling slots. There is no guarantee that the slots of a derived class are contiguous with the slots of a base class; a __weakref__ and a __dict__ field may be placed in between, and another metaclass could add other things. For example, you could write a metaclass in C that took the __slots__ idea one step further and let you declare the types of the slots as basic C types, so that other structmember keys could be used, e.g. T_INT or T_FLOAT. If you want your instances to be pickled *efficiently*, you should write a custom reduce method in C anyway -- right now, new-style classes are pickled by a piece of Python code at the end of copy_reg.py. > > Maybe you can formulate it as a set of tentative clarifying > > patches to PEPs 252, 253, and 254? > > To be honest, I forgot that those PEPs existed! I've been working > off of the Python 2.2 source and the tutorials. I'll read them over > tonight and see. I had a feeling you were missing something basic. :-) > When I say SOMMCP, I really mean the "metaclass protocol" defined by the > various postulates and theorems in the first few chapters of the book. As I said, I don't have the whole set in my head, so you'll have to be more specific in your questions. (Basically, I don't expect to be adding much from the book, but I'll be looking to the book for clues as we find problems with how things are implemented now, e.g. the automatically derived metaclass issue below.) > > - I currently don't complain when there are serious order > > disagreements. I haven't decided yet whether to make these an > > error (then I'd have to implement an overridable way of defining > > "serious") or whether it's more Pythonic to leave this up to the > > user. > > Sure -- I noticed this. Maybe you should store the order-safety in the > metaclass? That way, the user can inspect it when they decide it is > important. You mean in the class object? I'm not sure what you mean by "storing the order-safety". I currently don't calculate whether there are any order conflicts: serious_order_disagreements() returns 0 without doing anything. Someone who wants it can easily implement the check from the book though. > > - I don't enforce any of their rules about cooperative methods. > > This is Pythonic: you can be cooperative but you don't have to > > be. It would also be too incompatible with current practice (I > > expect few people will adopt super().) > > I agree with most of that, except that I expect that MANY people > will start using 'super'. I doubt it with the current super(Class,self).method(args) notation. Probably they will once super is a keyword so you can write super.method(args). > I've trained an office full of Java > programmers to program in Python and they are always complaining > about the lack of super calls. Also, I've _always_ considered this > idiom ugly and hackish: > > def Foo(Bar,Baz): > def __init__(self): > Bar.__init__(self) > Baz.__init__(self) Strange that you mention Java in the same paragraph as an example using multiple inheritance. ;-/ Also note that this is pretty much what C++ wants you to do, except it uses '::' instead of '.' and doesn't require you to pass self (which is a different issue). I don't see this as a serious issue, just syntactic sugar. > Its so much better as: > > def Foo(Bar,Baz): > def __init__(self): > # when super becomes a keyword and we write nice cooperative __init__ > # methods > super.__init__(self) But that's not what you'd be writing -- you'd be writing super.__init__(). > > - I don't automatically derive a new metaclass if multiple base > > classes have different metaclasses. > > I have my own ideas about this, but like you, don't have enough > experience with them in practice to do anything about it. Can you share them? This might be interesting. > > Since I expect that non-trivial metaclasses are > > often implemented in C, I'm not so comfortable with automatically > > merging multiple metaclasses -- I can't prove to myself that it's > > always safe. > > It is always safe when the assumption of monotonicity is not violated. And that we can't know. > > - I don't check that a base class doesn't override instance > > variables. As I stated above, I don't think I should, but I'm not > > 100% sure. > > Do you mean slots or all Python instance attributes in this statement? I just meant slots, but in a sense it's also true for other ivars: if you don't know that your base class defines an ivar 'foo', you might create your own ivar named 'foo' and use it in a way that's inconsistent with the base class. Because there are no type checks and no ivar declarations, that's much harder to avoid in Python than in more static languages like C++ or Java (I assume those will complain when you redefine an ivar, even with the same type). > > > 3) Do you intend to enforce monotonicity for all methods and > > > slots? (Clearly, this is not desirable for instance > > > __dict__ attributes.) > > > > If I understand the concept of monotonicity, no. Python > > traditionally allows you to override methods in ways that are > > incompatible with the contract of the base class method, and I > > don't intend to forbid this. > > For Python, monotonicity means that the instance attributes and > instance methods of a class are a superset of those of all its > ancestors. This is not the way that normal __dict__ attributes work > in Python, so lets talk only about slots when discussing monotonic > properties. I'm not sure what you mean by "this is not the way that normal __dict__ attrs work", unless you are talking about overriding __init__ without calling the base class __init__ (and perhaps the same for other methods), which of course can mean that a derived class instance lacks an ivar that a base class instance would have. This is Pythonic freedom IMO. Since it's not true for regular ivars, why worry about it for slots? > In order words, it means that the metaclass interface > does not provide a way to delete a slot or a method, only ways to > add and override them. Combined with some static type information, > the assumption of monotonicity will be very helpful when we can > eventually compile Python. I don't think we should be guided here by what might be needed by a compiler. Without actually trying to build a compiler, we'll probably miss important requirements that mean we'll have to change the language anyway, and we'll impose requirements that we think might be important without a good reason. (E.g. structured programming was once thought as an aid to compiler technology as well as to the human reader. Nowadays, optimizers reduce all control flow to labels and goto statements. :-) > > It would be good if PyChecker checked for accidental mistakes in > > this area, and maybe there should be a way to declare that you do > > want this enforced; I don't know how though. > > I have a pretty good idea how. Its essentially a proof-based method > that works by solving metatype constraints. Isn't that how most of PyChecker works? At least the proof-base part? > > There's also the issue that (again, if I remember the concepts right) > > there are some semantic requirements that would be really hard to > > check at compile time for Python. > > True for __dict__ instance attributes, not for slots! Again, you're trying to hijack slots for purposes for which they weren't created. Think of slots as an efficiency hack, *not* as a better way to declare ivars. > > > 4) Should descriptors work cooperatively? i.e., allowing a > > > 'super' call within __get__ and __set__. > > > > I don't think so, but I haven't thought through all the > > consequences (I'm not sure why you're asking this, and whether > > it's still a relevant question after my responses above). You can > > do this for properties though. > > class Foo(object): > __slots__=() > a = 1 > > class Bar(Foo): > __slots__ = ('a',) > > bar = Bar() > print dir(a) > print a That's a NameError, I suppose you meant 'bar' instead of 'a' in the last two lines, then it makes sense. :-) > The resolution rule for descriptors could work cooperatively to find > Foo's class attribute 'a' instead of giving up with an > AttributeError. Once a descriptor is found, that's the end of the line. When you find a method, you call it, and it raises an exception, you're not going to continue looking for a base class method either! The descriptor type used to implement slots could do this, but doesn't. I don't care about this feature. With a __dict__, there's some real saving in not storing default values, since it means a smaller dict, which can save space. The slot space is always there, so you might as well initialize it. Concluding: don't expect that you can take an arbitrary class, analyze what ivars it uses, and add a __slots__ variable to speed it up. There are lots of differences in semantics when you use slots, and I don't want to hide those. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Thu Feb 28 21:51:46 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 28 Feb 2002 22:51:46 +0100 Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding) In-Reply-To: <3C7DF381.C2E1335A@lemburg.com> References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <3C7CA3E2.C3705289@lemburg.com> <3C7CAD5D.6692F44@lemburg.com> <15485.15623.543255.443894@anthem.wooz.org> <3C7DF381.C2E1335A@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > Which wrapper APIs do we currently have which could actually > be made part of the Python core ? On Unix, we have iconv(3). On Windows, we have MultiByteToWideChar, which would need to be wrapped with a map translating codec names to codepage numbers. There is also a codec API through a COM interface provided by Internet Exploder; I don't have the name of that interface right now. On all platforms, we could easily wrap the Tcl encodings, which are available everywhere where Python is available. Not sure what the performance implications would be. There also could be a wrapper around ICU. On OS X, CFStringCreateFromExternalRepresentation could be used. > Aside: while it's true that we could use those, the Unicode > implementation has shown that rolling our own has worked out > quite well too. There have been a few correctness glitches in those, but overall, I'd agree that they have worked quite well. Performance is a different issue, though; people just haven't complained, yet, IMO. Regards, Martin From martin@v.loewis.de Thu Feb 28 22:00:01 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 28 Feb 2002 23:00:01 +0100 Subject: [Python-Dev] PEP 1 update In-Reply-To: <017101c1c084$e85cdaa0$6d94fea9@newmexico> References: <20020228181706.D0335E8C7@waltz.rahul.net> <017101c1c084$e85cdaa0$6d94fea9@newmexico> Message-ID: "Samuele Pedroni" writes: > Just my impressions. I agree with the observations, but what would you do about this? Regards, Martin From tim@zope.com Thu Feb 28 22:30:49 2002 From: tim@zope.com (Tim Peters) Date: Thu, 28 Feb 2002 17:30:49 -0500 Subject: [Python-Dev] proposal: add basic time type to the standard library In-Reply-To: <3C7CD8B7.3E9A89A3@zope.com> Message-ID: [Jim Fulton] >>> ZODB has a TimeStamp type that uses a 32-bit unsigned integer >>> to store year, month,, day, hour, and minute in a way that >>> makes it dirt simple to extract a component. [Tim] >> You really think so? It's a mixed-radix scheme: >> >> v=((((y-1900)*12+mo-1)*31+d-1)*24+h)*60+m; >> >> so requires lots of expensive integer division and remainder ... [Jim] > Compared to storing date-times as offsets from an epoch, this is > much simpler and cheaper. OK, as with most things, it boils down to the definition of dirt: you're contrasting hard-packed dirt with a 21%-dirt 79%-concrete mix, and I'm constrasting hard-packed dirt with household dust. I'm sure you'll agree that's a rigorously correct summary . From pedroni@inf.ethz.ch Thu Feb 28 23:21:29 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Fri, 1 Mar 2002 00:21:29 +0100 Subject: [Python-Dev] PEP 1 update References: <20020228181706.D0335E8C7@waltz.rahul.net><017101c1c084$e85cdaa0$6d94fea9@newmexico> Message-ID: <035601c1c0ae$a6c11a00$6d94fea9@newmexico> From: Martin v. Loewis > "Samuele Pedroni" writes: > > > Just my impressions. > > I agree with the observations, but what would you do about this? Some possible proposals (more or less easy to implement) >From PEP 1: Standards track PEPs must have a Python-Version: header which indicates the version of Python that the feature will be released with. Informational PEPs do not need a Python-Version: header. - have in the summary an active (standard track) PEP category: e.g. PEP 237, PEP 252, PEP 253, PEP 238 should go there; maybe use for them the Python-Version (possibly renamed Implementation-Python-Versions: ) in a reasonable imaginative way PEP 237: 2.2-2.3-2.4,...3.0 PEP 238: 2.2...3.0 PEP 252: 2.2.... - PEP for which it is not clear whether they will be implemeted should have Python-Version: ?, I think that for example PEPs 273 and 277 are fine reporting Python-Version: 2.3 - Maybe open PEPs should be divided between those that have at least a proof-of-concept or ref impl, and those that don't have one (the latter for obvious reasons are less likely to be implemented). Maybe other/richer categorizations would sense but those would require more burocracy. - Maybe status should go a bit beyond the actual draft/final dicotomy but this needs discussion (thinking out loud: draft -> draft-stable vs. draft-incomplete or open-draft) OTOH the above proposals should already improve things a bit (if they are practicable). - PEP workflow: at the moment it seems that a PEP champion can ask the BDFL to accept/reject and then things should reach "quickly" a final settlement. (Are all the PEPpers aware of this, sometimes it seems not for some of the PEP hanging around) Now if this would happen for all the PEPs on the plate, Guido would have an hard time :-) I think is up to Guido to think/decide/change things in this respect. (For sure I miss the pie-in-the-sky category, maybe Guido should sometimes go over all the PEPs and assign acceptance likelyhood measures, half-kidding . ) Just some vague ideas. regards, Samuele Pedroni. From pedroni@inf.ethz.ch Thu Feb 28 23:31:27 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Fri, 1 Mar 2002 00:31:27 +0100 Subject: [Python-Dev] PEP 1 update References: <20020228181706.D0335E8C7@waltz.rahul.net><017101c1c084$e85cdaa0$6d94fea9@newmexico> <035601c1c0ae$a6c11a00$6d94fea9@newmexico> Message-ID: <036a01c1c0b0$0b6f38a0$6d94fea9@newmexico> Another maybe valuable thing: probably another useful heuristic to divided the open PEPs beyond proof-of-concept/no-proof-of-concept is new-syntax/new-keywords/new-"funny"-semantics/ non-backward compatible vs. infrastructure/library/etc/BDFL championed Those PEPs espacially make peope wonder: will that happen to my favorite language, oh god, when?, it seems real soon now - gulp, gasp. regards, Samuele Pedroni. From pedroni@inf.ethz.ch Thu Feb 28 23:59:08 2002 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Fri, 1 Mar 2002 00:59:08 +0100 Subject: R: [Python-Dev] PEP 1 update References: <20020228181706.D0335E8C7@waltz.rahul.net><017101c1c084$e85cdaa0$6d94fea9@newmexico> <035601c1c0ae$a6c11a00$6d94fea9@newmexico> <036a01c1c0b0$0b6f38a0$6d94fea9@newmexico> Message-ID: <039a01c1c0b3$e99946e0$6d94fea9@newmexico> From: Samuele Pedroni > Another maybe valuable thing: > > probably another useful heuristic > to divided the open PEPs > beyond proof-of-concept/no-proof-of-concept > > is new-syntax/new-keywords/new-"funny"-semantics/ > non-backward compatible > > vs. infrastructure/library/etc/BDFL championed > > Those PEPs espacially make peope wonder: > will that happen to my favorite language, > oh god, when?, it seems real soon now - gulp, gasp. > Clearly my point is not against "disruptive" changes but about making clear for people what is ongoing-work-in-progress and what is still just undecided. regarfs, Samuele Pedroni.