From gmcm@hypernet.com Thu Feb 3 13:41:29 2000 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 3 Feb 2000 08:41:29 -0500 Subject: [Import-sig] Kick-off Message-ID: <1262537223-4632885@hypernet.com> I guess the first order of business is to establish some objectives. I see two goals (my version of what happend at dev- day): Short-term: Provide a "new architecture import hooks" module for the standard library. This would deprecate ihooks and friends, and provide developers with a way of learning the new architecture. Long-term: Reform the entire import architecture of Python. This affects Python start-up, the semantics of sys.path, and the C API to importing. The model for this is, of course, Greg's imputil.py (Greg, your latest version is not yet on your website which still has a November version). If this seems like a reasonable statement, I'll massage and expand it into the import SIG's homepage. I'll also try to produce legible downloads of the handouts I passed out at the dev day session. - Gordon From guido@python.org Thu Feb 3 13:48:56 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 03 Feb 2000 08:48:56 -0500 Subject: [Import-sig] Kick-off In-Reply-To: Your message of "Thu, 03 Feb 2000 08:41:29 EST." <1262537223-4632885@hypernet.com> References: <1262537223-4632885@hypernet.com> Message-ID: <200002031348.IAA26846@eric.cnri.reston.va.us> > I guess the first order of business is to establish some > objectives. I see two goals (my version of what happend at dev- > day): > > Short-term: Provide a "new architecture import hooks" module > for the standard library. This would deprecate ihooks and > friends, and provide developers with a way of learning the new > architecture. > > Long-term: Reform the entire import architecture of Python. > This affects Python start-up, the semantics of sys.path, and > the C API to importing. > > The model for this is, of course, Greg's imputil.py (Greg, your > latest version is not yet on your website which still has a > November version). > > If this seems like a reasonable statement, I'll massage and > expand it into the import SIG's homepage. I'll also try to > produce legible downloads of the handouts I passed out at the > dev day session. One addition: at the devday meeting, Michael Reilly objected to the notion of deprecating ihooks -- he has been using ihooks successfully to meet his needs. I think we should think long and hard about thowing ihooks out -- it may be that the problem is simply that it's not well documented (actually, undocumented is better :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Thu Feb 3 13:53:55 2000 From: gstein@lyra.org (Greg Stein) Date: Thu, 3 Feb 2000 05:53:55 -0800 (PST) Subject: [Import-sig] Kick-off In-Reply-To: <1262537223-4632885@hypernet.com> Message-ID: On Thu, 3 Feb 2000, Gordon McMillan wrote: >... > I guess the first order of business is to establish some > objectives. I see two goals (my version of what happend at dev- > day): > > Short-term: Provide a "new architecture import hooks" module > for the standard library. This would deprecate ihooks and > friends, and provide developers with a way of learning the new > architecture. > > Long-term: Reform the entire import architecture of Python. > This affects Python start-up, the semantics of sys.path, and > the C API to importing. > > The model for this is, of course, Greg's imputil.py (Greg, your > latest version is not yet on your website which still has a > November version). My latest version is available from the CVS repository. The module is easily accessed via: http://www.lyra.org/cgi-bin/viewcvs.cgi/gjspy/imputil.py I haven't posted it to the web site because the web version is the "public, stable" version. I'll add the above link so that people can get the "development" version. Specifically, the new version has some outstanding feedback from MAL and a couple others that I need to handle. There is also some basic cleanup to do. And dealing with any feedback from Guido (which was deferred (partially) due to the types-sig). [ of course, please feel free to link this stuff from the import-sig web pages ] > If this seems like a reasonable statement, I'll massage and > expand it into the import SIG's homepage. I'll also try to > produce legible downloads of the handouts I passed out at the > dev day session. I'm not quite sure what the "short-term" means/implies, so any summary from the dev-day session would be great. For the long-term stuff, I'm fine with your goal statement. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Feb 3 14:02:46 2000 From: gstein@lyra.org (Greg Stein) Date: Thu, 3 Feb 2000 06:02:46 -0800 (PST) Subject: [Import-sig] deprecate ihooks? (was: Kick-off) In-Reply-To: <200002031348.IAA26846@eric.cnri.reston.va.us> Message-ID: On Thu, 3 Feb 2000, Guido van Rossum wrote: > Gordon wrote: >... > > Short-term: Provide a "new architecture import hooks" module > > for the standard library. This would deprecate ihooks and > > friends, and provide developers with a way of learning the new > > architecture. >... > One addition: at the devday meeting, Michael Reilly objected to the > notion of deprecating ihooks -- he has been using ihooks successfully > to meet his needs. I think we should think long and hard about > thowing ihooks out -- it may be that the problem is simply that it's > not well documented (actually, undocumented is better :-). There are a lot of modules that people have used in the past, which are now deprecated (I count 17 modules in Lib/lib-old). Deprecating a module is simply signalling an intent to move to a new system (and, hopefully, a better one). As long as we're improving things, then I don't see a problem with noting some older stuff should not be used. In other words, there will sometimes be sacrifices in the name of overall improvement. ihooks will continue to be available in the lib-old library in the distribution (or redistributed with the apps that require it). Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm@hypernet.com Thu Feb 3 14:15:19 2000 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 3 Feb 2000 09:15:19 -0500 Subject: [Import-sig] Kick-off In-Reply-To: <200002031348.IAA26846@eric.cnri.reston.va.us> References: Your message of "Thu, 03 Feb 2000 08:41:29 EST." <1262537223-4632885@hypernet.com> Message-ID: <1262535193-4755007@hypernet.com> [Gordon] > > Short-term: Provide a "new architecture import hooks" module > > for the standard library. This would deprecate ihooks and > > friends, and provide developers with a way of learning the new > > architecture. [Guido] > One addition: at the devday meeting, Michael Reilly objected to the > notion of deprecating ihooks -- he has been using ihooks successfully > to meet his needs. I think we should think long and hard about > thowing ihooks out -- it may be that the problem is simply that it's > not well documented (actually, undocumented is better :-). I have badgered Michael into joining the SIG, so hopefully we can iron this out. I didn't want to pollute the objectives with a bunch of issues, but I'll do that now. Controversies ------------------- 1) Speed (at least when imputil is used as _the_ import mechanism, without making use of it's new features). 2) Lack of certain hook features (I'm unclear on this; I *think* Michael's complaints fit in here). 3) Lack of package __path__ mechanism. 4) The need to flesh out the ImportManager. Anything else? - Gordon From guido@python.org Thu Feb 3 14:15:56 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 03 Feb 2000 09:15:56 -0500 Subject: [Import-sig] deprecate ihooks? (was: Kick-off) In-Reply-To: Your message of "Thu, 03 Feb 2000 06:02:46 PST." References: Message-ID: <200002031415.JAA28876@eric.cnri.reston.va.us> > > One addition: at the devday meeting, Michael Reilly objected to the > > notion of deprecating ihooks -- he has been using ihooks successfully > > to meet his needs. I think we should think long and hard about > > thowing ihooks out -- it may be that the problem is simply that it's > > not well documented (actually, undocumented is better :-). > > There are a lot of modules that people have used in the past, which are > now deprecated (I count 17 modules in Lib/lib-old). Deprecating a module > is simply signalling an intent to move to a new system (and, hopefully, a > better one). As long as we're improving things, then I don't see a problem > with noting some older stuff should not be used. In other words, there > will sometimes be sacrifices in the name of overall improvement. > > ihooks will continue to be available in the lib-old library in the > distribution (or redistributed with the apps that require it). Maybe I didn't explain it well enough. I think ihooks actually has a better base architecture than your imputil (except it's missing the import manager, which is a separate thing anyway). --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Fri Feb 4 09:27:51 2000 From: gstein@lyra.org (Greg Stein) Date: Fri, 4 Feb 2000 01:27:51 -0800 (PST) Subject: [Import-sig] deprecate ihooks? In-Reply-To: <200002031415.JAA28876@eric.cnri.reston.va.us> Message-ID: On Thu, 3 Feb 2000, Guido van Rossum wrote: >... > Maybe I didn't explain it well enough. I think ihooks actually has a > better base architecture than your imputil (except it's missing the > import manager, which is a separate thing anyway). Ah! Now we get to it :-) I'd love to hear any feedback that you have on imputil. I've received some from a few others that is awaiting incorporation. How would you like to do this to minimize your review time? Should I fold in the other feedback first (to eliminate duplicate feedback)? Do you want to do a quick, high-level feedback? Or go for broke with a fully detailed review? :-) It may also be constructive to compare it against the requirements for a new import mechanism that you set up in: Initial requirements list: http://www.python.org/pipermail/python-dev/1999-November/002867.html My response, with imputil in mind: http://www.python.org/pipermail/python-dev/1999-November/002899.html Your response and modified requirements: http://www.python.org/pipermail/python-dev/1999-December/002973.html I'll get my Python page updated in a moment with a pointer to the CVS version of imputil. For reference: http://www.lyra.org/greg/python/ Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm@hypernet.com Fri Feb 4 15:52:20 2000 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 4 Feb 2000 10:52:20 -0500 Subject: [Import-sig] Requirements Message-ID: <1262442956-256070@hypernet.com> Going through the links Greg posted to the dev list discussion, I find some things that I think need clarification: [Guido] > - the core API may be incompatible, as long as > compatibility layers can be provided in pure Python From the C API, we have PyImport_Import which does the same as (keyword) import. But PyImport_ImportModule and ...Ex are lower level. I assume that modulo some arg munging, these also will do the same as (keyword) import. Decent assumption? [Guido] > - support for freeze functionality Heh, heh. The current modulefinder works by (yet again) emulating the entire import process, but not letting the "imported" code leak out. In imputil, it's the Importer base class that does the "leaking", not code in a (well-behaved) derived class. So that opens the possibility of replacing the Importer object in the derived class's bases with a PhonyImporter that doesn't leak. So modulefinder could use the derived class and wouldn't have to emulate. However, modulefinder would have to report more information - the importer that found the module, as well as the file/URL/whatever it found it in. [Guido] > - sys.path and sys.modules should still exist; sys.path > might have a slightly different meaning and > - Standard import from zip or jar files, in two ways: > (1) an entry on sys.path can be a zip/jar file instead of a > directory; its contents will be searched for modules or > packages > (2) a file in a directory that's on sys.path can be a zip/jar > file;its contents will be considered as a package It looks like we're very close to this. Maybe already there (once a suffix importer has been written for a zip file). In the current version, items on sys.path can be directory names or importer instances. Obviously at startup, sys.path is nothing more that strings. Also (per other discussions), sys.path starts as a minimal boot path, and gets expanded from Python. What is this mechanism? Do we worry about: Network installations in heterogeneous environments? Ditto in homogeneous environments? Multiple incompatible installations? (I vote "yes" on the last two, and "maybe" on the first; mainly because the latter two can be solved by figuring a boot path based on the location of the executable). Should the syntax of .pth files be expanded to allow specifying importer instances? Or do we use sitecustomize? Mmph. Enough for now... - Gordon From Fredrik Lundh" Gordon: > > Short-term: Provide a "new architecture import hooks" module=20 > > for the standard library. This would deprecate ihooks and=20 > > friends, and provide developers with a way of learning the new=20 > > architecture. Guido: > One addition: at the devday meeting, Michael Reilly objected to the > notion of deprecating ihooks -- he has been using ihooks successfully > to meet his needs. I think we should think long and hard about > thowing ihooks out -- it may be that the problem is simply that it's > not well documented (actually, undocumented is better :-). Greg: > There are a lot of modules that people have used in the past, which = are > now deprecated (I count 17 modules in Lib/lib-old). Deprecating a = module > is simply signalling an intent to move to a new system (and, = hopefully, a > better one). As long as we're improving things, then I don't see a = problem > with noting some older stuff should not be used. In other words, there > will sometimes be sacrifices in the name of overall improvement. we've also used ihooks in a number of places, with great success. on the other hand, changing to imputil was hardly any work at all... so I guess The Question is whether the find/load separation is really necessary. I cannot think of a reason, but that's probably just me... cheers /Gredrik (at home) "Sometimes, when you are a Bear of Very Little Brain, and you Think of Things, you find sometimes that a Thing which seemed very Thingish inside you is quite different when it gets out into the open and has other people looking at it." -- Pooh From Fredrik Lundh" Gordon wrote: > Short-term: Provide a "new architecture import hooks" module=20 > for the standard library. This would deprecate ihooks and=20 > friends, and provide developers with a way of learning the new=20 > architecture. 1.6.1 <=3D version <=3D 1.7, right? > Long-term: Reform the entire import architecture of Python.=20 > This affects Python start-up, the semantics of sys.path, and=20 > the C API to importing. 1.7 <=3D version < 3000, right? > The model for this is, of course, Greg's imputil.py (Greg, your=20 > latest version is not yet on your website which still has a=20 > November version). I'd like to add an ultra-short-term issue: possible changes to 1.6.0 that makes it easier to experiment with alternate import strategies, mostly for installation tools like gordon's install and pythonworks' deployment subsystem. (as discussed on last week's consortium meeting) most importantly, I'd like to come up with a way to execute small snippets of script code *before* Python attempts to import stuff like exceptions.py. From guido@python.org Fri Feb 4 18:48:07 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 04 Feb 2000 13:48:07 -0500 Subject: [Import-sig] Kick-off In-Reply-To: Your message of "Fri, 04 Feb 2000 19:47:28 +0100." <025601bf6f40$4aff5220$f4a7b5d4@hagrid> References: <025601bf6f40$4aff5220$f4a7b5d4@hagrid> Message-ID: <200002041848.NAA14651@eric.cnri.reston.va.us> > I'd like to add an ultra-short-term issue: possible changes > to 1.6.0 that makes it easier to experiment with alternate > import strategies, mostly for installation tools like gordon's > install and pythonworks' deployment subsystem. > > (as discussed on last week's consortium meeting) > > most importantly, I'd like to come up with a way to execute > small snippets of script code *before* Python attempts to > import stuff like exceptions.py. In my notes of the consortium meeting, I have punch a tiny hole in the interpreter through which /F can drive his truck." Unfortunately I don't recall where the hole should be punched. Since you have an application for this, can you remind me? --Guido van Rossum (home page: http://www.python.org/~guido/) From Fredrik Lundh" <200002041848.NAA14651@eric.cnri.reston.va.us> Message-ID: <026801bf6f43$16b7cee0$f4a7b5d4@hagrid> Guido van Rossum wrote: > > I'd like to add an ultra-short-term issue: possible changes > > to 1.6.0 that makes it easier to experiment with alternate > > import strategies >=20 > In my notes of the consortium meeting, I have punch a tiny hole in the > interpreter through which /F can drive his truck." >=20 > Unfortunately I don't recall where the hole should be punched. Since > you have an application for this, can you remind me? I have an idea or two, but I gotta ship that SRE kit first... (soon) (just thought that the import-siggers might come up with some additional ideas while I'm busy doing that) From guido@python.org Fri Feb 4 19:26:34 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 04 Feb 2000 14:26:34 -0500 Subject: [Import-sig] Requirements In-Reply-To: Your message of "Fri, 04 Feb 2000 10:52:20 EST." <1262442956-256070@hypernet.com> References: <1262442956-256070@hypernet.com> Message-ID: <200002041926.OAA14816@eric.cnri.reston.va.us> > From the C API, we have PyImport_Import which does the > same as (keyword) import. But PyImport_ImportModule and > ...Ex are lower level. I assume that modulo some arg > munging, these also will do the same as (keyword) import. > Decent assumption? I suppose you mean they should do the same in the new design, because they would only be there for b/w compatibility? Right now they are designed to be different -- in particular PyImport_Import() calls __import__() calls PyImport_ImportModule[Ex](). Do we want to keep the override-__import__ hook? Who else uses PyImport_ImportModule[Ex]()? > [Guido] > > - support for freeze functionality > > Heh, heh. The current modulefinder works by (yet again) > emulating the entire import process, but not letting the > "imported" code leak out. Actually, it uses a wrapper around imp.find_module() that checks for a few special cases and otherwise hands the query off to imp.find_module()! ( I don't understand why there's a special case for looking in the Windows registry; find_module() should already do that, too.) > In imputil, it's the Importer base > class that does the "leaking", not code in a (well-behaved) > derived class. So that opens the possibility of replacing the > Importer object in the derived class's bases with a > PhonyImporter that doesn't leak. So modulefinder could use > the derived class and wouldn't have to emulate. However, > modulefinder would have to report more information - the > importer that found the module, as well as the > file/URL/whatever it found it in. I'm afraid you've lost me here. What does "leaking" refer to? > [Guido] > > - sys.path and sys.modules should still exist; sys.path > > might have a slightly different meaning > > and > > > - Standard import from zip or jar files, in two ways: > > (1) an entry on sys.path can be a zip/jar file instead of a > > directory; its contents will be searched for modules or > > packages > > (2) a file in a directory that's on sys.path can be a zip/jar > > file;its contents will be considered as a package > > It looks like we're very close to this. Maybe already there > (once a suffix importer has been written for a zip file). > > In the current version, items on sys.path can be directory > names or importer instances. Obviously at startup, sys.path is > nothing more that strings. Also (per other discussions), > sys.path starts as a minimal boot path, and gets expanded > from Python. > > What is this mechanism? Look at Modules/getpath.c and PC/getpathp.c. Or are you asking about how the mechanism should be redesigned? > Do we worry about: > Network installations in heterogeneous environments? Yes, by supporting sys.exec_prefix. This has consequences for getpath.c, see there. I think this support has lost its significance with the advent of fast disks, but I'm not going to fight millions of sysadmins stuck in the past, so we have to continue to support it. It's no big deal anyway. > Ditto in homogeneous environments? How can you even tell? Maybe I don't understand what you are talking about (and then my previous response also doesn't make sense?) > Multiple incompatible installations? Emphatically yes. A Python binary should be able to find out where the rest of its installation is. This is a platform specific problem (hence getpath.c and PC/getpathp.c). Note that on Windows there's the added problem of Mark Hammond's COM support. COM services implemented by Python can be started on the fly without starting python.exe, e.g. by embedding such a COM object in a Word document. The consequence of this (I've been told) is that the python15.dll file must live in the system directory (\WinNT, \Windows, etc.). This means that its path is useless to find the rest of the installation, and that's why we're using the registry. I don't know if all of this is still true; I would think that if a COM support DLL lives somewhere else, the registry could point to it? But who am I to argue with Microsoft. Anyway I wouldn't mind if this was somehow solved differently; for example there could be another copy of python15.dll in the Python install dir which was used normally, and in that case the registry wouldn't be needed. > (I vote "yes" on the last two, and "maybe" on the first; mainly > because the latter two can be solved by figuring a boot path > based on the location of the executable). > > Should the syntax of .pth files be expanded to allow specifying > importer instances? Or do we use sitecustomize? Do you really think that will be used? There would seem to be a chicken/egg problem. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Fri Feb 4 21:29:45 2000 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 4 Feb 2000 16:29:45 -0500 Subject: [Import-sig] Requirements In-Reply-To: <200002041926.OAA14816@eric.cnri.reston.va.us> References: Your message of "Fri, 04 Feb 2000 10:52:20 EST." <1262442956-256070@hypernet.com> Message-ID: <1262422710-1474030@hypernet.com> [me] > > From the C API, we have PyImport_Import which does the > > same as (keyword) import. But PyImport_ImportModule and > > ...Ex are lower level. I assume that modulo some arg > > munging, these also will do the same as (keyword) import. > > Decent assumption? [Guido] > I suppose you mean they should do the same in the new design, because > they would only be there for b/w compatibility? Yes. > Right now they are > designed to be different -- in particular PyImport_Import() calls > __import__() calls PyImport_ImportModule[Ex](). > > Do we want to keep the override-__import__ hook? We need a builtin function (so you can use a runtime arg; and not be forced to exec). But there's not much sense in making it hookable, when the whole import system is a set of hooks. > Who else uses PyImport_ImportModule[Ex]()? In my experience, almost all extension writers use PyImport_ImportModule, not PyImport_Import. I think this is speed-freakism, not for functionality (which could only be to avoid hooks). > > [Guido] > > > - support for freeze functionality > > > > Heh, heh. The current modulefinder works by (yet again) > > emulating the entire import process, but not letting the > > "imported" code leak out. > > In imputil, it's the Importer base > > class that does the "leaking", not code in a (well-behaved) > > derived class. So that opens the possibility of replacing the > > Importer object in the derived class's bases with a > > PhonyImporter that doesn't leak. So modulefinder could use > > the derived class and wouldn't have to emulate. However, > > modulefinder would have to report more information - the > > importer that found the module, as well as the > > file/URL/whatever it found it in. > > I'm afraid you've lost me here. What does "leaking" refer to? Letting the module into sys.modules or any real namespace. Just pointing out that a new modulefinder should be able to follow the hooks without excessive effort. [zip files on or in sys.path...] > > In the current version, items on sys.path can be directory > > names or importer instances. Obviously at startup, sys.path is > > nothing more that strings. Also (per other discussions), > > sys.path starts as a minimal boot path, and gets expanded > > from Python. > > > > What is this mechanism? > > Look at Modules/getpath.c and PC/getpathp.c. Or are you asking about > how the mechanism should be redesigned? Yes. "Current version" meant "of imputil". Sorry. > > Do we worry about: > > Network installations in heterogeneous environments? > > Yes, by supporting sys.exec_prefix. This has consequences for > getpath.c, see there. I think this support has lost its significance > with the advent of fast disks, but I'm not going to fight millions of > sysadmins stuck in the past, so we have to continue to support it. > It's no big deal anyway. > > > Ditto in homogeneous environments? > > How can you even tell? Maybe I don't understand what you are talking > about (and then my previous response also doesn't make sense?) Terms: by "heterogeneous" I meant, eg, a Solaris server with Solaris, Windows and Linux clients. By "homogeneous" I meant clients (and probably server) are all binary compatible. I *think* "homogeneous" is more-or-less solved when "multiple incompatible installations" is solved. The added complexity of "heterogeneous" being the plat_xxx libraries (and what package authors have to do), which appears to be getting deprecated(?). > > Multiple incompatible installations? > > Emphatically yes. A Python binary should be able to find out where > the rest of its installation is. This is a platform specific problem > (hence getpath.c and PC/getpathp.c). Um, yes and no (to it being platform specific). Yes in that you can't follow symlinks on Windows, or easily get the absolute path name of the executable in some *nixen. No, in that I feel strongly (modulo some of the COM stuff below) that the psuedo-code should be the same - just think of distutils and package authors! > Note that on Windows there's the added problem of Mark Hammond's COM > support. COM services implemented by Python can be started on the fly > without starting python.exe, e.g. by embedding such a COM object in a > Word document. The consequence of this (I've been told) is that the > python15.dll file must live in the system directory (\WinNT, \Windows, > etc.). This means that its path is useless to find the rest of the > installation, and that's why we're using the registry. > > I don't know if all of this is still true; I would think that if a COM > support DLL lives somewhere else, the registry could point to it? But > who am I to argue with Microsoft. I *think* this problem has been solved, and the registry can point wherever it wants, but I'm not the expert. If this stuff is still needed, perhaps it could be fallback: "Oops, I can't figure out PYTHONPATH, so I'll look in the registry". I'll forward this question to Mark. > Anyway I wouldn't mind if this was somehow solved differently; Amen. > > Should the syntax of .pth files be expanded to allow specifying > > importer instances? Or do we use sitecustomize? > > Do you really think that will be used? There would seem to be a > chicken/egg problem. Categorizing it doesn't solve it ;-). OK, we don't need a concrete solution now, but is this a reasonable approach? 1) Py_Initialize calls getpath.c 2) getpath.c returns a directory (or very short list thereof). 3) Fredrik drives his truck through (me too) 4) A frozen-in exceptions.py gets imported 5) sys.path gets expanded by looking for (something | some things) in the existing sys.path and ( executing | reading ) them. Maybe 3 & 4 are swapped? Maybe some of this is written in Python and frozen in? The, err, obsession here being to make it (1) highly customizable AND (2) generally idiot-resistant. A few simple controls under a bright red hatch cover that says "Warning - touching this stuff will void your warranty". - Gordon From guido@python.org Fri Feb 4 22:07:04 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 04 Feb 2000 17:07:04 -0500 Subject: [Import-sig] Requirements In-Reply-To: Your message of "Fri, 04 Feb 2000 16:29:45 EST." <1262422710-1474030@hypernet.com> References: Your message of "Fri, 04 Feb 2000 10:52:20 EST." <1262442956-256070@hypernet.com> <1262422710-1474030@hypernet.com> Message-ID: <200002042207.RAA16458@eric.cnri.reston.va.us> [Guido] > > Right now they are > > designed to be different -- in particular PyImport_Import() calls > > __import__() calls PyImport_ImportModule[Ex](). > > > > Do we want to keep the override-__import__ hook? [Gordon] > We need a builtin function (so you can use a runtime arg; and > not be forced to exec). But there's not much sense in making > it hookable, when the whole import system is a set of hooks. Agreed, except for b/w compat. > > Who else uses PyImport_ImportModule[Ex]()? > > In my experience, almost all extension writers use > PyImport_ImportModule, not PyImport_Import. I think this is > speed-freakism, not for functionality (which could only be to > avoid hooks). I think that's more likely because for a long time, PyImport_ImportModule() was the only interface -- PyImport_Import() was added much later (by Jim F who needed access to the hooked code from inside cPickle). > > > Do we worry about: > > > Network installations in heterogeneous environments? > > > > Yes, by supporting sys.exec_prefix. This has consequences for > > getpath.c, see there. I think this support has lost its significance > > with the advent of fast disks, but I'm not going to fight millions of > > sysadmins stuck in the past, so we have to continue to support it. > > It's no big deal anyway. > > > > > Ditto in homogeneous environments? > > > > How can you even tell? Maybe I don't understand what you are talking > > about (and then my previous response also doesn't make sense?) > > Terms: by "heterogeneous" I meant, eg, a Solaris server with > Solaris, Windows and Linux clients. By "homogeneous" I > meant clients (and probably server) are all binary compatible. OK, I knew that. > I *think* "homogeneous" is more-or-less solved when "multiple > incompatible installations" is solved. I don't see any problems with homogeneous environments -- what possible problem could there be (that doesn't exist when there's no sharing and that isn't caused by multiple versions)? > The added complexity of "heterogeneous" being the plat_xxx > libraries (and what package authors have to do), which > appears to be getting deprecated(?). > > > > Multiple incompatible installations? > > > > Emphatically yes. A Python binary should be able to find out where > > the rest of its installation is. This is a platform specific problem > > (hence getpath.c and PC/getpathp.c). > > Um, yes and no (to it being platform specific). Yes in that you > can't follow symlinks on Windows, or easily get the absolute > path name of the executable in some *nixen. No, in that I feel > strongly (modulo some of the COM stuff below) that the > psuedo-code should be the same - just think of distutils and > package authors! The pseudo code is also different because the structure of site-packages etc. is different on Windows (it doesn't exist). But I agree that it's a shame that there are two copies of code with very similar functionality, and I'd gladly get rid of one. (There's even a third copy, in the os2 subdirectory!) > > > Should the syntax of .pth files be expanded to allow specifying > > > importer instances? Or do we use sitecustomize? > > > > Do you really think that will be used? There would seem to be a > > chicken/egg problem. > > Categorizing it doesn't solve it ;-). OK, we don't need a > concrete solution now, but is this a reasonable approach? > > 1) Py_Initialize calls getpath.c > 2) getpath.c returns a directory (or very short list thereof). > 3) Fredrik drives his truck through (me too) > 4) A frozen-in exceptions.py gets imported > 5) sys.path gets expanded by looking for (something | some > things) in the existing sys.path and ( executing | reading ) > them. > > Maybe 3 & 4 are swapped? Yes, better. > Maybe some of this is written in Python and frozen in? Possibly. > The, err, obsession here being to make it (1) highly > customizable AND (2) generally idiot-resistant. A few simple > controls under a bright red hatch cover that says "Warning - > touching this stuff will void your warranty". Good metaphor. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Fri Feb 4 23:32:17 2000 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 4 Feb 2000 18:32:17 -0500 Subject: [Import-sig] RE: pythonpath and COM support In-Reply-To: References: <1262422703-1474374@hypernet.com> Message-ID: <1262415358-1916426@hypernet.com> Mark, [also posting this to the import-SIG] > I believe that we could drop putting python1x.dll in the system directory, > but this would have the following implications: > > * All Python executables would need to exist in the same directory as the > .dll. Python.exe and Pythonw.exe already would, but "3rd party" > executables, such as Pythonwin.exe or any other .exes supplied by extension > authors would also need to live in that directory. Ditto for other DLL's, > such as pythoncom15.dll, pywintypes15.dll, etc. Couldn't pythoncom15.dll etc. live in /DLLs? Or are they loaded by other C extensions (using LoadLibrary instead of PyImport_x)? > * The path searching code would need to use the location of Python1x.dll, > rather than the .exe, to locate the PYTHONHOME. This would not be a huge > change, but necessary none-the-less. Um, why? Especially if they're the same directory ;-)? Oh, because of COM (and exposing python15.dll as a COM server)? > I think this would definately be a win, and would be happy to help make this > happen. At the very least, would get around the must-have-admin-rights on NT problem. Thanks, - Gordon From mhammond@skippinet.com.au Fri Feb 4 23:44:38 2000 From: mhammond@skippinet.com.au (Mark Hammond) Date: Sat, 5 Feb 2000 10:44:38 +1100 Subject: [Import-sig] RE: pythonpath and COM support In-Reply-To: <1262415358-1916426@hypernet.com> Message-ID: > [also posting this to the import-SIG] Sheesh - too many new sigs. Im not on that one, so please CC me where appropriate. > Couldn't pythoncom15.dll etc. live in /DLLs? Or are they > loaded by other C extensions (using LoadLibrary instead of > PyImport_x)? pythoncom15.dll and pywintypes15.dll (infact, anything I release with the extension ".dll") is used as a "standard DLL". There exists the possibility that a .exe will have an implicit reference to one of these .DLLs. So unless they are in the same path as the executables, or on the %PATH%, they will not be found. > > * The path searching code would need to use the location of > Python1x.dll, > > rather than the .exe, to locate the PYTHONHOME. This would not > be a huge > > change, but necessary none-the-less. > > Um, why? Especially if they're the same directory ;-)? > > Oh, because of COM (and exposing python15.dll as a COM > server)? Exactly - when a Python COM object is being used, the .exe may well be something created by VB, and no where near the Python directory. Thus, the full path to the .exe will be useless, but the full path to Python1x.dll will be OK. > At the very least, would get around the must-have-admin-rights > on NT problem. The admin problem is due to writing the registry, rather than copying something to the system32 directory. At the moment, the installation package writes 2 classes of information: * Core Python stuff - pythonpath etc. * Information for other installers and IDEs The 2nd category includes stuff like the "Start Menu" group the user selected, and the path where Python was installed. Later installers (such as win32all) can read this information and avoid asking the user the same questions - thereby making the installation process more robust. Another example is "help files" - eg, a list of all documentation installed by Python or extensions, thereby allowing IDEs to be smart. If we can drop the registry all together for the first category, then it would seem a shame to keep using the registry just for the 2nd category. OTOH, I think the information made available by the 2nd category is valuable. Back-to-ini-files-ly, Mark. From mal@lemburg.com Fri Feb 4 20:10:54 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 04 Feb 2000 21:10:54 +0100 Subject: [Import-sig] deprecate ihooks? References: <025101bf6f3d$ea552500$f4a7b5d4@hagrid> Message-ID: <389B324E.E779DB0F@lemburg.com> This is a multi-part message in MIME format. --------------6726D2007785ACB228CD285A Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Just to throw some old 2 cents, I've attached some code I wrote way back in 1997 on top of ihooks.py. It turns modules into real classes with all the goodies of __getattr__ et al. at no extra cost. Perhaps this mechanism offers some new insights: by delegating work to the objects in question (the modules) rather than hooking together some meta objects... note that you can do subclassing to add functionality to modules using this approach, e.g. packages could be subclasses of a general package class, etc. Anyway, just a thought you might want to consider... I'm too busy right now to jump into this discussion again ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ --------------6726D2007785ACB228CD285A Content-Type: text/python; charset=us-ascii; name="ClassModules.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ClassModules.py" #!/usr/local/bin/python """ Module-Class-Importer for Python (Version 0.6) Modules in Python behave almost like classes, but do not provide the same mechanisms, like inheritance, baseclasses, special methods, etc. This module provides an alternative module loader, that is build on top of the ihooks.py-interface for the builtin import statement. It works in a similar way, the normal import does, but provides some extra features: * when a module is requested, an instance of the Module-class (or some subclass of Module) is created and the actions 'find' and 'load' are redirected to this instance via method calls * after loading, a call to install_module copies all the attributes from the "real" module object to the Module instance (which costs some memory, but increases lookup speed), thereby making it behave just like the original * a reference to the original module object is kept, so that 'from...import...' also works (since this statement needs a real module object) * whenever a module is referenced, the Module object is returned, if possible, so even after having done 'from x import y' at some point, 'import x' will return the Module object, so hopefully all references to a module made by a Python program should return the Module object, with all its nice advantages (like catching AttributeErrors) * Module provides a basic skeleton -- you can subclass it and then give the ModuleClassImporter your class to use (LazyModule is an example for this), if you don't like some things, like copying attributes (e.g. use __getattr__ to redirect the lookup) This module contains all necessary base classes (working ones, not simply a bare framework), some Loaders, and of course, the LazyModule which started this whole thing in the first place. For more information on how importing works, see ihooks.py and ni.py. ---------------------------------------------------------------- Example of usage: Lazy Import for Python (see LazyImp.py) --------------------------------------------------------------- History: - 0.6: fixed for Python 1.5 Bugs: - none, only unsupported features :-) - I have tested it with Tkinter and a 10.000 line framework, but of course... there may still be some imports out there, I haven't taken into account yet. (c) Marc-Andre Lemburg; all rights reserved """ __version__ = '0.6' import sys,ihooks,imp,os # so that it also works under Python 1.4 try: __debug__ except: __debug__ = 0 # # A fast ModuleLoader # class FastModuleLoader(ihooks.ModuleLoader): """ works like ModuleLoader, but uses imp's find_module, which makes it somewhat faster * note: file system hooks won't work here !!! """ def find_module(self,name,path=None): m = self.find_builtin_module(name) if m: return m if path is None: path = sys.path return imp.find_module(name,path) # # A preprocessing loader # # (parts taken from py_compile.py) import marshal def clong(x): """ return the 4-byte long x as 4-byte string """ return chr(x&0xff)+chr((x>>8)&0xff)+chr((x>>16)&0xff)+chr((x>>24)&0xff) class PreProcessingLoader(FastModuleLoader): """ do some preprocessing when importing a module, that has to be compiled first, i.e. is read in as source file * leaves the rest to FastModuleLoader """ def load_module(self, name, stuff): """ load the module name using stuff """ file, filename, (suff, mode, type) = stuff # check if there already is a properly compiled version pass # if we have to handle a source file... if type == imp.PY_SOURCE: # read file program = file.read() # process program program = self.preprocess(program) # compile and try to write the .pyc-file (copied from py_compile.py) code = compile(program, filename, 'exec') codefilename = filename + (__debug__ and 'c' or 'o') try: fc = open(codefilename,'wb') fc.write(imp.get_magic()) timestamp = long(os.stat(filename)[8]) fc.write(clong(timestamp)) marshal.dump(code,fc) fc.close() if os.name == 'mac': import macfs macfs.FSSpec(codefilename).SetCreatorType('Pyth', 'PYC ') macfs.FSSpec(filename).SetCreatorType('Pyth', 'TEXT') except IOError: pass else: return FastModuleLoader.load_module(self, name, stuff) # register and initialize module m = self.hooks.add_module(name) m.__file__ = filename exec code in m.__dict__ return m def preprocess(self,program): """ do something with the code in program and return the modified string """ program = "The_PreProcessingLoader_was_here = ':-)'\n" + program return program # # The Module base class # class InternalVars: # container class pass class Module: """ The module-works-as-a-class base class * this class is instantiated for every new module loaded by the SimulateImport mechanism * you can subclass the class to add functionality and pass the subclass to SimulateImport for it to be used * important: local variables should always reside in self.__moduleobj__, not in self directly (to avoid name clashes) * note: module initialization is done in the usual way, the modules namespaces then copied to this object * this class emulates the normal import-operation """ def __init__(self,name,loader,fromlist=None): """ a module name is requested * this method should NOT be overridden, instead override startup() which is called, when this method finishes """ self.__moduleobj__ = m = InternalVars() self.__name__ = name m.loader = loader m.fromlist = fromlist m.found = 0 m.loaded = 0 m.modules = loader.modules_dict() m.self = self m.module = None # gets filled by load_module() self.startup() def startup(self): """ module startup * called when a module is requested """ self.find_module() self.load_module() def real_module(self): """ return a real module object """ return self.__moduleobj__.module def register(self): """ makes an entry in modules pointing to this object * loading a module through the loader normally also registers the module, so a call to this method is not needed * note: if you want to do 'from..import..' with this module later on, the registering MUST be done by loader """ self.__moduleobj__.modules[self.__name__] = self def find_module(self): """ find the module """ m = self.__moduleobj__ m.stuff = m.loader.find_module(self.__name__) if not m.stuff: raise ImportError, 'Module: No module named %s'%name m.found = 1 def load_module(self): """ load the module and initialize it * the module must already be found * uses __moduleobj__.loader for loading * calls .install_module to complete the job """ m = self.__moduleobj__ if m.loaded: return if not m.found: raise ImportError, 'Module: call %s.find_module() first'%self.__name__ else: module = m.loader.load_module(self.__name__,m.stuff) self.install_module(module) m.loaded = 1 def install_module(self,module): """ install the module in this objects namespace * must be called after a module is loaded """ # keep a reference to the original self.__moduleobj__.module = module # copy all module attributes to this object for k,v in module.__dict__.items(): setattr(self,k,v) # create a reference in the real module object setattr(module,'__moduleobj__',self.__moduleobj__) def __repr__(self): """ return some meaningful string describing self """ if self.__moduleobj__.loaded: return "<%s '%s'>"%(self.__class__.__name__,self.__name__) elif self.__moduleobj__.found: return "<%s '%s', loading deferred>"%(self.__class__.__name__,self.__name__) else: return "<%s '%s', finding deferred>"%(self.__class__.__name__,self.__name__) __str__ = __repr__ def __getattr__(self,x): """ some unknown attribute is being requested """ raise AttributeError,'%s "%s" was looking for "%s"'%(self.__class__.__name__,self.__name__,x) # # The module-as-class importer # # ModuleImporter to be used: ImporterBaseClass = ihooks.ModuleImporter class ModuleClassImporter(ImporterBaseClass): """ Module importer, that knows how to handle Module-objects correctly """ def __init__(self,module_class,*importer_class_init): """ import modules by encapsulating them in an instance of module_class * modules_class must be a subclass of Module * the other parameters are passed to the ImportClass (see ihooks.py for details) """ apply(ImporterBaseClass.__init__,(self,)+importer_class_init) self.module_class = module_class def import_module(self, name, globals={}, locals={}, fromlist=None): """ module import hook """ if self.modules.has_key(name): # fast path m = self.modules[name] # return the object, if possible if fromlist is None: #print 'Importer: import',name,'(found in sys.modules)',m try: return m.__moduleobj__.self except: return m else: # from..import.. insists on having the real thing ! #print 'Importer: from',name,'import',fromlist,'(found in sys.modules)',m try: return m.__moduleobj__.module except: return m else: if fromlist is None: # normal 'import modulename' #print 'Importer: import "%s" with %s'%(name,self.module_class.__name__) module = apply(self.module_class,(name,self.loader,fromlist)) else: # emulate 'from modulename import something' # (note: this a hack... and not a nice one !) #print 'Importer: from',name,'import',fromlist,'with',self.module_class.__name__ module = apply(self.module_class,(name,self.loader,fromlist)) # module has to be loaded for this to work module.load_module() module = module.real_module() #print 'Importer: %s returned %s'%(self.module_class.__name__,module) return module --------------6726D2007785ACB228CD285A Content-Type: text/python; charset=us-ascii; name="LazyImp.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="LazyImp.py" #!/usr/local/bin/python """ Lazy Import for Python (Version 0.6) Loads modules only if they are needed and referenced. This is done by overloading the builtin 'import' statement, so no code changes are necessary. Everything should work as normal, except that the actual loading process is deferred until a module's attribute is requested (you have to keep in mind, that this can cause exceptions from the module initialization process -- use lazyimp after debugging !) * depends on the module ClassModules.py *** Importing this module autoinstalls the Lazy Import Feature. *** All subsequent imports will be done lazy. *** If you don't like this, comment out the last line ! For more information, see the LazyModule-doc string below. --------------------------------------------------------------- History: - 0.6: fixed for Python 1.5 Bugs: - none, only unsupported features :-) - I have tested it with Tkinter and a 10.000 line framework, but of course... there may still be some imports out there, I haven't taken into account yet. (c) Marc-Andre Lemburg; all rights reserved """ __version__ = '0.6' import sys,ihooks,imp,os from ClassModules import * # base class to be used: LazyModuleBaseClass = Module class LazyModule(LazyModuleBaseClass): """ Lazy Import for Python Loads modules only if they are needed and referenced. This is done by overloading the builtin 'import' statement, so no code changes are necessary. Everything should work as normal, except that the actual loading process is deferred until a module's attribute is requested (you have to keep in mind, that this can cause exceptions from the module initialization process -- use lazyimp after debugging !) Hints: - you can call the method load_module() of a lazy module to force loading of the module (or simply reference some attribute) Caveats: - attributes like __dict__ and __name__, that are provided by the LazyImport-class, do not cause loading - due to a Python internal limitation, from ... import ... is not handled in a lazy fashion (wouldn't be too efficient anyway) - debugging circular imports can become an even harder task (uncomment the #print-statements to see what's going on) """ # finding the module is normally done when the object is created # -- setting this to 1 defers finding too __defer_find = 0 def startup(self): """ lazy import module """ self.__moduleobj__.defer_find = self.__defer_find if not self.__moduleobj__.defer_find: self.find_module() self.register() def load_module(self,cause='*'): """ do the actual import * this can cause ImportErrors and raise exceptions, that must be handled by the caller, i.e. the first reference to a module might raise an exception ! * modules are only loaded once; any subsequent calls to this method are silently ignored (i.e. ImportErrors are only raised the first time, this method is used) """ if self.__moduleobj__.loaded: return #print 'LazyModule: loading module "%s", looking for "%s" ...'%(self.__name__,cause) if self.__moduleobj__.defer_find: # find now self.find_module() # let the base class handle the rest LazyModuleBaseClass.load_module(self) #print 'LazyModule: module "%s" loaded'%self.__name__ def __getattr__(self,x): """ the module's needed, so load it and return the requested attribute afterwards """ #print self.__name__,'is looking for',x if not self.__moduleobj__.loaded: self.load_module(x) return getattr(self,x) else: raise AttributeError,'%s "%s" was looking for "%s"'%(self.__class__.__name__,self.__name__,x) def autoinstall(): """ install the Lazy Module Import feature """ mloader = FastModuleLoader() mhandler = LazyModule newimport = ModuleClassImporter(mhandler,mloader) newimport.install() # # auto-install as new 'import' (comment out, if you don't like this) # autoinstall() --------------6726D2007785ACB228CD285A-- From mal@lemburg.com Fri Feb 4 20:10:54 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 04 Feb 2000 21:10:54 +0100 Subject: [Import-sig] deprecate ihooks? References: <025101bf6f3d$ea552500$f4a7b5d4@hagrid> Message-ID: <389B324E.E779DB0F@lemburg.com> This is a multi-part message in MIME format. --------------6726D2007785ACB228CD285A Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Just to throw some old 2 cents, I've attached some code I wrote way back in 1997 on top of ihooks.py. It turns modules into real classes with all the goodies of __getattr__ et al. at no extra cost. Perhaps this mechanism offers some new insights: by delegating work to the objects in question (the modules) rather than hooking together some meta objects... note that you can do subclassing to add functionality to modules using this approach, e.g. packages could be subclasses of a general package class, etc. Anyway, just a thought you might want to consider... I'm too busy right now to jump into this discussion again ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ --------------6726D2007785ACB228CD285A Content-Type: text/python; charset=us-ascii; name="ClassModules.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ClassModules.py" #!/usr/local/bin/python """ Module-Class-Importer for Python (Version 0.6) Modules in Python behave almost like classes, but do not provide the same mechanisms, like inheritance, baseclasses, special methods, etc. This module provides an alternative module loader, that is build on top of the ihooks.py-interface for the builtin import statement. It works in a similar way, the normal import does, but provides some extra features: * when a module is requested, an instance of the Module-class (or some subclass of Module) is created and the actions 'find' and 'load' are redirected to this instance via method calls * after loading, a call to install_module copies all the attributes from the "real" module object to the Module instance (which costs some memory, but increases lookup speed), thereby making it behave just like the original * a reference to the original module object is kept, so that 'from...import...' also works (since this statement needs a real module object) * whenever a module is referenced, the Module object is returned, if possible, so even after having done 'from x import y' at some point, 'import x' will return the Module object, so hopefully all references to a module made by a Python program should return the Module object, with all its nice advantages (like catching AttributeErrors) * Module provides a basic skeleton -- you can subclass it and then give the ModuleClassImporter your class to use (LazyModule is an example for this), if you don't like some things, like copying attributes (e.g. use __getattr__ to redirect the lookup) This module contains all necessary base classes (working ones, not simply a bare framework), some Loaders, and of course, the LazyModule which started this whole thing in the first place. For more information on how importing works, see ihooks.py and ni.py. ---------------------------------------------------------------- Example of usage: Lazy Import for Python (see LazyImp.py) --------------------------------------------------------------- History: - 0.6: fixed for Python 1.5 Bugs: - none, only unsupported features :-) - I have tested it with Tkinter and a 10.000 line framework, but of course... there may still be some imports out there, I haven't taken into account yet. (c) Marc-Andre Lemburg; all rights reserved """ __version__ = '0.6' import sys,ihooks,imp,os # so that it also works under Python 1.4 try: __debug__ except: __debug__ = 0 # # A fast ModuleLoader # class FastModuleLoader(ihooks.ModuleLoader): """ works like ModuleLoader, but uses imp's find_module, which makes it somewhat faster * note: file system hooks won't work here !!! """ def find_module(self,name,path=None): m = self.find_builtin_module(name) if m: return m if path is None: path = sys.path return imp.find_module(name,path) # # A preprocessing loader # # (parts taken from py_compile.py) import marshal def clong(x): """ return the 4-byte long x as 4-byte string """ return chr(x&0xff)+chr((x>>8)&0xff)+chr((x>>16)&0xff)+chr((x>>24)&0xff) class PreProcessingLoader(FastModuleLoader): """ do some preprocessing when importing a module, that has to be compiled first, i.e. is read in as source file * leaves the rest to FastModuleLoader """ def load_module(self, name, stuff): """ load the module name using stuff """ file, filename, (suff, mode, type) = stuff # check if there already is a properly compiled version pass # if we have to handle a source file... if type == imp.PY_SOURCE: # read file program = file.read() # process program program = self.preprocess(program) # compile and try to write the .pyc-file (copied from py_compile.py) code = compile(program, filename, 'exec') codefilename = filename + (__debug__ and 'c' or 'o') try: fc = open(codefilename,'wb') fc.write(imp.get_magic()) timestamp = long(os.stat(filename)[8]) fc.write(clong(timestamp)) marshal.dump(code,fc) fc.close() if os.name == 'mac': import macfs macfs.FSSpec(codefilename).SetCreatorType('Pyth', 'PYC ') macfs.FSSpec(filename).SetCreatorType('Pyth', 'TEXT') except IOError: pass else: return FastModuleLoader.load_module(self, name, stuff) # register and initialize module m = self.hooks.add_module(name) m.__file__ = filename exec code in m.__dict__ return m def preprocess(self,program): """ do something with the code in program and return the modified string """ program = "The_PreProcessingLoader_was_here = ':-)'\n" + program return program # # The Module base class # class InternalVars: # container class pass class Module: """ The module-works-as-a-class base class * this class is instantiated for every new module loaded by the SimulateImport mechanism * you can subclass the class to add functionality and pass the subclass to SimulateImport for it to be used * important: local variables should always reside in self.__moduleobj__, not in self directly (to avoid name clashes) * note: module initialization is done in the usual way, the modules namespaces then copied to this object * this class emulates the normal import-operation """ def __init__(self,name,loader,fromlist=None): """ a module name is requested * this method should NOT be overridden, instead override startup() which is called, when this method finishes """ self.__moduleobj__ = m = InternalVars() self.__name__ = name m.loader = loader m.fromlist = fromlist m.found = 0 m.loaded = 0 m.modules = loader.modules_dict() m.self = self m.module = None # gets filled by load_module() self.startup() def startup(self): """ module startup * called when a module is requested """ self.find_module() self.load_module() def real_module(self): """ return a real module object """ return self.__moduleobj__.module def register(self): """ makes an entry in modules pointing to this object * loading a module through the loader normally also registers the module, so a call to this method is not needed * note: if you want to do 'from..import..' with this module later on, the registering MUST be done by loader """ self.__moduleobj__.modules[self.__name__] = self def find_module(self): """ find the module """ m = self.__moduleobj__ m.stuff = m.loader.find_module(self.__name__) if not m.stuff: raise ImportError, 'Module: No module named %s'%name m.found = 1 def load_module(self): """ load the module and initialize it * the module must already be found * uses __moduleobj__.loader for loading * calls .install_module to complete the job """ m = self.__moduleobj__ if m.loaded: return if not m.found: raise ImportError, 'Module: call %s.find_module() first'%self.__name__ else: module = m.loader.load_module(self.__name__,m.stuff) self.install_module(module) m.loaded = 1 def install_module(self,module): """ install the module in this objects namespace * must be called after a module is loaded """ # keep a reference to the original self.__moduleobj__.module = module # copy all module attributes to this object for k,v in module.__dict__.items(): setattr(self,k,v) # create a reference in the real module object setattr(module,'__moduleobj__',self.__moduleobj__) def __repr__(self): """ return some meaningful string describing self """ if self.__moduleobj__.loaded: return "<%s '%s'>"%(self.__class__.__name__,self.__name__) elif self.__moduleobj__.found: return "<%s '%s', loading deferred>"%(self.__class__.__name__,self.__name__) else: return "<%s '%s', finding deferred>"%(self.__class__.__name__,self.__name__) __str__ = __repr__ def __getattr__(self,x): """ some unknown attribute is being requested """ raise AttributeError,'%s "%s" was looking for "%s"'%(self.__class__.__name__,self.__name__,x) # # The module-as-class importer # # ModuleImporter to be used: ImporterBaseClass = ihooks.ModuleImporter class ModuleClassImporter(ImporterBaseClass): """ Module importer, that knows how to handle Module-objects correctly """ def __init__(self,module_class,*importer_class_init): """ import modules by encapsulating them in an instance of module_class * modules_class must be a subclass of Module * the other parameters are passed to the ImportClass (see ihooks.py for details) """ apply(ImporterBaseClass.__init__,(self,)+importer_class_init) self.module_class = module_class def import_module(self, name, globals={}, locals={}, fromlist=None): """ module import hook """ if self.modules.has_key(name): # fast path m = self.modules[name] # return the object, if possible if fromlist is None: #print 'Importer: import',name,'(found in sys.modules)',m try: return m.__moduleobj__.self except: return m else: # from..import.. insists on having the real thing ! #print 'Importer: from',name,'import',fromlist,'(found in sys.modules)',m try: return m.__moduleobj__.module except: return m else: if fromlist is None: # normal 'import modulename' #print 'Importer: import "%s" with %s'%(name,self.module_class.__name__) module = apply(self.module_class,(name,self.loader,fromlist)) else: # emulate 'from modulename import something' # (note: this a hack... and not a nice one !) #print 'Importer: from',name,'import',fromlist,'with',self.module_class.__name__ module = apply(self.module_class,(name,self.loader,fromlist)) # module has to be loaded for this to work module.load_module() module = module.real_module() #print 'Importer: %s returned %s'%(self.module_class.__name__,module) return module --------------6726D2007785ACB228CD285A Content-Type: text/python; charset=us-ascii; name="LazyImp.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="LazyImp.py" #!/usr/local/bin/python """ Lazy Import for Python (Version 0.6) Loads modules only if they are needed and referenced. This is done by overloading the builtin 'import' statement, so no code changes are necessary. Everything should work as normal, except that the actual loading process is deferred until a module's attribute is requested (you have to keep in mind, that this can cause exceptions from the module initialization process -- use lazyimp after debugging !) * depends on the module ClassModules.py *** Importing this module autoinstalls the Lazy Import Feature. *** All subsequent imports will be done lazy. *** If you don't like this, comment out the last line ! For more information, see the LazyModule-doc string below. --------------------------------------------------------------- History: - 0.6: fixed for Python 1.5 Bugs: - none, only unsupported features :-) - I have tested it with Tkinter and a 10.000 line framework, but of course... there may still be some imports out there, I haven't taken into account yet. (c) Marc-Andre Lemburg; all rights reserved """ __version__ = '0.6' import sys,ihooks,imp,os from ClassModules import * # base class to be used: LazyModuleBaseClass = Module class LazyModule(LazyModuleBaseClass): """ Lazy Import for Python Loads modules only if they are needed and referenced. This is done by overloading the builtin 'import' statement, so no code changes are necessary. Everything should work as normal, except that the actual loading process is deferred until a module's attribute is requested (you have to keep in mind, that this can cause exceptions from the module initialization process -- use lazyimp after debugging !) Hints: - you can call the method load_module() of a lazy module to force loading of the module (or simply reference some attribute) Caveats: - attributes like __dict__ and __name__, that are provided by the LazyImport-class, do not cause loading - due to a Python internal limitation, from ... import ... is not handled in a lazy fashion (wouldn't be too efficient anyway) - debugging circular imports can become an even harder task (uncomment the #print-statements to see what's going on) """ # finding the module is normally done when the object is created # -- setting this to 1 defers finding too __defer_find = 0 def startup(self): """ lazy import module """ self.__moduleobj__.defer_find = self.__defer_find if not self.__moduleobj__.defer_find: self.find_module() self.register() def load_module(self,cause='*'): """ do the actual import * this can cause ImportErrors and raise exceptions, that must be handled by the caller, i.e. the first reference to a module might raise an exception ! * modules are only loaded once; any subsequent calls to this method are silently ignored (i.e. ImportErrors are only raised the first time, this method is used) """ if self.__moduleobj__.loaded: return #print 'LazyModule: loading module "%s", looking for "%s" ...'%(self.__name__,cause) if self.__moduleobj__.defer_find: # find now self.find_module() # let the base class handle the rest LazyModuleBaseClass.load_module(self) #print 'LazyModule: module "%s" loaded'%self.__name__ def __getattr__(self,x): """ the module's needed, so load it and return the requested attribute afterwards """ #print self.__name__,'is looking for',x if not self.__moduleobj__.loaded: self.load_module(x) return getattr(self,x) else: raise AttributeError,'%s "%s" was looking for "%s"'%(self.__class__.__name__,self.__name__,x) def autoinstall(): """ install the Lazy Module Import feature """ mloader = FastModuleLoader() mhandler = LazyModule newimport = ModuleClassImporter(mhandler,mloader) newimport.install() # # auto-install as new 'import' (comment out, if you don't like this) # autoinstall() --------------6726D2007785ACB228CD285A-- From gstein@lyra.org Sat Feb 5 12:03:50 2000 From: gstein@lyra.org (Greg Stein) Date: Sat, 5 Feb 2000 04:03:50 -0800 (PST) Subject: [Import-sig] find/load? (was: deprecate ihooks?) In-Reply-To: <025101bf6f3d$ea552500$f4a7b5d4@hagrid> Message-ID: On Fri, 4 Feb 2000, Fredrik Lundh wrote: >... > we've also used ihooks in a number of places, with great > success. on the other hand, changing to imputil was hardly > any work at all... > > so I guess The Question is whether the find/load separation > is really necessary. I cannot think of a reason, but that's > probably just me... I've argued in the past that find/load is an inappropriate model for a flexible import mechanism. I won't repeat that here, but if people would like to see some reference material then I'll see if I can dig up those threads. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Feb 5 12:06:36 2000 From: gstein@lyra.org (Greg Stein) Date: Sat, 5 Feb 2000 04:06:36 -0800 (PST) Subject: [Import-sig] versions? (was: Kick-off) In-Reply-To: <025601bf6f40$4aff5220$f4a7b5d4@hagrid> Message-ID: How about we just start blazing a path. If we get it done before 1.6.0, then we'll be happy. I don't see a particular reason to partition the releases *before* we even know where we're going, what we'll build, and how long it may take to complete that. In other words, let's ignore the versions -- that's putting the cart before the horse. I read Gordon's opening note as a way to focus attention. We start with the short-term, finish that, then move onto the long-term. Cheers, -g On Fri, 4 Feb 2000, Fredrik Lundh wrote: > Gordon wrote: > > Short-term: Provide a "new architecture import hooks" module > > for the standard library. This would deprecate ihooks and > > friends, and provide developers with a way of learning the new > > architecture. > > 1.6.1 <= version <= 1.7, right? > > > Long-term: Reform the entire import architecture of Python. > > This affects Python start-up, the semantics of sys.path, and > > the C API to importing. > > 1.7 <= version < 3000, right? > > > The model for this is, of course, Greg's imputil.py (Greg, your > > latest version is not yet on your website which still has a > > November version). > > I'd like to add an ultra-short-term issue: possible changes > to 1.6.0 that makes it easier to experiment with alternate > import strategies, mostly for installation tools like gordon's > install and pythonworks' deployment subsystem. > > (as discussed on last week's consortium meeting) > > most importantly, I'd like to come up with a way to execute > small snippets of script code *before* Python attempts to > import stuff like exceptions.py. > > > > > _______________________________________________ > Import-sig mailing list > Import-sig@python.org > http://www.python.org/mailman/listinfo/import-sig > -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Feb 5 12:28:49 2000 From: gstein@lyra.org (Greg Stein) Date: Sat, 5 Feb 2000 04:28:49 -0800 (PST) Subject: [Import-sig] RE: pythonpath and COM support In-Reply-To: Message-ID: On Sat, 5 Feb 2000, Mark Hammond wrote: >... > > Couldn't pythoncom15.dll etc. live in /DLLs? Or are they > > loaded by other C extensions (using LoadLibrary instead of > > PyImport_x)? > > pythoncom15.dll and pywintypes15.dll (infact, anything I release with the > extension ".dll") is used as a "standard DLL". There exists the possibility > that a .exe will have an implicit reference to one of these .DLLs. So > unless they are in the same path as the executables, or on the %PATH%, they > will not be found. To be quite explicit about what Mark means here: If we have DLL foo.dll and an application or *another* DLL links against foo.dll, then we must place foo.dll on %PATH%. Current "design preference" in Windows is to avoid changing PATH, so the DLLs go into the System or System32 directory. What DLLs go in there, and why? python15.dll: any Python extension module is going to link against this. Thus, when the extension is loaded (from wherever!), this module must be found. pywintypes15.dll: most of the Python/Win32 DLLs links against this for some basic types. pythoncom15.dll: Python/COM extensions will link against this for various pieces of functionality (base classes and support functions) > > > * The path searching code would need to use the location of > > Python1x.dll, > > > rather than the .exe, to locate the PYTHONHOME. This would not > > be a huge > > > change, but necessary none-the-less. > > > > Um, why? Especially if they're the same directory ;-)? > > > > Oh, because of COM (and exposing python15.dll as a COM > > server)? > > Exactly - when a Python COM object is being used, the .exe may well be > something created by VB, and no where near the Python directory. Thus, the > full path to the .exe will be useless, but the full path to Python1x.dll > will be OK. The path to the python15.dll won't help us -- it is sitting in System32. There might be some weird voodoo in the COM stuff that can alter the load path (AppPath?) for a COM server DLL, but I've never looked into it. I seem to recall that it exists and allows the NT Loader to look in specified directories for the dependent DLLs. For example, a COM server registration says "use MyComExtension.dll for the COM server" and it also says "use C:\Program Files\Python\DLLs" as an additional path. If it exists, that would be nice. I don't see that we can avoid putting those DLLs into the System directory without some special provision for modifying the NT Loader's paths. Cheers, -g -- Greg Stein, http://www.lyra.org/ From mhammond@skippinet.com.au Sat Feb 5 23:10:46 2000 From: mhammond@skippinet.com.au (Mark Hammond) Date: Sun, 6 Feb 2000 10:10:46 +1100 Subject: [Import-sig] RE: pythonpath and COM support In-Reply-To: Message-ID: [Greg writes] > > Exactly - when a Python COM object is being used, the .exe may well be > > something created by VB, and no where near the Python > directory. Thus, the > > full path to the .exe will be useless, but the full path to Python1x.dll > > will be OK. > > The path to the python15.dll won't help us -- it is sitting in System32. Except I thought we were discussing how to get python15.dll _out_ of System32? > There might be some weird voodoo in the COM stuff that can alter the load > path (AppPath?) for a COM server DLL, but I've never looked into it. I > seem to recall that it exists and allows the NT Loader to look in > specified directories for the dependent DLLs. For example, a COM server > registration says "use MyComExtension.dll for the COM server" and it also > says "use C:\Program Files\Python\DLLs" as an additional path. If it > exists, that would be nice. Damn - yes - I believe you are correct. [Although my recollection is that the AppPath exists for executables rather than COM objects - eg, we could set up an AppPath for "Python.exe", but not for arbitary COM object names] Lets say "MyVBApp.exe" uses a PythonCOM object. This PythonCOM object can happily point to a full path - eg "C:\Program Files\Python\Pythoncom15.dll". Now, Pythoncom15.dll obviously links against Python15.dll. However, the search path rules dictate that the "path of the .exe" is used to locate the extra DLLs. So even though Python15.dll is in the same directory as Pythoncom15.dll, Windows will not search that directory for Python15.dll - it will search the path that "MyVBApp.exe" lives in, but not the paths that other DLLs were found in. *sigh* - I should have remembered that without Greg's prodding :-( OTOH, I really should test this out - maybe Windows is a little smarter than the rules say it is. Mark. From guido@python.org Sun Feb 13 17:05:52 2000 From: guido@python.org (Guido van Rossum) Date: Sun, 13 Feb 2000 12:05:52 -0500 Subject: [Import-sig] Long-awaited imputil comments Message-ID: <200002131705.MAA20578@eric.cnri.reston.va.us> Hi Greg, I've finally got to a full close-reading of imputil.py. There sure is a lot of good stuff there. At the same time, I think it's far from ready for prime time in the distribution. Here are some detailed comments. (I'd also like to have a high level discussion about its fate, but I'll let to respond to this list first.) class ImportManager: General comment: I would like to use/subclass the ImportManager in rexec (in order to emulate its behavior), but for that to work, I would need to change all references to sys.path and sys.modules (and sys.whatever) to self.whatever, but that would currently require rewriting large pieces of code. It would be nice if the sys module were somehow passed in. I have a feeling the same is true for all references to the os module (_os_stat etc.) because the rexec module also wants to have control over what pieces of the filesystem you have access to. This explains some of the complexity of ihooks (which is currently used by rexec). def install(): The __chain_* code seems in transition (e.g. some functions end with both raise and return) The hook mechanism for 1.6 hasn't been designed yet; what should it be? def add_suffix(): It seems the suffixes variable is only used by the _FilesystemImporter. Since it is shared, calls to add_suffix() will have an effect on the _FilesystemImporter instance. I think it would make more sense if the suffixes table was initialized and managed by the _FilesystemImporter; the add_suffix method on the ImportManager could then simply pass its arguments on to the _FilesystemImporter. def _import_hook(): I think we need a hook here so that Marc-Andre can implement walk-me-up-Scotty; or Tim Peters could implement a policy that requires all imports to use the full module name, even from within the same package. top_module = sys.modules[parts[0]]: There's an undocumented convention that sys.modules[x] may be None, to indicate that module x doesn't exist and that we shouldn't try to look for it any further. This is used by package import to avoid excessive filesystem access in the case where modules in a package import top-level modules. E.g. we're in package P, module M. Now we see "import string". The "local imports override global imports" rule requres that we first look for P.string, which of course doesn't exist, before we look for string in sys.path. The first time we look for P.string, we actually do a bunch of stats: for P/string.py, P/string.pyc, P/string.pyd, P/string.dll. When other submodules of package P also import string, they would each incur all these stat() calls, unless we somehow remebered that there's no P.string. This is taken care of by setting sys.modules['P.string'] = None. Anyway, I think that your logic here doesn't expect that to happen. A fix could be to put "if not top_module: raise KeyError" inside the try/except. def _determine_import_context(): Possible feature: the package could set a flag here to force all imports to go to the top-level (i.e., the flag would mean "no relative imports"). def _import_top_module(): Instead of type(item)..., use isinstance(). Looking forward to _FilesystemImporter: I want to be able to have the *name* of a zip file (or other archive, whatever is supported) in sys.path; that seems more convenient for the user and simplifies $PYTHONPATH processing. def _reload_hook(): Note that reload() does NOT blast the module's dict; for better or for worse. (Occasionally modules know this and save important global data.) class Importer: def install(): This should be a method on the manager. (That makes it easier to change references to sys.path etc.; see my rexec notes above.) def import_top(): This appears a hook that a base class can override; but why? "PRIVATE METHODS": These aren't really private to the class; some are used from the ImportManager class. I note that none of these use any state of the class (it doesn''t *have* any state) and they are essentially an implementation of the (cumbersome) package import policy. I wonder if the whole package import policy shouldn't be implemented in the ImportManager class -- or even in a separate policy class. def get_code(): On the 2/3-tuple return value: a proposal for code to be included in 1.6 shouldn't be encumbered by backwards compatibility issues to previous versions of the proposal. I'm still worried about get_code() returning the code object (or even a module, in the case of extensions). This means that something like freeze might have to re-implement the whole module searching -- first, it might want the source code instead of the code object, and second, it might not want the extension to be loaded at all! I've noticed that all implementations of get_code() start with a test whether parent is None or not, and branch to completely different code. This suggests that we might have two separate methods??? "Some handy stuff for the Importers": This seems to be a remnant of an older imputil.py version; it appears to be unused by the current code or at least there is some code duplication; e.g. _c_suffixes is also calculated in ImportManager. def _compile(): You will have to strip CRLF from the code string read from the file; this is for Unix where opening the file in text mode doesn't do that, but users have come to expect this because the current implementation explicitly strips them. I've recently been notified of a race condition in the code here, when two processes are writing a .pyc file and a third is reading it. On Unix we will have to remove the .pyc file and then use os.open() with O_EXCL in order to avoid the race condition. I don't know how to avoid it on Windows. def _os_bootstrap(): This is ugly and contains a dangerous repetition of code from os.py. I know why you want it but for the "real" version we need a different approach here. def _fs_import(): I think this is unused now? Anyway, it hardcodes the suffix sequence. The test t_pyc >= t_py does not match the current implementation and should not affect the outcome. (But it does save a stat() call in a common case...) The call to _compile() may raise SyntaxError (and also OverflowError and maybe a few others). I don't know what to do about this, but the traceback in that case will look really ugly! class _FilesystemImporter: See comments above about suffix list management. class SuffixImporter: Why is this a class at all? It would seem to be sufficient to have a table of functions instead of a table of instances. These importers have no state and only one method that is always overridden. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Wed Feb 16 14:01:32 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 16 Feb 2000 06:01:32 -0800 (PST) Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: <200002131705.MAA20578@eric.cnri.reston.va.us> Message-ID: On Sun, 13 Feb 2000, Guido van Rossum wrote: > I've finally got to a full close-reading of imputil.py. There sure is > a lot of good stuff there. Excellent! Thanx for taking the time and providing the great feedback. > At the same time, I think it's far from > ready for prime time in the distribution. Understood. I think some of your concerns are based on its historical cruftiness, rather than where it "should" be. I'll prep a new version with the backwards compat pulled out. People that relied on the old version can continue to use their older version (which is Public Domain, so they're quite free to do so :-) >... > class ImportManager: > > General comment: > > I would like to use/subclass the ImportManager in rexec (in > order to emulate its behavior), but for that to work, I would > need to change all references to sys.path and sys.modules (and > sys.whatever) to self.whatever, but that would currently > require rewriting large pieces of code. It would be nice if > the sys module were somehow passed in. I have a feeling the > same is true for all references to the os module (_os_stat > etc.) because the rexec module also wants to have control over > what pieces of the filesystem you have access to. This > explains some of the complexity of ihooks (which is currently > used by rexec). All right. Shouldn't be a problem. > def install(): > > The __chain_* code seems in transition (e.g. some functions > end with both raise and return) The raise/return thingy is simply because I wanted to catch all cases where it didn't import the module and was about to head towards the default importer via the chain. > The hook mechanism for 1.6 hasn't been designed yet; what > should it be? Part of this depends on the policy that you would like to proscribe. There are two issues that I can think of: 1) are import hooks per-builtin-namespace, or per-interpreter? The former is the current model, and may be necessary for rexec types of functionality. The latter would shift the hook to some functions in the sys module, much like the profiler/trace hooks. 2) currently, Python has a single hook, with no provisions for multiple mechanisms to be used simultaneously. This was one of the primary advantages of imputil and its policies/chaining -- it would allow multiple import mechanisms. I believe that we want to continue the current policy: the core interpreter sees a single hook function. [ and Standard Operating Procedure is to install an ImportManager in there, or a subclass ] For a while, I had thought about the "sys" approach, but just realized that may be too limited for rexec types of environments (because we may need a couple ImportManagers to be operating within an interpreter) BUT: we may be able to design an RExecImportManager that remembers restricted modules and uses different behavior when it determines the import context is one of those modules. I haven't thought on this to determine is true viability, tho... > def add_suffix(): > > It seems the suffixes variable is only used by the > _FilesystemImporter. Since it is shared, calls to add_suffix() > will have an effect on the _FilesystemImporter instance. I > think it would make more sense if the suffixes table was > initialized and managed by the _FilesystemImporter; the > add_suffix method on the ImportManager could then simply pass > its arguments on to the _FilesystemImporter. Agreed. ImportManager had the suffix list for a while because it needed it. The code was shifted to _FilesystemImporter, I didn't revisit the placement of the suffixes. I think that I was also leaving it there with an intent that it is part of the public interface, subject to more complex manipulations than the simple add_suffix() would allow. I believe we can solve this latter problem, though, just by adding a get_suffixes() method that fetches the list from fs_imp. > def _import_hook(): > > I think we need a hook here so that Marc-Andre can implement > walk-me-up-Scotty; or Tim Peters could implement a policy that > requires all imports to use the full module name, even from > within the same package. I've been thinking of something along the lines of _determine_import_context() returning a list of things to try. Default is to return something like [current-context,] + sys.path (not exactly that, but you get the idea). The _import_hook would then operate as a simple scan over that list of places to attempt an import from. MAL could override _determine_import_context() to return the walk-me-up, intervening packages. Tim could just always return sys.path (and never bother trying to determine the current context). > top_module = sys.modules[parts[0]]: > There's an undocumented convention that sys.modules[x] may > be None, to indicate that module x doesn't exist and that >... Yes, I'm familiar with that mechanism, and it would be a good addition. >... > Anyway, I think that your logic here doesn't expect that > to happen. A fix could be to put "if not top_module: > raise KeyError" inside the try/except. I've attempted to avoid it, but this recent rewrite may have introduced things like this -- where ImportManager operates as if it the *only* thing performing imports. It's certainly friendly when it doesn't see __ispkg__ or __importer__, but this top_module thing is a valid bug. Quite fixable. > def _determine_import_context(): > > Possible feature: the package could set a flag here to force > all imports to go to the top-level (i.e., the flag would mean > "no relative imports"). Ah. Neat optimization. I'll insert some comments / prototype code to do this. > def _import_top_module(): > > Instead of type(item)..., use isinstance(). Can do. > Looking forward to _FilesystemImporter: I want to be able to > have the *name* of a zip file (or other archive, whatever is > supported) in sys.path; that seems more convenient for the > user and simplifies $PYTHONPATH processing. Hmm. This will complicate things quite a bit, but is doable. It will also increase the processing time for sys.path elements. I'll think of a design and maybe prototype something up after the current round of changes. > def _reload_hook(): > > Note that reload() does NOT blast the module's dict; for > better or for worse. (Occasionally modules know this and save > important global data.) All righty. > class Importer: > > def install(): > > This should be a method on the manager. (That makes it easier > to change references to sys.path etc.; see my rexec notes above.) This is here for backwards compat. I'll remove it. > def import_top(): > > This appears a hook that a base class can override; but why? It is for use by clients of the Importer. In particular, by the ImportManager. It is not intended to be overridden. > "PRIVATE METHODS": > > These aren't really private to the class; some are used from > the ImportManager class. Private to the system, then :-) > I note that none of these use any state of the class (it > doesn''t *have* any state) and they are essentially an > implementation of the (cumbersome) package import policy. > I wonder if the whole package import policy shouldn't be > implemented in the ImportManager class -- or even in a > separate policy class. Nope. They do use a piece of state: self. Subclasses may add state and they need to refer to that state via self. We store Importer instances into modules as __importer__ and then use it later. The example Importers (FuncImporter, PackageImporter, DirectoryImporter, and PathImporter) all use instance variables. _FilesystemImporter definitely requires it. Once we locate the Importer responsible for a package and its contained modules, then we pass off control to that Importer. It is then responsible for completing the import (including the notions of a Python package). The completion of the import can be moved out of Importer, and we would replace "self" by the Importer in question. I attempted to minimize code in ImportManager because a typical execution will only have a single instance of it -- there is no opportunity to implement different policies and mechanisms. By shifting as much as possible out to the Importer class, there is more opportunity for altering behavior. > def get_code(): > > On the 2/3-tuple return value: a proposal for code to be > included in 1.6 shouldn't be encumbered by backwards > compatibility issues to previous versions of the proposal. No problem. Consider the 2-tuple form to be gone. > I'm still worried about get_code() returning the code object > (or even a module, in the case of extensions). This means > that something like freeze might have to re-implement the > whole module searching -- first, it might want the source code > instead of the code object, and second, it might not want the > extension to be loaded at all! You're asking for something that is impossible in practice. Consider the following code fragment: ------------------------ import my_imp_tools my_imp_tools.HTTPImporter("http://www.lyra.org/greg/python/") import qp_xml ------------------------ There is no way that you can create a freeze tool that will be able to manage this kind of scenario. I've frozen software for three different, shipping products. In all cases, I had to custom-build a freeze mechanism. Each time, there would be a list of "root" scripts, a set of false-positives to ignore, a set of missed modules to force inclusion, and a set of binary dependencies that couldn't be determined otherwise. I think the desire for a fully-automated module finder and freezer isn't fulfillable. That said: I wouldn't be opposed to adding a get_source() method to an Importer. If the source is available, then it can return it (it may not be available in an archive!). > I've noticed that all implementations of get_code() start with > a test whether parent is None or not, and branch to completely > different code. This suggests that we might have two separate > methods??? Interesting point. I hadn't noticed that. They don't always branch to different points, however: consider DirectoryImporter and FuncImporter. Essentially, we have two forms of calls: get_code(None, modname, modname) # look for a top-level module get_code(parent, modname, fqname) # look in a package for a module imputil has been quite nice because you only had to worry about one hook, but separating these might be a good thing to do. My preference is to leave it as one hook, but let's see what others have to say... > "Some handy stuff for the Importers": Consider all this torched. I'll also move the Importer subclasses to a new file for placement under Demo/. That should trim imputil.py down quite a lot. >... > def _compile(): > > You will have to strip CRLF from the code string read from the > file; this is for Unix where opening the file in text mode > doesn't do that, but users have come to expect this because > the current implementation explicitly strips them. I'm not sure that I follow what you mean here. The existing code seems to work fine. We *append* a newline; are you suggesting stripping inside the codestring, at the end, ?? > I've recently been notified of a race condition in the code > here, when two processes are writing a .pyc file and a third > is reading it. On Unix we will have to remove the .pyc file > and then use os.open() with O_EXCL in order to avoid the race > condition. I don't know how to avoid it on Windows. The O_EXCL would be on writing. It would be nice if there was a "shared" mode for reading. How are two writers and a reader different from a single writer and a single reader? > def _os_bootstrap(): > > This is ugly and contains a dangerous repetition of code from > os.py. I know why you want it but for the "real" version we > need a different approach here. I agree :-) But short of some refactoring of os.py and the per-platform modules, this is the best that I could do. Importing "os" loads a lot of stuff; I wanted to ensure that we deferred that until we had the ImportManager and associated Importers in place (so the import could occur under the direction of imputil). > def _fs_import(): > > I think this is unused now? Anyway, it hardcodes the suffix > sequence. Unused. It can be ignored. > class SuffixImporter: > > Why is this a class at all? It would seem to be sufficient to > have a table of functions instead of a table of instances. These > importers have no state and only one method that is always > overridden. DynLoadSuffixImporter has state. We could still deal with that, however, by just storing a bound method into the table. I used instances because I wasn't sure what all we might want in there. If we don't add any other methods or attributes to the public interface, then yah: we could switch to a function-based approach. I'll release a new imputil later this week, incorporating these changes, MAL's feedback, and Finn Bock's feedback. Thanx! -g -- Greg Stein, http://www.lyra.org/ From guido@python.org Wed Feb 16 17:27:14 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 16 Feb 2000 12:27:14 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: Your message of "Wed, 16 Feb 2000 06:01:32 PST." References: Message-ID: <200002161727.MAA28273@eric.cnri.reston.va.us> > Excellent! Thanx for taking the time and providing the great feedback. And thanks for the replies. Quick responses: > > At the same time, I think it's far from > > ready for prime time in the distribution. > > Understood. I think some of your concerns are based on its historical > cruftiness, rather than where it "should" be. I'll prep a new version with > the backwards compat pulled out. People that relied on the old version can > continue to use their older version (which is Public Domain, so they're > quite free to do so :-) OK, I'm awaiting a new imputil announcement. (You should really have a webpage for it pointing both to the old and the new version.) > > The hook mechanism for 1.6 hasn't been designed yet; what > > should it be? > > Part of this depends on the policy that you would like to proscribe. There > are two issues that I can think of: > > 1) are import hooks per-builtin-namespace, or per-interpreter? The former > is the current model, and may be necessary for rexec types of > functionality. The latter would shift the hook to some functions in the > sys module, much like the profiler/trace hooks. Yes, per-builtin-namespace is necessary because of rexec -- the system must translate an import statement into a call to a hook that depends on the builtin namespace, because that's how rexec environments (there may be many per interpreter) are denoted. > 2) currently, Python has a single hook, with no provisions for multiple > mechanisms to be used simultaneously. This was one of the primary > advantages of imputil and its policies/chaining -- it would allow > multiple import mechanisms. I believe that we want to continue the > current policy: the core interpreter sees a single hook function. Yes. > [ and Standard Operating Procedure is to install an ImportManager in > there, or a subclass ] Yes. > For a while, I had thought about the "sys" approach, but just realized > that may be too limited for rexec types of environments (because we may > need a couple ImportManagers to be operating within an interpreter) Agreed. It sounds like we're stuck with overriding __builtin__.__import__, like before. > BUT: we may be able to design an RExecImportManager that remembers > restricted modules and uses different behavior when it determines the > import context is one of those modules. I haven't thought on this to > determine is true viability, tho... Sounds tricky -- *all* modules imported in restricted mode must be treated differently. > > def _import_hook(): > > > > I think we need a hook here so that Marc-Andre can implement > > walk-me-up-Scotty; or Tim Peters could implement a policy that > > requires all imports to use the full module name, even from > > within the same package. > > I've been thinking of something along the lines of > _determine_import_context() returning a list of things to try. Default is > to return something like [current-context,] + sys.path (not exactly that, > but you get the idea). The _import_hook would then operate as a simple > scan over that list of places to attempt an import from. MAL could > override _determine_import_context() to return the walk-me-up, intervening > packages. Tim could just always return sys.path (and never bother trying > to determine the current context). Yes. In ni (remember ni?) we had this mechanism; it was called "domain" (not a great name for it). The domain was a list of packages where relative imports were sought. A package could set its domain by setting a variable __domain__. The current policy (current package, then toplevel) corresponds to a 2-item domain: [, ""] <(where "" stands for the unnamed toplevel package). Walk-me-up-Scotty corresponds to a domain containing the current package, its parent, its grandparent, and so on, ending with "". The "no relative imports" policy is represented [""]. If we let __domain__ be initialized by the importer but overridden by the package, we can do everything we need. > > def _determine_import_context(): > > > > Possible feature: the package could set a flag here to force > > all imports to go to the top-level (i.e., the flag would mean > > "no relative imports"). > > Ah. Neat optimization. I'll insert some comments / prototype code to do > this. See above -- the package could simply set its domain to [""]. > > Looking forward to _FilesystemImporter: I want to be able to > > have the *name* of a zip file (or other archive, whatever is > > supported) in sys.path; that seems more convenient for the > > user and simplifies $PYTHONPATH processing. > > Hmm. This will complicate things quite a bit, but is doable. It will also > increase the processing time for sys.path elements. We could cache this in a dictionary: the ImportManager can have a cache dict mapping pathnames to importer objects, and a separate method for coming up with an importer given a pathname that's not yet in the cache. The method should do a stat and/or look at the extension to decide which importer class to use; you can register new importer classes by registering a suffix or a Boolean function, plus a class. If you register a new importer class, the cache is zapped. The cache is independent from sys.path (but maintained per ImportManager instance) so that rearrangements of sys.path do the right thing. If a path is dropped from sys.path the corresponding cache entry is simply no longer used. > > def import_top(): > > > > This appears a hook that a base class can override; but why? > > It is for use by clients of the Importer. In particular, by the > ImportManager. It is not intended to be overridden. Are there any other clients of the Importer class? Since import_top() simply calls _import_one(), do we even need it? > > "PRIVATE METHODS": > > > > These aren't really private to the class; some are used from > > the ImportManager class. > > Private to the system, then :-) Then use a different word -- "private" has a well-defined meaning for C++ and Java programmers. To me, "internal" sounds better. > > I note that none of these use any state of the class (it > > doesn''t *have* any state) and they are essentially an > > implementation of the (cumbersome) package import policy. > > I wonder if the whole package import policy shouldn't be > > implemented in the ImportManager class -- or even in a > > separate policy class. > > Nope. They do use a piece of state: self. Subclasses may add state and > they need to refer to that state via self. We store Importer instances > into modules as __importer__ and then use it later. The example Importers > (FuncImporter, PackageImporter, DirectoryImporter, and PathImporter) all > use instance variables. _FilesystemImporter definitely requires it. > > Once we locate the Importer responsible for a package and its contained > modules, then we pass off control to that Importer. It is then responsible > for completing the import (including the notions of a Python package). The > completion of the import can be moved out of Importer, and we would > replace "self" by the Importer in question. > > I attempted to minimize code in ImportManager because a typical execution > will only have a single instance of it -- there is no opportunity to > implement different policies and mechanisms. By shifting as much as > possible out to the Importer class, there is more opportunity for altering > behavior. But the importer is the wrong place to change the policy globally. The example is walk-me-up-Scotty: to implement that (or, more generally, to implement the __domain__ hook) without editing imputil.py, Marc-Andre would have to have to subclass all the importers that are used. If the policy was embodied in the ImportManager class, he could subclass the ImportManager, install it instead of the default one, but continue to use the existing importers (e.g. to import from zip files). > > I'm still worried about get_code() returning the code object > > (or even a module, in the case of extensions). This means > > that something like freeze might have to re-implement the > > whole module searching -- first, it might want the source code > > instead of the code object, and second, it might not want the > > extension to be loaded at all! > > You're asking for something that is impossible in practice. Consider the > following code fragment: > > ------------------------ > import my_imp_tools > my_imp_tools.HTTPImporter("http://www.lyra.org/greg/python/") > > import qp_xml > ------------------------ > > There is no way that you can create a freeze tool that will be able to > manage this kind of scenario. > > I've frozen software for three different, shipping products. In all cases, > I had to custom-build a freeze mechanism. Each time, there would be a list > of "root" scripts, a set of false-positives to ignore, a set of missed > modules to force inclusion, and a set of binary dependencies that couldn't > be determined otherwise. I think the desire for a fully-automated module > finder and freezer isn't fulfillable. I know you can't do it fully automatically. But I still want to be able to reuse as much of existing importer and importmanager classes as possible. Currently, Tools/freeze/modulefinder.py contains a lot of code that reimplements the entire package resolution mechanism. It should really be able to reuse all that -- so that e.g. the addition of a __domain__ feature doesn't require changes both to imputil.py and to modulefinder.py. > That said: I wouldn't be opposed to adding a get_source() method to an > Importer. If the source is available, then it can return it (it may not be > available in an archive!). How would you invoke it? I would need to have a different call into the ImportManager that would invoke get_source() rather than get_code(). My desire is that freeze's modulefinder should be able to instantiate another ImportManager (maybe a slight subclass) which would find the source code for modules for it. For this, I need to have an API that say "in this context, what module does 'import X' return?". It's okay to return some kind of descriptor object that has methods to (1) get the code, (2) get the source, (3) describe itself. The description should return whether this is a Python or an extension module, whether source is available, and the filename if available. If it came from an archive, the descriptor could be archive-aware, and have additional methods to find out the filename of the archive and the name of the module inside the archive, as well as the type of archive. It it was an extension, there would be extra calls to load the library and to initialize the module, and a way to get the actual filename. This way, freeze could issue reasonable errors if it found a module but couldn't find the source. Freeze also needs to deal specially with extension (and differently with built-in extensions and shared library ones). > > I've noticed that all implementations of get_code() start with > > a test whether parent is None or not, and branch to completely > > different code. This suggests that we might have two separate > > methods??? > > Interesting point. I hadn't noticed that. They don't always branch to > different points, however: consider DirectoryImporter and FuncImporter. But those are the most trivial examples. > Essentially, we have two forms of calls: > > get_code(None, modname, modname) # look for a top-level module > get_code(parent, modname, fqname) # look in a package for a module > > imputil has been quite nice because you only had to worry about one hook, > but separating these might be a good thing to do. My preference is to > leave it as one hook, but let's see what others have to say... You might provide a default implementation for get_code() that calls either get_subcode() or get_topcode() depending on whether parent is None; then subclasses can separately override those, or override get_code() when it's more convenient. I noticed there's only one call to get_code(), from _import_one(). Not sure where this leads though. > I'll also move the Importer subclasses to a new file for placement under > Demo/. That should trim imputil.py down quite a lot. Except for the _FilesystemImporter class, I presume, which is needed in normal use. > >... > > def _compile(): > > > > You will have to strip CRLF from the code string read from the > > file; this is for Unix where opening the file in text mode > > doesn't do that, but users have come to expect this because > > the current implementation explicitly strips them. > > I'm not sure that I follow what you mean here. The existing code seems to > work fine. We *append* a newline; are you suggesting stripping inside the > codestring, at the end, ?? I mean that you have to do codestring = re.sub(r"\r\n", r"\n", codestring) on the code string after reading it. This has nothing to do with the trailing newline. It is needed because the tokenizer chokes on \r\n when it finds it in a string, but not when it reads it from a file -- this has been reported a few times as a bug because it means that exec compile(open(fn).read(), fn, "exec") is not completely equivalent to execfile(fn) -- the compile() may choke if the read() returns lines ending in \r\n, as it may when a Windows file was transplanted to a Unix system. Again, when the parser itself reads the file, this is dealt with correctly. This is a feature: many people share filesystems between Unix and Windows, and just like most Windows compilers don't insist on the \r being there, Unix tools shouldn't insist on it being absent. Fixing compile() is hard, unfortunately, hence this request for a workaround. > > I've recently been notified of a race condition in the code > > here, when two processes are writing a .pyc file and a third > > is reading it. On Unix we will have to remove the .pyc file > > and then use os.open() with O_EXCL in order to avoid the race > > condition. I don't know how to avoid it on Windows. > > The O_EXCL would be on writing. Yes of course, sorry for not clarifying that. > It would be nice if there was a "shared" > mode for reading. How are two writers and a reader different from a single > writer and a single reader? OK, I'll explain the problem. (Cut from a mail explaining it to Jeremy:) | Unfortunately, there's still a race condition, involving three or more | processes: | | A sees no .pyc file and starts writing it | | B sees an invalid .pyc file and decides to go write it later | | A finishes writing and fills in the mtime | | C sees the valid magic and mtime and decides to go read the .pyc file | | B overwrites the .pyc file, truncating it at first | | C continues to read the .pyc file, but sees a truncated file | | B finishes writing and fills in the mtime | | At this point, the .pyc file is valid, but process C has probably | crashed in the unmarshalling code. | | I have devised the following solution (which may even work on | Windows) but not yet implemented it: | | when writing the .pyc file, use unlink() to remove the .pyc file and | then use low-level open() with the proper flags to require that the | file doesn't yet exist (O_EXC:?); then use fdopen(). If the open() | fails, don't write (treat it the same as a failing fopen() now.) | | (You'd think that you could use a temporary file, but it's hard to | come up with a temp filename that's unique -- and if it's not unique, | the same race condition could still happen.) > > def _os_bootstrap(): > > > > This is ugly and contains a dangerous repetition of code from > > os.py. I know why you want it but for the "real" version we > > need a different approach here. > > I agree :-) > > But short of some refactoring of os.py and the per-platform modules, this > is the best that I could do. Importing "os" loads a lot of stuff; I wanted > to ensure that we deferred that until we had the ImportManager and > associated Importers in place (so the import could occur under the > direction of imputil). OK. Let's table this one until we feel we know how to refactor os.py. (Maybe a platform-specific os.py could be frozen into the interpreter.) > > class SuffixImporter: > > > > Why is this a class at all? It would seem to be sufficient to > > have a table of functions instead of a table of instances. These > > importers have no state and only one method that is always > > overridden. > > DynLoadSuffixImporter has state. We could still deal with that, however, > by just storing a bound method into the table. Exactly. > I used instances because I wasn't sure what all we might want in there. If > we don't add any other methods or attributes to the public interface, then > yah: we could switch to a function-based approach. See http://c2.com/cgi-bin/wiki?YouArentGonnaNeedIt (and the rest of this wikiweb on refactoring, patterns etc.) for why you shouldn't plan ahead this far. > I'll release a new imputil later this week, incorporating these changes, > MAL's feedback, and Finn Bock's feedback. Great! --Guido van Rossum (home page: http://www.python.org/~guido/) From just@letterror.com Wed Feb 16 18:40:51 2000 From: just@letterror.com (Just van Rossum) Date: Wed, 16 Feb 2000 19:40:51 +0100 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: <200002161727.MAA28273@eric.cnri.reston.va.us> References: Your message of "Wed, 16 Feb 2000 06:01:32 PST." Message-ID: I kindof lost track here, is it ok if I take a step back? I don't like the chaining aspect of imputil.py. I think I remember that Greg did this mainly to remain as compatible as possible, but I don't see a reason to keep it that way. What you end up with is a linked list, which seems, uh, a little un-pythonic... (and a little awkward to manipulate.) What's needed is pluggable importers. Once proposal I remember (I think it was also Greg's) was that elements on sys.path could be importer instances. Is this still being proposed? I also vaguely remember someone saying that this was not optimal, since it's a 2-dimensional problem: there's a list of directories/files to search, and a list of importers. Would it make sense to add a new variable to the sys module, called "importers" or something, which contains a list of, erm, importers? And drop the __import__ hook. People would plug the import mechanism by manipulating sys.importers instead of mucking with __builtin__.__import__. Importers could then traverse sys.path, or use their own list of things. Just From gmcm@hypernet.com Wed Feb 16 19:56:49 2000 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 16 Feb 2000 14:56:49 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: References: <200002161727.MAA28273@eric.cnri.reston.va.us> Message-ID: <1261391513-4256760@hypernet.com> Just wrote: > I kindof lost track here, is it ok if I take a step back? > > I don't like the chaining aspect of imputil.py. [snip] > What's needed is pluggable importers. Once proposal I remember (I think it > was also Greg's) was that elements on sys.path could be importer instances. > Is this still being proposed? The "stable" version of imputil uses a chain. The CVS version uses sys.path to hold importers (well, it's got one foot in each world). > I also vaguely remember someone saying that this was not optimal, since > it's a 2-dimensional problem: there's a list of directories/files to > search, and a list of importers. People were thinking in terms of "policy" importers - which would, indeed, be unmanagable. Importers are based on "turf" (to which we're currently trying to add "policy" hooks, at least for certain kinds of importers). So a decently written importer knows very quickly whether the request belongs to him. > Would it make sense to add a new variable > to the sys module, called "importers" or something, which contains a list > of, erm, importers? And drop the __import__ hook. People would plug the > import mechanism by manipulating sys.importers instead of mucking with > __builtin__.__import__. Importers could then traverse sys.path, or use > their own list of things. As it stands now (CVS version, looking to be in the std dist, but not the core), an ImportManager uses the __import__ hook and developers put their importers on sys.path. A policy hook would probably go into the FileSystemImporter, (where it would get activated when a directory on sys.path was searched), but it wouldn't go searching sys.path itself. - Gordon From gmcm@hypernet.com Wed Feb 16 20:43:54 2000 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 16 Feb 2000 15:43:54 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: References: <200002131705.MAA20578@eric.cnri.reston.va.us> Message-ID: <1261388683-4430097@hypernet.com> [Guido] > > I'm still worried about get_code() returning the code object > > (or even a module, in the case of extensions). This means > > that something like freeze might have to re-implement the > > whole module searching -- first, it might want the source code > > instead of the code object, and second, it might not want the > > extension to be loaded at all! If a freeze mechanism is analyzing source, then it needs source, but I don't think that's necessary. Then only other reason I can see for wanting source is if freeze is running with one magic number, but the frozen code will run with another, (to which I say "tough toenails, tootsie"). [Greg] > You're asking for something that is impossible in practice. Consider the > following code fragment: > > ------------------------ > import my_imp_tools > my_imp_tools.HTTPImporter("http://www.lyra.org/greg/python/") > > import qp_xml > ------------------------ > > There is no way that you can create a freeze tool that will be able to > manage this kind of scenario. Sure. Dynamically replace the Importer in __bases__ with a hacked one that doesn't affect sys.modules, grabs the code object and analyzes byte code (like modulefinder does) to find further imports. > I've frozen software for three different, shipping products. In all cases, > I had to custom-build a freeze mechanism. Each time, there would be a list > of "root" scripts, a set of false-positives to ignore, a set of missed > modules to force inclusion, and a set of binary dependencies that couldn't > be determined otherwise. I think the desire for a fully-automated module > finder and freezer isn't fulfillable. False-positives are unavoidable. Missed modules would be no worse and probably better than today (since this scheme would use the importer to trace the actions of the importer). - Gordon From gmcm@hypernet.com Wed Feb 16 20:43:53 2000 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 16 Feb 2000 15:43:53 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: <200002161727.MAA28273@eric.cnri.reston.va.us> References: Your message of "Wed, 16 Feb 2000 06:01:32 PST." Message-ID: <1261388665-4430113@hypernet.com> [Guido] > > > I think we need a hook here so that Marc-Andre can implement > > > walk-me-up-Scotty; or Tim Peters could implement a policy that > > > requires all imports to use the full module name, even from > > > within the same package. Hmm, after thinking about it, I can't see these as "import hooks". At least, if you are installing these system-wide, these are changing the semantics of "import", not grabbing code from strange places, or transforming .xyz files into Python or ... I can have working code; now add some mxX or TP extension and have existing code (not using the new extension) break. OTOH, if MAL / TP provides an importer, and fixes that importer to follow their preferred policy, that's fine; and I can pretend that that's an "import hook". - Gordon From guido@python.org Wed Feb 16 20:52:08 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 16 Feb 2000 15:52:08 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: Your message of "Wed, 16 Feb 2000 15:43:54 EST." <1261388683-4430097@hypernet.com> References: <200002131705.MAA20578@eric.cnri.reston.va.us> <1261388683-4430097@hypernet.com> Message-ID: <200002162052.PAA29056@eric.cnri.reston.va.us> > [Guido] > > > I'm still worried about get_code() returning the code object > > > (or even a module, in the case of extensions). This means > > > that something like freeze might have to re-implement the > > > whole module searching -- first, it might want the source code > > > instead of the code object, and second, it might not want the > > > extension to be loaded at all! [Gordon] > If a freeze mechanism is analyzing source, then it needs > source, but I don't think that's necessary. Then only other > reason I can see for wanting source is if freeze is running with > one magic number, but the frozen code will run with another, > (to which I say "tough toenails, tootsie"). Maybe my freezer wants to store the source as well as the code objects, so it can give decent tracebacks. Which reminds me -- we need to introduce a standard API to retrieve the source for a module that's been imported (if it's available at all). I can easily see how archives can be distributed containing both .pyc and .py files; the zip access module could easily find the .py file on request. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Feb 16 20:54:27 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 16 Feb 2000 15:54:27 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: Your message of "Wed, 16 Feb 2000 15:43:53 EST." <1261388665-4430113@hypernet.com> References: Your message of "Wed, 16 Feb 2000 06:01:32 PST." <1261388665-4430113@hypernet.com> Message-ID: <200002162054.PAA29068@eric.cnri.reston.va.us> > [Guido] > > > > I think we need a hook here so that Marc-Andre can implement > > > > walk-me-up-Scotty; or Tim Peters could implement a policy that > > > > requires all imports to use the full module name, even from > > > > within the same package. [Gordon] > Hmm, after thinking about it, I can't see these as "import > hooks". At least, if you are installing these system-wide, > these are changing the semantics of "import", not grabbing > code from strange places, or transforming .xyz files into > Python or ... > > I can have working code; now add some mxX or TP extension > and have existing code (not using the new extension) break. > > OTOH, if MAL / TP provides an importer, and fixes that > importer to follow their preferred policy, that's fine; and I can > pretend that that's an "import hook". Good point. Fortunately, the proposed solution (reintroducing __domain__) lets this be solved on a per-package basis. Still, I want to be able to subclass ImportManager to change the global policy; supporting __domain__ is an example of such a change of policy. I also want to avoid having to reimplement the policy, with all its warts, in freeze. --Guido van Rossum (home page: http://www.python.org/~guido/) From Fredrik Lundh" <1261388683-4430097@hypernet.com> Message-ID: <014901bf78c2$35503260$34aab5d4@hagrid> > If a freeze mechanism is analyzing source, then it needs=20 > source, but I don't think that's necessary. Then only other=20 > reason I can see for wanting source Which reminds me... it would be nice if an import handler can provide optional "find corresponding source" hooks for traceback.py and friends. (among other things, this would allow pythonworks to use an archive file as the "workspace"...) From gmcm@hypernet.com Wed Feb 16 21:24:45 2000 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 16 Feb 2000 16:24:45 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: <200002162052.PAA29056@eric.cnri.reston.va.us> References: Your message of "Wed, 16 Feb 2000 15:43:54 EST." <1261388683-4430097@hypernet.com> Message-ID: <1261386237-4574590@hypernet.com> [Gordon] > > If a freeze mechanism is analyzing source, then it needs > > source, but I don't think that's necessary. Then only other > > reason I can see for wanting source is if freeze is running with > > one magic number, but the frozen code will run with another, > > (to which I say "tough toenails, tootsie"). [Guido] > Maybe my freezer wants to store the source as well as the code > objects, so it can give decent tracebacks. > > Which reminds me -- we need to introduce a standard API to retrieve > the source for a module that's been imported (if it's available at > all). I can easily see how archives can be distributed containing > both .pyc and .py files; the zip access module could easily find the > .py file on request. [Fredrik] > Which reminds me... it would be nice if an import handler > can provide optional "find corresponding source" hooks for > traceback.py and friends. > > (among other things, this would allow pythonworks to use > an archive file as the "workspace"...) Hmm, wasn't there a reference earlier today to "You ain't gonna need it"? Java doesn't do it. You can already do it if you install source, then archive it, leaving the __file__ attribute alone - IDLE / Pythonwin will pop up the source. Nobody sane is going to put code under active development in an archive. A developer who wants run from an archive, yet see (but not alter) the source at a traceback can do as above (install source, then archive it). Users who don't know and don't care can snip the traceback and send it to the developer, who can find the source. Yeah, it can be supported, but Pythonworks is the only people who are going to use it, and the mad scientist can code it up in 10 minutes ;-). - Gordon From guido@python.org Wed Feb 16 21:47:34 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 16 Feb 2000 16:47:34 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: Your message of "Wed, 16 Feb 2000 16:24:45 EST." <1261386237-4574590@hypernet.com> References: Your message of "Wed, 16 Feb 2000 15:43:54 EST." <1261388683-4430097@hypernet.com> <1261386237-4574590@hypernet.com> Message-ID: <200002162147.QAA29542@eric.cnri.reston.va.us> > Hmm, wasn't there a reference earlier today to "You ain't > gonna need it"? I claim we need it. > Java doesn't do it. So? > You can already do it if you install source, then archive it, > leaving the __file__ attribute alone - IDLE / Pythonwin will pop > up the source. > > Nobody sane is going to put code under active development in > an archive. But there are other reasons why you would want to see tracebacks even if you're not actively developing. Plenty of people distribute mostly-working code to end users and ask them to report tracebacks. E.g. the Ultraseek product from Infoseek (used for the python.org search) occasionally displays tracebacks. The Zope guys also do this (they hide the traceback in an HTML comment I believe, but it's there). Sure, you can take a traceback without source lines and match up the line numbers manually with your source, assuming you have the exact version of the source -- but it's a pain. > A developer who wants run from an archive, yet see (but not > alter) the source at a traceback can do as above (install > source, then archive it). That's no option for distributions -- the archive is the only distribution! > Users who don't know and don't care can snip the traceback > and send it to the developer, who can find the source. As I said, very inconvenient. > Yeah, it can be supported, but Pythonworks is the only people > who are going to use it, and the mad scientist can code it up > in 10 minutes ;-). I didn't say I wanted *you* to code it. I just said that I want the API. Accessing the source code is a common need in lots of places. Adding the source to the archive is a nice solution. Why don't you like it? --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Wed Feb 16 22:49:14 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 16 Feb 2000 14:49:14 -0800 (PST) Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: <200002161727.MAA28273@eric.cnri.reston.va.us> Message-ID: On Wed, 16 Feb 2000, Guido van Rossum wrote: >... > > Understood. I think some of your concerns are based on its historical > > cruftiness, rather than where it "should" be. I'll prep a new version with > > the backwards compat pulled out. People that relied on the old version can > > continue to use their older version (which is Public Domain, so they're > > quite free to do so :-) > > OK, I'm awaiting a new imputil announcement. (You should really have > a webpage for it pointing both to the old and the new version.) I do. http://www.lyra.org/greg/python/. The stable module is provided, and the latest is available via ViewCVS (linked from the imputil section on the page). >... > Agreed. It sounds like we're stuck with overriding > __builtin__.__import__, like before. Yes. I'll rebuild around this. If somebody comes up with a new/better design at some point, then we switch. > > BUT: we may be able to design an RExecImportManager that remembers > > restricted modules and uses different behavior when it determines the > > import context is one of those modules. I haven't thought on this to > > determine is true viability, tho... > > Sounds tricky -- *all* modules imported in restricted mode must be > treated differently. Yah. We'll just use the __import__ thing. It will allow multiple ImportManager objects to exist. >... > If we let __domain__ be initialized by the importer but overridden by > the package, we can do everything we need. All right. I'll look at adding that. >... > > > Looking forward to _FilesystemImporter: I want to be able to > > > have the *name* of a zip file (or other archive, whatever is > > > supported) in sys.path; that seems more convenient for the > > > user and simplifies $PYTHONPATH processing. > > > > Hmm. This will complicate things quite a bit, but is doable. It will also > > increase the processing time for sys.path elements. > > We could cache this in a dictionary: the ImportManager can have a Sounds like a good plan. I'll add this. >... > > > def import_top(): > > > > > > This appears a hook that a base class can override; but why? > > > > It is for use by clients of the Importer. In particular, by the > > ImportManager. It is not intended to be overridden. > > Are there any other clients of the Importer class? Since import_top() > simply calls _import_one(), do we even need it? Manual invocation. This code works: ------------------------- fetch = my_imp_tools.HTTPImporter("...") qp_xml = fetch.import_top("qp_xml") ------------------------- In other words: it is entirely possible to use an Importer without it being installed into an ImportManager. >... > Then use a different word -- "private" has a well-defined meaning for > C++ and Java programmers. To me, "internal" sounds better. I'll comment/doc appropriately. >... > But the importer is the wrong place to change the policy globally. > The example is walk-me-up-Scotty: to implement that (or, more > generally, to implement the __domain__ hook) without editing > imputil.py, Marc-Andre would have to have to subclass all the > importers that are used. If the policy was embodied in the > ImportManager class, he could subclass the ImportManager, install it > instead of the default one, but continue to use the existing importers > (e.g. to import from zip files). You're falling right back into the classic import hook problem. If MAL alters ImportManager and installs it, then he blows away whatever TP has done. The only time that anybody should ever consider modifying or subclassing ImportManager: 1) An rexec-like environment. This is allowed because you are installing the new ImportManager into a specific namespace, rather than __builtin__.__import__ 2) A shipping application. This is allowed because it's your app :-). The app is fully self-contained and all imports are known ahead of time. [ if your app is going to import unknown third-party code, then we're still okay, as that third-party stuff will be in an rexec environment, or it won't fall into the "I'm an app" category ] Given that modules/packages should not be altering ImportManager at any point, then any policy changes must go elsewhere. I claim that is the Importer that is managing that package. Since a package is normally imported and managed only *one* Importer, then the problem falls down to altering the Importer used for the package while it is loading. This is doable: 1) import foo.bar.baz is executed 2) ImportManager locates the package via an Importer. That Importer calls get_code() (or import_from_dir() for the _FilesystemImporter) 3) The Importer loads the package code object from wherever. (_FilesystemImporter loads the code object for __init__ and returns it) 4) The Importer creates a module object and stores __importer__ into it, pointing at . 5) The code object is executed, overwriting __importer__. (it is also possible to do this by returning a new value in result[2] of the get_code() call, but this scenario doesn't have a custom Importer installed yet that would do this) 6) The Importer finishes loading the package and returns to the ImportManager. 7) The ImportManager calls _finish_import on the Importer found in the package module's __importer__ attribute (the custom Importer) 8) The custom Importer Does Its Thing In essence, people should be highly discouraged from ever touching ImportManager. The particular __domain__ thing can be defined as a package-private modification to the import process (for that package only!), thus the package should fix up the Importer used for itself. Specifically, the MAL and TP "import style" is implemented by overriding the algorithm in Importer._do_import(). >... > I know you can't do it fully automatically. But I still want to be > able to reuse as much of existing importer and importmanager classes > as possible. Currently, Tools/freeze/modulefinder.py contains a lot > of code that reimplements the entire package resolution mechanism. > It should really be able to reuse all that -- so that e.g. the > addition of a __domain__ feature doesn't require changes both to > imputil.py and to modulefinder.py. All right. I'll see if I can come up with something for this. > > That said: I wouldn't be opposed to adding a get_source() method to an > > Importer. If the source is available, then it can return it (it may not be > > available in an archive!). > > How would you invoke it? I would need to have a different call into > the ImportManager that would invoke get_source() rather than > get_code(). Yes: a different call into the ImportManager. I'll do the get_source thing as a second step (possibly as a subclass, as you mentioned). First is to fold in the rest of the feedback. >... > > > I've noticed that all implementations of get_code() start with > > > a test whether parent is None or not, and branch to completely > > > different code. This suggests that we might have two separate > > > methods??? > > > > Interesting point. I hadn't noticed that. They don't always branch to > > different points, however: consider DirectoryImporter and FuncImporter. > > But those are the most trivial examples. So? That doesn't negate their use as an example. class HTTPImporter: def __init__(self, url): self.url = url def get_code(self, parent, modname, fqname): if parent: url = parent.__url__ else: url = self.url # look for at Granted, we could also rewrite the "look for" part as a method which is called by get_subcode() and get_topcode(). > > Essentially, we have two forms of calls: > > > > get_code(None, modname, modname) # look for a top-level module > > get_code(parent, modname, fqname) # look in a package for a module > > > > imputil has been quite nice because you only had to worry about one hook, > > but separating these might be a good thing to do. My preference is to > > leave it as one hook, but let's see what others have to say... > > You might provide a default implementation for get_code() that calls > either get_subcode() or get_topcode() depending on whether parent is > None; then subclasses can separately override those, or override > get_code() when it's more convenient. Sure. >... > > I'll also move the Importer subclasses to a new file for placement under > > Demo/. That should trim imputil.py down quite a lot. > > Except for the _FilesystemImporter class, I presume, which is needed > in normal use. Yes. >... > I mean that you have to do > > codestring = re.sub(r"\r\n", r"\n", codestring) > > on the code string after reading it. This has nothing to do with the Ah! Okay... not a problem. It would be nice to invoke the compiler on an open file object. That would obviate this problem entirely. I think that I'll look into doing a patch for this, rather than using re.sub(). >... > Unix tools shouldn't insist on it being absent. Fixing compile() is > hard, unfortunately, hence this request for a workaround. If I can't figure out a way to do it, then I'll fall back to re.sub() :-) >... > > It would be nice if there was a "shared" > > mode for reading. How are two writers and a reader different from a single > > writer and a single reader? > > OK, I'll explain the problem. (Cut from a mail explaining it to > Jeremy:) >... > | I have devised the following solution (which may even work on > | Windows) but not yet implemented it: > | > | when writing the .pyc file, use unlink() to remove the .pyc file and > | then use low-level open() with the proper flags to require that the > | file doesn't yet exist (O_EXC:?); then use fdopen(). If the open() > | fails, don't write (treat it the same as a failing fopen() now.) > | > | (You'd think that you could use a temporary file, but it's hard to > | come up with a temp filename that's unique -- and if it's not unique, > | the same race condition could still happen.) Consider it fixed. >... > OK. Let's table this one until we feel we know how to refactor os.py. > (Maybe a platform-specific os.py could be frozen into the > interpreter.) I'll leave appropriate comments in the source as a reminder. > > > class SuffixImporter: > > > > > > Why is this a class at all? It would seem to be sufficient to > > > have a table of functions instead of a table of instances. These > > > importers have no state and only one method that is always > > > overridden. >... > > I used instances because I wasn't sure what all we might want in there. If > > we don't add any other methods or attributes to the public interface, then > > yah: we could switch to a function-based approach. > > See http://c2.com/cgi-bin/wiki?YouArentGonnaNeedIt (and the > rest of this wikiweb on refactoring, patterns etc.) for why you > shouldn't plan ahead this far. I wasn't planning far ahead at all. Just banging out some code :-) Now that that piece is (ahem) done, I'll revise the use of ".import_file". And yes... YouArentGonnaNeedIt is a very familiar mantra to me. You should have been at MSFT with me to see how many times I wielded that bat against the developers :-) [ the mantra applies whole-heartedly to Python; it gets a little less rigid when you're talking about hard-to-maintain languages like C++ ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Feb 16 23:05:11 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 16 Feb 2000 15:05:11 -0800 (PST) Subject: [Import-sig] freezing (was: Long-awaited imputil comments) In-Reply-To: <1261388683-4430097@hypernet.com> Message-ID: On Wed, 16 Feb 2000, Gordon McMillan wrote: >... > [Greg] > > You're asking for something that is impossible in practice. Consider the > > following code fragment: > > > > ------------------------ > > import my_imp_tools > > my_imp_tools.HTTPImporter("http://www.lyra.org/greg/python/") > > > > import qp_xml > > ------------------------ > > > > There is no way that you can create a freeze tool that will be able to > > manage this kind of scenario. > > Sure. Dynamically replace the Importer in __bases__ with a > hacked one that doesn't affect sys.modules, grabs the code > object and analyzes byte code (like modulefinder does) to find > further imports. The HTTPImporter is parameterized -- analyzing bytecode or a parse tree won't discover those parameter values (without a lot of work). You'd have to run the code to get the Importers instantiated and installed, but then you could have a problem with code that is executing outside of a classdef or funcdef. Guido suggested a custom ImportManager, but that would run into the same kind of problem. Effectively, I think what needs to happen is that the freeze tool understands Importer classes. The configuration input to the tool would specify how to set up the Importers, which the tool would directly query. In other words... it would still be a pretty custom approach *if* the application uses any custom import stuff. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm@hypernet.com Wed Feb 16 23:03:20 2000 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 16 Feb 2000 18:03:20 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: <200002162147.QAA29542@eric.cnri.reston.va.us> References: Your message of "Wed, 16 Feb 2000 16:24:45 EST." <1261386237-4574590@hypernet.com> Message-ID: <1261380325-4930086@hypernet.com> [source in archives] > > Java doesn't do it. > > So? So if there were demand for it, I would expect JavaSoft to invest web real estate in describing it as a feature. They don't. > But there are other reasons why you would want to see tracebacks even > if you're not actively developing. > > Plenty of people distribute mostly-working code to end users and ask > them to report tracebacks. E.g. the Ultraseek product from Infoseek > (used for the python.org search) occasionally displays tracebacks. > The Zope guys also do this (they hide the traceback in an HTML comment > I believe, but it's there). > > Sure, you can take a traceback without source lines and match up the > line numbers manually with your source, assuming you have the exact > version of the source -- but it's a pain. It's an inconvenience which I think will cause far less pain and suffering than you're predicting. I can't double click in my browser and go to the source, nor in an email containing a traceback. So unless I'm intimately familiar with the bug, I'll be entering line numbers into my editor anyway. Let them distribute alphas and betas in source form if that's a problem. They'll still enter line numbers into their editors. > > A developer who wants run from an archive, yet see (but not > > alter) the source at a traceback can do as above (install > > source, then archive it). > > That's no option for distributions -- the archive is the only > distribution! Archives aren't a convenience for distribution - you zip / tgz anyway. They're only a minor aid in installing (unless you're talking about a "freeze" type situation, in which case you almost certainly don't want source) - it's that much less you need to unpack, but you'll almost certainly be uncompressing and unpacking anyway - even if just to get to the README. We've already thrown "disk space" out, since zlib isn't everywhere available. That leaves speed. We've interfered with that by adopting a complex file format, but I can buy the reasoning - the existance of tools. > > Users who don't know and don't care can snip the traceback > > and send it to the developer, who can find the source. > > As I said, very inconvenient. > > > Yeah, it can be supported, but Pythonworks is the only people > > who are going to use it, and the mad scientist can code it up > > in 10 minutes ;-). > > I didn't say I wanted *you* to code it. I just said that I want the > API. Accessing the source code is a common need in lots of places. > Adding the source to the archive is a nice solution. > > Why don't you like it? It complicates based on a predicted need that I think is inaccurate. I've got a file folder of nearly 500 msgs about my installer, and not one mentions lack of access to source on a traceback as a problem. Yes, that's "different", because it's a freeze like situation, and people don't make that complaint about freeze, either. Which takes me back to Java as a real life example. I say the only people who would be bothered are developers using archives - and as developers, they have easy ways of dealing with it. OK, I'm being irate. No, it's not that big a deal. Maybe by Py3K we'll have agreed on what exception to raise when get_source fails... - Gordon From gstein@lyra.org Wed Feb 16 23:13:03 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 16 Feb 2000 15:13:03 -0800 (PST) Subject: [Import-sig] fetching source (was: Long-awaited imputil comments) In-Reply-To: <200002162052.PAA29056@eric.cnri.reston.va.us> Message-ID: On Wed, 16 Feb 2000, Guido van Rossum wrote: >... > Which reminds me -- we need to introduce a standard API to retrieve > the source for a module that's been imported (if it's available at > all). I can easily see how archives can be distributed containing > both .pyc and .py files; the zip access module could easily find the > .py file on request. We could do something like the following: source = module.__importer__.get_module_source(module) Note that we also have: source = importer.get_source(parent, modname, fqname) Something along those lines... Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@python.org Thu Feb 17 14:46:22 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 17 Feb 2000 09:46:22 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: Your message of "Wed, 16 Feb 2000 18:03:20 EST." <1261380325-4930086@hypernet.com> References: Your message of "Wed, 16 Feb 2000 16:24:45 EST." <1261386237-4574590@hypernet.com> <1261380325-4930086@hypernet.com> Message-ID: <200002171446.JAA00480@eric.cnri.reston.va.us> OK, OK. No need to get all wound up about it. I'll stop now, after this: I don't mind that the implementation for get_source() raises an error (any error) when the code came from an archive. I just want a standard API that people who write alternative code repositories can implement. Greg's proposal seems fine: module.__importer__.get_module_source(module). --Guido van Rossum (home page: http://www.python.org/~guido/) From Fredrik Lundh" <1261386237-4574590@hypernet.com> Message-ID: <002e01bf7957$9598d4c0$34aab5d4@hagrid> Gordon wrote: > Java doesn't do it. so? > You can already do it if you install source, then archive it,=20 > leaving the __file__ attribute alone - IDLE / Pythonwin will pop=20 > up the source. pythonworks users install pythonworks in a directory of their own choosing. they don't necessarily install source, archive it, and keep the source files around in the file system. > Nobody sane is going to put code under active development in=20 > an archive. I didn't say that, did I? Just said that *I* thought it was a good idea ;-) (if it makes you feel better about the idea, replace the word "archive" with "database"). > Yeah, it can be supported, but Pythonworks is the only people=20 > who are going to use it, and the mad scientist can code it up=20 > in 10 minutes ;-).=20 sure, but then he has to ship a custom python library (and a custom interpreter, for that matter). From guido@python.org Thu Feb 17 14:56:32 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 17 Feb 2000 09:56:32 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: Your message of "Wed, 16 Feb 2000 14:49:14 PST." References: Message-ID: <200002171456.JAA00490@eric.cnri.reston.va.us> > You're falling right back into the classic import hook problem. If MAL > alters ImportManager and installs it, then he blows away whatever TP has > done. Fair enough. I believe I was thinking of exactly the situations you were mentioning later: in legitimate situations where the policy needs to be changed, such as rexec or a closed app, it would be helpful if the policy was all implemented as part of ImportManager. The importers shouldn't typically deal with the policy: they are there to deal with the intricacies of importing code from a particular archive format, or from the web (e.g. webDAV :-), or from a database or version control management system. If I have a legitimate situation (see above) where I need to change the policy, I want to be able to subclass one class. With the current architecture, I would need to subclass each of the importer classes that I am using to change the policy. Instead, I want to be able to change the ImportManager and hook it up with the existing importers. (This also suggests that the relationship between the ImportManager and the _FilesystemImporter should be more loosely coupled.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Thu Feb 17 17:20:21 2000 From: gstein@lyra.org (Greg Stein) Date: Thu, 17 Feb 2000 09:20:21 -0800 (PST) Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: <200002171456.JAA00490@eric.cnri.reston.va.us> Message-ID: On Thu, 17 Feb 2000, Guido van Rossum wrote: >... > If I have a legitimate situation (see above) where I need to change > the policy, I want to be able to subclass one class. With the current > architecture, I would need to subclass each of the importer classes > that I am using to change the policy. Instead, I want to be able to > change the ImportManager and hook it up with the existing importers. All righty. It looks like we're in agreement on what "legitimate changes to ImportManager" means. I'll try to capture the essence of this into some doc/comments somewhere (definitely into doc when this is stable). However, we still have a tension occurring here: 1) implementing policy in ImportManager assists in single-point policy changes for app/rexec situations 2) implementing policy in Importer assists in package-private policy changes for normal, operating conditions I'll see if I can sort out a way to do this. Maybe the Importer class will implement the methods (which can be overridden to change policy) by delegating to ImportManager. > (This also suggests that the relationship between the ImportManager > and the _FilesystemImporter should be more loosely coupled.) Per a suggestion from MAL, I'm going to allow a user to pass at ImportManager construction time. If you write a custom ImportManager, then you can pass in your own fs_imp when you instantiate it. I'll also move the default class (_FilesystemImporter) into a class variable. Is that the uncoupling you were thinking of? (we're also uncoupling the suffixes stuff somewhat) Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@python.org Thu Feb 17 17:27:54 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 17 Feb 2000 12:27:54 -0500 Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: Your message of "Thu, 17 Feb 2000 09:20:21 PST." References: Message-ID: <200002171727.MAA01339@eric.cnri.reston.va.us> > However, we still have a tension occurring here: > > 1) implementing policy in ImportManager assists in single-point policy > changes for app/rexec situations > 2) implementing policy in Importer assists in package-private policy > changes for normal, operating conditions > > I'll see if I can sort out a way to do this. Maybe the Importer class will > implement the methods (which can be overridden to change policy) by > delegating to ImportManager. Maybe also think about what kind of policies an Importer would be likely to want to change. I have a feeling that a lot of the code there is actually not so much policy but a *necessity* to get things working given the calling conventions for the __import__ hook: whether to return the head or tail of a dotted name, or when to do the "finish fromlist" stuff. > > (This also suggests that the relationship between the ImportManager > > and the _FilesystemImporter should be more loosely coupled.) > > Per a suggestion from MAL, I'm going to allow a user to pass at > ImportManager construction time. If you write a custom ImportManager, then > you can pass in your own fs_imp when you instantiate it. I'll also move > the default class (_FilesystemImporter) into a class variable. > > Is that the uncoupling you were thinking of? (we're also uncoupling the > suffixes stuff somewhat) Great! --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Thu Feb 17 18:01:49 2000 From: gstein@lyra.org (Greg Stein) Date: Thu, 17 Feb 2000 10:01:49 -0800 (PST) Subject: [Import-sig] Re: Long-awaited imputil comments In-Reply-To: <200002171727.MAA01339@eric.cnri.reston.va.us> Message-ID: On Thu, 17 Feb 2000, Guido van Rossum wrote: >... > Maybe also think about what kind of policies an Importer would be > likely to want to change. I have a feeling that a lot of the code > there is actually not so much policy but a *necessity* to get things > working given the calling conventions for the __import__ hook: whether > to return the head or tail of a dotted name, or when to do the "finish > fromlist" stuff. Agreed! Thanx for all the feedback. Time to write some code... Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Feb 17 18:41:51 2000 From: gstein@lyra.org (Greg Stein) Date: Thu, 17 Feb 2000 10:41:51 -0800 (PST) Subject: [Import-sig] getting source (was: Long-awaited imputil comments) In-Reply-To: <200002171446.JAA00480@eric.cnri.reston.va.us> Message-ID: On Thu, 17 Feb 2000, Guido van Rossum wrote: >... > I don't mind that the implementation for get_source() raises an error > (any error) when the code came from an archive. I was thinking of returning None, to follow the get_code() pattern. > I just want a > standard API that people who write alternative code repositories > can implement. Greg's proposal seems fine: > module.__importer__.get_module_source(module). I also plan to have importer.get_source(parent, modname, fqname). * get_module_source() is needed for things like tracebacks, where it somewhat difficult to recover parent/modname/fqname (also described as: why make clients repeat the code to recover that data) * get_source() is needed for cases where a module hasn't been imported at that point Importer subclasses will only need to implement get_source(). The base class will extract the parent/modname/fqname from information in the module. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Fri Feb 18 14:25:21 2000 From: gstein@lyra.org (Greg Stein) Date: Fri, 18 Feb 2000 06:25:21 -0800 (PST) Subject: [Import-sig] (partially) updated imputil Message-ID: I've made some more updates to imputil. Change log and the updated module are available at: http://www.lyra.org/cgi-bin/viewcvs.cgi/gjspy/imputil.py (revisions 1.10 and 1.11) I'm going to be out of town Saturday thru Tuesday. I'll probably do some more work on imputil before I leave. Not sure if I'll work on it while I'm gone (or on that LONG plane flight to Boston...) Anyhow, there is still some more work queued up on it. I haven't made myself an exhaustive list yet, so I can't list that here. But I'll probably get that list done before leaving. Cheers, -g p.s. the demo importers are now in a module named importers.py accessible thru ViewCVS (from the link above, just navigate up one level) -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Feb 19 13:50:13 2000 From: gstein@lyra.org (Greg Stein) Date: Sat, 19 Feb 2000 05:50:13 -0800 (PST) Subject: [Import-sig] Re: (partially) updated imputil In-Reply-To: Message-ID: On Fri, 18 Feb 2000, Greg Stein wrote: > I've made some more updates to imputil. Change log and the updated module > are available at: > http://www.lyra.org/cgi-bin/viewcvs.cgi/gjspy/imputil.py > > (revisions 1.10 and 1.11) > > > I'm going to be out of town Saturday thru Tuesday. I'll probably do some > more work on imputil before I leave. Not sure if I'll work on it while I'm > gone (or on that LONG plane flight to Boston...) > > Anyhow, there is still some more work queued up on it. I haven't made > myself an exhaustive list yet, so I can't list that here. But I'll > probably get that list done before leaving. I just checked in rev 1.12 which includes the TODO/wish list. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Feb 19 13:26:00 2000 From: gstein@lyra.org (Greg Stein) Date: Sat, 19 Feb 2000 05:26:00 -0800 (PST) Subject: [Import-sig] import "domain" question (was: Long-awaited imputil comments) In-Reply-To: <200002161727.MAA28273@eric.cnri.reston.va.us> Message-ID: On Wed, 16 Feb 2000, Guido van Rossum wrote: >... > > I've been thinking of something along the lines of > > _determine_import_context() returning a list of things to try. Default is > > to return something like [current-context,] + sys.path (not exactly that, > > but you get the idea). The _import_hook would then operate as a simple > > scan over that list of places to attempt an import from. MAL could > > override _determine_import_context() to return the walk-me-up, intervening > > packages. Tim could just always return sys.path (and never bother trying > > to determine the current context). > > Yes. In ni (remember ni?) we had this mechanism; it was called > "domain" (not a great name for it). The domain was a list of packages > where relative imports were sought. A package could set its domain by > setting a variable __domain__. The current policy (current package, > then toplevel) corresponds to a 2-item domain: [, ""] > <(where "" stands for the unnamed toplevel package). > Walk-me-up-Scotty corresponds to a domain containing the current > package, its parent, its grandparent, and so on, ending with "". The > "no relative imports" policy is represented [""]. > > If we let __domain__ be initialized by the importer but overridden by > the package, we can do everything we need. How is this different from __path__ ?? Is it simply that __path__ refers to the filesystem, while __domain__ refers to the package namespace? If so, then that seems like duplicate functionality. I can easily see constructing a __path__ using the __file__ attribute and "walking up the directory tree". Certainly a bit nicer/faster than checking two "paths" inside the import system. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@python.org Mon Feb 21 18:32:39 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 21 Feb 2000 13:32:39 -0500 Subject: [Import-sig] Re: import "domain" question (was: Long-awaited imputil comments) In-Reply-To: Your message of "Sat, 19 Feb 2000 05:26:00 PST." References: Message-ID: <200002211832.NAA03051@eric.cnri.reston.va.us> [me] > > Yes. In ni (remember ni?) we had this mechanism; it was called > > "domain" (not a great name for it). The domain was a list of packages > > where relative imports were sought. A package could set its domain by > > setting a variable __domain__. The current policy (current package, > > then toplevel) corresponds to a 2-item domain: [, ""] > > <(where "" stands for the unnamed toplevel package). > > Walk-me-up-Scotty corresponds to a domain containing the current > > package, its parent, its grandparent, and so on, ending with "". The > > "no relative imports" policy is represented [""]. > > > > If we let __domain__ be initialized by the importer but overridden by > > the package, we can do everything we need. [Greg] > How is this different from __path__ ?? > > Is it simply that __path__ refers to the filesystem, while __domain__ > refers to the package namespace? If so, then that seems like duplicate > functionality. I can easily see constructing a __path__ using the __file__ > attribute and "walking up the directory tree". Certainly a bit > nicer/faster than checking two "paths" inside the import system. No, no, no! If you are looking for foo.py in sys.path, the full module name will be "foo", no matter where you find it. If you are looking for it in various packages, the module name will be foo *prefixed with the name of the package where you found it*! This doesn't affect the importer much (since they asked for it by foo anyway), but it greatly affects the sys.modules administration, and it affect when you have hard name conflicts. --Guido van Rossum (home page: http://www.python.org/~guido/)