From sanner@scripps.edu Mon May 3 21:51:55 1999 From: sanner@scripps.edu (Michel Sanner) Date: Mon, 3 May 1999 13:51:55 -0700 Subject: [Distutils] Python packages Message-ID: <990503135155.ZM168661@noah.scripps.edu> Hello, I am once again in the Python update phase and (of course) running into the same type of problems regarding the installation of packages. I believe that Python should provide support for installing packages that contain both platform independant files (.py) and platform dependant files (.so, .dso .pyd). The reason is that I do not want to install (and maintain) multiple copies of the .py (one for each paltform I support). It seems to me that this is one ogf the benefits of platform independance. The problem right now is that the only way to do this (I know of) if to hack together an __init__.py file for the package placed in the paltform independant part of the installation tree and that would add the right directory to the PATH for importing .so files. The problem with that is that I need to do that for every single package. Would it be unreasonable to have the Python import mechanism check for packages in the $prefix AND the $exec_prefix directory ? -Michel From sanner@scripps.edu Mon May 3 23:04:26 1999 From: sanner@scripps.edu (Michel Sanner) Date: Mon, 3 May 1999 15:04:26 -0700 Subject: [Distutils] More about packages In-Reply-To: Greg Ward "[Distutils] Some code to play with" (Mar 22, 10:10am) References: <19990322101021.A489@cnri.reston.va.us> Message-ID: <990503150426.ZM168872@noah.scripps.edu> It seems to me that the platform dependent and independent trees of a Python installation are not symetric in some sense: on one side we have: $prefix/include/python$version/ $prefix/lib/python$version/ $prefix/man on the other side $exec_prefix/bin $exec_prefix/include/python$version $exec_prefix/lib/python$version/config $exec_prefix/lib/python$version/lib-dynload $exec_prefix/lib/python$version/site-packages what bothers me is that we do not have that extra level under lib in the platform independent tree. I'd like to have something like: $prefix/lib/python$version/standard/ (equivalent of lib-dynload) $prefix/lib/python$version/packages/ (for paltform independant packages) and these directories should be part of the Python PATH built by default. I am not sure where pur Python packages are supposed to be installed right now ? -Michel From gward@cnri.reston.va.us Thu May 20 22:09:45 1999 From: gward@cnri.reston.va.us (Greg Ward) Date: Thu, 20 May 1999 17:09:45 -0400 Subject: [Distutils] extensions in packages Message-ID: <19990520170945.A6434@cnri.reston.va.us> [I'm going to try to yank this thread over from PSA-members, which should have been done *long* ago!] [on psa-members, Michel Sanner opined:] > I do not think that trying to bend over bacakwards to circumevent this > "limitation" is the right way to proceed .. unless changing the import > mechanism in Python itself is something that would be extremely difficult to > do. I agree that a change to the import mechanism is in order. My understanding (from another one of those office-hallway conversations with Fred) is that it would be very sensible to add a package's platform-dependent directory to the package's __path__ attribute. No frobbing of sys.path is necessary, and thus no danger of stupid name conflicts in extension modules that are supposed to be buried deep in some package structure. Perhaps imputil.py can help us in the playing around stage; I'm not familiar with it, though, so I'll refrain from further comment. Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913 From just@letterror.com Fri May 21 10:00:22 1999 From: just@letterror.com (Just van Rossum) Date: Fri, 21 May 1999 11:00:22 +0200 Subject: [Distutils] Re: [PSA MEMBERS] packages in Python In-Reply-To: References: <990520114522.ZM223757@noah.scripps.edu> Message-ID: (I just subscribed here, so maybe I've missed earlier replies to David's post in the PSA list) At 11:57 AM -0700 5/20/99, David Ascher wrote: >I'm (slowly) getting to the point where I agree. Two thoughts: > >1) imputil.py (greg stein's thing) might be a good place to start working > out a better system. See distutils-sig for URLs. > >2) the problem of statically compiled package-enclosed modules is > separate and needs to be addressed in the core. In other words, it > won't make it before 1.6. Point 2 shouldn't be too hard. It is already possible (since a 1.5.2 alpha I think) to statically link submodules in a frozen build. I guess it's relatively easy to patch find_module() to do something like this: foo.bar is registered as a "builtin" in config.c file as {"foo.bar", initbar}, (Hm, this is problemetic if the is a distinct global builtin module "bar") find_module() should then first check sys.builtin_module_names with the full name before doing anything else. (probably only when it is confirmed that "foo" is a package.) No time to play with that right now, but it sure seems trivial. Just From hinsen@dirac.cnrs-orleans.fr Fri May 21 10:22:55 1999 From: hinsen@dirac.cnrs-orleans.fr (hinsen@dirac.cnrs-orleans.fr) Date: Fri, 21 May 1999 11:22:55 +0200 Subject: [Distutils] extensions in packages Message-ID: <199905210922.LAA02670@chinon.cnrs-orleans.fr> Just van Rossum wrote: > foo.bar is registered as a "builtin" in config.c file as > > {"foo.bar", initbar}, > > (Hm, this is problemetic if the is a distinct global builtin module "bar") Or if any other package has a module "bar"! > find_module() should then first check sys.builtin_module_names with the > full name before doing anything else. (probably only when it is confirmed > that "foo" is a package.) All that would be doable, but the real problem is the name of the init function! Only one module can define a global symbol "initbar". So the one for foo.bar would have to be called "initfoo.bar" (or something similar). On the other hand, when the same module is used dynamically, the init function must be called "initbar" again (unless the current import mechanism is changed). Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From just@letterror.com Fri May 21 11:20:47 1999 From: just@letterror.com (Just van Rossum) Date: Fri, 21 May 1999 12:20:47 +0200 Subject: [Distutils] extensions in packages In-Reply-To: <199905210922.LAA02670@chinon.cnrs-orleans.fr> Message-ID: At 11:22 AM +0200 5/21/99, hinsen@dirac.cnrs-orleans.fr wrote: >> find_module() should then first check sys.builtin_module_names with the >> full name before doing anything else. (probably only when it is confirmed >> that "foo" is a package.) > >All that would be doable, but the real problem is the name of the init >function! Right, I was being naive: I thought that was just "a" problem... >Only one module can define a global symbol "initbar". So the >one for foo.bar would have to be called "initfoo.bar" (or something >similar). On the other hand, when the same module is used dynamically, >the init function must be called "initbar" again (unless the current >import mechanism is changed). So there are really two options: 1) Define a switch that C extensions can check to determine whether the init func should be called initbar or initfoo_bar (or something). This means it's up to the extension developer to cater for statically linked builtin submodules by doing something like this in the extension source: #ifdef PY_STATIC_SUBMODULES #define initbar initfoo_bar #endif 2) change the DL import mechanism so the init function *has* to be called initfoo_bar. But then, to remain backwards compatible you'd still have use a switch, so it doesn't help much now. Just From mal@lemburg.com Fri May 21 14:50:19 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 21 May 1999 15:50:19 +0200 Subject: [Distutils] Packages with C extensions Message-ID: <3745649B.4B56F9C2@lemburg.com> [Problem with dynamic extensions in packages being platform dependent] I haven't followed the thread too closely, but was alarmed by the recent proposals of splitting .so files out of the "normal" package distribution under a separate dir tree. This is really not such a good idea because it would cause the package information stored in the extension module to be lost (you can't have two top-level packages with the same name on the path: only the first one on the path will be used). Here is the scheme I would use: create a subpackage for the extension and have it take care of importing the correct shared lib for the platform Python is currently running on. The libs themselves could be placed in plat- subdirs of that subpackage and the __init__.py would then load the shared lib using either a sys.path+__import__() hack or thread safe via imp.load_dynamic(). An even simpler solution is installing the whole package under .../python1.5/plat- separately for each supported platform rather than putting it under site-packages. [Disk space is no argument nowadays and its likely that different platforms need different Setup files anyway.] -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 224 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From hinsen@cnrs-orleans.fr Fri May 21 15:08:29 1999 From: hinsen@cnrs-orleans.fr (Konrad Hinsen) Date: Fri, 21 May 1999 16:08:29 +0200 Subject: [Distutils] extensions in packages In-Reply-To: (message from Just van Rossum on Fri, 21 May 1999 12:20:47 +0200) References: Message-ID: <199905211408.QAA03506@chinon.cnrs-orleans.fr> > 1) Define a switch that C extensions can check to determine whether the > init func should be called initbar or initfoo_bar (or something). I'd rather have a set of macros that automatically do the right thing, but that's a minor detail. Changing the name of the init function is certainly doable. But if the init function contains the complete package path (and I see no other way to avoid name clashes), then we have to worry about the limitations that various systems impose on the name of global symbols. I doubt that there are still many systems around that use only eight characters, but I think 32 is a common limit. Although I am not really sure about the current state of the art! > 2) change the DL import mechanism so the init function *has* to be called > initfoo_bar. But then, to remain backwards compatible you'd still have use > a switch, so it doesn't help much now. Backwards compatible with what? Currently builtin modules can't be in packages at all, so nothing's lost. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From just@letterror.com Fri May 21 15:57:27 1999 From: just@letterror.com (Just van Rossum) Date: Fri, 21 May 1999 16:57:27 +0200 Subject: [Distutils] extensions in packages In-Reply-To: <199905211408.QAA03506@chinon.cnrs-orleans.fr> References: (message from Just van Rossum on Fri, 21 May 1999 12:20:47 +0200) Message-ID: At 4:08 PM +0200 5/21/99, Konrad Hinsen wrote: >> 2) change the DL import mechanism so the init function *has* to be called >> initfoo_bar. But then, to remain backwards compatible you'd still have use >> a switch, so it doesn't help much now. > >Backwards compatible with what? Currently builtin modules can't be >in packages at all, so nothing's lost. But DLLs *can* be (that's the whole point, no?). If the rules for the init func changes, I think at least Marc-Andre L. won't be too happy: all (?) of his extensions use DLLs as submodules, so he would need to add switches to remain compatible with 1.5.2. I'm sure he's not the only one. Just From hinsen@cnrs-orleans.fr Fri May 21 16:07:45 1999 From: hinsen@cnrs-orleans.fr (Konrad Hinsen) Date: Fri, 21 May 1999 17:07:45 +0200 Subject: [Distutils] extensions in packages In-Reply-To: (message from Just van Rossum on Fri, 21 May 1999 16:57:27 +0200) References: (message from Just van Rossum on Fri, 21 May 1999 12:20:47 +0200) Message-ID: <199905211507.RAA04104@chinon.cnrs-orleans.fr> > >Backwards compatible with what? Currently builtin modules can't be > >in packages at all, so nothing's lost. > > But DLLs *can* be (that's the whole point, no?). If the rules for the init > func changes, I think at least Marc-Andre L. won't be too happy: all (?) of > his extensions use DLLs as submodules, so he would need to add switches to > remain compatible with 1.5.2. I'm sure he's not the only one. I admit I hadn't thought about the possibility that someone might have used dynamic libraries in packages already; my development cycles always include statically linked modules at some stage, so all extension modules remain top-level. Which makes me wonder how others develop extension modules: I always use a debugger at some point, and I haven't yet found one which lets me set breakpoints in dynamic libraries that haven't been loaded yet! Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From Fred L. Drake, Jr." References: <990520114522.ZM223757@noah.scripps.edu> <199905210904.LAA02630@chinon.cnrs-orleans.fr> Message-ID: <14149.30596.737136.29729@weyr.cnri.reston.va.us> Konrad Hinsen writes: > > I would much more prefere have Python try to import from the > > paltform independant part of a package (installed under $prefix) > > and, if it cannot find what we are looking for, lookup > > "automatically" the platform-dependant part of that package. > > Shouldn't that be the other way round? I'd expect to be able to override > general modules by platform-specific modules. Greg Ward and I were talking about this stuff the other day, and I think we decided that there was no good way to have multiple implementations of a module installed such that the platform dependent version was sure to take precedence over a platform independent version; this relies on the sequence of directories in the relevant search path (whether it be sys.path or a package's __path__). The general solution seems to be that two things need to be done: a package's __path__ needs to include *all* the appropriate directories found along the search path, not just the one holding __init__.py*, AND the platform dependent modules should have different names from the platform independent modules. The platform independent module should be the public interface, and it can load platform dependent code the same way that string loads functions from strop. The problem here is that the package's __path__ is not being created this way now; if anyone has time to work on a patch for Python/import.c, I'd be glad to help test it! ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From sanner@scripps.edu Fri May 21 17:39:50 1999 From: sanner@scripps.edu (Michel Sanner) Date: Fri, 21 May 1999 09:39:50 -0700 Subject: [Distutils] extensions in packages In-Reply-To: Konrad Hinsen "Re: [Distutils] extensions in packages" (May 21, 5:07pm) References: (message from Just van Rossum on Fri 21 May 1999 12:20:47 +0200) <199905211507.RAA04104@chinon.cnrs-orleans.fr> Message-ID: <990521093950.ZM192336@noah.scripps.edu> On May 21, 5:07pm, Konrad Hinsen wrote: > > Which makes me wonder how others develop extension modules: I always > use a debugger at some point, and I haven't yet found one which lets > me set breakpoints in dynamic libraries that haven't been loaded yet! > After you import the .so you can set a break point. I do that all the time on my sgi uner dbx or cvd .. no problem. I also have about 10 extensions modules all of which use .so files ! -Michel From sanner@scripps.edu Fri May 21 17:30:58 1999 From: sanner@scripps.edu (Michel Sanner) Date: Fri, 21 May 1999 09:30:58 -0700 Subject: [Distutils] Re: [PSA MEMBERS] packages in Python In-Reply-To: "M.-A. Lemburg" "Re: [PSA MEMBERS] packages in Python" (May 21, 10:24am) References: <37451849.CEF027B@lemburg.com> Message-ID: <990521093058.ZM235238@noah.scripps.edu> On May 21, 10:24am, M.-A. Lemburg wrote: > Subject: Re: [PSA MEMBERS] packages in Python > [Problem with dynamic extensions in packages being platform dependent] > > I haven't followed the thread too closely, but was alarmed by > the recent proposals of splitting .so files out of the "normal" > package distribution under a separate dir tree. This is really > not such a good idea because it would cause the package information > stored in the extension module to be lost (you can't have two > top-level packages with the same name on the path: only the first one > on the path will be used). > > Here is the scheme I would use: create a subpackage for the > extension and have it take care of importing the correct > shared lib for the platform Python is currently running on. > The libs themselves could be placed in plat- subdirs > of that subpackage and the __init__.py would then load the > shared lib using either a sys.path+__import__() hack or > thread safe via imp.load_dynamic(). > > An even simpler solution is installing the whole package under > .../python1.5/plat- separately for each supported > platform rather than putting it under site-packages. [Disk space > is no argument nowadays and its likely that different platforms > need different Setup files anyway.] > As someone who maintains Python for several unix based architectures I am not concerned with disk space but really file duplication with the obvious risc to run out of sync. Also, the plat- scheme is far from being able to capture the complexity of this world. SGI alone has 3 ABIs o32 n32 n64 multiplied by MIPS1, MIPS3, MIPS4 instruction sets yimes IRIX5.x, IRIX6.2, IRIX6.3, IRIX6.4, IRIX6.5 and many of these combinations are incompatible. ! Finally, why have a $prefix and a $ exec_prefix if it is not used to split plateform dependent stuff from platform independent. And we should really take this dot distutil-sig :) -Michel From sanner@scripps.edu Fri May 21 17:42:28 1999 From: sanner@scripps.edu (Michel Sanner) Date: Fri, 21 May 1999 09:42:28 -0700 Subject: [Distutils] Re: [PSA MEMBERS] packages in Python In-Reply-To: "Fred L. Drake" "[Distutils] Re: [PSA MEMBERS] packages in Python" (May 21, 11:11am) References: <990520114522.ZM223757@noah.scripps.edu> <199905210904.LAA02630@chinon.cnrs-orleans.fr> <14149.30596.737136.29729@weyr.cnri.reston.va.us> Message-ID: <990521094228.ZM237910@noah.scripps.edu> On May 21, 11:11am, Fred L. Drake wrote: > Subject: [Distutils] Re: [PSA MEMBERS] packages in Python > > Konrad Hinsen writes: > > Shouldn't that be the other way round? I'd expect to be able to override > > general modules by platform-specific modules. > > Greg Ward and I were talking about this stuff the other day, and I > think we decided that there was no good way to have multiple > implementations of a module installed such that the platform dependent > version was sure to take precedence over a platform independent > version; this relies on the sequence of directories in the relevant > search path (whether it be sys.path or a package's __path__). > The general solution seems to be that two things need to be done: a > package's __path__ needs to include *all* the appropriate directories > found along the search path, not just the one holding __init__.py*, > AND the platform dependent modules should have different names from > the platform independent modules. The platform independent module > should be the public interface, and it can load platform dependent > code the same way that string loads functions from strop. > The problem here is that the package's __path__ is not being created > this way now; if anyone has time to work on a patch for > Python/import.c, I'd be glad to help test it! ;-) > I take care of this in my extension module. If I have a platform dependent implementation of a module I: try: import extension Fail: use common This requires minimal amount of coding. Personally I did not see the need of this being automatic -Michel From Fred L. Drake, Jr." References: <990520114522.ZM223757@noah.scripps.edu> <199905210904.LAA02630@chinon.cnrs-orleans.fr> <14149.30596.737136.29729@weyr.cnri.reston.va.us> <990521094228.ZM237910@noah.scripps.edu> Message-ID: <14149.36341.651126.805299@weyr.cnri.reston.va.us> Michel Sanner writes: > This requires minimal amount of coding. Personally I did not see the need of > this being automatic The only part that I think should be automatic is loading the additional directories into the package's __path__ (like $exec_prefix/lib/python$VERSION/site-packages/...). The rest seems to be sufficiently specific to the package to require that it be handled explicitly. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From pas@scansoft.com Fri May 21 06:56:11 1999 From: pas@scansoft.com (Perry A. Stoll) Date: Fri, 21 May 1999 01:56:11 -0400 Subject: [Distutils] Packages with C extensions Message-ID: <01bea34e$a0be1160$3e3e79cf@nara.scansoft.com> > M.-A. Lemburg writes: >[Problem with dynamic extensions in packages being platform dependent] > >I haven't followed the thread too closely, but was alarmed by >the recent proposals of splitting .so files out of the "normal" >package distribution under a separate dir tree. Fair enough. There's the the GNU configure view of life, where $prefix and $exec_prefix are separate directories, and there is the perl view of life $PERL_ARCHLIB is usually a subdirectory of the install directory). M.A. prefers the perl-ish approach. Fine with me, as long as we do it explicitly. > (you can't have two top-level packages with the same name >on the path: only the first one on the path will be used). That's only because that's how it's done today. Just a matter of some code...(and the thought and design behind it). > [ snipped scheme for having packages do the platform specific import ] I'd rather not burden the package writer. I think it's better to include the batteries for this one. > [recommendation that you just have a different install dir for each platform ] > Disk space is no argument nowadays Ease of maintenance is the overriding argument here. The .py files are the same for all platforms so why do I want different copies of those files when I have python install for three platforms? > its likely that different platforms need different Setup files anyway. But that's a platform dependent file which goes in the $INSTALL_ARCHLIB. In short, I think we need to get this infrastructure into Python itself to ease the creation of package authors. But then I'm probably preaching to the choir. -Perry From pas@scansoft.com Fri May 21 07:55:56 1999 From: pas@scansoft.com (Perry A. Stoll) Date: Fri, 21 May 1999 02:55:56 -0400 Subject: [Distutils] Re: [PSA MEMBERS] packages in Python Message-ID: <01bea356$fa01f5e0$3e3e79cf@nara.scansoft.com> >> The problem here is that the package's __path__ is not being created >> this way now; if anyone has time to work on a patch for >> Python/import.c, I'd be glad to help test it! ;-) >> >I take care of this in my extension module. If I have a platform dependent >implementation of a module I: > >try: > import extension >Fail: > use common I don't think this is the case that's causing problem. The problem is when a submodule on a package is *always* platform dependent (because, for example, it interfaces to another library). -Perry From Fred L. Drake, Jr." References: <01bea356$fa01f5e0$3e3e79cf@nara.scansoft.com> Message-ID: <14149.44654.562124.940757@weyr.cnri.reston.va.us> Perry A. Stoll writes: > I don't think this is the case that's causing problem. The problem is when a > submodule on a package is *always* platform dependent (because, for example, > it interfaces to another library). Perry, I think in this case the only problem is getting all the right directories on the package's __path__; am I missing something? It avoids the need for the "conditional" import, but that's largely separate. If there are also platform independent modules, this is an issue; if *all* the "real" modules are platform dependent, I think the duplication of the (essentially empty) __init__.py* is something we can allow, and just install the package entirely under $exec_prefix. Are there problems that I'm missing that can't be solved by locating all the parallel package directories and placing them on the package's __path__? I think the multiple SGI binary formats can be handled by using a different $exec_prefix for each (it sounds like anything less won't get the job done anyway for that case). -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From pas@scansoft.com Fri May 21 08:47:03 1999 From: pas@scansoft.com (Perry A. Stoll) Date: Fri, 21 May 1999 03:47:03 -0400 Subject: [Distutils] Re: [PSA MEMBERS] packages in Python Message-ID: <01bea35e$1dffa8a0$3e3e79cf@nara.scansoft.com> > I think in this case the only problem is getting all the right >directories on the package's __path__; am I missing something? It >avoids the need for the "conditional" import, but that's largely >separate. Fred, Good point. Can you recommend a concise place that the import mechanism (in all it's glory) is documented? That should solve the problem, except for when freeze-ing or making a static python binary (as previously mentioned by Konrad). I was poking around in ihooks.py. It looks like it should be possible to cook up something approximating this using ihooks. What do you think? -Perry From Fred L. Drake, Jr." References: <01bea35e$1dffa8a0$3e3e79cf@nara.scansoft.com> Message-ID: <14149.47682.416406.221941@weyr.cnri.reston.va.us> Perry A. Stoll writes: > Good point. Can you recommend a concise place that the import mechanism (in > all it's glory) is documented? Documentation? Ha! I don't have no stinkin' documentation! ;-) I think going over Python/import.c is the best bet. There's an import_package() function (I think that's the name); probably the best bet is to modify that to build the right __path__ value; at this point we know it's a package, so we're not interfering with the performance of importing non-packages, only the package/subpackages themselves. > That should solve the problem, except for when freeze-ing or making a static > python binary (as previously mentioned by Konrad). I don't know enough about freezing, but I suspect that's not too difficult; probably about the same as staticly linked package-ized modules. ;-) I don't think those will actually be that difficult for someone that has time to read the code; the only real problem is the public symbol for the module init function. > I was poking around in ihooks.py. It looks like it should be possible to > cook up something approximating this using ihooks. What do you think? That can probably be done, but places the import machinery in Python rather than in C, so it'll be slow. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From MHammond@skippinet.com.au Fri May 21 23:51:02 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Sat, 22 May 1999 08:51:02 +1000 Subject: [Distutils] extensions in packages In-Reply-To: Message-ID: <000c01bea3dc$67574de0$0801a8c0@bobcat> FWIW, I _do_ use DLLs in packages, and it causes me no end of grief. I need to have special runtime hacks that works with __path__, I need a special __init__ in the package where the DLL is to "appear", and also need even further special casing for Freeze! So although I can see the problems with the mechanisms, IMO it is very important that packages be capable of treating DLLs as first-class citizens. Personally, I would not have a "compatibility" problem as such, but I would need to remove or update my hacks - but I find that reasonable. Mark. > >Backwards compatible with what? Currently builtin modules can't be > >in packages at all, so nothing's lost. > > But DLLs *can* be (that's the whole point, no?). If the rules > for the init > func changes, I think at least Marc-Andre L. won't be too > happy: all (?) of > his extensions use DLLs as submodules, so he would need to > add switches to > remain compatible with 1.5.2. I'm sure he's not the only one. From mal@lemburg.com Tue May 25 10:14:06 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 25 May 1999 11:14:06 +0200 Subject: [Distutils] Packages with C extensions References: <01bea34e$a0be1160$3e3e79cf@nara.scansoft.com> Message-ID: <374A69DE.6A2BE81C@lemburg.com> Perry A. Stoll wrote: > > > M.-A. Lemburg writes: > > >[Problem with dynamic extensions in packages being platform dependent] > > > >I haven't followed the thread too closely, but was alarmed by > >the recent proposals of splitting .so files out of the "normal" > >package distribution under a separate dir tree. > > Fair enough. There's the the GNU configure view of life, where $prefix and > $exec_prefix are separate directories, and there is the perl view of life > $PERL_ARCHLIB is usually a subdirectory of the install directory). M.A. > prefers the perl-ish approach. Fine with me, as long as we do it explicitly. > > > (you can't have two top-level packages with the same name > >on the path: only the first one on the path will be used). > > That's only because that's how it's done today. Just a matter of some > code...(and the thought and design behind it). Good point :-) ... having Python continue scanning the path after some import fails would definitely ease structuring of packages, mostly because it allows extending existing installations with user or platform specific modules. The latter is a basic building block for a possible future standard Python lib with a package layout (see a discussion about this on c.l.p last year, I think). > > [ snipped scheme for having packages do the platform specific import ] > > I'd rather not burden the package writer. I think it's better to include the > batteries for this one. Fair enough, but I guess a simple platform aware import helper would do the trick nicely, e.g. MyModule = platimport('MyModule') > > [recommendation that you just have a different install dir for each > platform ] > > Disk space is no argument nowadays > > Ease of maintenance is the overriding argument here. The .py files are the > same for all platforms so why do I want different copies of those files when > I have python install for three platforms? The latter gets defeated by the disk space non-argument. Also, some future version of Python may very well use platform dependent optimized versions of .py files, e.g. JIT compiled ones. Besides, I don't think that a simple 'cp -a plat-1 plat-2' causes too much maintenance effort ;-) If you worry about disk space, you could even setup a linked copy for the new platform using e.g. Tools/scripts/linktree.py. > > its likely that different platforms need different Setup files anyway. > > But that's a platform dependent file which goes in the $INSTALL_ARCHLIB. True. Not sure why you would want to install that file though (its only needed for compilation). > In short, I think we need to get this infrastructure into Python itself to > ease the creation of package authors. Now, I think, you're taking the idea a bit too far ;-) ... > But then I'm probably preaching to the > choir. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 220 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Tue May 25 10:41:13 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 25 May 1999 11:41:13 +0200 Subject: [Distutils] extensions in packages References: <000c01bea3dc$67574de0$0801a8c0@bobcat> Message-ID: <374A7039.3A63B129@lemburg.com> Mark Hammond wrote: > > FWIW, I _do_ use DLLs in packages, and it causes me no end of grief. I > need to have special runtime hacks that works with __path__, I need a > special __init__ in the package where the DLL is to "appear", and also need > even further special casing for Freeze! I've been using DLL/SOs in packages with much success for some time now. Don't know why you need any hacks to get this going though: it works right out of the box for me. The situation is a little different for frozen apps without shared libs though: the extension modules will become top-level modules. Haven't frozen those kinds of apps yet, but it should still work out of the box (except maybe when you pass pickles from an app using top-level modules to one using in-package modules). -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 220 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Mon May 24 21:53:19 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 24 May 1999 22:53:19 +0200 Subject: [Distutils] extensions in packages References: (message from Just van Rossum on Fri, 21 May 1999 12:20:47 +0200) <199905211507.RAA04104@chinon.cnrs-orleans.fr> Message-ID: <3749BC3F.3614E4DE@lemburg.com> Konrad Hinsen wrote: > > > >Backwards compatible with what? Currently builtin modules can't be > > >in packages at all, so nothing's lost. > > > > But DLLs *can* be (that's the whole point, no?). If the rules for the init > > func changes, I think at least Marc-Andre L. won't be too happy: all (?) of > > his extensions use DLLs as submodules, so he would need to add switches to > > remain compatible with 1.5.2. I'm sure he's not the only one. Yep, all my extensions are wrapped into packages and all of them use subpackages which wrap extension modules included as submodules of those packages... that gives you a very flexible setup since the __init__.py files let you do all kinds of nifty things to load the correct C extension (see my previous post). > I admit I hadn't thought about the possibility that someone might have > used dynamic libraries in packages already; my development cycles > always include statically linked modules at some stage, so all > extension modules remain top-level. The main reason for including the extensions in the packages themselves rather than making them top-level was to simplify installation, e.g. on Windows (with pre-compiled binaries), you only have to unzip the archive and that's it... no make install or equivalent is necessary. > Which makes me wonder how others develop extension modules: I always > use a debugger at some point, and I haven't yet found one which lets > me set breakpoints in dynamic libraries that haven't been loaded yet! That one is simple: you run it twice. The first time to load the DLL and the second time with the break point set in the DLL. Works with gdb on Linux, not sure about other platforms. Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 221 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From da@ski.org Wed May 26 05:02:22 1999 From: da@ski.org (David Ascher) Date: Tue, 25 May 1999 21:02:22 -0700 (Pacific Daylight Time) Subject: [Distutils] extensions in packages In-Reply-To: <374A7039.3A63B129@lemburg.com> Message-ID: On Tue, 25 May 1999, M.-A. Lemburg wrote: > > I've been using DLL/SOs in packages with much success for some time > now. Don't know why you need any hacks to get this going though: > it works right out of the box for me. Do you have the DLLs/.so's in directories that are children of $exec_prefix? If yes, please let us know how you do it. That's the task that we're trying to solve. --david From mal@lemburg.com Wed May 26 08:41:57 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 26 May 1999 09:41:57 +0200 Subject: [Distutils] extensions in packages References: Message-ID: <374BA5C5.706EFB1E@lemburg.com> David Ascher wrote: > > On Tue, 25 May 1999, M.-A. Lemburg wrote: > > > > I've been using DLL/SOs in packages with much success for some time > > now. Don't know why you need any hacks to get this going though: > > it works right out of the box for me. > > Do you have the DLLs/.so's in directories that are children of > $exec_prefix? If yes, please let us know how you do it. That's the task > that we're trying to solve. No, I simply leave them in the packages subdirectories. The "make install" step is not needed if you have the users compile the extensions in the package subdirs. There's no magic to it. This doesn't allow you to have one installation for multiple platforms, but it makes the installation process very simple and currently is the only way to go with the classical Makefile.pre.in approach, since this does not allow you to install extensions in directories other than site-packages without tweaking. I still think that to get multi-platform installation working we'd definitely need to extend the package import mechanism to have it continue the search for a particular module in case the first try fails. Note that this kind of search will be very costly due the amount of IO needed to search the path. Some sort of fastpath hook should be included along with this extension to fix this. (Such a hook could also be used to do other PYTHONPATH mods at runtime which go far beyond the current sys.path tricks, e.g. to implement new module lookup schemes.) For a try at such a hook, see: http://starship.skyport.net/~lemburg/fastpath.zip -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 219 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From Fred L. Drake, Jr." References: <374BA5C5.706EFB1E@lemburg.com> Message-ID: <14155.63283.936747.273032@weyr.cnri.reston.va.us> M.-A. Lemburg writes: > Note that this kind of search will be very costly due the amount > of IO needed to search the path. Some sort of fastpath hook Marc-Andre, Why does this need to be so costly? Compared to the current scheme, there's little to add. Once a package has been identified (and *only* then!), search the path for all the appropriate subdirectories (one stat() for each path entry). The current approach requires about a half dozen stats for each path entry: foo.py, foo.py[co], foomodule.so, foo.so, foo/ + foo/__init__.py + foo.__init__.py[co]. It will typically be even cheaper for sub-packages, because the original path will usually be much shorter than sys.path. Note that I'm not saying there shouldn't be some sort of directory caching; loading Grail is still dog slow, and I've no doubt that the 600+ stat() calls contribute to that! 1-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From Fred L. Drake, Jr." References: <374BA5C5.706EFB1E@lemburg.com> <14155.63283.936747.273032@weyr.cnri.reston.va.us> <374C20DD.53B458DC@lemburg.com> Message-ID: <14156.28035.739990.919106@weyr.cnri.reston.va.us> M.-A. Lemburg writes: > Well, I was referring to the additional lookup needed to find > the next package dir of the same name. Say you put the Python > package into site-packages and the binaries into plat-. I didn't say it was free; just that the cost was insignificant compared to the current cost. My sys.path in an interactive interpreter contains 11 entries. If I want to add a package with both $prefix and $exec_prefix components, the worst case is that the directory holding the __init__.py* is the last path entry, and the other directory is in the immediately preceeding path entry. After the current mechanism locates the __init__.py* file, it needs to build the __path__ for the package. It takes 10 stat() calls to locate the additional directory. Considering that the initial search that caused the package module to be created took: 11 stats to see if the entries contained the appropriate directory + 2 stats to determine that the first directory of the package (the one that doesn't have __init__.py*) wasn't it + 36 to determine that the first 9 directories didn't contain a matching .so|module.so|.py|.py[co]. Plus at least one to actually find the __init__.pyc; two if only the .py is available. (I think I followed the code right. ;) That's 59 system calls (either stat() or open(), the later hidden inside fdopen()). I don't the added 10 to get the right __path__ is worth worrying about. It's the .py[co] files that are expensive to load! Once you've created the package, sub-modules are very cheap: you will typically have no more than two path entries to check even once all this is in place. I said: > caching; loading Grail is still dog slow, and I've no doubt that the > 600+ stat() calls contribute to that! 1-) Oops, after following through with the math, I'd have to adjust this to 6000 stat()/open() calls for Grail. Sorry! And back to Marc-Andre: > I would very much like to see some sort of caching in the > interpreter. The fastpath hook I implemented uses a marshalled > dict stored in the user's home dir for the lookup. Once created, I don't think I'd store the cache; if a user's home directory is mounted via NFS (common), then it may often be wrong if the user actively works with a variety of hosts with different versions or installations of Python. The benefits of a cache are greatest for applications that import a lot of modules (like Grail!); the cache can be built using a directory scan as each directory is searched. (I think one of the guys from CWI did this at one point and had really good results; Jack?) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From mal@lemburg.com Wed May 26 17:27:09 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 26 May 1999 18:27:09 +0200 Subject: [Distutils] extensions in packages References: <374BA5C5.706EFB1E@lemburg.com> <14155.63283.936747.273032@weyr.cnri.reston.va.us> Message-ID: <374C20DD.53B458DC@lemburg.com> Fred L. Drake wrote: > > M.-A. Lemburg writes: > > Note that this kind of search will be very costly due the amount > > of IO needed to search the path. Some sort of fastpath hook > > Marc-Andre, > Why does this need to be so costly? Compared to the current scheme, > there's little to add. Once a package has been identified (and *only* > then!), search the path for all the appropriate subdirectories (one > stat() for each path entry). The current approach requires about a > half dozen stats for each path entry: foo.py, foo.py[co], > foomodule.so, foo.so, foo/ + foo/__init__.py + foo.__init__.py[co]. > It will typically be even cheaper for sub-packages, because the > original path will usually be much shorter than sys.path. Well, I was referring to the additional lookup needed to find the next package dir of the same name. Say you put the Python package into site-packages and the binaries into plat-. Since the platform subdirs come first on the standard sys.path, all imports of the form import MyPackage.MyModule will first look in the binary package, fail and then continue to look (and hopefully find) the MyModule submodule in the Python package installed under site-packages. Since these imports are more common than importing binaries, imports would get even slower on average. Ok, you could change the sys.path so that the binaries come *after* the source packages... but it's currently not the default. > Note that I'm not saying there shouldn't be some sort of directory > caching; loading Grail is still dog slow, and I've no doubt that the > 600+ stat() calls contribute to that! 1-) I would very much like to see some sort of caching in the interpreter. The fastpath hook I implemented uses a marshalled dict stored in the user's home dir for the lookup. Once created, it reduces startup time noticeably (cutting down stat() calls from around 200 for a typical utility script to around 20). The nice thing about the hack is that you can experiment with the cache logic using Python functions before possibly coding it in C. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 219 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Thu May 27 09:21:11 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 27 May 1999 10:21:11 +0200 Subject: [Distutils] extensions in packages References: <374BA5C5.706EFB1E@lemburg.com> <14155.63283.936747.273032@weyr.cnri.reston.va.us> <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us> Message-ID: <374D0077.2699505F@lemburg.com> Fred L. Drake wrote: > > M.-A. Lemburg writes: > > Well, I was referring to the additional lookup needed to find > > the next package dir of the same name. Say you put the Python > > package into site-packages and the binaries into plat-. > > I didn't say it was free; just that the cost was insignificant > compared to the current cost. Agreed. > My sys.path in an interactive interpreter contains 11 entries. If I > want to add a package with both $prefix and $exec_prefix components, > the worst case is that the directory holding the __init__.py* is the > last path entry, and the other directory is in the immediately > preceeding path entry. After the current mechanism locates the > __init__.py* file, it needs to build the __path__ for the package. It > takes 10 stat() calls to locate the additional directory. Considering > that the initial search that caused the package module to be created > took: 11 stats to see if the entries contained the appropriate > directory + 2 stats to determine that the first directory of the > package (the one that doesn't have __init__.py*) wasn't it + 36 to > determine that the first 9 directories didn't contain a matching > .so|module.so|.py|.py[co]. Plus at least one to actually find the > __init__.pyc; two if only the .py is available. (I think I followed > the code right. ;) That's 59 system calls (either stat() or open(), > the later hidden inside fdopen()). I don't the added 10 to get the > right __path__ is worth worrying about. Wow, what an analysis. > It's the .py[co] files that > are expensive to load! Once you've created the package, sub-modules > are very cheap: you will typically have no more than two path entries > to check even once all this is in place. I'm not sure I follow you here: do you mean with a package dir cache in place or using the system implemented in the current release ? > I said: > > caching; loading Grail is still dog slow, and I've no doubt that the > > 600+ stat() calls contribute to that! 1-) > > Oops, after following through with the math, I'd have to adjust this > to 6000 stat()/open() calls for Grail. Sorry! This seems like something to worry about and probably also enough to try really hard to find a good solution, IMHO. > And back to Marc-Andre: > > I would very much like to see some sort of caching in the > > interpreter. The fastpath hook I implemented uses a marshalled > > dict stored in the user's home dir for the lookup. Once created, > > I don't think I'd store the cache; if a user's home directory is > mounted via NFS (common), then it may often be wrong if the user > actively works with a variety of hosts with different versions or > installations of Python. True, that's why the hook allows you to code the strategy in Python. Note that my current version uses the sys.path as key into a table of name:file mappings, so even when using different setups (which will certainly have some differences in sys.path), the cache should work. Maybe one should add some more information to the key... like the platform specifica or the even the mtimes of the directories on the path. > The benefits of a cache are greatest for > applications that import a lot of modules (like Grail!); the cache can > be built using a directory scan as each directory is searched. (I > think one of the guys from CWI did this at one point and had really > good results; Jack?) Yep, remember that too. The problem with these scans is that directories may contain huge amounts of files and you would need to check all of them against the module extensions Python uses. Anyway, the dynamic and static versions are both implementable using the hook, so I'd opt for going into that direction rather than hard-wiring some logic into the interpreters core. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 218 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein@lyra.org Thu May 27 11:00:07 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 27 May 1999 03:00:07 -0700 Subject: [Distutils] extensions in packages References: <374BA5C5.706EFB1E@lemburg.com> <14155.63283.936747.273032@weyr.cnri.reston.va.us> <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us> <374D0077.2699505F@lemburg.com> Message-ID: <374D17A7.436D8260@lyra.org> M.-A. Lemburg wrote: >... > Anyway, the dynamic and static versions are both implementable > using the hook, so I'd opt for going into that direction > rather than hard-wiring some logic into the interpreters core. IMO, the interpreter core should perform as little searching as possible. Basically, it should only contain bootstrap stuff. It should look for a standard importing module and load that. After it is loaded, the import mechanism should defer to Python for all future imports. (the cost of running Python code is minimal against the I/O used by the import) IMO #2, the standard importing module should operate along the lines of imputil.py. Cheers, -g -- Greg Stein, http://www.lyra.org/ From Fred L. Drake, Jr." References: <374BA5C5.706EFB1E@lemburg.com> <14155.63283.936747.273032@weyr.cnri.reston.va.us> <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us> <374D0077.2699505F@lemburg.com> <374D17A7.436D8260@lyra.org> Message-ID: <14157.21595.78742.142962@weyr.cnri.reston.va.us> Greg Stein writes: > IMO #2, the standard importing module should operate along the lines of > imputil.py. Which could then be implemented in C for efficiency, once everyone's agreed and if someone has the inclination. ;-) Note: I'm not endorsing any of the magical import mechanisms; I'm just becoming increasingly concerned about the performance of whatever is "standard." And whatever is standard is the only one I'll use; using "ni" in Grail was somewhat useful, but painful as well! ;-) But I should be over it in a few more years. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From Fred L. Drake, Jr." References: <374BA5C5.706EFB1E@lemburg.com> <14155.63283.936747.273032@weyr.cnri.reston.va.us> <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us> <374D0077.2699505F@lemburg.com> Message-ID: <14157.23376.400325.179191@weyr.cnri.reston.va.us> M.-A. Lemburg writes: > Wow, what an analysis. And such fun, as well! ;-) > > It's the .py[co] files that > > are expensive to load! Once you've created the package, sub-modules > > are very cheap: you will typically have no more than two path entries > > to check even once all this is in place. > > I'm not sure I follow you here: do you mean with a package dir > cache in place or using the system implemented in the current Anything contained within a package is relatively cheap to load because the search path is shorter. Currently, if the __init__.py* does nothing to the __path__, there's only one entry! In the current scheme, the .py[co] files are the last thing checked within a directory during the search. Loading one of these costs more in searching than any other type of module. Of course, parsing Python isn't free either, so loading a .py file for which no .py[co] exists is really more expensive, it's just found a little sooner. I said: > caching; loading Grail is still dog slow, and I've no doubt that the > 600+ stat() calls contribute to that! 1-) And then I corrected myself: > Oops, after following through with the math, I'd have to adjust this > to 6000 stat()/open() calls for Grail. Sorry! Ok, I loaded Grail and looked more carefully. I was thinking it was loading about 100 modules. Well, that's at the point that it loads the users .grail/user/grailrc.py (if it exists). By the time my home page was loaded, there were 145 distinct module objects loaded into sys.modules, and 17 entries on sys.path. Lots of Grail modules are in packages these days, but there are also a lot loaded from the standard library. So lets say there are probably around 5000 stat()/open() calls (reduce the number due to package use, then increase it again because (a) there are more modules being loaded than I'd estimated, and (b) the standard library is quite a ways down sys.path. > This seems like something to worry about and probably also enough > to try really hard to find a good solution, IMHO. This is where a good caching system makes a lot of sense. > True, that's why the hook allows you to code the strategy in > Python. Note that my current version uses the sys.path as > key into a table of name:file mappings, so even when using > different setups (which will certainly have some differences in > sys.path), the cache should work. Maybe one should add some > more information to the key... like the platform specifica > or the even the mtimes of the directories on the path. I'm not sure that keying on sys.path is sufficient. Around here, a Solaris/SPARC and Solaris/x86 box are likely to share the same sys.path. That doesn't mean the directories are the same; the differences are taken care of via NFS. Using the mtimes as part of the key means you don't have any way to clear the cache: an older mtime may just mean the version of the path for a different platform, which still wants to use the cache! Perhaps it could be keyed on (platform, dir), and the mtimes could be used to determine the need to refresh that directory. Doing this right is hard, and can be substantially affected by a site's filesystem layout. Avoiding problems due to issues like these is a good reason to use a runtime-only cache. A site for which this isn't sufficient can the use the "hook" mechanism to install something that can do better within the context of specific filesystem management policies. > Yep, remember that too. The problem with these scans is that > directories may contain huge amounts of files and you would > need to check all of them against the module extensions Python They probably won't contain much other than Python modules in a reasonable installation. There's no need to filter the list; just include every file, and then test for the appropriate entries when attempting a specific import. This limits the up-front cost substantially. If we don't assume a reasonable installation (non-module files in the module dirs), it just gets slower and people have an incentive to clean up their installation. This is acceptable. > Anyway, the dynamic and static versions are both implementable > using the hook, so I'd opt for going into that direction > rather than hard-wiring some logic into the interpreters core. I have no problems with using a "hook" to implement a more efficient mechanism. I just want the "standard" mechanism to be efficient, because that's the one I'll use. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From ovidiu@cup.hp.com Thu May 27 17:32:41 1999 From: ovidiu@cup.hp.com (Ovidiu Predescu) Date: Thu, 27 May 1999 09:32:41 -0700 Subject: [Distutils] complete GNU readline support, packaging issues Message-ID: <199905271632.JAA26633@hpcll563.cup.hp.com> Hi, I've just started with Python and I discovered that the binding to the GNU readline is really basic. So about a week ago I started working on a binding for it and I have most of the work done. I need to finish the bindings for keymap functions and some of the functions in the completion part before the package could be considered complete. I started to investigate the ways to package the work I've done and I discovered the distutil package. I'm trying to write a setup.py file, but I need the following things and I didn't figure out how to express them: - the main C file is obtained by running the m4 program on a .m4 file. How can I specify this dependency and the rule for generating the C file? - I need to define a C preprocessor macro that contains the version of the readline library. The way the things are setup now is by running a configure script that determines the version of the library. Can I do this with setup.py? Thanks, -- Ovidiu Predescu http://www.geocities.com/SiliconValley/Monitor/7464/ From mal@lemburg.com Fri May 28 08:52:33 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 28 May 1999 09:52:33 +0200 Subject: [Distutils] extensions in packages References: <374BA5C5.706EFB1E@lemburg.com> <14155.63283.936747.273032@weyr.cnri.reston.va.us> <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us> <374D0077.2699505F@lemburg.com> <374D17A7.436D8260@lyra.org> Message-ID: <374E4B41.13A2A1FD@lemburg.com> Greg Stein wrote: > > M.-A. Lemburg wrote: > >... > > Anyway, the dynamic and static versions are both implementable > > using the hook, so I'd opt for going into that direction > > rather than hard-wiring some logic into the interpreters core. > > IMO, the interpreter core should perform as little searching as > possible. Basically, it should only contain bootstrap stuff. It should > look for a standard importing module and load that. After it is loaded, > the import mechanism should defer to Python for all future imports. (the > cost of running Python code is minimal against the I/O used by the > import) > > IMO #2, the standard importing module should operate along the lines of > imputil.py. You mean moving the whole import mechanism away from C and into Python ? Have you tried such an approach with your imputil.py ? I wonder whether all things done in import.c can be coded in Python, esp. the exotic things like the Windows registry stuff and the Mac fork munging seem to be C only (at least as long as there are no core Python APIs for these C calls). And just curious: why did Guido recode ni.py in C if he could have used ni.py in your proposed way instead ? Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 217 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Fri May 28 09:02:42 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 28 May 1999 10:02:42 +0200 Subject: [Distutils] extensions in packages References: <374BA5C5.706EFB1E@lemburg.com> <14155.63283.936747.273032@weyr.cnri.reston.va.us> <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us> <374D0077.2699505F@lemburg.com> <14157.23376.400325.179191@weyr.cnri.reston.va.us> Message-ID: <374E4DA2.45B0E797@lemburg.com> Fred L. Drake wrote: > > M.-A. Lemburg writes: > > This seems like something to worry about and probably also enough > > to try really hard to find a good solution, IMHO. > > This is where a good caching system makes a lot of sense. > > > True, that's why the hook allows you to code the strategy in > > Python. Note that my current version uses the sys.path as > > key into a table of name:file mappings, so even when using > > different setups (which will certainly have some differences in > > sys.path), the cache should work. Maybe one should add some > > more information to the key... like the platform specifica > > or the even the mtimes of the directories on the path. > > I'm not sure that keying on sys.path is sufficient. Around here, a > Solaris/SPARC and Solaris/x86 box are likely to share the same > sys.path. That doesn't mean the directories are the same; the > differences are taken care of via NFS. Using the mtimes as part of > the key means you don't have any way to clear the cache: an older > mtime may just mean the version of the path for a different platform, > which still wants to use the cache! Perhaps it could be keyed on > (platform, dir), and the mtimes could be used to determine the need to > refresh that directory. > Doing this right is hard, and can be substantially affected by a > site's filesystem layout. Avoiding problems due to issues like these > is a good reason to use a runtime-only cache. A site for which this > isn't sufficient can the use the "hook" mechanism to install something > that can do better within the context of specific filesystem > management policies. Right and that's the key point in optionally moving (at least) the lookup machinery into Python. Admins could then use site.py to add optimized lookup cache implementations for their site. The default implementation should probably be some sort of dynamic cache like the one you sketched below. > > Yep, remember that too. The problem with these scans is that > > directories may contain huge amounts of files and you would > > need to check all of them against the module extensions Python > > They probably won't contain much other than Python modules in a > reasonable installation. There's no need to filter the list; just > include every file, and then test for the appropriate entries when > attempting a specific import. This limits the up-front cost > substantially. Ok, point taken. > If we don't assume a reasonable installation (non-module files in > the module dirs), it just gets slower and people have an incentive to > clean up their installation. This is acceptable. True. > > Anyway, the dynamic and static versions are both implementable > > using the hook, so I'd opt for going into that direction > > rather than hard-wiring some logic into the interpreters core. > > I have no problems with using a "hook" to implement a more efficient > mechanism. I just want the "standard" mechanism to be efficient, > because that's the one I'll use. The hook idea makes the implementation a little more open. Still, I think that even the "standard" lookup/caching scheme should be implemented in Python. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 217 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein@lyra.org Fri May 28 22:24:16 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 28 May 1999 14:24:16 -0700 Subject: [Distutils] extensions in packages References: <374BA5C5.706EFB1E@lemburg.com> <14155.63283.936747.273032@weyr.cnri.reston.va.us> <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us> <374D0077.2699505F@lemburg.com> <374D17A7.436D8260@lyra.org> <374E4B41.13A2A1FD@lemburg.com> Message-ID: <374F0980.171E34F6@lyra.org> M.-A. Lemburg wrote: > > Greg Stein wrote: > >... > > IMO, the interpreter core should perform as little searching as > > possible. Basically, it should only contain bootstrap stuff. It should > > look for a standard importing module and load that. After it is loaded, > > the import mechanism should defer to Python for all future imports. (the > > cost of running Python code is minimal against the I/O used by the > > import) > > > > IMO #2, the standard importing module should operate along the lines of > > imputil.py. > > You mean moving the whole import mechanism away from C and into > Python ? Have you tried such an approach with your imputil.py ? Yes and yes. Using Python's import hook effectively means that you completely take over Python's import mechanism (one of its failings, IMO). imputil.py is designed to provide for iterating through a list of importers, looking for one that works. In any case... yes, I've use imputil to the exclusion of Python's import logic. You still need imp.new_module() and imp.get_magic(). But that does implies that you can axe a lot of stuff outside of that. My tests don't have loading of dynamic libraries, so you would still need an imp function to load that (but strip the *searching* for the module). > I wonder whether all things done in import.c can be coded in Python, > esp. the exotic things like the Windows registry stuff and the > Mac fork munging seem to be C only (at least as long as there are > no core Python APIs for these C calls). win32api provides Registry access, so you just have to bootstrap that. I haven't tried to remove a lot of Python's logic, so I can't say what can actually be tossed, kept around, or just restructured a bit. IMO, the best thing to do is to expose a few minimal functions and defer to Python. > And just curious: why did Guido recode ni.py in C if he could have > used ni.py in your proposed way instead ? For two reasons that I can think of: 1) people had to manually import "ni" 2) it took over the import hook which effectively prevents further use of it (or if somebody *did* use it, then they would wipe out ni's functionality; again, this is why I dislike the current hook approach and like a list-based approach which is possible via imputil) And rather than respond to Fred's note in a separate thread, I'll tie it in here: Frankly: Fred is off-based on the need to "recode in C for efficiency". That is a bogus argument. The cost is I/O operations, not the interpreter overhead. You will gain no real benefit by moving the import mechanism to C. C is *only* required to access the operating system in ways that are not already available in the core, or which you cannot effectively bootstrap. Python should strip away all of its C-based code for packages and for locating modules. That should all move to Python. All that should remain in C is the necessary functions for importing dynamic modules. Cheers, -g -- Greg Stein, http://www.lyra.org/