From sanner@scripps.edu  Mon May  3 21:51:55 1999
From: sanner@scripps.edu (Michel Sanner)
Date: Mon, 3 May 1999 13:51:55 -0700
Subject: [Distutils] Python packages
Message-ID: <990503135155.ZM168661@noah.scripps.edu>

Hello,

I am once again in the Python update phase and (of course) running into the
same type of problems regarding the installation of packages.

I believe that Python should provide support for installing packages that
contain both platform independant files (.py) and platform dependant files
(.so, .dso .pyd). The reason is that I do not want to install (and maintain)
multiple copies of the .py (one for each paltform I support). It seems to me
that this is one ogf the benefits of platform independance. The problem right
now is that the only way to do this (I know of) if to hack together an
__init__.py file for the package placed in the paltform independant part of the
installation tree and that would add the right directory to the PATH for
importing .so files.
The problem with that is that I need to do that for every single package.

Would it be unreasonable to have the Python import mechanism check for packages
in the $prefix AND the $exec_prefix directory ?

-Michel


From sanner@scripps.edu  Mon May  3 23:04:26 1999
From: sanner@scripps.edu (Michel Sanner)
Date: Mon, 3 May 1999 15:04:26 -0700
Subject: [Distutils] More about packages
In-Reply-To: Greg Ward <gward@cnri.reston.va.us>
 "[Distutils] Some code to play with" (Mar 22, 10:10am)
References: <19990322101021.A489@cnri.reston.va.us>
Message-ID: <990503150426.ZM168872@noah.scripps.edu>

It seems to me that the platform dependent and independent trees of a Python
installation are not symetric in some sense:

on one side we have:

$prefix/include/python$version/
$prefix/lib/python$version/
$prefix/man

on the other side

$exec_prefix/bin
$exec_prefix/include/python$version
$exec_prefix/lib/python$version/config
$exec_prefix/lib/python$version/lib-dynload
$exec_prefix/lib/python$version/site-packages

what bothers me is that we do not have that extra level under lib in the
platform independent tree. I'd like to have something like:

$prefix/lib/python$version/standard/  (equivalent of lib-dynload)
$prefix/lib/python$version/packages/  (for paltform independant packages)

and these directories should be part of the Python PATH built by default.

I am not sure where pur Python packages are supposed to be installed right now
?

-Michel


From gward@cnri.reston.va.us  Thu May 20 22:09:45 1999
From: gward@cnri.reston.va.us (Greg Ward)
Date: Thu, 20 May 1999 17:09:45 -0400
Subject: [Distutils] extensions in packages
Message-ID: <19990520170945.A6434@cnri.reston.va.us>

[I'm going to try to yank this thread over from PSA-members,
 which should have been done *long* ago!]

[on psa-members, Michel Sanner opined:]
> I do not think that trying to bend over bacakwards to circumevent this
> "limitation" is the right way to proceed .. unless changing the import
> mechanism in Python itself is something that would be extremely difficult to
> do.

I agree that a change to the import mechanism is in order.  My
understanding (from another one of those office-hallway conversations
with Fred) is that it would be very sensible to add a package's
platform-dependent directory to the package's __path__ attribute.  No
frobbing of sys.path is necessary, and thus no danger of stupid name
conflicts in extension modules that are supposed to be buried deep in
some package structure.

Perhaps imputil.py can help us in the playing around stage; I'm not
familiar with it, though, so I'll refrain from further comment.

        Greg

-- 
Greg Ward - software developer                    gward@cnri.reston.va.us
Corporation for National Research Initiatives    
1895 Preston White Drive                           voice: +1-703-620-8990
Reston, Virginia, USA  20191-5434                    fax: +1-703-620-0913


From just@letterror.com  Fri May 21 10:00:22 1999
From: just@letterror.com (Just van Rossum)
Date: Fri, 21 May 1999 11:00:22 +0200
Subject: [Distutils] Re: [PSA MEMBERS] packages in Python
In-Reply-To: <Pine.WNT.4.04.9905201148380.285-100000@rigoletto.ski.org>
References: <990520114522.ZM223757@noah.scripps.edu>
Message-ID: <l03102803b36aced64c0e@[193.78.237.148]>

(I just subscribed here, so maybe I've missed earlier replies to David's
post in the PSA list)

At 11:57 AM -0700 5/20/99, David Ascher wrote:
>I'm (slowly) getting to the point where I agree.  Two thoughts:
>
>1) imputil.py (greg stein's thing) might be a good place to start working
>   out a better system. See distutils-sig for URLs.
>
>2) the problem of statically compiled package-enclosed modules is
>   separate and needs to be addressed in the core.  In other words, it
>   won't make it before 1.6.

Point 2 shouldn't be too hard. It is already possible (since a 1.5.2 alpha
I think) to statically link submodules in a frozen build. I guess it's
relatively easy to patch find_module() to do something like this:

foo.bar is registered as a "builtin" in config.c file as

    {"foo.bar", initbar},

(Hm, this is problemetic if the is a distinct global builtin module "bar")

find_module() should then first check sys.builtin_module_names with the
full name before doing anything else. (probably only when it is confirmed
that "foo" is a package.)

No time to play with that right now, but it sure seems trivial.

Just


From hinsen@dirac.cnrs-orleans.fr  Fri May 21 10:22:55 1999
From: hinsen@dirac.cnrs-orleans.fr (hinsen@dirac.cnrs-orleans.fr)
Date: Fri, 21 May 1999 11:22:55 +0200
Subject: [Distutils] extensions in packages
Message-ID: <199905210922.LAA02670@chinon.cnrs-orleans.fr>

Just van Rossum wrote:

> foo.bar is registered as a "builtin" in config.c file as
> 
>     {"foo.bar", initbar},
> 
> (Hm, this is problemetic if the is a distinct global builtin module "bar")

Or if any other package has a module "bar"!

> find_module() should then first check sys.builtin_module_names with the
> full name before doing anything else. (probably only when it is confirmed
> that "foo" is a package.)

All that would be doable, but the real problem is the name of the init
function! Only one module can define a global symbol "initbar". So the
one for foo.bar would have to be called "initfoo.bar" (or something
similar). On the other hand, when the same module is used dynamically,
the init function must be called "initbar" again (unless the current
import mechanism is changed).

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From just@letterror.com  Fri May 21 11:20:47 1999
From: just@letterror.com (Just van Rossum)
Date: Fri, 21 May 1999 12:20:47 +0200
Subject: [Distutils] extensions in packages
In-Reply-To: <199905210922.LAA02670@chinon.cnrs-orleans.fr>
Message-ID: <l03102804b36adc7f8371@[193.78.237.148]>

At 11:22 AM +0200 5/21/99, hinsen@dirac.cnrs-orleans.fr wrote:
>> find_module() should then first check sys.builtin_module_names with the
>> full name before doing anything else. (probably only when it is confirmed
>> that "foo" is a package.)
>
>All that would be doable, but the real problem is the name of the init
>function!

Right, I was being naive: I thought that was just "a" problem...

>Only one module can define a global symbol "initbar". So the
>one for foo.bar would have to be called "initfoo.bar" (or something
>similar). On the other hand, when the same module is used dynamically,
>the init function must be called "initbar" again (unless the current
>import mechanism is changed).

So there are really two options:

1) Define a switch that C extensions can check to determine whether the
init func should be called initbar or initfoo_bar (or something).
This means it's up to the extension developer to cater for statically
linked builtin submodules by doing something like this in the extension
source:

#ifdef PY_STATIC_SUBMODULES
#define initbar initfoo_bar
#endif

2) change the DL import mechanism so the init function *has* to be called
initfoo_bar. But then, to remain backwards compatible you'd still have use
a switch, so it doesn't help much now.

Just


From mal@lemburg.com  Fri May 21 14:50:19 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 21 May 1999 15:50:19 +0200
Subject: [Distutils] Packages with C extensions
Message-ID: <3745649B.4B56F9C2@lemburg.com>

[Problem with dynamic extensions in packages being platform dependent]

I haven't followed the thread too closely, but was alarmed by
the recent proposals of splitting .so files out of the "normal"
package distribution under a separate dir tree. This is really
not such a good idea because it would cause the package information
stored in the extension module to be lost (you can't have two
top-level packages with the same name on the path: only the first one
on the path will be used).

Here is the scheme I would use: create a subpackage for the
extension and have it take care of importing the correct
shared lib for the platform Python is currently running on.
The libs themselves could be placed in plat-<platform> subdirs
of that subpackage and the __init__.py would then load the
shared lib using either a sys.path+__import__() hack or
thread safe via imp.load_dynamic().

An even simpler solution is installing the whole package under
.../python1.5/plat-<platform> separately for each supported
platform rather than putting it under site-packages. [Disk space
is no argument nowadays and its likely that different platforms
need different Setup files anyway.]

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   224 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From hinsen@cnrs-orleans.fr  Fri May 21 15:08:29 1999
From: hinsen@cnrs-orleans.fr (Konrad Hinsen)
Date: Fri, 21 May 1999 16:08:29 +0200
Subject: [Distutils] extensions in packages
In-Reply-To: <l03102804b36adc7f8371@[193.78.237.148]> (message from Just van
 Rossum on Fri, 21 May 1999 12:20:47 +0200)
References: <l03102804b36adc7f8371@[193.78.237.148]>
Message-ID: <199905211408.QAA03506@chinon.cnrs-orleans.fr>

> 1) Define a switch that C extensions can check to determine whether the
> init func should be called initbar or initfoo_bar (or something).

I'd rather have a set of macros that automatically do the right thing,
but that's a minor detail. Changing the name of the init function is
certainly doable.

But if the init function contains the complete package path (and I see
no other way to avoid name clashes), then we have to worry about the
limitations that various systems impose on the name of global symbols.
I doubt that there are still many systems around that use only eight
characters, but I think 32 is a common limit. Although I am not really
sure about the current state of the art!

> 2) change the DL import mechanism so the init function *has* to be called
> initfoo_bar. But then, to remain backwards compatible you'd still have use
> a switch, so it doesn't help much now.

Backwards compatible with what? Currently builtin modules can't be
in packages at all, so nothing's lost.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From just@letterror.com  Fri May 21 15:57:27 1999
From: just@letterror.com (Just van Rossum)
Date: Fri, 21 May 1999 16:57:27 +0200
Subject: [Distutils] extensions in packages
In-Reply-To: <199905211408.QAA03506@chinon.cnrs-orleans.fr>
References: <l03102804b36adc7f8371@[193.78.237.148]> (message from Just
 van	Rossum on Fri, 21 May 1999 12:20:47 +0200)
 <l03102804b36adc7f8371@[193.78.237.148]>
Message-ID: <l03102809b36b23cf4e0a@[193.78.237.148]>

At 4:08 PM +0200 5/21/99, Konrad Hinsen wrote:
>> 2) change the DL import mechanism so the init function *has* to be called
>> initfoo_bar. But then, to remain backwards compatible you'd still have use
>> a switch, so it doesn't help much now.
>
>Backwards compatible with what? Currently builtin modules can't be
>in packages at all, so nothing's lost.

But DLLs *can* be (that's the whole point, no?). If the rules for the init
func changes, I think at least Marc-Andre L. won't be too happy: all (?) of
his extensions use DLLs as submodules, so he would need to add switches to
remain compatible with 1.5.2. I'm sure he's not the only one.

Just


From hinsen@cnrs-orleans.fr  Fri May 21 16:07:45 1999
From: hinsen@cnrs-orleans.fr (Konrad Hinsen)
Date: Fri, 21 May 1999 17:07:45 +0200
Subject: [Distutils] extensions in packages
In-Reply-To: <l03102809b36b23cf4e0a@[193.78.237.148]> (message from Just van
 Rossum on Fri, 21 May 1999 16:57:27 +0200)
References: <l03102804b36adc7f8371@[193.78.237.148]> (message from Just
 van	Rossum on Fri, 21 May 1999 12:20:47 +0200)
 <l03102804b36adc7f8371@[193.78.237.148]> <l03102809b36b23cf4e0a@[193.78.237.148]>
Message-ID: <199905211507.RAA04104@chinon.cnrs-orleans.fr>

> >Backwards compatible with what? Currently builtin modules can't be
> >in packages at all, so nothing's lost.
> 
> But DLLs *can* be (that's the whole point, no?). If the rules for the init
> func changes, I think at least Marc-Andre L. won't be too happy: all (?) of
> his extensions use DLLs as submodules, so he would need to add switches to
> remain compatible with 1.5.2. I'm sure he's not the only one.

I admit I hadn't thought about the possibility that someone might have
used dynamic libraries in packages already; my development cycles
always include statically linked modules at some stage, so all
extension modules remain top-level.

Which makes me wonder how others develop extension modules: I always
use a debugger at some point, and I haven't yet found one which lets
me set breakpoints in dynamic libraries that haven't been loaded yet!

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From Fred L. Drake, Jr." <fdrake@acm.org  Fri May 21 16:11:00 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 21 May 1999 11:11:00 -0400 (EDT)
Subject: [Distutils] Re: [PSA MEMBERS] packages in Python
In-Reply-To: <199905210904.LAA02630@chinon.cnrs-orleans.fr>
References: <Pine.WNT.4.04.9905201036080.285-100000@rigoletto.ski.org>
 <990520114522.ZM223757@noah.scripps.edu>
 <199905210904.LAA02630@chinon.cnrs-orleans.fr>
Message-ID: <14149.30596.737136.29729@weyr.cnri.reston.va.us>

Konrad Hinsen writes:
 > > I would much more prefere have Python try to import from the
 > > paltform independant part of a package (installed under $prefix)
 > > and, if it cannot find what we are looking for, lookup
 > > "automatically" the platform-dependant part of that package.
 > 
 > Shouldn't that be the other way round? I'd expect to be able to override
 > general modules by platform-specific modules.

  Greg Ward and I were talking about this stuff the other day, and I
think we decided that there was no good way to have multiple
implementations of a module installed such that the platform dependent 
version was sure to take precedence over a platform independent
version; this relies on the sequence of directories in the relevant
search path (whether it be sys.path or a package's __path__).
  The general solution seems to be that two things need to be done: a
package's __path__ needs to include *all* the appropriate directories
found along the search path, not just the one holding __init__.py*,
AND the platform dependent modules should have different names from
the platform independent modules.  The platform independent module
should be the public interface, and it can load platform dependent
code the same way that string loads functions from strop.
  The problem here is that the package's __path__ is not being created 
this way now; if anyone has time to work on a patch for
Python/import.c, I'd be glad to help test it!  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From sanner@scripps.edu  Fri May 21 17:39:50 1999
From: sanner@scripps.edu (Michel Sanner)
Date: Fri, 21 May 1999 09:39:50 -0700
Subject: [Distutils] extensions in packages
In-Reply-To: Konrad Hinsen <hinsen@cnrs-orleans.fr>
 "Re: [Distutils] extensions in packages" (May 21,  5:07pm)
References: <l03102804b36adc7f8371@[193.78.237.148]> (message from Just van	Rossum on Fri  21 May 1999 12:20:47 +0200)
 <l03102809b36b23cf4e0a@[193.78.237.148]>
 <199905211507.RAA04104@chinon.cnrs-orleans.fr>
Message-ID: <990521093950.ZM192336@noah.scripps.edu>

On May 21,  5:07pm, Konrad Hinsen wrote:
>
> Which makes me wonder how others develop extension modules: I always
> use a debugger at some point, and I haven't yet found one which lets
> me set breakpoints in dynamic libraries that haven't been loaded yet!
>
After you import the .so you can set a break point. I do that all the time on
my sgi uner dbx or cvd .. no problem.

I also have about 10 extensions modules all of which use .so files !

-Michel


From sanner@scripps.edu  Fri May 21 17:30:58 1999
From: sanner@scripps.edu (Michel Sanner)
Date: Fri, 21 May 1999 09:30:58 -0700
Subject: [Distutils] Re: [PSA MEMBERS] packages in Python
In-Reply-To: "M.-A. Lemburg" <mal@lemburg.com>
 "Re: [PSA MEMBERS] packages in Python" (May 21, 10:24am)
References: <Pine.WNT.4.04.9905201148380.285-100000@rigoletto.ski.org>
 <37451849.CEF027B@lemburg.com>
Message-ID: <990521093058.ZM235238@noah.scripps.edu>

On May 21, 10:24am, M.-A. Lemburg wrote:
> Subject: Re: [PSA MEMBERS] packages in Python
> [Problem with dynamic extensions in packages being platform dependent]
>
> I haven't followed the thread too closely, but was alarmed by
> the recent proposals of splitting .so files out of the "normal"
> package distribution under a separate dir tree. This is really
> not such a good idea because it would cause the package information
> stored in the extension module to be lost (you can't have two
> top-level packages with the same name on the path: only the first one
> on the path will be used).
>
> Here is the scheme I would use: create a subpackage for the
> extension and have it take care of importing the correct
> shared lib for the platform Python is currently running on.
> The libs themselves could be placed in plat-<platform> subdirs
> of that subpackage and the __init__.py would then load the
> shared lib using either a sys.path+__import__() hack or
> thread safe via imp.load_dynamic().
>
> An even simpler solution is installing the whole package under
> .../python1.5/plat-<platform> separately for each supported
> platform rather than putting it under site-packages. [Disk space
> is no argument nowadays and its likely that different platforms
> need different Setup files anyway.]
>
As someone who maintains Python for several unix based architectures I am not
concerned with disk space but really file duplication with the obvious risc to
run out of sync.

Also, the plat-<platform> scheme is far from being able to capture the
complexity of this world. SGI alone has 3 ABIs o32 n32 n64 multiplied by MIPS1,
MIPS3, MIPS4 instruction sets yimes IRIX5.x, IRIX6.2, IRIX6.3, IRIX6.4, IRIX6.5
and many of these combinations are incompatible. !

Finally, why have a $prefix and a $ exec_prefix if it is not used to split
plateform dependent stuff from platform independent.

And we should really take this dot distutil-sig :)

-Michel


From sanner@scripps.edu  Fri May 21 17:42:28 1999
From: sanner@scripps.edu (Michel Sanner)
Date: Fri, 21 May 1999 09:42:28 -0700
Subject: [Distutils] Re: [PSA MEMBERS] packages in Python
In-Reply-To: "Fred L. Drake" <fdrake@cnri.reston.va.us>
 "[Distutils] Re: [PSA MEMBERS] packages in Python" (May 21, 11:11am)
References: <Pine.WNT.4.04.9905201036080.285-100000@rigoletto.ski.org>
 <990520114522.ZM223757@noah.scripps.edu>
 <199905210904.LAA02630@chinon.cnrs-orleans.fr>
 <14149.30596.737136.29729@weyr.cnri.reston.va.us>
Message-ID: <990521094228.ZM237910@noah.scripps.edu>

On May 21, 11:11am, Fred L. Drake wrote:
> Subject: [Distutils] Re: [PSA MEMBERS] packages in Python
>
> Konrad Hinsen writes:
>  > Shouldn't that be the other way round? I'd expect to be able to override
>  > general modules by platform-specific modules.
>
>   Greg Ward and I were talking about this stuff the other day, and I
> think we decided that there was no good way to have multiple
> implementations of a module installed such that the platform dependent
> version was sure to take precedence over a platform independent
> version; this relies on the sequence of directories in the relevant
> search path (whether it be sys.path or a package's __path__).
>   The general solution seems to be that two things need to be done: a
> package's __path__ needs to include *all* the appropriate directories
> found along the search path, not just the one holding __init__.py*,
> AND the platform dependent modules should have different names from
> the platform independent modules.  The platform independent module
> should be the public interface, and it can load platform dependent
> code the same way that string loads functions from strop.
>   The problem here is that the package's __path__ is not being created
> this way now; if anyone has time to work on a patch for
> Python/import.c, I'd be glad to help test it!  ;-)
>
I take care of this in my extension module. If I have a platform dependent
implementation of a module I:

try:
  import extension
Fail:
   use common

This requires minimal amount of coding. Personally I did not see the need of
this being automatic

-Michel


From Fred L. Drake, Jr." <fdrake@acm.org  Fri May 21 17:46:45 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 21 May 1999 12:46:45 -0400 (EDT)
Subject: [Distutils] Re: [PSA MEMBERS] packages in Python
In-Reply-To: <990521094228.ZM237910@noah.scripps.edu>
References: <Pine.WNT.4.04.9905201036080.285-100000@rigoletto.ski.org>
 <990520114522.ZM223757@noah.scripps.edu>
 <199905210904.LAA02630@chinon.cnrs-orleans.fr>
 <14149.30596.737136.29729@weyr.cnri.reston.va.us>
 <990521094228.ZM237910@noah.scripps.edu>
Message-ID: <14149.36341.651126.805299@weyr.cnri.reston.va.us>

Michel Sanner writes:
 > This requires minimal amount of coding. Personally I did not see the need of
 > this being automatic

  The only part that I think should be automatic is loading the
additional directories into the package's __path__ (like
$exec_prefix/lib/python$VERSION/site-packages/...).  The rest seems to 
be sufficiently specific to the package to require that it be handled
explicitly.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From pas@scansoft.com  Fri May 21 06:56:11 1999
From: pas@scansoft.com (Perry A. Stoll)
Date: Fri, 21 May 1999 01:56:11 -0400
Subject: [Distutils] Packages with C extensions
Message-ID: <01bea34e$a0be1160$3e3e79cf@nara.scansoft.com>

> M.-A. Lemburg <mal@lemburg.com> writes:


>[Problem with dynamic extensions in packages being platform dependent]
>
>I haven't followed the thread too closely, but was alarmed by
>the recent proposals of splitting .so files out of the "normal"
>package distribution under a separate dir tree.

Fair enough. There's the the GNU configure view of life, where $prefix and
$exec_prefix are separate directories, and there is the perl view of life
 $PERL_ARCHLIB is usually a subdirectory of the install directory). M.A.
prefers the perl-ish approach. Fine with me, as long as we do it explicitly.

> (you can't have two top-level packages with the same name
>on the path: only the first one on the path will be used).


That's only because that's how it's done today. Just a matter of some
code...(and the thought and design behind it).

>  [ snipped scheme for having packages do the platform specific import ]


I'd rather not burden the package writer. I think it's better to include the
batteries for this one.

>  [recommendation that you just have a different install dir for each
platform ]
> Disk space is no argument nowadays

Ease of maintenance is the overriding argument here. The .py files are the
same for all platforms so why do I want different copies of those files when
I have python install for three platforms?

> its likely that different platforms need different Setup files anyway.


But that's a platform dependent file which goes in the $INSTALL_ARCHLIB.

In short, I think we need to get this infrastructure into Python itself to
ease the creation of package authors. But then I'm probably preaching to the
choir.

-Perry


From pas@scansoft.com  Fri May 21 07:55:56 1999
From: pas@scansoft.com (Perry A. Stoll)
Date: Fri, 21 May 1999 02:55:56 -0400
Subject: [Distutils] Re: [PSA MEMBERS] packages in Python
Message-ID: <01bea356$fa01f5e0$3e3e79cf@nara.scansoft.com>

>>   The problem here is that the package's __path__ is not being created
>> this way now; if anyone has time to work on a patch for
>> Python/import.c, I'd be glad to help test it!  ;-)
>>
>I take care of this in my extension module. If I have a platform dependent
>implementation of a module I:
>
>try:
>  import extension
>Fail:
>   use common


I don't think this is the case that's causing problem. The problem is when a
submodule on a package is *always* platform dependent (because, for example,
it interfaces to another library).

-Perry


From Fred L. Drake, Jr." <fdrake@acm.org  Fri May 21 20:05:18 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 21 May 1999 15:05:18 -0400 (EDT)
Subject: [Distutils] Re: [PSA MEMBERS] packages in Python
In-Reply-To: <01bea356$fa01f5e0$3e3e79cf@nara.scansoft.com>
References: <01bea356$fa01f5e0$3e3e79cf@nara.scansoft.com>
Message-ID: <14149.44654.562124.940757@weyr.cnri.reston.va.us>

Perry A. Stoll writes:
 > I don't think this is the case that's causing problem. The problem is when a
 > submodule on a package is *always* platform dependent (because, for example,
 > it interfaces to another library).

Perry,
  I think in this case the only problem is getting all the right
directories on the package's __path__; am I missing something?  It
avoids the need for the "conditional" import, but that's largely
separate.
  If there are also platform independent modules, this is an issue; if 
*all* the "real" modules are platform dependent, I think the
duplication of the (essentially empty) __init__.py* is something we
can allow, and just install the package entirely under $exec_prefix.
  Are there problems that I'm missing that can't be solved by locating 
all the parallel package directories and placing them on the package's 
__path__?  I think the multiple SGI binary formats can be handled by
using a different $exec_prefix for each (it sounds like anything less
won't get the job done anyway for that case).


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From pas@scansoft.com  Fri May 21 08:47:03 1999
From: pas@scansoft.com (Perry A. Stoll)
Date: Fri, 21 May 1999 03:47:03 -0400
Subject: [Distutils] Re: [PSA MEMBERS] packages in Python
Message-ID: <01bea35e$1dffa8a0$3e3e79cf@nara.scansoft.com>

>  I think in this case the only problem is getting all the right
>directories on the package's __path__; am I missing something?  It
>avoids the need for the "conditional" import, but that's largely
>separate.

Fred,

Good point. Can you recommend a concise place that the import mechanism (in
all it's glory) is documented?

That should solve the problem, except for when freeze-ing or making a static
python binary (as previously mentioned by Konrad).

I was poking around in ihooks.py. It looks like it should be possible to
cook up something approximating this using ihooks.  What do you think?

-Perry


From Fred L. Drake, Jr." <fdrake@acm.org  Fri May 21 20:55:46 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 21 May 1999 15:55:46 -0400 (EDT)
Subject: [Distutils] Re: [PSA MEMBERS] packages in Python
In-Reply-To: <01bea35e$1dffa8a0$3e3e79cf@nara.scansoft.com>
References: <01bea35e$1dffa8a0$3e3e79cf@nara.scansoft.com>
Message-ID: <14149.47682.416406.221941@weyr.cnri.reston.va.us>

Perry A. Stoll writes:
 > Good point. Can you recommend a concise place that the import mechanism (in
 > all it's glory) is documented?

  Documentation?  Ha!  I don't have no stinkin' documentation!  ;-)
  I think going over Python/import.c is the best bet.  There's an
import_package() function (I think that's the name); probably the best 
bet is to modify that to build the right __path__ value; at this point 
we know it's a package, so we're not interfering with the performance
of importing non-packages, only the package/subpackages themselves.

 > That should solve the problem, except for when freeze-ing or making a static
 > python binary (as previously mentioned by Konrad).

  I don't know enough about freezing, but I suspect that's not too
difficult; probably about the same as staticly linked package-ized
modules.  ;-)  I don't think those will actually be that difficult for 
someone that has time to read the code; the only real problem is the
public symbol for the module init function.

 > I was poking around in ihooks.py. It looks like it should be possible to
 > cook up something approximating this using ihooks.  What do you think?

  That can probably be done, but places the import machinery in Python 
rather than in C, so it'll be slow.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From MHammond@skippinet.com.au  Fri May 21 23:51:02 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Sat, 22 May 1999 08:51:02 +1000
Subject: [Distutils] extensions in packages
In-Reply-To: <l03102809b36b23cf4e0a@[193.78.237.148]>
Message-ID: <000c01bea3dc$67574de0$0801a8c0@bobcat>

FWIW, I _do_ use DLLs in packages, and it causes me no end of grief.  I
need to have special runtime hacks that works with __path__, I need a
special __init__ in the package where the DLL is to "appear", and also need
even further special casing for Freeze!

So although I can see the problems with the mechanisms, IMO it is very
important that packages be capable of treating DLLs as first-class
citizens.

Personally, I would not have a "compatibility" problem as such, but I would
need to remove or update my hacks - but I find that reasonable.

Mark.

> >Backwards compatible with what? Currently builtin modules can't be
> >in packages at all, so nothing's lost.
>
> But DLLs *can* be (that's the whole point, no?). If the rules
> for the init
> func changes, I think at least Marc-Andre L. won't be too
> happy: all (?) of
> his extensions use DLLs as submodules, so he would need to
> add switches to
> remain compatible with 1.5.2. I'm sure he's not the only one.


From mal@lemburg.com  Tue May 25 10:14:06 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 25 May 1999 11:14:06 +0200
Subject: [Distutils] Packages with C extensions
References: <01bea34e$a0be1160$3e3e79cf@nara.scansoft.com>
Message-ID: <374A69DE.6A2BE81C@lemburg.com>

Perry A. Stoll wrote:
> 
> > M.-A. Lemburg <mal@lemburg.com> writes:
> 
> >[Problem with dynamic extensions in packages being platform dependent]
> >
> >I haven't followed the thread too closely, but was alarmed by
> >the recent proposals of splitting .so files out of the "normal"
> >package distribution under a separate dir tree.
> 
> Fair enough. There's the the GNU configure view of life, where $prefix and
> $exec_prefix are separate directories, and there is the perl view of life
>  $PERL_ARCHLIB is usually a subdirectory of the install directory). M.A.
> prefers the perl-ish approach. Fine with me, as long as we do it explicitly.
>
> > (you can't have two top-level packages with the same name
> >on the path: only the first one on the path will be used).
> 
> That's only because that's how it's done today. Just a matter of some
> code...(and the thought and design behind it).

Good point :-) ... having Python continue scanning the path after
some import fails would definitely ease structuring of packages,
mostly because it allows extending existing installations with
user or platform specific modules.

The latter is a basic building block for a possible future standard
Python lib with a package layout (see a discussion about this on
c.l.p last year, I think).
 
> >  [ snipped scheme for having packages do the platform specific import ]
> 
> I'd rather not burden the package writer. I think it's better to include the
> batteries for this one.

Fair enough, but I guess a simple platform aware import helper
would do the trick nicely, e.g.

MyModule = platimport('MyModule')
 
> >  [recommendation that you just have a different install dir for each
> platform ]
> > Disk space is no argument nowadays
> 
> Ease of maintenance is the overriding argument here. The .py files are the
> same for all platforms so why do I want different copies of those files when
> I have python install for three platforms?

The latter gets defeated by the disk space non-argument. Also, some
future version of Python may very well use platform dependent optimized
versions of .py files, e.g. JIT compiled ones.

Besides, I don't think that a simple 'cp -a plat-1 plat-2' causes
too much maintenance effort ;-) If you worry about disk space, you
could even setup a linked copy for the new platform using e.g.
Tools/scripts/linktree.py.

> > its likely that different platforms need different Setup files anyway.
> 
> But that's a platform dependent file which goes in the $INSTALL_ARCHLIB.

True. Not sure why you would want to install that file though (its only
needed for compilation).
 
> In short, I think we need to get this infrastructure into Python itself to
> ease the creation of package authors.

Now, I think, you're taking the idea a bit too far ;-) ...

> But then I'm probably preaching to the
> choir.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   220 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Tue May 25 10:41:13 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 25 May 1999 11:41:13 +0200
Subject: [Distutils] extensions in packages
References: <000c01bea3dc$67574de0$0801a8c0@bobcat>
Message-ID: <374A7039.3A63B129@lemburg.com>

Mark Hammond wrote:
> 
> FWIW, I _do_ use DLLs in packages, and it causes me no end of grief.  I
> need to have special runtime hacks that works with __path__, I need a
> special __init__ in the package where the DLL is to "appear", and also need
> even further special casing for Freeze!

I've been using DLL/SOs in packages with much success for some time
now. Don't know why you need any hacks to get this going though:
it works right out of the box for me.

The situation is a little different for frozen apps without shared
libs though: the extension modules will become top-level modules.
Haven't frozen those kinds of apps yet, but it should still work
out of the box (except maybe when you pass pickles from an
app using top-level modules to one using in-package modules).
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   220 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon May 24 21:53:19 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 24 May 1999 22:53:19 +0200
Subject: [Distutils] extensions in packages
References: <l03102804b36adc7f8371@[193.78.237.148]> (message from Just
 van	Rossum on Fri, 21 May 1999 12:20:47 +0200)
 <l03102804b36adc7f8371@[193.78.237.148]> <l03102809b36b23cf4e0a@[193.78.237.148]> <199905211507.RAA04104@chinon.cnrs-orleans.fr>
Message-ID: <3749BC3F.3614E4DE@lemburg.com>

Konrad Hinsen wrote:
> 
> > >Backwards compatible with what? Currently builtin modules can't be
> > >in packages at all, so nothing's lost.
> >
> > But DLLs *can* be (that's the whole point, no?). If the rules for the init
> > func changes, I think at least Marc-Andre L. won't be too happy: all (?) of
> > his extensions use DLLs as submodules, so he would need to add switches to
> > remain compatible with 1.5.2. I'm sure he's not the only one.

Yep, all my extensions are wrapped into packages and all of them
use subpackages which wrap extension modules included as submodules
of those packages... that gives you a very flexible setup since the
__init__.py files let you do all kinds of nifty things to load the
correct C extension (see my previous post).

> I admit I hadn't thought about the possibility that someone might have
> used dynamic libraries in packages already; my development cycles
> always include statically linked modules at some stage, so all
> extension modules remain top-level.

The main reason for including the extensions in the packages themselves
rather than making them top-level was to simplify installation, e.g.
on Windows (with pre-compiled binaries), you only have to unzip
the archive and that's it... no make install or equivalent is
necessary.
 
> Which makes me wonder how others develop extension modules: I always
> use a debugger at some point, and I haven't yet found one which lets
> me set breakpoints in dynamic libraries that haven't been loaded yet!

That one is simple: you run it twice. The first time to load the
DLL and the second time with the break point set in the DLL.
Works with gdb on Linux, not sure about other platforms.

Cheers,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   221 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From da@ski.org  Wed May 26 05:02:22 1999
From: da@ski.org (David Ascher)
Date: Tue, 25 May 1999 21:02:22 -0700 (Pacific Daylight Time)
Subject: [Distutils] extensions in packages
In-Reply-To: <374A7039.3A63B129@lemburg.com>
Message-ID: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>

On Tue, 25 May 1999, M.-A. Lemburg wrote:
> 
> I've been using DLL/SOs in packages with much success for some time
> now. Don't know why you need any hacks to get this going though:
> it works right out of the box for me.

Do you have the DLLs/.so's in directories that are children of
$exec_prefix?  If yes, please let us know how you do it.  That's the task
that we're trying to solve.

--david


From mal@lemburg.com  Wed May 26 08:41:57 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 26 May 1999 09:41:57 +0200
Subject: [Distutils] extensions in packages
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
Message-ID: <374BA5C5.706EFB1E@lemburg.com>

David Ascher wrote:
> 
> On Tue, 25 May 1999, M.-A. Lemburg wrote:
> >
> > I've been using DLL/SOs in packages with much success for some time
> > now. Don't know why you need any hacks to get this going though:
> > it works right out of the box for me.
> 
> Do you have the DLLs/.so's in directories that are children of
> $exec_prefix?  If yes, please let us know how you do it.  That's the task
> that we're trying to solve.

No, I simply leave them in the packages subdirectories. The "make
install" step is not needed if you have the users compile the
extensions in the package subdirs. There's no magic to it.

This doesn't allow you to have one installation for multiple
platforms, but it makes the installation process very simple
and currently is the only way to go with the classical
Makefile.pre.in approach, since this does not allow you to
install extensions in directories other than site-packages
without tweaking.

I still think that to get multi-platform installation working
we'd definitely need to extend the package import mechanism
to have it continue the search for a particular module in case
the first try fails.

Note that this kind of search will be very costly due the amount
of IO needed to search the path. Some sort of fastpath hook
should be included along with this extension to fix this. (Such
a hook could also be used to do other PYTHONPATH mods at runtime
which go far beyond the current sys.path tricks, e.g. to implement
new module lookup schemes.)

For a try at such a hook, see:

	http://starship.skyport.net/~lemburg/fastpath.zip

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   219 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From Fred L. Drake, Jr." <fdrake@acm.org  Wed May 26 14:29:23 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Wed, 26 May 1999 09:29:23 -0400 (EDT)
Subject: [Distutils] extensions in packages
In-Reply-To: <374BA5C5.706EFB1E@lemburg.com>
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
 <374BA5C5.706EFB1E@lemburg.com>
Message-ID: <14155.63283.936747.273032@weyr.cnri.reston.va.us>

M.-A. Lemburg writes:
 > Note that this kind of search will be very costly due the amount
 > of IO needed to search the path. Some sort of fastpath hook

Marc-Andre,
  Why does this need to be so costly?  Compared to the current scheme, 
there's little to add.  Once a package has been identified (and *only* 
then!), search the path for all the appropriate subdirectories (one
stat() for each path entry).  The current approach requires about a
half dozen stats for each path entry: foo.py, foo.py[co],
foomodule.so, foo.so, foo/ + foo/__init__.py + foo.__init__.py[co].
It will typically be even cheaper for sub-packages, because the
original path will usually be much shorter than sys.path.
  Note that I'm not saying there shouldn't be some sort of directory
caching; loading Grail is still dog slow, and I've no doubt that the
600+ stat() calls contribute to that!  1-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From Fred L. Drake, Jr." <fdrake@acm.org  Wed May 26 22:54:11 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Wed, 26 May 1999 17:54:11 -0400 (EDT)
Subject: [Distutils] extensions in packages
In-Reply-To: <374C20DD.53B458DC@lemburg.com>
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
 <374BA5C5.706EFB1E@lemburg.com>
 <14155.63283.936747.273032@weyr.cnri.reston.va.us>
 <374C20DD.53B458DC@lemburg.com>
Message-ID: <14156.28035.739990.919106@weyr.cnri.reston.va.us>

M.-A. Lemburg writes:
 > Well, I was referring to the additional lookup needed to find
 > the next package dir of the same name. Say you put the Python
 > package into site-packages and the binaries into plat-<platform>.

  I didn't say it was free; just that the cost was insignificant
compared to the current cost.
  My sys.path in an interactive interpreter contains 11 entries.  If I 
want to add a package with both $prefix and $exec_prefix components,
the worst case is that the directory holding the __init__.py* is the
last path entry, and the other directory is in the immediately
preceeding path entry.  After the current mechanism locates the
__init__.py* file, it needs to build the __path__ for the package.  It 
takes 10 stat() calls to locate the additional directory.  Considering
that the initial search that caused the package module to be created
took: 11 stats to see if the entries contained the appropriate
directory + 2 stats to determine that the first directory of the
package (the one that doesn't have __init__.py*) wasn't it + 36 to
determine that the first 9 directories didn't contain a matching
.so|module.so|.py|.py[co].  Plus at least one to actually find the
__init__.pyc; two if only the .py is available.  (I think I followed
the code right. ;)  That's 59 system calls (either stat() or open(),
the later hidden inside fdopen()).  I don't the added 10 to get the
right __path__ is worth worrying about.  It's the .py[co] files that
are expensive to load!  Once you've created the package, sub-modules
are very cheap: you will typically have no more than two path entries
to check even once all this is in place.

I said:
 > caching; loading Grail is still dog slow, and I've no doubt that the
 > 600+ stat() calls contribute to that!  1-)

  Oops, after following through with the math, I'd have to adjust this 
to 6000 stat()/open() calls for Grail.  Sorry!

And back to Marc-Andre:
 > I would very much like to see some sort of caching in the
 > interpreter. The fastpath hook I implemented uses a marshalled
 > dict stored in the user's home dir for the lookup. Once created,

  I don't think I'd store the cache; if a user's home directory is
mounted via NFS (common), then it may often be wrong if the user
actively works with a variety of hosts with different versions or
installations of Python.  The benefits of a cache are greatest for
applications that import a lot of modules (like Grail!); the cache can 
be built using a directory scan as each directory is searched.  (I
think one of the guys from CWI did this at one point and had really
good results; Jack?)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From mal@lemburg.com  Wed May 26 17:27:09 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 26 May 1999 18:27:09 +0200
Subject: [Distutils] extensions in packages
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
 <374BA5C5.706EFB1E@lemburg.com> <14155.63283.936747.273032@weyr.cnri.reston.va.us>
Message-ID: <374C20DD.53B458DC@lemburg.com>

Fred L. Drake wrote:
> 
> M.-A. Lemburg writes:
>  > Note that this kind of search will be very costly due the amount
>  > of IO needed to search the path. Some sort of fastpath hook
> 
> Marc-Andre,
>   Why does this need to be so costly?  Compared to the current scheme,
> there's little to add.  Once a package has been identified (and *only*
> then!), search the path for all the appropriate subdirectories (one
> stat() for each path entry).  The current approach requires about a
> half dozen stats for each path entry: foo.py, foo.py[co],
> foomodule.so, foo.so, foo/ + foo/__init__.py + foo.__init__.py[co].
> It will typically be even cheaper for sub-packages, because the
> original path will usually be much shorter than sys.path.

Well, I was referring to the additional lookup needed to find
the next package dir of the same name. Say you put the Python
package into site-packages and the binaries into plat-<platform>.
Since the platform subdirs come first on the standard sys.path,
all imports of the form

import MyPackage.MyModule

will first look in the binary package, fail and then continue
to look (and hopefully find) the MyModule submodule in the
Python package installed under site-packages. Since these
imports are more common than importing binaries, imports would
get even slower on average.

Ok, you could change the sys.path so that the binaries come
*after* the source packages... but it's currently not the
default.

>   Note that I'm not saying there shouldn't be some sort of directory
> caching; loading Grail is still dog slow, and I've no doubt that the
> 600+ stat() calls contribute to that!  1-)

I would very much like to see some sort of caching in the
interpreter. The fastpath hook I implemented uses a marshalled
dict stored in the user's home dir for the lookup. Once created,
it reduces startup time noticeably (cutting down stat() calls
from around 200 for a typical utility script to around 20).

The nice thing about the hack is that you can experiment with
the cache logic using Python functions before possibly coding it
in C.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   219 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Thu May 27 09:21:11 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 27 May 1999 10:21:11 +0200
Subject: [Distutils] extensions in packages
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
 <374BA5C5.706EFB1E@lemburg.com>
 <14155.63283.936747.273032@weyr.cnri.reston.va.us>
 <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us>
Message-ID: <374D0077.2699505F@lemburg.com>

Fred L. Drake wrote:
> 
> M.-A. Lemburg writes:
>  > Well, I was referring to the additional lookup needed to find
>  > the next package dir of the same name. Say you put the Python
>  > package into site-packages and the binaries into plat-<platform>.
> 
>   I didn't say it was free; just that the cost was insignificant
> compared to the current cost.

Agreed.

>   My sys.path in an interactive interpreter contains 11 entries.  If I
> want to add a package with both $prefix and $exec_prefix components,
> the worst case is that the directory holding the __init__.py* is the
> last path entry, and the other directory is in the immediately
> preceeding path entry.  After the current mechanism locates the
> __init__.py* file, it needs to build the __path__ for the package.  It
> takes 10 stat() calls to locate the additional directory.  Considering
> that the initial search that caused the package module to be created
> took: 11 stats to see if the entries contained the appropriate
> directory + 2 stats to determine that the first directory of the
> package (the one that doesn't have __init__.py*) wasn't it + 36 to
> determine that the first 9 directories didn't contain a matching
> .so|module.so|.py|.py[co].  Plus at least one to actually find the
> __init__.pyc; two if only the .py is available.  (I think I followed
> the code right. ;)  That's 59 system calls (either stat() or open(),
> the later hidden inside fdopen()).  I don't the added 10 to get the
> right __path__ is worth worrying about.

Wow, what an analysis. 

> It's the .py[co] files that
> are expensive to load!  Once you've created the package, sub-modules
> are very cheap: you will typically have no more than two path entries
> to check even once all this is in place.

I'm not sure I follow you here: do you mean with a package dir
cache in place or using the system implemented in the current
release ?
 
> I said:
>  > caching; loading Grail is still dog slow, and I've no doubt that the
>  > 600+ stat() calls contribute to that!  1-)
> 
>   Oops, after following through with the math, I'd have to adjust this
> to 6000 stat()/open() calls for Grail.  Sorry!

This seems like something to worry about and probably also enough
to try really hard to find a good solution, IMHO.
 
> And back to Marc-Andre:
>  > I would very much like to see some sort of caching in the
>  > interpreter. The fastpath hook I implemented uses a marshalled
>  > dict stored in the user's home dir for the lookup. Once created,
> 
>   I don't think I'd store the cache; if a user's home directory is
> mounted via NFS (common), then it may often be wrong if the user
> actively works with a variety of hosts with different versions or
> installations of Python. 

True, that's why the hook allows you to code the strategy in
Python. Note that my current version uses the sys.path as
key into a table of name:file mappings, so even when using
different setups (which will certainly have some differences in
sys.path), the cache should work. Maybe one should add some
more information to the key... like the platform specifica
or the even the mtimes of the directories on the path.

> The benefits of a cache are greatest for
> applications that import a lot of modules (like Grail!); the cache can
> be built using a directory scan as each directory is searched.  (I
> think one of the guys from CWI did this at one point and had really
> good results; Jack?)

Yep, remember that too. The problem with these scans is that
directories may contain huge amounts of files and you would
need to check all of them against the module extensions Python
uses.

Anyway, the dynamic and static versions are both implementable
using the hook, so I'd opt for going into that direction
rather than hard-wiring some logic into the interpreters core.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   218 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein@lyra.org  Thu May 27 11:00:07 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 27 May 1999 03:00:07 -0700
Subject: [Distutils] extensions in packages
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
 <374BA5C5.706EFB1E@lemburg.com>
 <14155.63283.936747.273032@weyr.cnri.reston.va.us>
 <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us> <374D0077.2699505F@lemburg.com>
Message-ID: <374D17A7.436D8260@lyra.org>

M.-A. Lemburg wrote:
>...
> Anyway, the dynamic and static versions are both implementable
> using the hook, so I'd opt for going into that direction
> rather than hard-wiring some logic into the interpreters core.

IMO, the interpreter core should perform as little searching as
possible. Basically, it should only contain bootstrap stuff. It should
look for a standard importing module and load that. After it is loaded,
the import mechanism should defer to Python for all future imports. (the
cost of running Python code is minimal against the I/O used by the
import)

IMO #2, the standard importing module should operate along the lines of
imputil.py.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From Fred L. Drake, Jr." <fdrake@acm.org  Thu May 27 15:19:07 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Thu, 27 May 1999 10:19:07 -0400 (EDT)
Subject: [Distutils] extensions in packages
In-Reply-To: <374D17A7.436D8260@lyra.org>
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
 <374BA5C5.706EFB1E@lemburg.com>
 <14155.63283.936747.273032@weyr.cnri.reston.va.us>
 <374C20DD.53B458DC@lemburg.com>
 <14156.28035.739990.919106@weyr.cnri.reston.va.us>
 <374D0077.2699505F@lemburg.com>
 <374D17A7.436D8260@lyra.org>
Message-ID: <14157.21595.78742.142962@weyr.cnri.reston.va.us>

Greg Stein writes:
 > IMO #2, the standard importing module should operate along the lines of
 > imputil.py.

  Which could then be implemented in C for efficiency, once everyone's 
agreed and if someone has the inclination.  ;-)
  Note: I'm not endorsing any of the magical import mechanisms; I'm
just becoming increasingly concerned about the performance of whatever 
is "standard."  And whatever is standard is the only one I'll use;
using "ni" in Grail was somewhat useful, but painful as well!  ;-)
But I should be over it in a few more years.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From Fred L. Drake, Jr." <fdrake@acm.org  Thu May 27 15:48:48 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Thu, 27 May 1999 10:48:48 -0400 (EDT)
Subject: [Distutils] extensions in packages
In-Reply-To: <374D0077.2699505F@lemburg.com>
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
 <374BA5C5.706EFB1E@lemburg.com>
 <14155.63283.936747.273032@weyr.cnri.reston.va.us>
 <374C20DD.53B458DC@lemburg.com>
 <14156.28035.739990.919106@weyr.cnri.reston.va.us>
 <374D0077.2699505F@lemburg.com>
Message-ID: <14157.23376.400325.179191@weyr.cnri.reston.va.us>

M.-A. Lemburg writes:
 > Wow, what an analysis. 

  And such fun, as well!  ;-)

 > > It's the .py[co] files that
 > > are expensive to load!  Once you've created the package, sub-modules
 > > are very cheap: you will typically have no more than two path entries
 > > to check even once all this is in place.
 > 
 > I'm not sure I follow you here: do you mean with a package dir
 > cache in place or using the system implemented in the current

  Anything contained within a package is relatively cheap to load
because the search path is shorter.  Currently, if the __init__.py*
does nothing to the __path__, there's only one entry!
  In the current scheme, the .py[co] files are the last thing checked
within a directory during the search.  Loading one of these costs more 
in searching than any other type of module.  Of course, parsing Python 
isn't free either, so loading a .py file for which no .py[co] exists
is really more expensive, it's just found a little sooner.

I said:
 > caching; loading Grail is still dog slow, and I've no doubt that the
 > 600+ stat() calls contribute to that!  1-)

And then I corrected myself:
 >   Oops, after following through with the math, I'd have to adjust this
 > to 6000 stat()/open() calls for Grail.  Sorry!

  Ok, I loaded Grail and looked more carefully.  I was thinking it was 
loading about 100 modules.  Well, that's at the point that it loads
the users .grail/user/grailrc.py (if it exists).  By the time my home
page was loaded, there were 145 distinct module objects loaded into
sys.modules, and 17 entries on sys.path.  Lots of Grail modules are in 
packages these days, but there are also a lot loaded from the standard
library.  So lets say there are probably around 5000 stat()/open()
calls (reduce the number due to package use, then increase it again
because (a) there are more modules being loaded than I'd estimated,
and (b) the standard library is quite a ways down sys.path.

 > This seems like something to worry about and probably also enough
 > to try really hard to find a good solution, IMHO.

  This is where a good caching system makes a lot of sense.

 > True, that's why the hook allows you to code the strategy in
 > Python. Note that my current version uses the sys.path as
 > key into a table of name:file mappings, so even when using
 > different setups (which will certainly have some differences in
 > sys.path), the cache should work. Maybe one should add some
 > more information to the key... like the platform specifica
 > or the even the mtimes of the directories on the path.

  I'm not sure that keying on sys.path is sufficient.  Around here, a
Solaris/SPARC and Solaris/x86 box are likely to share the same
sys.path.  That doesn't mean the directories are the same; the
differences are taken care of via NFS.  Using the mtimes as part of
the key means you don't have any way to clear the cache: an older
mtime may just mean the version of the path for a different platform,
which still wants to use the cache!  Perhaps it could be keyed on
(platform, dir), and the mtimes could be used to determine the need to 
refresh that directory.
  Doing this right is hard, and can be substantially affected by a
site's filesystem layout.  Avoiding problems due to issues like these
is a good reason to use a runtime-only cache.  A site for which this
isn't sufficient can the use the "hook" mechanism to install something 
that can do better within the context of specific filesystem
management policies.

 > Yep, remember that too. The problem with these scans is that
 > directories may contain huge amounts of files and you would
 > need to check all of them against the module extensions Python

  They probably won't contain much other than Python modules in a
reasonable installation.  There's no need to filter the list; just
include every file, and then test for the appropriate entries when
attempting a specific import.  This limits the up-front cost
substantially.
  If we don't assume a reasonable installation (non-module files in
the module dirs), it just gets slower and people have an incentive to
clean up their installation.  This is acceptable.

 > Anyway, the dynamic and static versions are both implementable
 > using the hook, so I'd opt for going into that direction
 > rather than hard-wiring some logic into the interpreters core.

  I have no problems with using a "hook" to implement a more efficient 
mechanism.  I just want the "standard" mechanism to be efficient,
because that's the one I'll use.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From ovidiu@cup.hp.com  Thu May 27 17:32:41 1999
From: ovidiu@cup.hp.com (Ovidiu Predescu)
Date: Thu, 27 May 1999 09:32:41 -0700
Subject: [Distutils] complete GNU readline support, packaging issues
Message-ID: <199905271632.JAA26633@hpcll563.cup.hp.com>

Hi,

I've just started with Python and I discovered that the binding to the GNU
readline is really basic. So about a week ago I started working on a binding
for it and I have most of the work done. I need to finish the bindings for
keymap functions and some of the functions in the completion part before the
package could be considered complete.

I started to investigate the ways to package the work I've done and I
discovered the distutil package. I'm trying to write a setup.py file, but I
need the following things and I didn't figure out how to express them:

- the main C file is obtained by running the m4 program on a .m4 file. How can
I specify this dependency and the rule for generating the C file?

- I need to define a C preprocessor macro that contains the version of the
readline library. The way the things are setup now is by running a configure
script that determines the version of the library. Can I do this with setup.py?


Thanks,
--
Ovidiu Predescu <ovidiu@cup.hp.com>
http://www.geocities.com/SiliconValley/Monitor/7464/


From mal@lemburg.com  Fri May 28 08:52:33 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 28 May 1999 09:52:33 +0200
Subject: [Distutils] extensions in packages
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
 <374BA5C5.706EFB1E@lemburg.com>
 <14155.63283.936747.273032@weyr.cnri.reston.va.us>
 <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us> <374D0077.2699505F@lemburg.com> <374D17A7.436D8260@lyra.org>
Message-ID: <374E4B41.13A2A1FD@lemburg.com>

Greg Stein wrote:
> 
> M.-A. Lemburg wrote:
> >...
> > Anyway, the dynamic and static versions are both implementable
> > using the hook, so I'd opt for going into that direction
> > rather than hard-wiring some logic into the interpreters core.
> 
> IMO, the interpreter core should perform as little searching as
> possible. Basically, it should only contain bootstrap stuff. It should
> look for a standard importing module and load that. After it is loaded,
> the import mechanism should defer to Python for all future imports. (the
> cost of running Python code is minimal against the I/O used by the
> import)
> 
> IMO #2, the standard importing module should operate along the lines of
> imputil.py.

You mean moving the whole import mechanism away from C and into
Python ? Have you tried such an approach with your imputil.py ?

I wonder whether all things done in import.c can be coded in Python,
esp. the exotic things like the Windows registry stuff and the
Mac fork munging seem to be C only (at least as long as there are
no core Python APIs for these C calls).

And just curious: why did Guido recode ni.py in C if he could have
used ni.py in your proposed way instead ?

Cheers,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   217 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Fri May 28 09:02:42 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 28 May 1999 10:02:42 +0200
Subject: [Distutils] extensions in packages
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
 <374BA5C5.706EFB1E@lemburg.com>
 <14155.63283.936747.273032@weyr.cnri.reston.va.us>
 <374C20DD.53B458DC@lemburg.com>
 <14156.28035.739990.919106@weyr.cnri.reston.va.us>
 <374D0077.2699505F@lemburg.com> <14157.23376.400325.179191@weyr.cnri.reston.va.us>
Message-ID: <374E4DA2.45B0E797@lemburg.com>

Fred L. Drake wrote:
> 
> M.-A. Lemburg writes:
>  > This seems like something to worry about and probably also enough
>  > to try really hard to find a good solution, IMHO.
> 
>   This is where a good caching system makes a lot of sense.
> 
>  > True, that's why the hook allows you to code the strategy in
>  > Python. Note that my current version uses the sys.path as
>  > key into a table of name:file mappings, so even when using
>  > different setups (which will certainly have some differences in
>  > sys.path), the cache should work. Maybe one should add some
>  > more information to the key... like the platform specifica
>  > or the even the mtimes of the directories on the path.
> 
>   I'm not sure that keying on sys.path is sufficient.  Around here, a
> Solaris/SPARC and Solaris/x86 box are likely to share the same
> sys.path.  That doesn't mean the directories are the same; the
> differences are taken care of via NFS.  Using the mtimes as part of
> the key means you don't have any way to clear the cache: an older
> mtime may just mean the version of the path for a different platform,
> which still wants to use the cache!  Perhaps it could be keyed on
> (platform, dir), and the mtimes could be used to determine the need to
> refresh that directory.
>   Doing this right is hard, and can be substantially affected by a
> site's filesystem layout.  Avoiding problems due to issues like these
> is a good reason to use a runtime-only cache.  A site for which this
> isn't sufficient can the use the "hook" mechanism to install something
> that can do better within the context of specific filesystem
> management policies.

Right and that's the key point in optionally moving (at least) the
lookup machinery into Python. Admins could then use site.py to
add optimized lookup cache implementations for their site.

The default implementation should probably be some sort of
dynamic cache like the one you sketched below.
 
>  > Yep, remember that too. The problem with these scans is that
>  > directories may contain huge amounts of files and you would
>  > need to check all of them against the module extensions Python
> 
>   They probably won't contain much other than Python modules in a
> reasonable installation.  There's no need to filter the list; just
> include every file, and then test for the appropriate entries when
> attempting a specific import.  This limits the up-front cost
> substantially.

Ok, point taken.

>   If we don't assume a reasonable installation (non-module files in
> the module dirs), it just gets slower and people have an incentive to
> clean up their installation.  This is acceptable.

True.
 
>  > Anyway, the dynamic and static versions are both implementable
>  > using the hook, so I'd opt for going into that direction
>  > rather than hard-wiring some logic into the interpreters core.
> 
>   I have no problems with using a "hook" to implement a more efficient
> mechanism.  I just want the "standard" mechanism to be efficient,
> because that's the one I'll use.

The hook idea makes the implementation a little more open. Still,
I think that even the "standard" lookup/caching scheme should be
implemented in Python.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   217 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein@lyra.org  Fri May 28 22:24:16 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 28 May 1999 14:24:16 -0700
Subject: [Distutils] extensions in packages
References: <Pine.WNT.4.05.9905252101380.45-100000@david.ski.org>
 <374BA5C5.706EFB1E@lemburg.com>
 <14155.63283.936747.273032@weyr.cnri.reston.va.us>
 <374C20DD.53B458DC@lemburg.com> <14156.28035.739990.919106@weyr.cnri.reston.va.us> <374D0077.2699505F@lemburg.com> <374D17A7.436D8260@lyra.org> <374E4B41.13A2A1FD@lemburg.com>
Message-ID: <374F0980.171E34F6@lyra.org>

M.-A. Lemburg wrote:
> 
> Greg Stein wrote:
> >...
> > IMO, the interpreter core should perform as little searching as
> > possible. Basically, it should only contain bootstrap stuff. It should
> > look for a standard importing module and load that. After it is loaded,
> > the import mechanism should defer to Python for all future imports. (the
> > cost of running Python code is minimal against the I/O used by the
> > import)
> >
> > IMO #2, the standard importing module should operate along the lines of
> > imputil.py.
> 
> You mean moving the whole import mechanism away from C and into
> Python ? Have you tried such an approach with your imputil.py ?

Yes and yes.

Using Python's import hook effectively means that you completely take
over Python's import mechanism (one of its failings, IMO). imputil.py is
designed to provide for iterating through a list of importers, looking
for one that works.

In any case... yes, I've use imputil to the exclusion of Python's import
logic. You still need imp.new_module() and imp.get_magic(). But that
does implies that you can axe a lot of stuff outside of that. My tests
don't have loading of dynamic libraries, so you would still need an imp
function to load that (but strip the *searching* for the module).

> I wonder whether all things done in import.c can be coded in Python,
> esp. the exotic things like the Windows registry stuff and the
> Mac fork munging seem to be C only (at least as long as there are
> no core Python APIs for these C calls).

win32api provides Registry access, so you just have to bootstrap that. I
haven't tried to remove a lot of Python's logic, so I can't say what can
actually be tossed, kept around, or just restructured a bit. IMO, the
best thing to do is to expose a few minimal functions and defer to
Python.

> And just curious: why did Guido recode ni.py in C if he could have
> used ni.py in your proposed way instead ?

For two reasons that I can think of:

1) people had to manually import "ni"
2) it took over the import hook which effectively prevents further use
of it (or if somebody *did* use it, then they would wipe out ni's
functionality; again, this is why I dislike the current hook approach
and like a list-based approach which is possible via imputil)

And rather than respond to Fred's note in a separate thread, I'll tie it
in here:

Frankly: Fred is off-based on the need to "recode in C for efficiency".
That is a bogus argument. The cost is I/O operations, not the
interpreter overhead. You will gain no real benefit by moving the import
mechanism to C. C is *only* required to access the operating system in
ways that are not already available in the core, or which you cannot
effectively bootstrap.

Python should strip away all of its C-based code for packages and for
locating modules. That should all move to Python. All that should remain
in C is the necessary functions for importing dynamic modules.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/